Inference

How to run a pipeline from our Model Hub

Example use-case: GPT-J

Here's how to run a Model Hub pipeline in the cloud in our fully managed GPU compute platform, Pipeline. Our pricing is pay-as-you-go on a per-millisecond basis, so you'll only be billed for your usage, no hourly fees. You'll get an API endpoint that you can send requests to from any environment or client, and we'll handle all the scaling, distribution, and task retry logic for you. Let's go!

We're going to use GPT-J as an example, but the same technique works for all of them.

Sign up and setup

First, sign up to Pipeline via our dashboard. You'll be prompted to subscribe to a Developer account, this will give you access to our public pipelines.

Once you're into the dashboard, head to Settings > API Tokens. To identify your requests on Pipeline, we use a unique token which you can generate here. Simply click 'Create API token', give it a name and expiry date. You can leave the expiry date set as None if you don't want it to automatically expire.

11201120

Creating an API token in Pipeline.

❗️

Warning

Your API tokens can directly edit your Pipeline account and objects, so treat them with care!

Deploy GPT-J for inference

Now head to the Model Hub by clicking 'Model Hub' in the navigation bar. Here you'll find all our published pipelines ready for you to use. Select the GPT-J pipeline, and when you're on that page, hit the 'Deploy' button.

You'll be asked where to deploy the pipeline, choose an existing or a new project. Then hit 'Deploy pipeline'. You've successfully added a pipeline to Pipeline!

11201120

Deploying GPT-J in Pipeline.

Send an API request

You'll find a sample request on the pipeline's page. Simply copy the code snippet, paste it into your chosen client, and make sure you've updated both the pipeline id and the api token to match your pipeline and token.

Then you can post the request, and send your inference task to our cloud infrastructure. Just hit go!

11201120

Sending a request to the Pipeline API.

Monitor usage

In the dashboard you can track your pipeline's usage. Check out when the last ten runs started and their compute time, or look at a graph of your activity across a day, week, or year period.

You can also enter the project where you deployed that pipeline and you'll see all the runs in list format. This can be useful when you're debugging and learning how Pipeline Cloud works.

11201120

Monitoring usage in the Pipeline dashboard.


Did this page help you?