Inference
How to run a pipeline from our Model Hub
Example use-case: GPT-J
Here's how to run a Model Hub pipeline in the cloud in our fully managed GPU compute platform, Pipeline. Our pricing is pay-as-you-go on a per-millisecond basis, so you'll only be billed for your usage, no hourly fees. You'll get an API endpoint that you can send requests to from any environment or client, and we'll handle all the scaling, distribution, and task retry logic for you. Let's go!
We're going to use GPT-J as an example, but the same technique works for all of them.
Sign up and setup
First, sign up to Pipeline via our dashboard. You'll be prompted to subscribe to a Developer account, this will give you access to our public pipelines.
Once you're into the dashboard, head to Settings > API Tokens. To identify your requests on Pipeline, we use a unique token which you can generate here. Simply click 'Create API token', give it a name and expiry date. You can leave the expiry date set as None if you don't want it to automatically expire.


Creating an API token in Pipeline.
Warning
Your API tokens can directly edit your Pipeline account and objects, so treat them with care!
Deploy GPT-J for inference
Now head to the Model Hub by clicking 'Model Hub' in the navigation bar. Here you'll find all our published pipelines ready for you to use. Select the GPT-J pipeline, and when you're on that page, hit the 'Deploy' button.
You'll be asked where to deploy the pipeline, choose an existing or a new project. Then hit 'Deploy pipeline'. You've successfully added a pipeline to Pipeline!


Deploying GPT-J in Pipeline.
Send an API request
You'll find a sample request on the pipeline's page. Simply copy the code snippet, paste it into your chosen client, and make sure you've updated both the pipeline id
and the api token
to match your pipeline and token.
Then you can post the request, and send your inference task to our cloud infrastructure. Just hit go!


Sending a request to the Pipeline API.
Monitor usage
In the dashboard you can track your pipeline's usage. Check out when the last ten runs started and their compute time, or look at a graph of your activity across a day, week, or year period.
You can also enter the project where you deployed that pipeline and you'll see all the runs in list format. This can be useful when you're debugging and learning how Pipeline Cloud works.


Monitoring usage in the Pipeline dashboard.
Updated 22 days ago