GuidesAPI reference

Full overview


Pipeline is a Python library for wrapping your ML inference code (no training yet) into a format that can then be serialised and sent to a cloud inference service to be used in production. The library interfaces directly with Pipeline Cloud , which provides a compute engine to run pipelines at scale. You can define how to cache models, run models on enterprise GPUs, perform arbitrary pre/post-processing steps, logging, custom Python environments and much more. In addition to this there are utility functions to accelerate the creation of ML inference pipelines.

PipelineCloud accepts HTTP requests to run your inference pipeline and supports the use of either CPU or GPU-based computing. You can send as many requests as you want per second and the load balancing/distribution is handled by PipelineCloud. We cache your pipeline as it runs so that future requests are faster. We're working on improving this to cache your pipeline before it's called!

Create an account

To get started with PipelineCloud, sign-up via the dashboard. You'll be prompted to enter your payment details to continue. After adding a card, you'll be given a 30 day free trial where you can use up to 10 hours of GPU compute free of charge. After that trial period ends, you will be charged a monthly subscription of $12.99 and then $2 per hour of compute. Our pricing is pay-as-you-go on a per-millisecond basis, so you'll only be billed for your usage, no hourly fees. You can cancel your subscription at any time, even before the end of your free trial. On your billing settings page, you can navigate to the billing portal, where you can update your payment methods and subscription.

Create an API token

In order to interact with the Pipeline Cloud API, you will need a valid API token. This allows our backend services to identify a given account with an incoming request. To create an API token, navigate to the API-tokens settings page on the dashboard. Simply click 'Create token', give it a name (required) and expiry date (optional). You can leave the expiry date set as None if you never want it to expire.

You can also deactivate and reactivate any of your API tokens. When a token is deactivated, a client using that token in a request will get 403 (forbidden) HTTP status code in response.


The Pipeline library provides an API client, PipelineCloud, which interfaces directly with the main API gateway. It provides a number of useful methods, for instance, for handling pipeline uploads or file uploads. In order for the API client to use your authentication credentials, you have the following 3 options:

  • Authenticate using the pipeline CLI: Pipeline CLI authentication
  • Pass in your API token as a keyword argument when instantiating the client: PipelineCloud(token="pipeline_sk...")
  • Set the environment variable: PIPELINE_API_TOKEN=pipeline_sk....

Alternatively, if you are interfacing with the main API gateway without using the PipelineCloud client, include your API token as a bearer token in your authorization headers. For instance, using curl, you can retrieve your user information as follows:

curl -G '' -H 'Authorization: Bearer pipeline_sk_...'

Deploy a public pipeline

In this section, we'll show you how to run one of our public pre-trained pipelines on PipelineCloud. If you're looking to run an off-the-shelf pre-trained pipeline without the need for much customisation, this could be all you need. You'll get an API endpoint that you can send HTTP requests to from any programming environment or client, and we'll handle all the scaling, distribution, and task retry logic for you.

To get started, head over to your dashboard homepage. If you haven't previously deployed any pipelines to PipelineCloud, you should see a quickstart guide. In the pre-trained models tab, click on "Begin quickstart", where you'll find all our published pipelines ready for you to use out of the box. In this example we'll be selecting a stable diffusion pipeline,public/stable-diffusion-v2.0:v1.0-fp32, but choose whichever pipeline you prefer:

After clicking on "Continue", you will have deployed your first pipeline! In order to run the pipeline, simply copy the code from the Shell tab in the code snippet (A CURL command) and then paste it into your terminal. After a while, you should get a response with an inference result of your first successful run, that was easy!

Deploy a custom pipeline

If you need a more customised pipeline that isn't included in our off the shelf public pipelines, you'll need to upload your own one. Luckily, the Pipeline library is built exactly for this use case! Let's see how to achieve this by building out a simple pipeline.

What is a pipeline?

In Pipeline, a pipeline is simply a computational graph. In other words, given some input, a pipeline describes how that input is fed into certain operations, whose outputs are fed into further operations, and so on, to produce an overall output. We'll look at a basic example below that will make this less abstract. One of the neat features of Pipeline is that it allows you to package those instructions into a single deployable unit that can be run remotely in the cloud.

The Pipeline library includes several features that are directed towards ML-ops, such as running workflows on GPU.

Basic pipeline

When defining a pipeline, we build out its computational graph. The Pipeline library uses a series of decorators to change the default behaviour of functions when used inside of a Pipeline context manager (the with Pipeline(...) statement used below). Within the context manager, all calls on functions decorated by pipeline_function, do not actually execute but rather return a reference to a variable that is stored in the pipeline. Functions wrapped in the pipeline_function decorator will execute normally when used outside of the Pipeline context manager.

Below we have a simple example of multiplying two numbers together and returning the result:

from pipeline import Pipeline, Variable, pipeline_function

PIPELINE_NAME = "maths-is-fun"

def multiply(a: float, b: float) -> float:
    return a * b

with Pipeline(PIPELINE_NAME) as builder:
    # Define the inputs used to feed data into the pipeline
    flt_1 = Variable(float, is_input=True)
    flt_2 = Variable(float, is_input=True)
    # Add the variables to the pipeline
    builder.add_variables(flt_1, flt_2)

    # Perform a computation on the inputs
    result = multiply(flt_1, flt_2)

    # Use the computation output as the output for the pipeline

Running your pipeline locally

To run the pipeline locally, we 'get' the pipeline and call .run on it, passing in the arguments:

output_pipeline = Pipeline.get_pipeline(PIPELINE_NAME)
print(, 6.0))
# The output of this pipeline is 30.0

Running your pipeline in the cloud

To run the pipeline in PipelineCloud, we first need to create an API client, upload the pipeline and then run it:

output_pipeline = Pipeline.get_pipeline(PIPELINE_NAME)

# Connect to PipelineCloud 
from pipeline import PipelineCloud
api = PipelineCloud(token="YOUR_API_TOKEN)

# Upload the pipeline
uploaded_pipeline = api.upload_pipeline(output_pipeline)

# Run the pipeline remotely
    [5.0, 6.0],


Local vs Cloud .run input arguments

The inputs to cloud runs, api.run_pipeline are the inputs to local run in a list.

The returned outputs from pipelines also come in a list, so you may need to parse the data to get your expected output.