Create an deploy a Spacy pipeline
How to use the spacy_to_pipeline wrapper to create pipelines from spacy
Create pipeline using Spacy wrapper
spacy_to_pipeline
is a convenient wrapper that allows you create a pipeline from any pretrained spacy tokeniser model.
Optional kwargs allow you to manipulate the output from spacy tokenisation within the pipeline as well as letting you name your pipeline.
See the function args and docstring for spacy_to_pipeline
below:
from pipeline import spacy_to_pipeline
def spacy_to_pipeline(spacy_model: str, func: t.Optional[t.Callable] = None, name: str = "Spacy pipeline") -> Graph:
"""
Create a pipeline using Spacy
Parameters:
spacy_model (str): tokenizer model name (trained Spacy "pipeline")
func (Optional[Callable]): function to be called on spacy output
name (str): Name to be given to this pipeline
Returns:
pipeline (Graph): Executable Pipeline Graph object
"""
...
Let's see a complete example of it in action running locally:
from pipeline import spacy_to_pipeline
def func(doc):
return [[token.text, token.lemma_, token.pos_] for token in doc]
spacy_pipeline = spacy_to_pipeline("en_core_web_sm", func=func, name="my-spacy-pipeline")
# run locally
input = "Apple is looking at buying U.K. startup for $1 billion"
[output] = spacy_pipeline.run(input)
print(output)
Above, we have created a pipeline from spacy model "en_core_web_sm". We name it "my spacy pipeline" and we postprocess the spacy model output (output tokens) using func()
.
The above pipeline is the equivalent to the following spacy code below:
import spacy
def func(doc):
return [[token.text, token.lemma_, token.pos_] for token in doc]
input = "Apple is looking at buying U.K. startup for $1 billion"
nlp = spacy.load("en_core_web_sm")
doc = nlp(input)
[output] = [func(doc)]
# note pipelines return results in a list
print(output)
Run in PipelineCloud
Example of running the same example above in PipelineCloud. Note api.run_pipeline
accepts the pipeline input in a list.
from pipeline import (
PipelineCloud,
spacy_to_pipeline
)
api = PipelineCloud(token="YOUR TOKEN HERE")
def func(doc):
return [[token.text, token.lemma_, token.pos_] for token in doc]
spacy_pipeline = spacy_to_pipeline("en_core_web_sm", func=func, name="spacy-get-all")
uploaded_pipeline = api.upload_pipeline(spacy_pipeline)
print(f"Uploaded pipeline: {uploaded_pipeline.id}")
print("Run uploaded pipeline")
run_result = api.run_pipeline(
uploaded_pipeline, ["Apple is looking at buying U.K. startup for $1 billion"]
)
try:
result_preview = run_result.result_preview
print("Run result:", result_preview)
except KeyError:
print(api.download_result(run_result))
Updated 6 months ago