Build a Python SDK with Streaming
This tutorial includes the following SDK languages and versions:
TypeScript v1 TypeScript v2 Java Python v1 Python v2 C# Go PHP ❌ ❌ ❌ ❌ ✅ ❌ ❌ ❌
Streaming data is a common pattern when calling AI Large Language Models (LLMs) like Chat GPT, Claude, Llama or Mistral.
In this post we will explore how to seamlessly create an SDK for your favorite LLM using the liblab SDK generator.
The example in this tutorial will use Ollama to host a LLM locally on your computer, however you can use the same principles to access any LLM API that provides an OpenAPI file
The OpenAPI file contains API information like paths, params, security schemas, and other.
Prerequisites
- A liblab account
- The liblab CLI installed and you are logged in
- Python ≥ 3.9
- Ollama installed and running
- A Large Language Model, such as Llama3.1
Steps
- Setting up Ollama and installing Llama 3.1
- Setting up the
liblab CLI
- Generating the SDK
- Using the SDK
- How to enable streaming endpoints
- Conclusion
1. Setting up an example Llama API
First go to Ollama home to download and install the latest version.
Then, once Ollama is installed and running, execute the following command on a console to download the latest version of Llama 3.1:
ollama pull llama3.1
To verify that Llama 3.1 is installed, run the following command:
ollama run llama3.1
If all is well, you will be able to send prompts to the model and receive responses. For example:
ollama run llama3.1
>>> tell me a joke
Here's one:
What do you call a fake noodle?
(wait for it...)
An impasta!
Hope that made you laugh! Do you want to hear another?
>>>
2. Setting up the liblab CLI
First, ensure you have the liblab CLI
installed. If not, you can install it via npm:
npm install -g @liblab/cli
Once installed, you need to log in to your liblab
account. Run the following command and follow the prompts to log in:
liblab login
After logging in, you can configure the CLI with your project. We want a new directory for the SDK, let's create one called streaming
:
mkdir -p streaming
cd streaming
liblab init
This will generate a liblab.config.json
.
Before we edit the generated json file, let’s create the API spec for which we will generate this SDK.
ℹ️ Usually we don’t need to create the API spec ourselves since most APIs provide them. However we are using Ollama locally and it does not provide an OpenAPI spec.
Create a new file called ollama-open-api.yaml
and paste the following content into it:
openapi: 3.0.0
info:
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateResponse'
components:
schemas:
GenerateRequest:
type: object
required:
- model
- prompt
properties:
model:
type: string
prompt:
type: string
stream:
type: boolean
GenerateResponse:
type: object
required:
- model
- created_at
- response
properties:
model:
type: string
created_at:
type: string
response:
type: string
done:
type: boolean
done_reason:
type: string
context:
type: array
items:
type: integer
total_duration:
type: integer
load_duration:
type: integer
prompt_eval_count:
type: integer
prompt_eval_duration:
type: integer
eval_count:
type: integer
eval_duration:
type: integer
Now let's update the liblab.config.json
file to use the our new spec. Copy and paste the json to overwrite the liblab.config.json
file:
{
"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./ollama-open-api.yaml",
"languages": ["python"],
"auth": [],
"customizations": {
"baseURL": "http://localhost:11434",
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
},
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
},
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": "true"
}
}
}
},
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
}
},
"publishing": {
"githubOrg": ""
}
}
Enable streaming
To enable the SDK to receive streaming data, we can either add the streaming: true
parameter to the endpointCustomizations
parameter in the liblab.config.json
, or we can add the x-liblab-streaming: true
annotation to the Open API Spec file.
In this example, we've done both, to illustrate how to enable streaming, but you only need to do one of these.
Note: streaming is enabled by default if an endpoint returns the text/event-stream
content type. In this case there is no need for any extra configurations.
3. Generate the SDK
Now that we have an OpenAPI Spec file and the liblab CLI, it is time to generate our SDK.
Execute the following command inside the streaming folder:
liblab build
The CLI will validate and notify us about any issues with the liblab.config.json
or the ollama-open-api.yaml
files. You might expect something like:
✓ No issues detected in the liblab config file.
Created output/api-schema-validation.json with the full linting results
Detected 3 potential issues with the spec:
⚠ OpenAPI "servers" must be present and non-empty array.
⚠ Info object must have "contact" object.
⚠ Operation must have non-empty "tags" array.
? It is important to fix your spec before continuing with a build. Not fixing the spec may yield a subpar SDK and documentation. Would you like to attempt to build the SDK anyway?
We can go ahead and confirm by typing Y
.
Next we should see the build started and hopefully finished messages:
Ignoring the spec errors and attempting to build with the spec
No hooks found, SDKs will be generated without hooks.
No custom plan modifiers found, SDKs will be generated without them.
Your SDKs are being generated. Visit the liblab portal (https://app.liblab.com/apis/ollama-api/builds/6770) to view more details on your build(s).
✓ Python built
Successfully generated SDKs downloaded. You can find them inside the /Users/felipe/Development/LibLab/cli-test-runner/output folder
Successfully generated SDK's for Python ♡
If we go inside the output
directory, we will see our SDK.
Congratulations! You have successfully generated an SDK for Ollama with streaming capabilities.
4. Using the SDK
Now that we have generated our SDK, let's make a request to Ollama to test it.
To do this, go into the output/python/examples
directory.
Now run the install.sh
script, and activate the generated virtual environment.
chmod u+x install.sh # this step my be necessary if this file requires run permissions.
./install.sh
source .venv/bin/activate
Now copy and paste the following code in the sample.py
file.
Execute the sample.py
file with this command:
python sample.py
After the sample has ran you should see the following output:
(.venv) ➜ examples git:(main) ✗ python sample.py
Here's one:
What do you call a fake noodle?
An impasta!
Hope that made you laugh! Do you want to hear another one?
5. How to enable streaming endpoints
There are 3 ways to enable streaming endpoints in your SDK:
1. Using the text/event-stream
content type
Streaming is automatically enabled for any endpoint that returns a text/event-stream
content type.
2. Using the liblab config
You can enable streaming for an endpoint by adding an endpoint customization to your liblab.config.json
file, like we did in this tutorial:
{
...
"customizations": {
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": "true"
}
}
}
}
...
}
3. Adding the x-liblab-streaming: true
annotation to the OpenAPI spec
You can also enable streaming by adding the x-liblab-streaming: true
annotation to the OpenAPI spec, like we did in this tutorial:
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
...
6. Conclusion
In conclusion, creating an SDK for your LLM application simplifies the development process. By following the steps outlined in this tutorial, you’ve learned how to utilize the liblab CLI to generate a robust SDK.