Build Python and TypeScript SDKs with Streaming
This tutorial includes the following SDK languages and versions:
TypeScript Java Python C# Go PHP ✅ ❌ ✅ ❌ ❌ ❌
Streaming data is a common feature when working with AI Large Language Models (LLMs) like ChatGPT, Claude, Llama, or Mistral. In this tutorial, you'll learn how to create Python and TypeScript SDKs with streaming features using the liblab SDK generator.
For this tutorial, you'll use Ollama to host an LLM locally on your computer. The principles in this guide can be applied to any LLM API that provides an OpenAPI file.
Prerequisites
- A liblab account.
- liblab CLI installed and logged in.
- Python (version ≥ 3.9).
- Node.js (version ≥ 18).
- Ollama installed and running.
- A supported LLM, such as Llama 3.1 (8b).
Steps
- Set up Ollama and install Llama 3.1
- Configure the liblab CLI
- Initialize and configure the project
- Generate the SDKs
- Test the SDKs
1. Set up Ollama and install Llama 3.1
First, you must install Ollama on your machine to run the LLM model. Follow the steps to install and run Ollama with Llama 3.1.
- Visit Ollama and download the latest version.
- Once Ollama is installed and running, execute the following command on your terminal to download the latest version of Llama 3.1 8b:
ollama pull llama3.1:8b
This tutorial uses the smaller Llama 3.1 model (8b) to ensure quicker response times. However, you can use any Llama 3.1 model if your machine can run it smoothly.
- To verify that Llama 3.1 is installed, run the following command:
ollama run llama3.1:8b
- If the setup is successful, you can send prompts to the model and receive responses. For example, ask Llama to tell you a joke:
ollama run llama3.1:8b
>>> tell me a joke
Here's one:
What do you call a fake noodle?
An impasta!
>>>
2. Configure the liblab CLI
After properly installing and configuring Ollama on your machine, you can start working with the liblab solution. First, install the liblab CLI running the following command:
npm install -g @liblab/cli
Once installed, you need to log in to your liblab
account. Run the following command and follow the prompts to log in:
liblab login
For more options and details on installing the liblab CLI, access the Install the CLI page.
3. Initialize and configure the project
With the CLI installed, and authenticated, you can now create a new project. Run the following commands to create a new liblab project in the streaming
directory:
mkdir -p streaming
cd streaming
liblab init
A new liblab.config.json
file will be created, which contains all the project's configurations.
3.1 Create the API spec
Before we edit the generated liblab.config.json
file, you need to have the API spec to be used by liblab to generate the SDK. You'll create a new spec for this tutorial to use the llama model locally.
In most cases, you don't need to create the API specification manually, as the API typically provides it. However, since we are using Ollama locally, an OpenAPI specification is not available so that we will create one ourselves.
Inside the streaming
directory, create a file named ollama-open-api.yaml
with the following content:
openapi: 3.0.0
info:
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
servers:
- url: 'http://localhost:11434'
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateResponse'
components:
schemas:
GenerateRequest:
type: object
required:
- model
- prompt
properties:
model:
type: string
prompt:
type: string
stream:
type: boolean
GenerateResponse:
type: object
required:
- model
- created_at
- response
properties:
model:
type: string
created_at:
type: string
response:
type: string
done:
type: boolean
done_reason:
type: string
context:
type: array
items:
type: integer
total_duration:
type: integer
load_duration:
type: integer
prompt_eval_count:
type: integer
prompt_eval_duration:
type: integer
eval_count:
type: integer
eval_duration:
type: integer
We already added the liblab streaming feature in the above API spec by adding x-liblab-streaming: true
. In addition, we have defined the server as http://localhost:11434
, which is the default server for Ollama running with Llama.
3.2 Configure the project
After creating the API spec file, you can start configuring the project. You must use the following configurations in your liblab.config.json
file:
{
"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./ollama-open-api.yaml",
"languages": [
"python",
"typescript"
],
"auth": [],
"customizations": {
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
},
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
},
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": true
}
}
}
},
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
},
"typescript": {
"bundle": true,
"exportClassDefault": false,
"httpClient": "fetch",
"npmName": "",
"npmOrg": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2",
"generateEnumAs": "union"
}
},
"publishing": {
"githubOrg": ""
}
}
The above file defines that SDKs will be created SDKs for Python and TypeScript (lines 6-9). The SDKs will be based on the API spec at ./ollama-open-api.yaml
(line 5). In addition, the streaming feature will be available for the /api/signature
endpoint (lines 26-32).
Streaming is enabled by default for endpoints returning text/event-stream
.
For other endpoints, you can enable it with one of the following options:
- Edit the
liblab.config.json
: Add thestreaming: true
for each endpoint you want to enable the streaming feature. - Modify the OpenAPI spec: Include the
x-liblab-streaming: true
annotation for the relevant endpoint.
In this example, both methods are demonstrated to illustrate how streaming can be enabled. However, you only need to implement one of these options.
4. Generate the SDKs
Now that we have the API spec and have configured the liblab.config.json
, it's time to generate the SDK. To generate the SDK, execute the following command on your terminal:
liblab build -y
liblab will validate and notify you about any issues with the liblab.config.json
or the ollama-open-api.yaml
files. You should see the build started and the following finished messages:
Your SDKs are being generated. Visit the liblab portal (https://app.liblab.com/apis/ollama-api/builds/8364) to view more details on your build(s).
✓ Python built
✓ TypeScript built
✓ Generate package-lock.json for TypeScript
Successfully generated SDKs for TypeScript, Python. ♡ You can find them inside: <path-to-the-project-directory>/streaming/output
The SDKs are available in the output
directory.
5. Test the SDKs
To test the SDKs, you can use the examples created by liblab when generating the SDKs. The following sections describe how to test the Python and TypeScript SDKs using the examples available.
5.1 Testing the Python SDK
To test the Python SDK, follow the steps:
- From your terminal, open the directory
output/python/examples
. - Run one of the following scripts depending on your operating system.
- Mac / Linux
- Windows
./install.sh
./install.cmd
The commands to run the install scripts or activate the Python environment may change if you use Mac or Linux.
- Activate the Python environment:
$ source .venv/Scripts/activate
- Open the file
sample.py
and change the model and prompt, as displayed below:
request_body = GenerateRequest(model="llama3.1:8b", prompt="Tell me a joke", stream=True)
- Execute the SDK by running the following command at the terminal:
python sample.py
On your terminal you will receive the streamed response:
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5467575Z',
response='Here',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5772503Z',
response="'s",
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5934289Z',
response=' one',
done=False
)
...
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.0848765Z',
response=' hear',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1016931Z',
response=' another',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1173826Z',
response=' one',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1327678Z',
response='?',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1484953Z',
response='',
done=True,
done_reason='stop',
context=[128006, 882, 128007, 271, 41551, 757, 264, 22380, 128009, 128006, 78191, 128007, 271, 8586, 596, 832, 1473, 3923, 656, 499, 1650, 264, 12700, 46895, 273, 1980, 65192, 369, 433, 62927, 2127, 3242, 14635, 2268, 40, 3987, 430, 1903, 499, 12835, 0, 3234, 499, 1390, 311, 6865, 2500, 832, 30],
total_duration=8415490600,
load_duration=7410168000,
prompt_eval_count=14,
prompt_eval_duration=400000000,
eval_count=37,
eval_duration=603000000
)
The SDK streams responses word by word until completion through the response
parameter. A final status done=True
confirms if the stream is done.
5.2 Testing the TypeScript SDK
To test the TypeScript SDK, follow the steps:
- Access the
output/typescript/examples
directory with your terminal. - Run the following command to install the SDK:
npm run setup
- Open the
src/index.ts
file and change themodel
andprompt
as presented in the following code snippet:
const generateRequest: GenerateRequest = {
model: 'llama3.1:8b',
prompt: 'Tell me a joke',
stream: true,
};
- Execute the SDK by running the following command at the terminal:
npm run start
On your terminal, you will receive the streamed response:
[email protected] start
> tsc && node dist/index.js
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:03.9792376Z',
response: 'Here',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
...
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.229332Z',
response: 'asta',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2435589Z',
response: '.',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2602098Z',
response: '',
done: true,
doneReason: 'stop',
context: [
128006, 882, 128007, 271, 41551,
757, 264, 22380, 128009, 128006,
78191, 128007, 271, 8586, 596,
832, 1473, 3923, 656, 499,
1650, 264, 12700, 46895, 273,
1980, 2127, 3242, 14635, 13
],
totalDuration: 3104377400,
loadDuration: 2724683300,
promptEvalCount: 14,
promptEvalDuration: 95000000,
evalCount: 18,
evalDuration: 283000000
}
The SDK streams responses word by word until completion through the response
parameter. A final status done=True
confirms if the stream is done.
Conclusion
Following this guide, you've learned how to create Python and TypeScript SDKs with streaming support using liblab. These SDKs simplify API integration and enhance real-time data handling. Explore further customization options in the liblab documentation.