Skip to main content

Build Python and TypeScript SDKs with Streaming

This tutorial includes the following SDK languages and versions:

TypeScriptJavaPythonC#GoPHP

Streaming data is a common feature when working with AI Large Language Models (LLMs) like ChatGPT, Claude, Llama, or Mistral. In this tutorial, you'll learn how to create Python and TypeScript SDKs with streaming features using the liblab SDK generator.

For this tutorial, you'll use Ollama to host an LLM locally on your computer. The principles in this guide can be applied to any LLM API that provides an OpenAPI file.

Prerequisites

Steps

  1. Set up Ollama and install Llama 3.1
  2. Configure the liblab CLI
  3. Initialize and configure the project
  4. Generate the SDKs
  5. Test the SDKs

1. Set up Ollama and install Llama 3.1

First, you must install Ollama on your machine to run the LLM model. Follow the steps to install and run Ollama with Llama 3.1.

  1. Visit Ollama and download the latest version.
  2. Once Ollama is installed and running, execute the following command on your terminal to download the latest version of Llama 3.1 8b:
ollama pull llama3.1:8b
llama model

This tutorial uses the smaller Llama 3.1 model (8b) to ensure quicker response times. However, you can use any Llama 3.1 model if your machine can run it smoothly.

  1. To verify that Llama 3.1 is installed, run the following command:
ollama run llama3.1:8b
  1. If the setup is successful, you can send prompts to the model and receive responses. For example, ask Llama to tell you a joke:
ollama run llama3.1:8b
>>> tell me a joke
Here's one:

What do you call a fake noodle?

An impasta!

>>>

2. Configure the liblab CLI

After properly installing and configuring Ollama on your machine, you can start working with the liblab solution. First, install the liblab CLI running the following command:

npm install -g @liblab/cli

Once installed, you need to log in to your liblab account. Run the following command and follow the prompts to log in:

liblab login
Installation options

For more options and details on installing the liblab CLI, access the Install the CLI page.

3. Initialize and configure the project

With the CLI installed, and authenticated, you can now create a new project. Run the following commands to create a new liblab project in the streaming directory:

mkdir -p streaming
cd streaming
liblab init

A new liblab.config.json file will be created, which contains all the project's configurations.

3.1 Create the API spec

Before we edit the generated liblab.config.json file, you need to have the API spec to be used by liblab to generate the SDK. You'll create a new spec for this tutorial to use the llama model locally.

note

In most cases, you don't need to create the API specification manually, as the API typically provides it. However, since we are using Ollama locally, an OpenAPI specification is not available so that we will create one ourselves.

Inside the streaming directory, create a file named ollama-open-api.yaml with the following content:

openapi: 3.0.0
info:
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
servers:
- url: 'http://localhost:11434'
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateResponse'

components:
schemas:
GenerateRequest:
type: object
required:
- model
- prompt
properties:
model:
type: string
prompt:
type: string
stream:
type: boolean

GenerateResponse:
type: object
required:
- model
- created_at
- response
properties:
model:
type: string
created_at:
type: string
response:
type: string
done:
type: boolean
done_reason:
type: string
context:
type: array
items:
type: integer
total_duration:
type: integer
load_duration:
type: integer
prompt_eval_count:
type: integer
prompt_eval_duration:
type: integer
eval_count:
type: integer
eval_duration:
type: integer

We already added the liblab streaming feature in the above API spec by adding x-liblab-streaming: true. In addition, we have defined the server as http://localhost:11434, which is the default server for Ollama running with Llama.

3.2 Configure the project

After creating the API spec file, you can start configuring the project. You must use the following configurations in your liblab.config.json file:

{
"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./ollama-open-api.yaml",
"languages": [
"python",
"typescript"
],
"auth": [],
"customizations": {
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
},
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
},
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": true
}
}
}
},
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
},
"typescript": {
"bundle": true,
"exportClassDefault": false,
"httpClient": "fetch",
"npmName": "",
"npmOrg": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2",
"generateEnumAs": "union"
}
},
"publishing": {
"githubOrg": ""
}
}

The above file defines that SDKs will be created SDKs for Python and TypeScript (lines 6-9). The SDKs will be based on the API spec at ./ollama-open-api.yaml (line 5). In addition, the streaming feature will be available for the /api/signature endpoint (lines 26-32).

Enabling Streaming

Streaming is enabled by default for endpoints returning text/event-stream. For other endpoints, you can enable it with one of the following options:

  1. Edit the liblab.config.json: Add the streaming: true for each endpoint you want to enable the streaming feature.
  2. Modify the OpenAPI spec: Include the x-liblab-streaming: true annotation for the relevant endpoint.

In this example, both methods are demonstrated to illustrate how streaming can be enabled. However, you only need to implement one of these options.

4. Generate the SDKs

Now that we have the API spec and have configured the liblab.config.json, it's time to generate the SDK. To generate the SDK, execute the following command on your terminal:  

liblab build -y

liblab will validate and notify you about any issues with the liblab.config.json or the ollama-open-api.yaml files. You should see the build started and the following finished messages:

Your SDKs are being generated. Visit the liblab portal (https://app.liblab.com/apis/ollama-api/builds/8364) to view more details on your build(s).
✓ Python built
✓ TypeScript built
✓ Generate package-lock.json for TypeScript
Successfully generated SDKs for TypeScript, Python. ♡ You can find them inside: <path-to-the-project-directory>/streaming/output

The SDKs are available in the output directory.

5. Test the SDKs

To test the SDKs, you can use the examples created by liblab when generating the SDKs. The following sections describe how to test the Python and TypeScript SDKs using the examples available.

5.1 Testing the Python SDK

To test the Python SDK, follow the steps:

  1. From your terminal, open the directory output/python/examples.
  2. Run one of the following scripts depending on your operating system.
./install.sh
note

The commands to run the install scripts or activate the Python environment may change if you use Mac or Linux.

  1. Activate the Python environment:
$ source .venv/Scripts/activate
  1. Open the file sample.py and change the model and prompt, as displayed below:
  request_body = GenerateRequest(model="llama3.1:8b", prompt="Tell me a joke", stream=True)
  1. Execute the SDK by running the following command at the terminal:
python sample.py 

On your terminal you will receive the streamed response:

GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5467575Z',
response='Here',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5772503Z',
response="'s",
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:46.5934289Z',
response=' one',
done=False
)
...
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.0848765Z',
response=' hear',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1016931Z',
response=' another',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1173826Z',
response=' one',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1327678Z',
response='?',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2024-12-10T00:14:47.1484953Z',
response='',
done=True,
done_reason='stop',
context=[128006, 882, 128007, 271, 41551, 757, 264, 22380, 128009, 128006, 78191, 128007, 271, 8586, 596, 832, 1473, 3923, 656, 499, 1650, 264, 12700, 46895, 273, 1980, 65192, 369, 433, 62927, 2127, 3242, 14635, 2268, 40, 3987, 430, 1903, 499, 12835, 0, 3234, 499, 1390, 311, 6865, 2500, 832, 30],
total_duration=8415490600,
load_duration=7410168000,
prompt_eval_count=14,
prompt_eval_duration=400000000,
eval_count=37,
eval_duration=603000000
)

The SDK streams responses word by word until completion through the response parameter. A final status done=Trueconfirms if the stream is done.

5.2 Testing the TypeScript SDK

To test the TypeScript SDK, follow the steps:

  1. Access the output/typescript/examples directory with your terminal.
  2. Run the following command to install the SDK:
npm run setup
  1. Open the src/index.ts file and change the model and prompt as presented in the following code snippet:
  const generateRequest: GenerateRequest = {
model: 'llama3.1:8b',
prompt: 'Tell me a joke',
stream: true,
};
  1. Execute the SDK by running the following command at the terminal:
npm run start

On your terminal, you will receive the streamed response:

[email protected] start
> tsc && node dist/index.js

{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:03.9792376Z',
response: 'Here',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
...
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.229332Z',
response: 'asta',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2435589Z',
response: '.',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2602098Z',
response: '',
done: true,
doneReason: 'stop',
context: [
128006, 882, 128007, 271, 41551,
757, 264, 22380, 128009, 128006,
78191, 128007, 271, 8586, 596,
832, 1473, 3923, 656, 499,
1650, 264, 12700, 46895, 273,
1980, 2127, 3242, 14635, 13
],
totalDuration: 3104377400,
loadDuration: 2724683300,
promptEvalCount: 14,
promptEvalDuration: 95000000,
evalCount: 18,
evalDuration: 283000000
}

The SDK streams responses word by word until completion through the response parameter. A final status done=Trueconfirms if the stream is done.

Conclusion

Following this guide, you've learned how to create Python and TypeScript SDKs with streaming support using liblab. These SDKs simplify API integration and enhance real-time data handling. Explore further customization options in the liblab documentation.