Build Python, TypeScript, and Java SDKs with Streaming
This tutorial includes the following SDK languages:
TypeScript Java Python C# Go PHP ✅ ✅ ✅ ❌ ❌ ❌
Streaming data is a common feature when working with AI Large Language Models (LLMs) like ChatGPT, Claude, Llama, or Mistral. In this tutorial, you'll learn how to create Python, TypeScript, and Java SDKs with streaming features using the liblab SDK generator.
For this tutorial, you'll use Ollama to host an LLM locally on your computer. The principles in this guide can be applied to any LLM API that provides an OpenAPI file.
Before getting started you'll want to make sure you have a liblab account and the liblab CLI installed.
You'll also need:
Set up Ollama and install Llama 3.1
This tutorial uses the smaller Llama 3.1 model (8b) to ensure quicker response times. However, you can use any Llama 3.1 model if your machine can run it smoothly.
First, you'll need to install Ollama on your machine to run the LLM model.
- Visit Ollama, download the latest version, and run the Ollama program.
- Once Ollama is installed and running, execute the following command on your terminal to download the latest version of Llama 3.1 8b and confirm that it's working:
ollama run llama3.1:8b "tell me a joke"
If the setup was successful you'll get a response like:
A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs
and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here
or not."
Initialize and configure liblab
After installing and configuring Ollama on your machine, you can create a new project with liblab. Run the following commands to create a new liblab project in the streaming
directory:
mkdir -p streaming
cd streaming
liblab init
A new liblab.config.json
file will be created, which contains all the project's configurations.
Enabling Streaming
Streaming is enabled by default for endpoints returning the text/event-stream
Content Type.
For other endpoints, you can enable it for each relevant endpoint with one of the following declarations:
- In your liblab config: Set
streaming: true
- In your OpenAPI spec: Set
x-liblab-streaming: true
You'll see both methods in full below. In practice you only need to implement one of these options.
Create the API spec
Since there is no local Ollama API Spec available one is provided below. In most cases API providers and frameworks provide these specs automatically.
Inside the streaming
directory, create a file named openapi.yaml
with the following content:
openapi: 3.0.0
info:
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
servers:
- url: 'http://localhost:11434'
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateResponse'
components:
schemas:
GenerateRequest:
type: object
required:
- model
- prompt
properties:
model:
type: string
prompt:
type: string
stream:
type: boolean
GenerateResponse:
type: object
required:
- model
- created_at
- response
properties:
model:
type: string
created_at:
type: string
response:
type: string
done:
type: boolean
done_reason:
type: string
context:
type: array
items:
type: integer
total_duration:
type: integer
load_duration:
type: integer
prompt_eval_count:
type: integer
prompt_eval_duration:
type: integer
eval_count:
type: integer
eval_duration:
type: integer
Configure the project
Now you can start configuring the project. You can replace the liblab.config.json
created earlier with the modified version below. The highlighting shows what has been changed:
{
"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./openapi.yaml",
"languages": [
"python",
"typescript",
"java"
],
"auth": [],
"customizations": {
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
},
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
},
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": true
}
}
}
},
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
},
"typescript": {
"bundle": true,
"exportClassDefault": false,
"httpClient": "fetch",
"npmName": "",
"npmOrg": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2",
"generateEnumAs": "union"
},
"java": {
"groupId": "com.swagger",
"artifactId": "petstore",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2",
"includeKotlinSnippets": true
}
},
"publishing": {
"githubOrg": ""
}
}
When generating SDKs using liblab, you can customize and fine-tune the SDK to meet specific client needs. Explore the Core SDK options and SDK customization options to discover all the available settings and enhancements.
Generate the SDKs
Now that you have the API spec and have configured the liblab.config.json
, it's time to generate the SDK. To generate the SDK, execute the following command on your terminal:
liblab build -y
liblab will validate and notify you about any issues with the liblab.config.json
or the openapi.yaml
files. You should see the build started and the following finished messages:
Your SDKs are being generated. Visit the liblab portal (https://app.liblab.com/apis/ollama-api/builds/8364) to view more details on your build(s).
✓ Java built
✓ Python built
✓ TypeScript built
✓ Generate package-lock.json for TypeScript
Successfully generated SDKs for TypeScript, Python, Java. ♡ You can find them inside: <path-to-the-project-directory>/streaming/output
The SDKs are available in the output
directory, which should have the following structure:
output/
├── api-schema-validation.json
├── java/
├── typescript/
└── python/
Test the SDKs
To test the SDKs, you can use the examples created by liblab when generating the SDKs. The following sections describe how to test the Python, TypeScript, and Java SDKs.
Testing the Python SDK
To test the Python SDK, follow the steps:
From your terminal, open the directory output/python/examples
:
cd output/python/examples
Run one of the following scripts, depending on your operating system, to set up and activate a Python virtual environment:
- Mac / Linux
- Windows
chmod +x install.sh
./install.sh
source .venv/bin/activate
./install.cmd
.venv/Scripts/Activate.ps1
The command to activate venv may vary depending on your operating system and shell. If you encounter any issues, refer to Python's venv documentation to determine the correct venv command.
- Open the file
sample.py
and change the model and prompt, as displayed below. Here you define the model you'll use and also define the prompt to send to the LLM:
request_body = GenerateRequest(model="llama3.1:8b", prompt="Tell me a joke", stream=True)
- Run the
sample.py
script to execute thesdk.api.generate()
function:
python sample.py
On your terminal, you'll receive the streamed response:
GenerateResponse(
model='llama3.1:8b',
created_at='2025-01-05T14:25:31.57605Z',
response='Here',
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2025-01-05T14:25:31.6006105Z',
response="'s",
done=False
)
GenerateResponse(
model='llama3.1:8b',
created_at='2025-01-05T14:25:31.617061Z',
response=' one',
done=False
)
...
GenerateResponse(
model='llama3.1:8b',
created_at='2025-01-05T14:25:31.8601164Z',
response='',
done=True,
done_reason='stop',
context=[128006, 882, 128007, 271, 41551, 757, 264, 22380, 128009, 128006, 78191, 128007, 271, 8586, 596, 832, 1473, 3923, 656, 499, 1650, 264, 12700, 46895, 273, 1980, 2127, 3242, 14635, 0],
total_duration=7032005800,
load_duration=6623531800,
prompt_eval_count=14,
prompt_eval_duration=120000000,
eval_count=18,
eval_duration=287000000
)
The SDK streams responses token by token through the response
parameter until completion. A final status done=True
confirms when the stream is done.
Testing the TypeScript SDK
To test the TypeScript SDK, follow the steps:
- Access the
output/typescript/examples
directory with your terminal. - Run the following command to install the SDK:
npm run setup
- Open the
src/index.ts
file and change themodel
andprompt
as presented in the following code snippet. Here you define the model you'll use and also define the prompt to send to the LLM:
const generateRequest: GenerateRequest = {
model: 'llama3.1:8b',
prompt: 'Tell me a joke',
stream: true,
};
- Execute the SDK by running the following command at the terminal:
npm run start
The SDK will use the api.generate(generateRequest)
function to perform an API request.
On your terminal, you'll receive the streamed response:
[email protected] start
> tsc && node dist/index.js
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:03.9792376Z',
response: 'Here',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
...
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.229332Z',
response: 'asta',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2435589Z',
response: '.',
done: false,
doneReason: undefined,
context: undefined,
totalDuration: undefined,
loadDuration: undefined,
promptEvalCount: undefined,
promptEvalDuration: undefined,
evalCount: undefined,
evalDuration: undefined
}
{
model: 'llama3.1:8b',
createdAt: '2024-12-10T00:25:04.2602098Z',
response: '',
done: true,
doneReason: 'stop',
context: [
128006, 882, 128007, 271, 41551,
757, 264, 22380, 128009, 128006,
78191, 128007, 271, 8586, 596,
832, 1473, 3923, 656, 499,
1650, 264, 12700, 46895, 273,
1980, 2127, 3242, 14635, 13
],
totalDuration: 3104377400,
loadDuration: 2724683300,
promptEvalCount: 14,
promptEvalDuration: 95000000,
evalCount: 18,
evalDuration: 283000000
}
The SDK streams responses token by token through the response
parameter until completion. The status done=True
confirms when the stream is done.
Testing the Java SDK
To test the Java SDK, follow the steps:
- Access the
output/java/example
directory with your terminal. - Open the
/src/main/java/com/example/Main.java
file and updategenerateRequest
'smodel
andprompt
arguments:
GenerateRequest generateRequest = GenerateRequest
.builder()
.model("llama3.1:8b")
.prompt("Tell me a joke")
.stream(true)
.build();
- Run the following command to compile and run the SDK, which will use the
api.generate(generateRequest)
to perform an API request:
chmod +x run.sh
./run.sh
On your terminal, you'll receive a streamed response that looks similar to the following:
GenerateResponse(
model=llama3.1:8b,
createdAt=2025-01-05T13:53:13.6047604Z,
response=Here,
done=false,
doneReason=null,
context=null,
totalDuration=null,
loadDuration=null,
promptEvalCount=null,
promptEvalDuration=null,
evalCount=null,
evalDuration=null
)
GenerateResponse(
model=llama3.1:8b,
createdAt=2025-01-05T13:53:13.6298966Z,
response='s,
done=false,
doneReason=null,
context=null,
totalDuration=null,
loadDuration=null,
promptEvalCount=null,
promptEvalDuration=null,
evalCount=null,
evalDuration=null
)
GenerateResponse(
model=llama3.1:8b,
createdAt=2025-01-05T13:53:13.6449122Z,
response= one,
done=false,
doneReason=null,
context=null,
totalDuration=null,
loadDuration=null,
promptEvalCount=null,
promptEvalDuration=null,
evalCount=null,
evalDuration=null
)
...
GenerateResponse(
model=llama3.1:8b,
createdAt=2025-01-05T13:53:14.1793566Z,
response=,
done=true,
doneReason=stop,
context=[128006, 882, 128007, 271, 41551, 757, 264, 22380, 128009, 128006, 78191, 128007, 271, 8586, 596, 832, 1473, 3923, 656, 499, 1650, 264, 12700, 46895, 273, 1980, 65192, 369, 433, 62927, 2127, 3242, 14635, 2268, 39115, 430, 1903, 499, 12835, 0, 3234, 499, 1390, 311, 6865, 2500, 832, 30],
totalDuration=6124368300,
loadDuration=5308640200,
promptEvalCount=14,
promptEvalDuration=231000000,
evalCount=36,
evalDuration=574000000
)
The SDK streams responses token by token through the response
parameter until completion. The status done=True
confirms when the stream is done.
Next Steps
Now that you've packaged your SDKs you can learn how to integrate them with your CI/CD pipeline and publish them to their respective package manager repositories.
We currently have guides for: