Skip to main content

Build a Python SDK with Streaming

This tutorial includes the following SDK languages and versions:

TypeScript v1TypeScript v2JavaPython v1Python v2C#GoPHP

Streaming data is a common pattern when calling AI Large Language Models (LLMs) like Chat GPT, Claude, Llama or Mistral.

In this post we will explore how to seamlessly create an SDK for your favorite LLM using the liblab SDK generator.

The example in this tutorial will use Ollama to host a LLM locally on your computer, however you can use the same principles to access any LLM API that provides an OpenAPI file

The OpenAPI file contains API information like paths, params, security schemas, and other.



  1. Setting up Ollama and installing Llama 3.1
  2. Setting up the liblab CLI
  3. Generating the SDK
  4. Using the SDK
  5. How to enable streaming endpoints
  6. Conclusion

1. Setting up an example Llama API

First go to Ollama home to download and install the latest version.

Then, once Ollama is installed and running, execute the following command on a console to download the latest version of Llama 3.1:

ollama pull llama3.1

To verify that Llama 3.1 is installed, run the following command:

ollama run llama3.1

If all is well, you will be able to send prompts to the model and receive responses. For example:

ollama run llama3.1
>>> tell me a joke
Here's one:

What do you call a fake noodle?

(wait for it...)

An impasta!

Hope that made you laugh! Do you want to hear another?


2. Setting up the liblab CLI

First, ensure you have the liblab CLI installed. If not, you can install it via npm:

npm install -g @liblab/cli

Once installed, you need to log in to your liblab account. Run the following command and follow the prompts to log in:

liblab login

After logging in, you can configure the CLI with your project. We want a new directory for the SDK, let's create one called streaming:

mkdir -p streaming
cd streaming
liblab init

This will generate a liblab.config.json.

Before we edit the generated json file, let’s create the API spec for which we will generate this SDK.

ℹ️ Usually we don’t need to create the API spec ourselves since most APIs provide them. However we are using Ollama locally and it does not provide an OpenAPI spec.

Create a new file called ollama-open-api.yaml and paste the following content into it:

openapi: 3.0.0
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
$ref: '#/components/schemas/GenerateRequest'
description: OK
$ref: '#/components/schemas/GenerateResponse'

type: object
- model
- prompt
type: string
type: string
type: boolean

type: object
- model
- created_at
- response
type: string
type: string
type: string
type: boolean
type: string
type: array
type: integer
type: integer
type: integer
type: integer
type: integer
type: integer
type: integer

Now let's update the liblab.config.json file to use the our new spec. Copy and paste the json to overwrite the liblab.config.json file:

"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./ollama-open-api.yaml",
"languages": ["python"],
"auth": [],
"customizations": {
"baseURL": "http://localhost:11434",
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": "true"
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
"publishing": {
"githubOrg": ""

Enable streaming

To enable the SDK to receive streaming data, we can either add the streaming: true parameter to the endpointCustomizations parameter in the liblab.config.json, or we can add the x-liblab-streaming: true annotation to the Open API Spec file.

In this example, we've done both, to illustrate how to enable streaming, but you only need to do one of these.

Note: streaming is enabled by default if an endpoint returns the text/event-stream content type. In this case there is no need for any extra configurations.

3. Generate the SDK

Now that we have an OpenAPI Spec file and the liblab CLI, it is time to generate our SDK.

Execute the following command inside the streaming folder:

liblab build

The CLI will validate and notify us about any issues with the liblab.config.json or the ollama-open-api.yaml files. You might expect something like:

✓ No issues detected in the liblab config file.

Created output/api-schema-validation.json with the full linting results

Detected 3 potential issues with the spec:

⚠ OpenAPI "servers" must be present and non-empty array.
⚠ Info object must have "contact" object.
⚠ Operation must have non-empty "tags" array.
? It is important to fix your spec before continuing with a build. Not fixing the spec may yield a subpar SDK and documentation. Would you like to attempt to build the SDK anyway?

We can go ahead and confirm by typing Y.

Next we should see the build started and hopefully finished messages:

Ignoring the spec errors and attempting to build with the spec

No hooks found, SDKs will be generated without hooks.

No custom plan modifiers found, SDKs will be generated without them.
Your SDKs are being generated. Visit the liblab portal ( to view more details on your build(s).
✓ Python built
Successfully generated SDKs downloaded. You can find them inside the /Users/felipe/Development/LibLab/cli-test-runner/output folder
Successfully generated SDK's for Python ♡

If we go inside the output directory, we will see our SDK.

Congratulations! You have successfully generated an SDK for Ollama with streaming capabilities.

4. Using the SDK

Now that we have generated our SDK, let's make a request to Ollama to test it.

To do this, go into the output/python/examples directory.

Now run the script, and activate the generated virtual environment.

chmod u+x # this step my be necessary if this file requires run permissions.
source .venv/bin/activate

Now copy and paste the following code in the file.

Execute the file with this command:


After the sample has ran you should see the following output:

(.venv) ➜  examples git:(main) ✗ python
Here's one:

What do you call a fake noodle?

An impasta!

Hope that made you laugh! Do you want to hear another one?

5. How to enable streaming endpoints

There are 3 ways to enable streaming endpoints in your SDK:

1. Using the text/event-stream content type

Streaming is automatically enabled for any endpoint that returns a text/event-stream content type.

2. Using the liblab config

You can enable streaming for an endpoint by adding an endpoint customization to your liblab.config.json file, like we did in this tutorial:

"customizations": {
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": "true"

3. Adding the x-liblab-streaming: true annotation to the OpenAPI spec

You can also enable streaming by adding the x-liblab-streaming: true annotation to the OpenAPI spec, like we did in this tutorial:

description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true

6. Conclusion

In conclusion, creating an SDK for your LLM application simplifies the development process. By following the steps outlined in this tutorial, you’ve learned how to utilize the liblab CLI to generate a robust SDK.