Staso Docs
Quickstart

OpenAI

One patch call. Every chat.completions.create becomes an LLM span — sync, async, streaming.

1. Get an API key

Sign in to Staso and create a key under Settings -> API Keys. Export it:

export STASO_API_KEY=ak_...

2. Install

pip install "staso[openai]"

Python 3.11+.

3. Patch and go

import staso as st
from staso.integrations import patch_openai
from openai import OpenAI

st.init(agent_name="my-agent")
patch_openai()

client = OpenAI()
client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "hi"}],
)

Open the Staso dashboard — the call shows up as an LLM span with model, tokens, and latency.

What it captures

  • Tokens (prompt, completion, total, reasoning for o-series).
  • Request params (model, temperature, max_tokens, top_p, frequency_penalty, presence_penalty, seed, tool_choice, response_format, service_tier).
  • Response (content, tool_calls, finish_reason, response_id).
  • Streaming (stream=True wrapped end-to-end).

Guard

Guard evaluates every tool_call in the response. Blocked tools raise staso.GuardBlocked; modified tools have their function.arguments rewritten in place. See error handling.

Tool spans

Wrap your tools with @st.tool to capture their inputs and outputs as child spans:

@st.tool
def search_faq(query: str) -> str:
    return db.search(query)

See Python SDK quickstart for the full set of decorators.

Next