Integrations
OpenAI Integration
Auto-trace every OpenAI chat completion call. No code changes.
Setup
pip install "staso[openai]"import staso as st
st.init(api_key="ak_...", agent_id="my-agent")
st.integrations.patch_openai()Done. Every call made through the OpenAI SDK is now traced automatically.
What Shows Up on the Dashboard
| Field | Example |
|---|---|
| Span name | openai.chat.completions.create |
| Model | gpt-4o |
| Input tokens | 245 |
| Output tokens | 89 |
| Total tokens | 334 |
| Latency | 1,230 ms |
| Status | ok / error |
Example
from openai import OpenAI
import staso as st
st.init(api_key="ak_...", agent_id="my-agent")
st.integrations.patch_openai()
client = OpenAI()
@st.agent(name="chat-agent")
def chat(message: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
max_tokens=1024,
messages=[{"role": "user", "content": message}],
)
return response.choices[0].message.content
with st.conversation("demo"):
chat("What is observability?")
st.shutdown()Dashboard:
chat-agent (agent)
└── openai.chat.completions.create (llm) — gpt-4o, 334 tokens, 1.2sStreaming
Streaming is fully supported. The SDK wraps the stream to capture tokens and latency as chunks arrive:
@st.agent(name="streaming-agent")
def chat(message: str) -> str:
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": message}],
stream=True,
)
chunks = []
for chunk in stream:
if chunk.choices[0].delta.content:
chunks.append(chunk.choices[0].delta.content)
return "".join(chunks)The span captures the full response, token usage, and time-to-first-token.
Async
Both sync and async clients are instrumented:
from openai import AsyncOpenAI
client = AsyncOpenAI()
@st.agent(name="async-agent")
async def chat(message: str) -> str:
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": message}],
)
return response.choices[0].message.contentWhat Gets Captured
Request parameters (stored in span metadata):
temperature,max_tokens,top_pfrequency_penalty,presence_penalty,seedtool_choice,response_format,service_tier
Response data:
- Content text, role, tool calls
- Finish reason
- Token usage (input, output, total, reasoning tokens)
Messages are captured by default. To disable, set capture_messages=False in st.init().
Without the Integration
If you want manual control:
@st.trace(name="llm_call", kind="llm")
def call_llm(prompt: str) -> str:
response = client.chat.completions.create(...)
return response.choices[0].message.contentYou get a span with timing and error tracking, but no automatic token/model extraction.