Datasets
Curate from traces
from staso.dataset import TraceColumnMapping
ds = st.dataset.from_traces(
name="refund-failures-may",
trace_ids=["trace_abc", "trace_def"],
mapping=[
TraceColumnMapping(source="root_span.input.query", target="input"),
TraceColumnMapping(source="root_span.output.result", target="expected"),
],
description="Real failed refund turns from production",
)Mapping
mapping is a list of TraceColumnMapping(source, target).
source— dotted path into the trace (e.g.root_span.input.query,request.body).target— dataset column name to write into.
A bare dict[str, str] is accepted for backwards compatibility but emits a DeprecationWarning.
Workflow
- Find failing traces in the dashboard (filter by status, latency, or guard verdict).
- Copy the trace IDs you want to freeze.
- Call
from_traces(...)with aTraceColumnMapping. - Attach scorers and re-run
evaluate(...)every time you ship a fix.
Entries created this way have source_type set, so you can always trace a row back to its originating span.