Datasets

Curate from traces

from staso.dataset import TraceColumnMapping

ds = st.dataset.from_traces(
    name="refund-failures-may",
    trace_ids=["trace_abc", "trace_def"],
    mapping=[
        TraceColumnMapping(source="root_span.input.query", target="input"),
        TraceColumnMapping(source="root_span.output.result", target="expected"),
    ],
    description="Real failed refund turns from production",
)

Mapping

mapping is a list of TraceColumnMapping(source, target).

source — dotted path into the trace (e.g. root_span.input.query, request.body).
target — dataset column name to write into.

A bare dict[str, str] is accepted for backwards compatibility but emits a DeprecationWarning.

Workflow

Find failing traces in the dashboard (filter by status, latency, or guard verdict).
Copy the trace IDs you want to freeze.
Call from_traces(...) with a TraceColumnMapping.
Attach scorers and re-run evaluate(...) every time you ship a fix.

Entries created this way have source_type set, so you can always trace a row back to its originating span.

Next

Manage

CRUD for datasets, entries, and folders. All top-level functions on st.dataset.

Evaluate

Run any Python function against every entry. Score with custom scorers.

On this page

Curate from traces Mapping Workflow Next