Staso Docs
Datasets

Create and Manage Datasets

Every dataset operation is a top-level function on staso.dataset. No classes to instantiate.

import staso as st

st.init(api_key="...", workspace_slug="...")

ds = st.dataset.create(
    name="eval-refunds",
    description="Refund-flow regressions",
    columns=[
        {"name": "input", "column_type": "input"},
        {"name": "expected", "column_type": "expected_output"},
    ],
)
st.dataset.add_entry(ds.id, {"input": "...", "expected": "refund_issued"}, split="test")

Dataset CRUD

ds = st.dataset.create(
    name="eval-refunds",
    description="Refund-flow regressions",
    columns=[{"name": "input", "column_type": "input"}],
    folder_id=None,
)

ds = st.dataset.get(dataset_id)

datasets = st.dataset.list(folder_id=None, limit=100, offset=0)

ds = st.dataset.update(dataset_id, name="eval-refunds-v2", description="New description")

st.dataset.delete(dataset_id)

create returns a Dataset with id, name, description, folder_id, columns, entry_count, created_at, updated_at.

Entries

entry = st.dataset.add_entry(
    ds.id,
    {"input": "I want a refund", "expected": "refund_issued"},
    split="train",
)

entries = st.dataset.add_entries(
    ds.id,
    [
        {"input": "case a"},
        {"input": "case b"},
    ],
    split="test",
)

rows = st.dataset.list_entries(ds.id, split=None, limit=100, offset=0)

st.dataset.update_entry(ds.id, entry.id, data={"input": "updated text"})

st.dataset.delete_entry(ds.id, entry.id)

Each DatasetEntry has id, dataset_id, data, split, source_type, created_at. source_type is set automatically when the entry came from a trace — see /docs/datasets/curate-from-traces.

Folders

Group datasets by workstream, team, or feature.

folder = st.dataset.create_folder("evals/refunds")

folders = st.dataset.list_folders(parent_id=None)

st.dataset.delete_folder(folder.id)

ColumnType

Column types drive how the dashboard and evaluator interpret each field.

ValuePurpose
variableFree-form variable, default
inputAgent input
outputAgent output
expected_outputGround truth for scoring
expected_tool_callsGround-truth tool trajectory
scenarioScenario description
expected_stepsGround-truth step list
conversation_historyPrior turns
filesAttached files
customEscape hatch
from staso.dataset import ColumnType

ColumnType.INPUT           # "input"
ColumnType.EXPECTED_OUTPUT # "expected_output"

SplitType

ValueUse
trainTraining split
testHeld-out eval split
validationTuning split
from staso.dataset import SplitType

st.dataset.add_entry(ds.id, {"input": "x"}, split=SplitType.TEST)

Strings ("train", "test", "validation") are accepted anywhere SplitType is.

Next