Stop the $4,200 refund before it executes.
Every tool call evaluated by static and LLM judges. Block, audit, or allow — synchronously.
One intercept. Three outcomes.
Guard runs inside the same process as your agent. No proxy, no queue. Evaluation completes before the tool executes.
Tool call requested
Your agent picks a tool. We intercept before the side-effect fires.
Evaluate
Static rules run in milliseconds. LLM judges run in parallel where a rule needs semantic reasoning.
Allow · Audit · Block
Deterministic outcome back to the agent. Full trace written whether the call ran or not.
Opinionated by default. Yours where it matters.
Ship with a library of proprietary detection rules. Layer your own on top for business logic nobody else can know.
llm judges, scoped.
Write a rule in plain english. Attach it to an agent, a tool, or a whole workspace. Run it on the tool call, the full trace, or both.
rule-as-code, in the repo.
YAML or python — version rules alongside the agent. Diff them like any other change. Roll back in one commit.
audit first. enforce later.
Every rule starts in observe mode. Watch what it would have blocked. Flip to enforce when the false-positive rate is acceptable.
evidence cache.
Every decision is validated against prior tool outputs and conversation state. Agents cannot act on hallucinated data.
2 lines. Synchronous. In-process.
guard() returns a deterministic decision and writes a full audit record. Wire it into your tool dispatcher once.
01from staso.guard import guard0203decision = guard(tool_name, tool_input, context=trace)04# decision.action → "allow" | "audit" | "block"
Inbound today. Outbound next.
Current guards cover prompt injection, PII, dangerous tool calls, and cost escalation. What we're building toward.
Outbound guards · hallucination
SoonCatch confident but invented claims in agent responses. Trigger a block or a flag before the user sees the answer.
Outbound guards · false completion
SoonDetect when the agent says done but the tool output disagrees. No more shipped bugs that read like success.
Outbound guards · quality drift
SoonPer-agent baseline on response quality and reasoning depth. Alert when the current run regresses.
Closed loop with self-heal
SoonA guard violation auto-triggers a diagnose run. The resolved root cause seeds a new rule. Detection sharpens per incident.
Your agents are making decisions right now.
Add guard to one tool. See what it catches. Expand from there.