Rules and policies

Rules and policies live in the dashboard under Settings -> Guard. There is no SDK configuration — the integrations evaluate whatever you have attached to your org, agents, or environments.

Static rules

Fast, deterministic, zero-config. Curated library covering common blast radii:

Dangerous shell (rm -rf /, dd, forkbombs).
Destructive SQL (DROP TABLE, unscoped DELETE, mass UPDATE).
PII leakage in tool outputs.
File-system writes to protected paths.

Sub-millisecond. No LLM calls.

LLM-judge rules

Model-based checks for semantic decisions:

Intent drift.
Prompt injection / jailbreak detection.
Hallucinated facts.
Sensitive-topic guardrails.

A few hundred ms per call. Use for high-stakes tools.

Custom rules

Define your own. Each rule specifies: matched tool names, evaluation logic (static or LLM judge), mode (audit or enforce), and action (block, modify, escalate).

Custom rules share the LLM-judge counter.

Policies

A bundle of rules attached to specific agents, environments, or the whole org. Roll out changes by flipping a policy instead of editing rules.

Audit vs enforce

Two modes that matter in practice:

Audit — the rule fires; the call still proceeds. A guard:would-block:* span is added so you see what would have been blocked.
Enforce — the rule fires; the call is blocked, modified, or escalated.

Start new rules in audit. Watch for false positives. Flip to enforce.

Permissions

Who can edit rules and policies is controlled by team roles (manage_guard_rules, manage_guard_policies).

Rules catalog — every shipping rule and judge with severity and surface.
Actions and escalation
Manual checks