Rules and policies
Rules and policies live in the dashboard under Settings -> Guard. There is no SDK configuration — the integrations evaluate whatever you have attached to your org, agents, or environments.
Static rules
Fast, deterministic, zero-config. Curated library covering common blast radii:
- Dangerous shell (
rm -rf /,dd, forkbombs). - Destructive SQL (
DROP TABLE, unscopedDELETE, massUPDATE). - PII leakage in tool outputs.
- File-system writes to protected paths.
Sub-millisecond. No LLM calls.
LLM-judge rules
Model-based checks for semantic decisions:
- Intent drift.
- Prompt injection / jailbreak detection.
- Hallucinated facts.
- Sensitive-topic guardrails.
A few hundred ms per call. Use for high-stakes tools.
Custom rules
Define your own. Each rule specifies: matched tool names, evaluation logic (static or LLM judge), mode (audit or enforce), and action (block, modify, escalate).
Custom rules share the LLM-judge counter.
Policies
A bundle of rules attached to specific agents, environments, or the whole org. Roll out changes by flipping a policy instead of editing rules.
Audit vs enforce
Two modes that matter in practice:
- Audit — the rule fires; the call still proceeds. A
guard:would-block:*span is added so you see what would have been blocked. - Enforce — the rule fires; the call is blocked, modified, or escalated.
Start new rules in audit. Watch for false positives. Flip to enforce.
Permissions
Who can edit rules and policies is controlled by team roles (manage_guard_rules, manage_guard_policies).
Next
- Rules catalog — every shipping rule and judge with severity and surface.
- Actions and escalation
- Manual checks