Concepts
Guardrails
Guardrails let you define rules that filter agent input and output in real time. Every message passes through guardrail checks before reaching the user or the model.
How it works
- Define rules via
POST /api/guardrails/rules(regex, keyword, or custom patterns). - Each rule specifies an action:
block,warn, orlog. - When a message matches a rule, the system takes the configured action and records a violation.
Rule types
| Type | Description |
|---|---|
regex | Match content against a regular expression |
keyword | Block or flag specific words or phrases |
Checking content
Use POST /api/guardrails/check to evaluate content against all active rules:
curl -X POST https://your-deployment.convex.site/api/guardrails/check \
-H 'Authorization: Bearer SESSION_TOKEN' \
-d '{"sessionId": "...", "content": "user message", "direction": "input"}'The response includes allowed: boolean and any violations triggered.
Monitoring violations
View all violations in the dashboard under Guardrails, or query them via GET /api/guardrails/violations.
Edit on GitHub
Last updated on