Guardrails

Guardrails let you define rules that filter agent input and output in real time. Every message passes through guardrail checks before reaching the user or the model.

How it works

Define rules via POST /api/guardrails/rules (regex, keyword, or custom patterns).
Each rule specifies an action: block, warn, or log.
When a message matches a rule, the system takes the configured action and records a violation.

Rule types

Type	Description
`regex`	Match content against a regular expression
`keyword`	Block or flag specific words or phrases

Checking content

Use POST /api/guardrails/check to evaluate content against all active rules:

curl -X POST https://your-deployment.convex.site/api/guardrails/check \
  -H 'Authorization: Bearer SESSION_TOKEN' \
  -d '{"sessionId": "...", "content": "user message", "direction": "input"}'

The response includes allowed: boolean and any violations triggered.

Monitoring violations

View all violations in the dashboard under Guardrails, or query them via GET /api/guardrails/violations.

Guardrails

How it works

Rule types

Checking content

Monitoring violations

On this page