Governance Gateway — Policy-as-Code
A pre-execution policy engine that evaluates rules before an agent operation is permitted. Every request passes through the gateway, which checks it against the tenant's active policies and either allows, denies, or escalates to a human-in-the-loop (HITL) gate.
Policy types
| Type | Description | Example Config |
|---|---|---|
rate_limit | Max operations per time window | {"max_requests": 600, "window_seconds": 60} |
model_allowlist | Restrict which LLM models can be used | {"models": ["gpt-4o", "claude-3.5-sonnet"]} |
content_filter | Block queries/outputs matching regex patterns | {"patterns": ["password", "secret"], "check_fields": ["query", "output"]} |
data_scope | Restrict which stores can be accessed | {"allowed_stores": ["default", "production"]} |
token_budget | Cap tokens per request | {"max_tokens_per_request": 10000} |
hitl_required | Require human approval for operations | {"operations": ["delete", "inference"]} |
pii_guard | Detect PII patterns in outputs | {"patterns": ["\\b\\d{3}-\\d{2}-\\d{4}\\b"]} |
Actions
| Action | Behavior |
|---|---|
deny | Block the request. Return 403. |
escalate | Pause for human-in-the-loop approval. |
log_only | Allow the request but log a warning. |
allow | Explicitly allow (overrides lower-priority denials). |
Evaluation flow
- Request arrives with
operation(e.g.write,retrieve,inference) andcontext - All enabled policies are loaded, ordered by priority (lower number = higher priority)
- Each policy is evaluated against the context
- Per-policy result is logged to
governance_evaluation_log - If any policy returns
deniedorescalated, the request is blocked
# Evaluate policies against a simulated request
curl -X POST https://cloud.grafomem.com/v1/governance/evaluate \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"operation": "inference",
"context": {
"model_id": "gpt-3.5-turbo",
"query": "What is the password?",
"tokens": 500
}
}'
Response:
{
"allowed": false,
"evaluations": [
{"policy_name": "Model Allowlist", "result": "denied", "detail": "Model 'gpt-3.5-turbo' not in allowlist"},
{"policy_name": "Content Filter", "result": "denied", "detail": "Content filter match in 'query': pattern 'password'"}
],
"summary": {"total": 3, "allowed": 1, "denied": 2, "escalated": 0, "logged": 0}
}
API reference
Policy CRUD
| Method | Path | Description |
|---|---|---|
POST | /v1/governance/policies | Create a new policy |
GET | /v1/governance/policies | List all policies |
GET | /v1/governance/policies/{id} | Get a single policy |
PUT | /v1/governance/policies/{id} | Update a policy |
DELETE | /v1/governance/policies/{id} | Delete a policy |
Evaluation & Monitoring
| Method | Path | Description |
|---|---|---|
POST | /v1/governance/evaluate | Evaluate all policies against a request |
POST | /v1/governance/seed-defaults | Seed default policies (rate limit + PII guard) |
GET | /v1/governance/stats | Summary statistics |
GET | /v1/governance/policy-types | List available policy types with config schemas |
GET | /v1/governance/logs | Evaluation log (filterable by policy_id, result) |
Default policies
When you call POST /v1/governance/seed-defaults, two policies are created:
- Default Rate Limit — 600 requests per minute, action:
deny - PII Output Guard — Detects SSN, credit card, and IBAN patterns in outputs, action:
log_only
Portal UI
The Governance tab in the Cloud Portal provides:
- Stats dashboard — active policies, total evaluations, denied, escalated
- Create policy form — name, type picker, action selector, JSON config editor
- Policy table — with ✓ON/✕OFF toggles and delete buttons
- Test evaluation panel — simulate requests and see which policies fire, color-coded results
- Evaluation logs — time, policy, operation, result, detail