The pipeline and the verdict model¶
Every request flows through a staged spine — ingress guards → route → execute
→ egress guards → finalize — compiled per route at startup from ordered node
lists in aegis.yaml. A code-level escape hatch accepts a fully custom
StateGraph for the cases configuration cannot express.
flowchart LR
REQ([Request]) --> AUTH{Auth}
AUTH -- no principal --> R401([401])
AUTH --> IN[Ingress guards]
IN -- block --> REF([Refused + audit])
IN -- require_approval --> HOLD([Paused — HITL])
HOLD -- approved --> RT
IN -- allow / sanitize --> RT[Route]
RT -- no compliant provider --> REF
RT --> EX[Execute]
EX -- tool call --> TG[Tool-call guard] --> MCP[MCP tool] --> TR[Tool-result guard] --> EX
EX --> EG[Egress guards]
EG -- block --> REF
EG -- allow / sanitize --> RESP([Response + audit])
RunState: the contract between nodes¶
Nodes communicate only through a typed RunState: run id, Principal,
messages, free-form labels (how packs talk without importing each other —
the classifier writes labels.classification, the residency pack reads it),
the mask map (a channel never serialized into model-visible messages), the
selected route, an append-only event log, and usage/cost accumulators. A
third-party node and a built-in node are indistinguishable to the runtime.
Four verdicts¶
Every guard returns exactly one of:
- allow — continue unchanged.
- sanitize — continue with mutated state (e.g. masked text). Sanitize deltas compose in configured order.
- block — terminal. The stage short-circuits; the client gets a refusal; the audit log gets the reason.
- require_approval — the run checkpoints and pauses via a LangGraph
interrupt; a human resumes it with allow or block. Approval authority is
itself policy: the approver's
Principalis checked againstapprovers:.
Every verdict — including allow — is an audit event. "Which guard let this
through?" is always answerable.
Tool traffic is governed in both directions¶
A tool call is model output (it can exfiltrate masked data or take a dangerous action) and a tool result is untrusted input (the prime prompt-injection vector). So the execute stage splices the same guardrail contract into both directions: a tool-call guard (argument scan, exfiltration check, per-tool approval) and a tool-result guard (injection scan). RAG-retrieved content is untrusted for the same reason and flows through the same tool-result chain.