Core Concepts
AgentCop's security model spans three layers: static analysis before deploy, behavioral monitoring at runtime, and execution gating before every tool call.
Three-Layer Architecture
No single security control is sufficient against adversarial agents. AgentCop stacks three independent layers — so that an attacker who defeats one still faces two more.
┌─────────────────────────────────────────────────┐
│ Layer 1: Scanner (Pre-Deploy) │
│ AST analysis · OWASP LLM Top 10 · Trust Score │
├─────────────────────────────────────────────────┤
│ Layer 2: Monitor (Runtime) │
│ Behavioral analysis · Anomaly detection │
├─────────────────────────────────────────────────┤
│ Layer 3: Gate (Execution) │
│ Permission layer · Approval boundaries │
└─────────────────────────────────────────────────┘
Concept Pages
Each layer of the security model has a dedicated concept page. Start with the Scanner Engine and work down.
Scanner Engine
AST-based static analysis that maps your agent code to the OWASP LLM Top 10.
Runtime Monitor
Behavioral analysis that watches what your agent does, not just what it says.
Execution Gate
The enforcement layer that blocks unauthorized tool calls before they run.
Permission Layer
Declarative rules that define what your agent is and isn't allowed to do.
Sandbox
Isolated execution environment that limits blast radius of agent actions.
Approval Boundaries
Human-in-the-loop checkpoints for high-risk operations.
Badge System
Embeddable trust scores that signal your agent's security posture.
Why Three Layers?
Each layer catches a different class of threat. None of them is complete on its own.
- Layer 1 (Scanner) catches insecure code patterns at write-time — before the agent ships. It finds hardcoded secrets, injection-prone prompt templates, and dangerous function calls. But it cannot see what inputs arrive at runtime.
- Layer 2 (Monitor) watches behavior during execution. A prompt injection attack looks like normal code until it runs. The monitor detects behavioral anomalies — unusual tool sequences, unexpected outbound data, sudden scope changes — that static analysis is blind to.
- Layer 3 (Gate) enforces permissions at the moment of action. Even if an attacker bypasses Layers 1 and 2, the Gate prevents unauthorized tool calls from completing. It is the last mechanical barrier before consequences occur.
An attacker who successfully injects a malicious prompt still faces the Gate's permission rules and the Sandbox's isolation. Compromise at one layer does not cascade automatically — which is what defense in depth means in practice.
defense in depth isn't paranoia. it's the only thing that works against adversarial prompts.