Concepts

Sandbox

The AgentCop Sandbox isolates agent execution, limiting the blast radius of a compromised or misbehaving agent.

What Is a Sandbox?

A sandbox is an isolated execution environment. The agent runs inside it; the host system runs outside. What happens inside stays inside.

The sandbox constrains three things: file system access, network access, and process spawning. If an agent is compromised — through a prompt injection, a malicious tool response, or a vulnerability in a dependency — the sandbox prevents it from using that compromise to damage the host system, exfiltrate data to arbitrary endpoints, or persist across runs.

After every run, the container is destroyed. The next run starts clean.

Sandbox Modes

Mode Isolation Use case
none No isolation Development only
process Process isolation Low-risk agents
container Docker container Production agents
vm Full VM isolation High-risk / untrusted code

Container Sandbox

The container mode is the recommended configuration for production deployments. It provides Docker-level isolation with granular control over filesystem access, network egress, and compute resources.

python
from agentcop import Sandbox

sandbox = Sandbox(
    mode="container",
    image="agentcop/agent-sandbox:latest",

    filesystem={
        "read_only": ["/workspace/knowledge"],
        "read_write": ["/workspace/output"],
        "blocked": ["/etc", "/var", "/root", "/home"]
    },

    network={
        "allow": ["api.openai.com", "api.company.com"],
        "block_outbound": True,  # Block everything else
    },

    resources={
        "cpu": "0.5",
        "memory": "512m",
        "timeout": 30
    }
)

with sandbox:
    result = agent.run(task)
    # Agent runs in container — cannot escape to host

What the Sandbox Blocks

  • File system traversal attacks — the agent cannot read or write outside its declared paths
  • Exfiltration via arbitrary network calls — outbound connections are restricted to an explicit allowlist
  • Privilege escalation via subprocess — process spawning is blocked by default in container mode
  • Persistent compromise — the container is destroyed after each run; there is no persistent foothold
Note

The sandbox is the last line of defense. It doesn't prevent an agent from making bad decisions — but it limits the consequences when it does.