Concepts

Sandbox

The AgentCop Sandbox isolates agent execution, limiting the blast radius of a compromised or misbehaving agent.

What Is a Sandbox?

A sandbox is an isolated execution environment. The agent runs inside it; the host system runs outside. What happens inside stays inside.

The sandbox constrains three things: file system access, network access, and process spawning. If an agent is compromised — through a prompt injection, a malicious tool response, or a vulnerability in a dependency — the sandbox prevents it from using that compromise to damage the host system, exfiltrate data to arbitrary endpoints, or persist across runs.

After every run, the container is destroyed. The next run starts clean.

Sandbox Modes

Mode	Isolation	Use case
`none`	No isolation	Development only
`process`	Process isolation	Low-risk agents
`container`	Docker container	Production agents
`vm`	Full VM isolation	High-risk / untrusted code

Container Sandbox

The container mode is the recommended configuration for production deployments. It provides Docker-level isolation with granular control over filesystem access, network egress, and compute resources.

python

from agentcop import Sandbox

sandbox = Sandbox(
    mode="container",
    image="agentcop/agent-sandbox:latest",

    filesystem={
        "read_only": ["/workspace/knowledge"],
        "read_write": ["/workspace/output"],
        "blocked": ["/etc", "/var", "/root", "/home"]
    },

    network={
        "allow": ["api.openai.com", "api.company.com"],
        "block_outbound": True,  # Block everything else
    },

    resources={
        "cpu": "0.5",
        "memory": "512m",
        "timeout": 30
    }
)

with sandbox:
    result = agent.run(task)
    # Agent runs in container — cannot escape to host

What the Sandbox Blocks

File system traversal attacks — the agent cannot read or write outside its declared paths
Exfiltration via arbitrary network calls — outbound connections are restricted to an explicit allowlist
Privilege escalation via subprocess — process spawning is blocked by default in container mode
Persistent compromise — the container is destroyed after each run; there is no persistent foothold

Note

The sandbox is the last line of defense. It doesn't prevent an agent from making bad decisions — but it limits the consequences when it does.