Getting Started

Your First Scan

Walk through scanning a real AI agent, understanding the results, and making your first fix.

The patient: AgentBob

Meet AgentBob — a LangChain ReAct agent we use throughout the docs to demonstrate what not to do. He's deliberately written with four common security vulnerabilities that AgentCop catches. Save the code below as agentbob.py and follow along.

⚠

AgentBob is an intentionally insecure example. Do not deploy or run this code in a production environment. It exists solely to demonstrate vulnerability patterns.

python

# agentbob.py — our cautionary tale
# AgentBob is a LangChain ReAct agent with several security issues
# We'll use him throughout the docs to show what NOT to do

import os
from langchain.agents import initialize_agent, AgentType
from langchain.tools import ShellTool, DuckDuckGoSearchRun
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
tools = [ShellTool(), DuckDuckGoSearchRun()]

# Issue 1: No execution gate on shell tool
# Issue 2: API key hardcoded (visible in plaintext)
OPENAI_API_KEY = "sk-proj-abc123..."  # LLM06: Sensitive data exposure

def run_agent(user_input: str):
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

    # Issue 3: Prompt injection — user input concatenated directly into prompt
    prompt = f"You are a helpful assistant. User request: {user_input}"

    # Issue 4: eval() on agent output — LLM02
    result = agent.run(prompt)
    return eval(result)  # 💀 never do this

The four issues AgentBob has — and the OWASP LLM categories they map to — are:

LLM06 — Sensitive Information Disclosure: the OpenAI API key is hardcoded in plaintext. Anyone with read access to the file or repo can steal it.
LLM01 — Prompt Injection: user_input is interpolated directly into the prompt string with an f-string. A malicious user can inject instructions that override the system prompt.
LLM02 — Insecure Output Handling: the agent's output is passed directly to eval(). If the LLM returns malicious Python, it will be executed.
LLM08 — Unreviewed External Actions: ShellTool is registered with no execution gate, meaning the agent can run arbitrary shell commands with no human approval step.

Running the scan

With AgentBob saved locally, run AgentCop against him:

terminal

$ agentcop scan agentbob.py

scanning agentbob.py...

Trust Score: 23/100 [CRITICAL RISK]

4 issues found:

[CRITICAL] LLM06 · Sensitive Information Disclosure

Line 14: Hardcoded API key detected

Pattern: sk-proj-[a-zA-Z0-9]{20,}

Fix: Use environment variables — os.getenv('OPENAI_API_KEY')

[HIGH] LLM01 · Prompt Injection

Line 19: f-string interpolation injects user_input into prompt

Fix: Use parameterized templates, sanitize user input

[HIGH] LLM02 · Insecure Output Handling

Line 22: eval() called on LLM-generated output

Fix: Never eval() LLM output. Parse/validate explicitly.

[MEDIUM] LLM08 · Unreviewed External Actions

Line 8: ShellTool used without execution gate

Fix: Wrap with AgentCop ExecutionGate

Full report: https://agentcop.live/scan/agentbob-demo

Reading the results

Trust Score

The Trust Score is a 0–100 integer that summarises the overall security posture of your agent. It is calculated by subtracting weighted penalties for each finding from a baseline of 100.

Range	Rating	Meaning
80–100	Low Risk	Safe for production with standard monitoring
60–79	Moderate Risk	Address high-severity findings before wide deployment
40–59	High Risk	Multiple serious vulnerabilities — do not deploy
0–39	Critical Risk	Immediate exploits possible — do not run in any shared environment

Severity levels

Each issue is tagged with one of five severity levels:

CRITICAL: Immediate exploit risk. Hard-stops CI with --min-score. Fix before any execution.
HIGH: Significant security gap. Fix before any shared or production use.
MEDIUM: Notable weakness. Fix before public or internet-facing deployment.
LOW: Minor concern or a defence-in-depth improvement. Address when convenient.
INFO: Informational observation. No immediate action required.

OWASP LLM categories

Every finding is mapped to the OWASP LLM Top 10 — the industry-standard taxonomy for AI/LLM security risks. The prefix (LLM01, LLM06, etc.) tells you which category a finding falls under so you can look up deeper context, CVEs, and mitigation patterns. See the full OWASP LLM Top 10 reference and the CWE Mapping for cross-framework coverage.

Fixing AgentBob

Here is the patched version of AgentBob with all four issues resolved. Each fix is annotated with what changed and why.

python

# agentbob_fixed.py — AgentBob learns from his mistakes
import os
from langchain.agents import initialize_agent, AgentType
from langchain.tools import DuckDuckGoSearchRun
from langchain_openai import ChatOpenAI
# from agentcop import ExecutionGate, PermissionLayer  # Coming in runtime release

llm = ChatOpenAI(model="gpt-4")  # Uses OPENAI_API_KEY from env

# Fix 1: No shell tool — remove dangerous capabilities
# Fix 2: API key from environment
tools = [DuckDuckGoSearchRun()]

def run_agent(user_input: str):
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

    # Fix 3: Parameterized prompt — user input is data, not instructions
    prompt = "Answer the following question: {question}"

    result = agent.run(prompt.format(question=user_input))

    # Fix 4: No eval() — return result directly
    return result

A summary of the four fixes applied:

Remove ShellTool — if your agent does not need shell access, do not give it shell access. Principle of least privilege.
API key from environment — ChatOpenAI reads OPENAI_API_KEY from the environment by default. Never commit credentials to source control.
Parameterized prompt — by placing user input into a named slot ({question}), it is treated as data rather than instructions. This prevents prompt injection attacks from hijacking the agent's behaviour.
Remove eval() — return the result string directly. If you need to parse structured output from an LLM, use json.loads() with explicit validation against a schema, never eval().

Re-scanning after fixes

Save the fixed file and re-run the scan to verify your changes improved the score:

terminal

$ agentcop scan agentbob_fixed.py

scanning agentbob_fixed.py...

Trust Score: 85/100 [LOW RISK]

1 issue found:

[INFO] LLM08 · Unreviewed External Actions

Line 8: DuckDuckGoSearchRun makes external network requests

Fix: Consider wrapping with ApprovalBoundary for sensitive deployments

Full report: https://agentcop.live/scan/agentbob-fixed-demo

Score jumped from 23 to 85. The remaining INFO-level finding flags the search tool making external network requests — a reasonable observation to note in a threat model, but not a blocker for most deployments. Adding an Approval Boundary around the tool would eliminate it entirely.

Embed the badge

Share your agent's Trust Score by embedding the AgentCop badge in your README. The badge updates automatically as you re-scan.

markdown

[![AgentCop Score](https://agentcop.live/api/badge/YOUR_SCAN_ID)](https://agentcop.live/scan/YOUR_SCAN_ID)

Replace YOUR_SCAN_ID with the scan ID shown at the end of your CLI output or in the report URL. The badge renders as an SVG with the score and a colour-coded risk rating. See the Badge System docs for dynamic badge options, CI integration, and badge API reference.