BuiltPythonllama.cppFastAPIWebSocketsOSV APIClaude Code Skills

Security Agent

A lesson in knowing when NOT to build an agent.

$ cat story.md

This started when I realized we'd built this site on a vulnerable version of Next.js. As a cybersecurity defender, I wanted to understand how attackers might use AI agents.

I built it all: ReAct loop, 12 security tools, real-time dashboard, CVE lookups via OSV API. The agent ran, called tools, found real CVEs. Then it also reported 100+ hallucinated findings about files that didn't exist.

After fighting context limits, hallucinations, and duplicate findings, I asked: could a deterministic script do this? Could Claude Code do this as a skill? Yes to both. The 'agent' added complexity without adding value.

Final output: a simple Claude Code skill that does dependency audits and code pattern scanning. No ReAct loop, no local LLM, no dashboard. It just works.

$ git status

Project complete. Built a working security agent with ReAct loop, 12 tools, and real-time dashboard. The 7B model hallucinated too many findings to be useful. Pivoted to a Claude Code skill that does the job reliably. Key lesson: agents aren't always the answer.

$ ls ./components

Tool Framework

ReAct Loop

Safety Layer

CVE Monitoring

Web Dashboard

Claude Code Skill

-ReAct pattern implementation from scratch
-Tool-augmented agent design
-FastAPI + WebSockets for real-time UIs
-When to use agents vs deterministic tools

$ ls ./blog/ --project="Security Agent"

I Built an AI Security Agent That Hallucinated 100 Vulnerabilities

Jan 16, 2026

Built a ReAct agent with 12 tools in 3-4 hours. It found real CVEs but invented 100 fake vulnerabilities. A Claude Code skill does the job better.

View all posts ->

$ cat story.md

$ git status

$ ls ./components

$ cat learning-goals.txt

$ ls ./blog/ --project="Security Agent"

I Built an AI Security Agent That Hallucinated 100 Vulnerabilities