Security Agent
A lesson in knowing when NOT to build an agent.
$ cat story.md
This started when I realized we'd built this site on a vulnerable version of Next.js. As a cybersecurity defender, I wanted to understand how attackers might use AI agents.
I built it all: ReAct loop, 12 security tools, real-time dashboard, CVE lookups via OSV API. The agent ran, called tools, found real CVEs. Then it also reported 100+ hallucinated findings about files that didn't exist.
After fighting context limits, hallucinations, and duplicate findings, I asked: could a deterministic script do this? Could Claude Code do this as a skill? Yes to both. The 'agent' added complexity without adding value.
Final output: a simple Claude Code skill that does dependency audits and code pattern scanning. No ReAct loop, no local LLM, no dashboard. It just works.
$ git status
Project complete. Built a working security agent with ReAct loop, 12 tools, and real-time dashboard. The 7B model hallucinated too many findings to be useful. Pivoted to a Claude Code skill that does the job reliably. Key lesson: agents aren't always the answer.
$ ls ./components
- -ReAct pattern implementation from scratch
- -Tool-augmented agent design
- -FastAPI + WebSockets for real-time UIs
- -When to use agents vs deterministic tools