~/findings
Findings
Standalone notes from projects and experiments. Some are stable references, others are living docs I update as I learn more. The yellow evolving badge marks the living ones.
Autoresearch Harness Log
evolvingWorking notes on how I use the autoresearch harness to probe agent workflows, find design holes, and decide what to experiment with next.
Agent Trace Telemetry
evolvingWhat to measure about an agentic investigation loop, and how a trace explorer turns raw run data into evidence for the next prompt or harness change.
Debugging Experiment Loops
evolvingRunning observations from debugging autonomous experiment loops. What I find when I stop guessing from aggregates and trace through scoring code and spans.
Three Local Models Compared on One Investigation
Running Hermes 4 70B, Nemotron Cascade 30B, and GPT-OSS 20B against the same security investigation exposes a speed-vs-depth tradeoff that shows up clearly when tools are fast.
Agent Investigation With Query Tools
Giving a 7B model two query tools and a 5 W's output format is enough to find attacks on a raw auth.log. The architecture beats dumping the logs into the prompt.
Entity Profiling Over Anomaly Flagging
Message-centric anomaly detection flags 269,000 'rare' events on 86,000 auth.log records. Entity profiling asks a different question and produces actionable intelligence instead.
Deterministic Validation for LLM Output
Schema-based validation catches the variance an LLM data cleaner produces between runs. Pattern: deterministic where you can, LLM where you must.
Local LLM Security Agent on Consumer Hardware
Running a security investigation agent on a 16GB consumer GPU with llama.cpp, the OpenAI-compatible API, and a small 7B model.