cd ../blog
January 9, 2026Agent-Driven Discovery

Five Personas, One Dataset: How Different Agents Find Different Insights

Same data, different perspectives. We built 5 Explorer personas and watched them find completely different insights from identical datasets.

multi-agentllmprompt-engineeringlocal-llmpersonas

I gave the same dataset to five AI analysts with different personalities: a Statistician, a Storyteller, a Detective, a Contrarian, and a generalist. They found completely different insights from identical data. This post shows what each one surfaced and why persona design matters more than I expected.

The Personas

I created 5 distinct Explorer personalities, each with their own definition of what makes a "good" insight:

PersonaFocusWhat They Look For
DefaultGeneral explorationNon-obvious patterns, interesting correlations
StatisticianStatistical rigorStrong correlations, p-values, sample sizes
StorytellerHuman narrativesThe "so what?", who it affects, memorable findings
DetectiveAnomaliesOutliers, suspicious patterns, things that don't fit
ContrarianChallenge assumptionsCounter-examples, exceptions to obvious rules

Each persona has distinct personality traits and exploration tips. The Statistician "demands statistical evidence, not just patterns." The Detective is "suspicious - something is always off, you just need to find it."

The Architecture

The implementation is a config dict + template function rather than 5 separate prompts:

EXPLORER_PERSONAS = {
    "statistician": {
        "name": "The Statistician",
        "description": "a rigorous data analyst obsessed with statistical significance",
        "personality": [
            "Rigorous: You demand statistical evidence, not just patterns",
            "Skeptical: Correlation is not causation, and you never forget it",
            # ...
        ],
        "good_insight": [
            "Statistically significant: Strong correlations (>0.5), clear distributions",
            "Quantified: Exact percentages, means, standard deviations",
            # ...
        ],
        # ...
    },
    # ... other personas
}

def get_explorer_system_prompt(persona: str) -> str:
    config = EXPLORER_PERSONAS.get(persona)
    # Assemble prompt from config

This keeps things DRY and makes adding new personas easy.

Same Data, Different Insights

I ran all 5 personas on the TMDB movies dataset. Same 4,803 movies, same tools, same Skeptic reviewing them. Here's what each one found:

Default (the original Explorer)

"The correlation between budget and revenue varies by genre, with Action and Adventure movies showing stronger correlations compared to Animation."

Generic correlation analysis. Finds something, but nothing surprising.

Statistician

"For movies produced by different companies, those with higher budgets tend to generate higher revenues, with some exceptions based on the company's past success rate. Correlation between budget and revenue is 0.731 (very strong positive correlation) across all movies."

Leads with the exact correlation coefficient. Notes sample size (4803). Methodical.

Storyteller

"The correlation between budget and revenue varies by production company and country, indicating that some entities are better at generating revenue from their investments."

Frames it in terms of who succeeds and why. Less about the numbers, more about the implications.

Detective

"High-budget movies from larger production companies tend to generate higher revenue compared to lower-budget films, while smaller companies show less correlation between budget and revenue, suggesting other factors contribute to their success."

Notes the anomaly: why don't smaller companies follow the same pattern? Opens more questions.

Contrarian

"High revenue movies do not always correlate with high budgets, particularly in documentaries, music films, and movies produced by Blumhouse Productions. Documentaries have an average revenue-to-budget ratio of 1.31, indicating low budgets for high revenue."

This one is the most interesting. While everyone else confirmed budget correlates with revenue, the Contrarian found the exceptions: genres and studios that succeed without big budgets.

Efficiency: Focused Beats General

Here's what surprised me. The specialized personas were consistently more efficient:

PersonaFirst-Round ApprovalAvg Tool CallsAvg Time
Statistician100%3.825.7s
Storyteller100%4.325.6s
Contrarian100%5.526.4s
Detective50%6.033.9s
Default50%15.069.4s

The default Explorer uses ~4x more tool calls on average. Why? It's exploring more broadly without clear direction. The specialized personas know what they're looking for and converge faster.

This matches intuition: a clear goal helps focus investigation.

Cross-Dataset: Earthquake Data

To verify this wasn't dataset-specific, I ran the same comparison on earthquake data (1826-2026, global seismic events).

Statistician

"Deep earthquakes along the Pacific Ring of Fire are predominantly found in the southern hemisphere, particularly in the subduction zones between -50 and -15 latitude."

Geographic distribution with specific coordinates.

Detective

"The distribution of outliers in the depth column, which represent 12.56% of the total depth values, shows that these unusual earthquake depths have a wider range compared to the main distribution."

Found the anomalies and quantified them.

Contrarian

"For earthquakes, the negative correlation between magnitude and frequency varies significantly based on the earthquake type, with tectonic earthquakes showing a stronger correlation than aseismic earthquakes."

Challenged the assumption that magnitude-frequency relationships are universal.

Different data, same pattern: each persona surfaces something the others miss.

The Hallucination Problem (Again)

The 7B model sometimes hallucinates insights from the wrong dataset. I'd added column validation to the Skeptic prompt to catch these, and it worked for most cases.

This round exposed a new edge case. Here's the Detective analyzing earthquake data:

EXPLORER proposes: "The presence of a significant number of earthquakes
                    with depths greater than 100km..."
SKEPTIC asks: "What is the expected range of depth values?"
EXPLORER proposes: "The positive correlation between budget and revenue
                    suggests that investing..."
SKEPTIC REJECTS: "Hallucination - references non-existent data"

Mid-exploration, the Detective started talking about budget and revenue. On earthquake data.

The mandatory questioning architecture saved us. The Skeptic caught it and rejected. The Explorer recovered:

EXPLORER proposes: "The distribution of outliers in the depth column,
                    representing 12.56% of total depth values..."
SKEPTIC asks: "What is the distribution of depth values for the outliers?"
EXPLORER expands with specific evidence
SKEPTIC approves

Without the mandatory question-answer round, this hallucinated insight might have been approved. The Skeptic refactor continues to pay dividends.

Running Personas

The CLI supports persona selection and comparison:

# Single persona
python run.py --dataset movies.csv --persona statistician

# Compare specific personas
python run.py --dataset movies.csv --persona statistician contrarian

# Run all 5 for full comparison
python run.py --dataset movies.csv --persona all

Each persona gets its own run ID (2026-01-09-100445-statistician) and the analysis script breaks down metrics by persona.

What I Noticed

1. Focused prompts seemed to outperform general ones

The specialized personas consistently used fewer tool calls than the default. A clear "what makes a good insight" criteria for each persona type seemed to help the model converge faster. I'm not sure if this generalizes, but it matched what I expected.

2. Different perspectives surfaced different insights

This seems obvious in retrospect. Of course a Contrarian would find budget-doesn't-matter cases while everyone else confirms budget-matters. But seeing it empirically was satisfying.

3. Multiple layers of hallucination detection helped

Column validation in the Skeptic caught most hallucinations. But the mandatory questioning architecture provided a safety net when hallucinations slipped through the first check.

4. The 4x efficiency gap surprised me

The tool call difference between Default and specialized personas was larger than I expected. Broad exploration seems expensive. If the persona knows what it's looking for, it finds it faster.

The Takeaway

Running the same data through 5 different lenses and comparing results turned out to be useful. Different questions lead to different answers. I'm curious what would happen if the personas could see each other's insights and build on them. I explored that idea in Building a Self-Correcting Multi-Agent System.