Specialist agents

Twelve agents, each with one job, a narrow tool whitelist, and a short role prompt. The single most important anti-hallucination lever.

Every agent lives as a markdown file in .claude/agents/ with YAML frontmatter declaring its model, tools, and color. Agents are spawned per-task with injected context; they don't see the full conversation, only what they need.

The catalog

AgentMantis roleDefault modelTools
recon-agentDISCOVER · asset discovery, fingerprinting, JS extraction, nucleiSonnetBash, Read, Write, Glob, Grep
triage-agentDISCOVER · Haiku-grade surface scorer (promote / defer / kill)HaikuRead, Write, Glob, Grep
hunter-agentREASON + TEST · specialist hunter (webapp / api / identity / network per brief)OpusBash, Read, Grep, Glob, MCP
chain-builderREASON · A→B kill-chain analysisOpusRead, Write, Bash, MCP
brutalist-verifierTEST round 1 · maximum skepticismOpusBash, Read, MCP
balanced-verifierTEST round 2 · catch false negativesOpusBash, Read, MCP
final-verifierTEST round 3 · fresh PoC confirmationOpusBash, MCP
graderLEARN · 5-axis scoring + SUBMIT/HOLD/SKIPSonnetMCP
report-writerLEARN · submission-ready report under 600 wordsSonnetWrite, MCP
patch-writerLEARN · suggested code-level fix per finding (advisory)SonnetRead, Write, MCP
disclosure-senderLEARN · gated email send to verified security contactSonnetRead, Write, Bash, Gmail MCP
(orchestrator)FSM driver, never tests itselfOpusAgent spawning, MCP, Bash (whitelisted)

The hunter's four specialist modes

The hunter-agent reads tech_stack from its brief and adopts the matching specialist persona before testing.

SpecialistTriggers (in tech_stack)Focus
webapp (default)next, react, vue, angular, wordpress, rails, django, laravelOWASP Top 10, IDOR, SQLi, XSS, SSRF, business logic, file upload, auth flaws
apigraphql, rest-api, grpc, swagger, openapi, websocketGraphQL introspection chains, REST IDOR/auth, gRPC reflection, WebSocket origin checks
identityoauth, oidc, saml, sso, jwt, auth0, okta, keycloakSAML XSW, OAuth flow flaws, JWT alg confusion, OIDC state reuse, SSO bypass
networknmap, raw IPs (rare in HTTP scope)Service enum, CVE correlation, only when nmap is in-scope

Why the separation matters

A single long-running agent will happily invent findings, inflate severity, and forget what it already tested. Splitting the work across narrow agents with tool whitelists is the single highest-leverage anti-hallucination lever in the system: