Cookbook
Working recipes for the most common Mantis flows.
Resume an interrupted run
Every Mantis phase persists state on disk. If a run halts mid-flight (Ctrl-C, context limit, system crash), just resume:
# Claude Code
/mantis resume target.com
# OpenCode
@mantis-orchestrator resume target.com
The orchestrator reads state.json, picks up at the last completed phase, and continues. If a hunter wave was in flight when the run stopped, you can force a reconciliation:
/mantis resume target.com force-merge
Run without auth
Skip the AUTH phase entirely. Hunters test unauthenticated only.
/mantis target.com --no-auth
Triage a new target cheaply
You found a new bug-bounty program and want to know if it's worth a full run.
/mantis-fast https://target.com
~$3-10 and ~10 minutes. If fast finds something, escalate to standard or ultra.
Long-mission rare-bug hunt
Some bugs only show up after hours of exploration. Loop mode runs the full FSM, then EXPLORE, then full FSM again, until a findings or time budget is hit.
/mantis-loop https://target.com --findings 3 --budget-min 240
Concurrent multi-target hunts
Each Mantis run owns one session directory and one MCP server process. To hunt two targets at once, use the worktree helper:
./scripts/mantis-worktree.sh target1.com # creates ~/mantis-worktrees/target1.com on its own branch
./scripts/mantis-worktree.sh target2.com # second worktree, independent server, independent state
# Open Claude Code or OpenCode in each worktree separately
cd ~/mantis-worktrees/target1.com && claude
# in another terminal:
cd ~/mantis-worktrees/target2.com && claude
Grading came back HOLD: now what
HOLD means the grader saw enough proof to be intriguing but not enough to SUBMIT. grade.json.feedback tells you what's missing. The orchestrator automatically re-runs HUNT with the grader's feedback injected, then re-runs CHAIN + VERIFY before re-grading. Max two HOLD loops; after that it escalates to you.
Use GPT-5 for hunters, Opus for verifiers
Mixed-provider runs concentrate top-tier tokens on the highest-leverage roles.
// opencode.json
{
"model": "openai/gpt-5",
"agent": {
"hunter-agent": { "model": "openai/gpt-5" },
"chain-builder": { "model": "anthropic/claude-opus-4-5" },
"brutalist-verifier": { "model": "anthropic/claude-opus-4-5" },
"balanced-verifier": { "model": "anthropic/claude-opus-4-5" },
"final-verifier": { "model": "anthropic/claude-opus-4-5" },
"grader": { "model": "openai/gpt-5-mini" },
"report-writer": { "model": "openai/gpt-5-mini" },
"triage-agent": { "model": "openai/gpt-5-nano" }
}
}
Run entirely on local Ollama
If you want zero data leaving your machine (slower, lower quality):
// opencode.json
{
"model": "ollama/llama3.3:70b",
"provider": {
"ollama": { "options": { "baseURL": "http://localhost:11434" } }
}
}
Llama 3.3 70B passes our smoke tests on the hunter and recon roles. For the verifiers and chain-builder, you'll see meaningfully worse evidence quality vs. a frontier model.
Use Mantis as an investigation tool in VS Code (Cline)
You found a specific suspicious endpoint and want a structured investigation. Don't need full FSM:
- Install Mantis with
--harness=opencode(which also drops.claude/agents/). - In Cline, register the Mantis MCP server (see adapters/cline.md).
- Paste the body of
.claude/agents/hunter-agent.mdinto the Cline chat as a system prompt. - Tell Cline the surface and target. It will use the typed MCP tools to record findings, but you drive the conversation.
Auto-disclose with a suggested patch
When you find something and want to do the right thing:
/mantis-fullsend target.com
Standard run + a patch-writer phase that generates a suggested code-level diff + a disclosure-sender phase that finds the security contact (security.txt / disclosure URL / well-known emails) and drafts an email through Gmail MCP. Hard-gated: never sends without your final review.