Cookbook

Working recipes for the most common Mantis flows.

Resume an interrupted run

Every Mantis phase persists state on disk. If a run halts mid-flight (Ctrl-C, context limit, system crash), just resume:

# Claude Code
/mantis resume target.com

# OpenCode
@mantis-orchestrator resume target.com

The orchestrator reads state.json, picks up at the last completed phase, and continues. If a hunter wave was in flight when the run stopped, you can force a reconciliation:

/mantis resume target.com force-merge

Run without auth

Skip the AUTH phase entirely. Hunters test unauthenticated only.

/mantis target.com --no-auth

Triage a new target cheaply

You found a new bug-bounty program and want to know if it's worth a full run.

/mantis-fast https://target.com

~$3-10 and ~10 minutes. If fast finds something, escalate to standard or ultra.

Long-mission rare-bug hunt

Some bugs only show up after hours of exploration. Loop mode runs the full FSM, then EXPLORE, then full FSM again, until a findings or time budget is hit.

/mantis-loop https://target.com --findings 3 --budget-min 240

Concurrent multi-target hunts

Each Mantis run owns one session directory and one MCP server process. To hunt two targets at once, use the worktree helper:

./scripts/mantis-worktree.sh target1.com   # creates ~/mantis-worktrees/target1.com on its own branch
./scripts/mantis-worktree.sh target2.com   # second worktree, independent server, independent state

# Open Claude Code or OpenCode in each worktree separately
cd ~/mantis-worktrees/target1.com && claude
# in another terminal:
cd ~/mantis-worktrees/target2.com && claude

Grading came back HOLD: now what

HOLD means the grader saw enough proof to be intriguing but not enough to SUBMIT. grade.json.feedback tells you what's missing. The orchestrator automatically re-runs HUNT with the grader's feedback injected, then re-runs CHAIN + VERIFY before re-grading. Max two HOLD loops; after that it escalates to you.

Use GPT-5 for hunters, Opus for verifiers

Mixed-provider runs concentrate top-tier tokens on the highest-leverage roles.

// opencode.json
{
  "model": "openai/gpt-5",
  "agent": {
    "hunter-agent":       { "model": "openai/gpt-5" },
    "chain-builder":      { "model": "anthropic/claude-opus-4-5" },
    "brutalist-verifier": { "model": "anthropic/claude-opus-4-5" },
    "balanced-verifier":  { "model": "anthropic/claude-opus-4-5" },
    "final-verifier":     { "model": "anthropic/claude-opus-4-5" },
    "grader":             { "model": "openai/gpt-5-mini" },
    "report-writer":      { "model": "openai/gpt-5-mini" },
    "triage-agent":       { "model": "openai/gpt-5-nano" }
  }
}

Run entirely on local Ollama

If you want zero data leaving your machine (slower, lower quality):

// opencode.json
{
  "model": "ollama/llama3.3:70b",
  "provider": {
    "ollama": { "options": { "baseURL": "http://localhost:11434" } }
  }
}

Llama 3.3 70B passes our smoke tests on the hunter and recon roles. For the verifiers and chain-builder, you'll see meaningfully worse evidence quality vs. a frontier model.

Use Mantis as an investigation tool in VS Code (Cline)

You found a specific suspicious endpoint and want a structured investigation. Don't need full FSM:

  1. Install Mantis with --harness=opencode (which also drops .claude/agents/).
  2. In Cline, register the Mantis MCP server (see adapters/cline.md).
  3. Paste the body of .claude/agents/hunter-agent.md into the Cline chat as a system prompt.
  4. Tell Cline the surface and target. It will use the typed MCP tools to record findings, but you drive the conversation.

Auto-disclose with a suggested patch

When you find something and want to do the right thing:

/mantis-fullsend target.com

Standard run + a patch-writer phase that generates a suggested code-level diff + a disclosure-sender phase that finds the security contact (security.txt / disclosure URL / well-known emails) and drafts an email through Gmail MCP. Hard-gated: never sends without your final review.