Back to Blog
Product

The Eight Deliberation Modes, Explained

Saad KadriMarch 5, 20266 min read

Consilium ships eight deliberation modes. They're not marketing skins on the same logic — each one is a distinct state machine in the deliberation graph with a different round count, transition shape, and judge behavior. Picking the right one matters. Picking the wrong one wastes spend or produces a thinner answer than the topic deserves. Here's what each mode actually does, when to reach for it, and what it costs.

The lineup

ModeRoundsMedian latencyBest for
quick1~15sFast lookup-style questions
council3~45sDefault for most reasoning tasks
deep5~90sArchitecture-level decisions
blind3~45sReducing model-anchor bias
redteam1 (attack/defend cycle)~60–120sSecurity & vulnerability review
jury3~60sRanked-choice on multiple options
market5~90sForecasting / probability claims
autovariesdepends on routed modeWhen you don’t want to choose

Round counts are the actual values in MAX_ROUNDS_BY_MODE in apps/agents/src/features/deliberation/deliberation_graph.py.

quick — single round, no debate

Each model produces one independent answer; the judge picks the best one. There is no cross-examination round. This is the cheapest mode and the only mode where you are essentially getting a ranked best-of-N from your council. Use it when you want multiple model perspectives but the question is simple enough that none of them is going to challenge another's answer productively.

council — the default

Three rounds: independent analysis, cross-examination, rebuttal-and-refinement. The judge runs the 5-phase synthesis on the round-3 outputs. This is what we recommend for most non-trivial reasoning tasks: long enough to surface disagreement, short enough not to drag.

deep — five rounds with sub-agent research

Adds two more rounds beyond council, plus optional sub-agent research for context-heavy questions. Worth the extra spend when the topic is genuinely contested and the room is split after round 3. Convergence detection still applies, so a deep-mode debate that locks in early ends early — you only pay for the rounds you needed.

blind — names hidden until scored

Same shape as council, but model identities are stripped before each round so models judging round-1 outputs in round 2 don't know which one came from Claude vs. GPT vs. Gemini. Useful when you suspect anchoring to a particular brand — for example when ranking responses to a prompt where the "safe" answer is the one a more cautious model wrote.

redteam — attack/defend cycle

Designed for security review. The flow is asymmetric: one subset of models attacks (generates exploits / failure modes / adversarial inputs), another defends, and the judge categorizes findings into five severity-ranked dimensions (security, bugs, performance, quality, edge cases). Despite the single-round count, redteam runs longer than council in wall-clock time because the attack and defend phases are sequential and each pulls more tokens than a normal round-1 generation.

jury — ranked-choice voting

Three rounds. After round 3, every model casts a ranked ballot over the candidate answers. We aggregate using Borda count for the headline score and run a Condorcet check for cycle detection. When the room produces a consistent preference order, jury surfaces it; when there's a cycle (A beats B beats C beats A), the judge produces a weighted synthesis that the dissent report explicitly flags. Use jury when there are clearly multiple defensible answers and you want the council's collective ranking instead of a single verdict.

market — prediction-market aggregation

Five rounds with confidence-weighted voting. Each model produces a probability claim and a justification; the judge aggregates the probabilities the way a prediction market would — weighted by each model's calibrated confidence on similar past questions. This is the right mode for forecasting ("will WebAssembly replace Docker for serverless within 3 years?") where the deliverable is a probability with reasoning, not a yes/no.

auto — let the engine pick

Auto runs a small classifier on the topic in front of round 1 and routes to one of the seven explicit modes. The decision is surfaced in the SSE stream as a routing:decidedevent so you can see what it chose and why — auto is not a black box, it's a default with an audit trail. If the classifier is uncertain it falls back to council, which is the safest non-trivial mode.

How to choose

Three rules of thumb that work in practice:

  • If the topic has a fact-of-the-matter answer, start with quick. You don't need debate when one round of best-of-N is enough.
  • If the topic has tradeoffs, council. The cross-examination round is where tradeoff analysis actually happens.
  • If the deliverable is structured (a vote, a probability, a security report), pick the matching specialized mode — jury, market, or redteam respectively.

And if you genuinely don't know, auto is fine. It's what we use as the default for unclassified incoming traffic.