Sign In Get Started

Structured Deliberation Between AI Models

Not another orchestration tool. Consilium makes AI models argue, challenge, and synthesize - producing answers with tracked confidence, dissent, and audit trails.

Get Started View Demo

How It Works

A structured 6-phase deliberation process inspired by academic debate and jury systems.

1

Propose

Each model independently analyzes the problem and presents its initial position.

2

Challenge

Models cross-examine each other, probing assumptions and identifying weaknesses.

3

Rebut

Models refine their positions based on challenges, strengthening or revising arguments.

4

Evaluate

A judge model assesses argument quality, evidence strength, and logical consistency.

5

Vote

Models cast confidence-weighted votes on the strongest positions.

6

Synthesize

A final synthesis integrates the best arguments into a single, rigorous answer.

8 Deliberation Modes

Choose the right deliberation strategy for your use case.

Quick

~15s

Single round, fastest response. Best for simple questions needing a fast sanity check.

Council

~45s

Multi-round deliberation between models. The default mode for most decisions.

Deep

~90s

Extended deliberation with sub-agent research for complex, high-stakes questions.

Blind

~45s

Model names hidden until scored. Eliminates brand bias from evaluation.

Red Team

~120s

Adversarial assessment where models actively try to break each other's arguments.

Jury

~60s

Panel deliberation with structured voting. Models must reach consensus or declare dissent.

Market

~90s

Prediction market style confidence aggregation. Models stake credibility on positions.

Auto

~45s

Automatically selects the best deliberation mode based on topic complexity.

Why Deliberation > Orchestration

Orchestration runs models in parallel and picks the best. Deliberation makes them argue until the truth emerges.

Capability	Deliberation	Orchestration
Multiple model perspectives
Models challenge each other
Structured argumentation
Dissent tracking
Confidence-weighted voting
Adversarial red-teaming
Blind evaluation mode
Audit trail of reasoning

One command to get started

Install the CLI

Run debates from your terminal - pipe in files, diffs, or stdin and stream the deliberation live.

# 1. Install the CLI globally
npm install -g @myconsilium/cli

# 2. Sign in (or run on the free tier with no key)
consilium login

# 3. Run your first debate
consilium debate "What's the best way to ship this feature?" \
  --mode council

CLI Docs View on npm

SDK Examples

Integrate deliberation into your stack in minutes.

from consilium import ConsiliumClient, DeliberationMode

client = ConsiliumClient(
    api_url="https://api.myconsilium.xyz",
    api_key="your-key",
)

result = client.deliberate(
    "Should we migrate to microservices?",
    mode=DeliberationMode.COUNCIL,
    models=["claude-sonnet-4-6",
            "gpt-5.4", "gemini-3-flash-preview"],
)

print(result.golden_prompt)
print(result.confidence_scores)
print(result.dissent_report)

Supported Providers

Bring your own API keys. Consilium works with all major LLM providers.

Anthropic

OpenAI

Google

Groq

xAI

Moonshot

OpenRouter

Available in the CLI and Web app

Models on the Council

Mix any combination across providers. Models marked Free run on the no-key-required free tier.

Anthropic

Claude 4 family - strongest reasoning and synthesis.

Claude Haiku 4.5
claude-haiku-4-5-20251001
Claude Sonnet 4.6
claude-sonnet-4-6
Claude Opus 4.6
claude-opus-4-6
Claude Opus 4.7
claude-opus-4-7

OpenAI

GPT-5 series - fast, mini, and pro tiers.

GPT-5.4 Nano
gpt-5.4-nano
GPT-5.4 Mini
gpt-5.4-mini
GPT-5.4
gpt-5.4
GPT-5.5
gpt-5.5
GPT-5.5 Pro
gpt-5.5-pro

Google

Gemini 3 - long context and fast multimodal.

Gemini 3.1 Flash-Lite
gemini-3.1-flash-lite-preview
Gemini 3 Flash
gemini-3-flash-preview
Gemini 3.1 Pro
gemini-3.1-pro-preview

Groq

Sub-second inference. Free tier available.

Llama 3.1 8B Instant
llama-3.1-8b-instant
Free
Llama 3.3 70B Versatile
llama-3.3-70b-versatile
Free
GPT-OSS 120B (via Groq)
openai/gpt-oss-120b
Free
GPT-OSS 20B (via Groq)
openai/gpt-oss-20b
Free
Groq Compound
groq/compound

xAI

Grok 4 - code-focused and reasoning variants.

Grok Code Fast
grok-code-fast-1
Grok 4.1 Fast (non-reasoning)
grok-4-1-fast-non-reasoning
Grok 4.1 Fast (reasoning)
grok-4-1-fast-reasoning
Grok 4.20
grok-4-20

Moonshot

Kimi K2 - long-context reasoning.

Kimi K2.6
kimi-k2.6

No key, no problem. Start a debate with zero setup - Consilium routes free-tier requests through Groq and OpenRouter automatically. Bring your own keys anytime for premium models.

Research Backed

Consilium's deliberation approach is grounded in peer-reviewed research.

Debating with More Persuasive LLMs Leads to More Truthful Answers

Akbir Khan et al. - ICML 2024

AI debate produces more truthful answers than single-model prompting, even when one debater argues for the wrong answer.

Improving Factuality and Reasoning via Multiagent Debate

Yilun Du et al. - ICML 2024

Multi-agent debate significantly improves factual accuracy and mathematical reasoning across multiple benchmarks.

LLM Discussion: Enhancing the Creativity of LLMs via Discussion Framework

Li et al. - AAAI 2024

Structured discussion between LLMs produces more creative and diverse outputs than individual generation.

Scalable AI Safety via Doubly-Efficient Debate

Irving et al. - AI Safety Research

Debate between AI systems provides a scalable mechanism for aligning AI behavior with human values.

Your keys. Your control.

Bring your own provider keys and pay only for what you use.

End-to-end encryptionBring Your Own KeysCLI + SDK

Start free See pricing