Visitors – FAQ for everyone
Frequently Asked Questions
General
Consilium is a multi-AI agent deliberation platform. Unlike orchestration tools that run models in parallel and pick one answer, Consilium implements formal debate protocols where models propose claims, challenge each other's reasoning with typed challenges (factual errors, missing evidence, flawed logic), defend positions with categorized rebuttals (concede/refute/qualify/redirect), vote using social choice theory (Condorcet/Borda/Ranked Pairs), and converge when mathematically verified (score ≥ 0.85). The result is a synthesized "golden prompt" with tracked confidence scores, dissent reports, and complete audit trails.
Single models give you one perspective. Consilium orchestrates structured debate between multiple models — Claude, GPT-4o, Gemini, Grok, Llama — making them cross-examine each other before synthesizing. Research shows multi-agent debate improves factual accuracy by 8-15% over single-model responses (ICML 2024).
Quick (1 round, fastest), Council (3 rounds, default), Deep (5 rounds, most thorough), Blind (identity stripped, bias-free), Red Team (adversarial with 8 attack categories), Jury (mandatory dissent reporting), Market (probability aggregation via log-opinion pooling), Auto (complexity-based routing).
Every deliberation produces: Golden Prompt (synthesized answer), Confidence Scores (per-model calibrated via explanation stability), Dissent Report (majority/minority positions via clustering), Vote Results (Condorcet winner, Borda scores, full ranking), Audit Trail (per-step tokens/cost/latency), and Cost Breakdown (per-model, per-round).
Yes. The hosted version at myconsilium.xyz is free to use with BYOK (Bring Your Own Keys). You pay only for LLM API calls through your own provider keys. Groq models (Llama 3.1 8B, 3.3 70B, 4 Scout) are completely free. Pro and Max tiers with additional features are coming soon.
Saad Kadri.
Technical
15 models across 5 providers: Anthropic (Claude Opus 4.6, Sonnet 4.5, Haiku 4.5), OpenAI (GPT-4o, 4o-mini, 4.1, o3-mini), Google (Gemini 2.0 Flash, 2.5 Flash, 2.5 Pro), Groq (Llama 3.1 8B, 3.3 70B, 4 Scout — all free), xAI (Grok 2, Grok 2 Mini).
Four formal social choice algorithms: Condorcet checks if any candidate beats all others pairwise (weighted by confidence). Borda count assigns points = (n-1-rank) × confidence_weight. Ranked Pairs locks strongest victories while preventing cycles. Copeland scores wins minus losses. Pipeline: Borda → ranking → Condorcet check → Ranked Pairs fallback.
Three metrics combined: Kendall tau (ranking correlation between rounds), Jaccard similarity (proposal content overlap), concession rate (fraction of rebuttals where models yield). Formula: 0.4 × ranking + 0.35 × proposal + 0.25 × concession. Converged when ≥ 0.85. Minimum 2 rounds for baseline.
Agglomerative clustering on a Jaccard similarity matrix between proposals. Iteratively merges closest clusters (threshold ≥ 0.5). Single cluster = consensus. Multiple clusters = dissent with majority and minority positions reported separately.
No. You need at least one. Groq is free and serves as automatic fallback. For best results, use 2-3 different providers to get genuine model diversity in debates.
Server-Sent Events (SSE). Connect to /deliberation/:id/stream. Events include: phase:proposal, agent:chunk, convergence:detected, dissent:report, cost:update, red_team:attack, market:bet, and more. Both SDKs and CLI support streaming.
Data & Infrastructure
PostgreSQL via Prisma ORM. Debate sessions, rounds, messages, audit entries, user data all stored relationally. API keys encrypted with AES-256-GCM before storage. Never stored in plaintext.
Security
AES-256-GCM encryption. Keys are encrypted before writing to the database. Never stored in plaintext. Never logged. Never transmitted to any third party.
Clerk for web authentication (JWT-based). CLI uses long-lived tokens (hashed, not plaintext) stored in ~/.consilium/config.json. API uses Bearer token in Authorization header.
Yes. Every deliberation records per-step: model ID, input summary, output summary, latency (ms), tokens in/out, cost, round number. Stored in AuditEntry model.
Self-hosted Consilium can be deployed in compliant infrastructure. BYOK ensures keys never leave your environment. Audit trails provide required record-keeping. Consult your compliance team for specific requirements.
Pricing and Costs
Depends on mode and models. Quick with GPT-4o-mini: ~$0.001. Council with 3 premium models: ~$0.05-0.15. Deep with 5 models: ~$0.20-0.50. Free with Groq models: $0.00.