Blog
Latest updates, research, and benchmarks from Consilium
What We Found Auditing Our Own Model Catalog Against the Live Provider Docs
We re-verified every model ID Consilium ships against each provider's own documentation page. Three real bugs surfaced — including an xAI model ID we'd been spelling with a dot when the API uses a dash. Here's the receipts.
BYOK With a Safety Net: How Consilium Falls Back to a Free Tier Without Touching Your Keys
When a debate runs without a user-supplied key for the requested provider, Consilium routes through a platform-hosted Groq or OpenRouter pool — but only as a last step, with full transparency via SSE. Here's the four-step resolver chain.
The Model Deprecation Calendar: What Retires Between June and October 2026
Six widely-used model IDs are scheduled for shutdown in the next six months. We list them, the date each one dies, and how Consilium's alias map keeps apps working past the cutoff.
Council Deliberation vs Single Models: What Our Benchmarks Actually Show
We ran MMLU, TruthfulQA, and HumanEval through Consilium's council mode. The raw scores are not yet representative — our answer checker is too strict. Here's what we ran, what broke, and the research baselines we measure ourselves against in the meantime.
Why Deliberation Beats Orchestration
Most multi-agent frameworks treat models as workers in a pipeline. Consilium treats them as adversaries in a structured debate. Here's the difference, and why it produces answers that survive cross-examination.
The Eight Deliberation Modes, Explained
Quick, council, deep, blind, redteam, jury, market, auto. Each has a specific shape — different round count, voting method, and judge behavior — picked to match the task. Here's how to choose.
Getting Started with the Consilium SDK
A practical walkthrough: install the CLI or SDK, set up a key (or skip it and use the free tier), run your first council debate, and stream events as they happen.