BYOK With a Safety Net: How Consilium Falls Back to a Free Tier Without Touching Your Keys
Bring-Your-Own-Key (BYOK) is the right default for an AI-mediating product — users keep control of cost, rate limits, and provider relationships. But BYOK alone has a sharp edge: when a user hasn't added a key for the provider their debate happens to need, the debate just fails. Consilium's free-tier fallback is the safety net for exactly that case.
The four-step resolver
Every model request runs through FreeTierResolver (apps/agents/src/features/free_tier/resolver.py). It returns a tuple of (effective model, effective provider, effective key, is_fallback). The chain has four steps and stops at the first one that succeeds:
- User BYOK.If the requested model's provider has a key in the request payload (
openaiKey,anthropicKey, etc.), use it. No fallback. - Self-hosted env var. If
OPENAI_API_KEY/ANTHROPIC_API_KEY/ etc. is set on the engine host, use it. This is the single-tenant deployment case where one operator funds every debate. - Free-tier Groq. If
CONSILIUM_FREE_TIER_GROQ_KEYis set, route to Groq with a tier-equivalent open model — fast →llama-3.1-8b-instant, balanced →llama-3.3-70b-versatile, deep →openai/gpt-oss-120b. Tier is inferred from the requested model's catalog cost. - Free-tier OpenRouter. Backup path. If
CONSILIUM_FREE_TIER_OPENROUTER_KEYis set, route through OpenRouter's free roster — fast →google/gemma-4-26b-a4b-it:free, balanced →qwen/qwen3-coder:free, deep →nvidia/nemotron-3-super-120b-a12b:free.
If none of the four match, the resolver raises NoKeyAvailableError. The debate fails with a clear message about which provider needs a key, not a 401 from the upstream API.
Why the platform pool uses separate env vars
The free-tier env vars (CONSILIUM_FREE_TIER_GROQ_KEY, CONSILIUM_FREE_TIER_OPENROUTER_KEY) are deliberately distinct from the standard GROQ_API_KEY and OPENROUTER_API_KEY. An operator running Consilium for a single internal team probably wants their own provider keys to handle every debate — that's the step-2 path. A platform operator funding a free pool for users who haven't signed up for a provider yet uses the step-3 / step-4 path. The two should not collide.
Transparency at the surface
When fallback fires, the engine emits a routing:fallback SSE event before the first round runs. The payload lists every model that got rerouted, the substitution it received, and a human-readable reason. Critically, the API key never appears in the event — only the substitution metadata.
event: routing:fallback
data: {
"count": 1,
"resolutions": [{
"requested_model": "claude-opus-4-7",
"requested_provider": "anthropic",
"effective_model": "openai/gpt-oss-120b",
"effective_provider": "groq",
"is_fallback": true,
"fallback_reason": "No anthropic API key configured. Routed claude-opus-4-7 to groq free tier..."
}],
"message": "1 model(s) routed to Consilium free tier..."
}The CLI surfaces this as a pre-flight notice (yellow banner with the substitution and a hint to add a key in consilium config). The web app surfaces it on the debate detail page. Either way, the user sees the substitution before they see the result.
What happens to legacy model IDs
The resolver runs after the alias map, so legacy IDs forward first and then resolve through the chain. A request for gpt-4o with no OpenAI key resolves to gpt-5.4(alias), then through the chain — if no OpenAI key exists anywhere, it falls back to Groq's balanced tier. The user sees requested_model: "gpt-4o" → effective_model: "llama-3.3-70b-versatile" and knows exactly what ran.
Why this matters
The combination — BYOK preferred, free pool as backstop, substitution surfaced explicitly — means a user can try Consilium with zero setup and still see real multi-provider debate behavior. They're not locked into a paid signup before they know if they want the product. And they're never silently routed to a different model than they asked for.