AI Providers & Models
7 providers, 25+ models, from free to frontier. Bring your own keys and mix models from different providers in the same deliberation — or run on Consilium's free-tier pool when you don't have keys.
All Models & Pricing
Prices are per 1 million tokens, charged by the provider (not Consilium). Consilium is BYOK — you pay providers directly through your own API keys.
| Provider | Model | Input/1M | Output/1M | Tier |
|---|---|---|---|---|
| Anthropic | Claude Opus 4.7 | $5.00 | $25.00 | Most Capable |
| Anthropic | Claude Opus 4.6 | $5.00 | $25.00 | Previous Flagship |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | Balanced |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | Fast |
| OpenAI | GPT-5.5 Pro | $30.00 | $180.00 | Most Capable |
| OpenAI | GPT-5.5 | $5.00 | $30.00 | Flagship |
| OpenAI | GPT-5.4 | $2.00 | $8.00 | Reasoning |
| OpenAI | GPT-5.4 Mini | $0.20 | $0.80 | Cost-effective |
| OpenAI | GPT-5.4 Nano | $0.08 | $0.30 | Lowest Cost |
| Gemini 3.1 Pro | $1.25 | $5.00 | Most Capable | |
| Gemini 3 Flash | $0.15 | $0.60 | Balanced | |
| Gemini 3.1 Flash-Lite | $0.05 | $0.20 | Lowest Cost | |
| Groq | Llama 3.1 8B | Free | Free | Instant |
| Groq | Llama 3.3 70B | Free | Free | Versatile |
| Groq | GPT-OSS 120B | Free | Free | Open-Weight Flagship |
| Groq | GPT-OSS 20B | Free | Free | Open-Weight |
| Groq | Groq Compound | $0.80 | $1.60 | Agentic |
| xAI | Grok 4.20 | $3.00 | $15.00 | Most Capable |
| xAI | Grok 4.1 Fast (reasoning) | $1.00 | $4.00 | Reasoning |
| xAI | Grok 4.1 Fast | $0.50 | $2.00 | Fast |
| xAI | Grok Code Fast | $0.30 | $1.20 | Coding |
| Moonshot | Kimi K2.6 | $1.20 | $2.50 | 1T Open-Source |
| OpenRouter | Gemma 4 26B (free) | Free | Free | Free Tier |
| OpenRouter | Gemma 4 31B (free) | Free | Free | Free Tier |
| OpenRouter | Qwen3 Coder (free) | Free | Free | Free Tier |
| OpenRouter | Nemotron 3 Super 120B (free) | Free | Free | Free Tier |
| OpenRouter | Ling 2.6 1T (free) | Free | Free | Free Tier |
Provider Details
Claude models excel at nuanced reasoning, following complex instructions, and agentic coding. Claude Opus 4.7 is Anthropic's most capable generally available model with a step-change improvement in agentic coding over Opus 4.6. Claude Sonnet 4.6 offers the best balance of speed and intelligence with a 1M-token context. Claude Haiku 4.5 is the fastest model with near-frontier intelligence.
Environment Variable
ANTHROPIC_API_KEY
API Client
anthropic.AsyncAnthropic
Strengths
Nuanced reasoning, instruction following, agentic coding, 1M context, structured output
GPT-5.5 and GPT-5.5 Pro are OpenAI's latest flagship models with 1M-token context windows, available through the Responses and Chat Completions APIs. GPT-5.4 (and the Mini/Nano variants) remain the cost-tier options for high-volume workloads. Retired models like GPT-4o, GPT-4.1, and o4-mini are no longer supported by OpenAI as of February 2026.
Environment Variable
OPENAI_API_KEY
API Client
openai.AsyncOpenAI (httpx.AsyncClient)
Strengths
General capability, code generation, agentic tool use, multimodal, 1M context
Gemini 3.1 Pro is Google's most advanced reasoning model, optimized for complex agentic workflows and coding. Gemini 3 Flash provides strong frontier-class performance at low cost, while Gemini 3.1 Flash-Lite is the cheapest option for high-volume, latency-sensitive traffic. Gemini 1.x and 2.x lines have been deprecated.
Environment Variable
GOOGLE_API_KEY
API Client
google.generativeai.GenerativeModel
Strengths
Cost efficiency, multimodal, long context, frontier reasoning
Groq provides ultra-fast inference for open-weight models including Llama 3.1 8B, Llama 3.3 70B, and OpenAI's GPT-OSS 120B/20B at zero cost through their free tier. Groq Compound and Compound Mini are agentic systems with built-in web search and code execution. Consilium uses Groq as the primary platform free-tier fallback (CONSILIUM_FREE_TIER_GROQ_KEY) when no BYOK key is configured.
Environment Variable
GROQ_API_KEY
API Client
OpenAI-compatible (api.groq.com/openai/v1)
Strengths
Free open-weight models, fastest inference, agentic compound systems
xAI's Grok lineup launched Grok 4.20 in February 2026 with a four-agent architecture for reasoning. Grok 4.1 Fast (reasoning + non-reasoning variants) and Grok Code Fast cover the lower-cost tier. Use the OpenAI-compatible API format. Grok 2/2-mini and grok-beta are legacy and have been migrated.
Environment Variable
XAI_API_KEY
API Client
OpenAI-compatible (api.x.ai/v1)
Strengths
Multi-agent reasoning, real-time knowledge, fast coding tasks
Moonshot's Kimi K2.6 (released April 2026) is a frontier-scale 1T-parameter open-source MoE model with a 262k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads. The API is OpenAI-compatible.
Environment Variable
MOONSHOT_API_KEY
API Client
OpenAI-compatible (platform.moonshot.ai)
Strengths
1T parameters, agentic tool use, long-context coding stability
OpenRouter aggregates access to dozens of models behind one OpenAI-compatible endpoint, including a free tier for popular community models like Gemma 4, Qwen3 Coder, Nemotron 3 Super 120B, and Ling 2.6 1T (rate-limited at 20 req/min, 50 req/day per OpenRouter's April 2026 free-tier policy). Consilium uses OpenRouter as the secondary free-tier fallback (CONSILIUM_FREE_TIER_OPENROUTER_KEY) when Groq is unavailable.
Environment Variable
OPENROUTER_API_KEY
API Client
OpenAI-compatible (openrouter.ai/api/v1)
Strengths
Free tier breadth, single endpoint for many providers, easy fallback
Judge Model Priority
The judge model evaluates proposals and produces the final synthesis. Consilium selects the judge based on this priority order (using the first provider for which you have a valid key):
Free-Tier Fallback (BYOK Preserved)
Consilium is BYOK-first. When you supply your own provider API key, that key is always used — no fallback occurs. When no key is set for the requested provider, Consilium routes through a platform-hosted free-tier pool so you can keep working at zero cost. Resolution order:
- Your BYOK key for the requested provider (always wins)
- Self-hosted env var (e.g.
OPENAI_API_KEY) - Groq free-tier pool —
CONSILIUM_FREE_TIER_GROQ_KEY - OpenRouter free-tier pool —
CONSILIUM_FREE_TIER_OPENROUTER_KEY
Tier is inferred from the requested model's catalog cost (fast / balanced / deep) and routed to a tier-equivalent free model. The CLI prints a pre-flight notice and a routing:fallback SSE event is emitted so you always know when fallback is active.
Tier-equivalent free models
Groq (preferred)
OpenRouter (backup)
Context Overflow Fallback
When a model returns a 413 or 400 error indicating the context is too large, Consilium automatically retries with a cheaper, smaller-context variant:
| Original Model | Fallback Model |
|---|---|
| gpt-5.5-pro | gpt-5.4-mini |
| gpt-5.5 | gpt-5.4-mini |
| claude-opus-4-7 | claude-haiku-4-5-20251001 |
| claude-sonnet-4-6 | claude-haiku-4-5-20251001 |
| gemini-3.1-pro-preview | gemini-3-flash-preview |
| grok-4-20 | grok-4-1-fast-non-reasoning |
Security
All API keys are encrypted with AES-256-GCM before storage. Keys are never stored in plaintext, never logged, and never transmitted to any third party.