Back to Docs

AI Providers & Models

7 providers, 25+ models, from free to frontier. Bring your own keys and mix models from different providers in the same deliberation — or run on Consilium's free-tier pool when you don't have keys.

All Models & Pricing

Prices are per 1 million tokens, charged by the provider (not Consilium). Consilium is BYOK — you pay providers directly through your own API keys.

ProviderModelInput/1MOutput/1MTier
AnthropicClaude Opus 4.7$5.00$25.00Most Capable
AnthropicClaude Opus 4.6$5.00$25.00Previous Flagship
AnthropicClaude Sonnet 4.6$3.00$15.00Balanced
AnthropicClaude Haiku 4.5$1.00$5.00Fast
OpenAIGPT-5.5 Pro$30.00$180.00Most Capable
OpenAIGPT-5.5$5.00$30.00Flagship
OpenAIGPT-5.4$2.00$8.00Reasoning
OpenAIGPT-5.4 Mini$0.20$0.80Cost-effective
OpenAIGPT-5.4 Nano$0.08$0.30Lowest Cost
GoogleGemini 3.1 Pro$1.25$5.00Most Capable
GoogleGemini 3 Flash$0.15$0.60Balanced
GoogleGemini 3.1 Flash-Lite$0.05$0.20Lowest Cost
GroqLlama 3.1 8B
Free
Free
Instant
GroqLlama 3.3 70B
Free
Free
Versatile
GroqGPT-OSS 120B
Free
Free
Open-Weight Flagship
GroqGPT-OSS 20B
Free
Free
Open-Weight
GroqGroq Compound$0.80$1.60Agentic
xAIGrok 4.20$3.00$15.00Most Capable
xAIGrok 4.1 Fast (reasoning)$1.00$4.00Reasoning
xAIGrok 4.1 Fast$0.50$2.00Fast
xAIGrok Code Fast$0.30$1.20Coding
MoonshotKimi K2.6$1.20$2.501T Open-Source
OpenRouterGemma 4 26B (free)
Free
Free
Free Tier
OpenRouterGemma 4 31B (free)
Free
Free
Free Tier
OpenRouterQwen3 Coder (free)
Free
Free
Free Tier
OpenRouterNemotron 3 Super 120B (free)
Free
Free
Free Tier
OpenRouterLing 2.6 1T (free)
Free
Free
Free Tier

Provider Details

Anthropic
Judge Priority #1

Claude models excel at nuanced reasoning, following complex instructions, and agentic coding. Claude Opus 4.7 is Anthropic's most capable generally available model with a step-change improvement in agentic coding over Opus 4.6. Claude Sonnet 4.6 offers the best balance of speed and intelligence with a 1M-token context. Claude Haiku 4.5 is the fastest model with near-frontier intelligence.

Environment Variable

ANTHROPIC_API_KEY

API Client

anthropic.AsyncAnthropic

Strengths

Nuanced reasoning, instruction following, agentic coding, 1M context, structured output

OpenAI
Judge Priority #3

GPT-5.5 and GPT-5.5 Pro are OpenAI's latest flagship models with 1M-token context windows, available through the Responses and Chat Completions APIs. GPT-5.4 (and the Mini/Nano variants) remain the cost-tier options for high-volume workloads. Retired models like GPT-4o, GPT-4.1, and o4-mini are no longer supported by OpenAI as of February 2026.

Environment Variable

OPENAI_API_KEY

API Client

openai.AsyncOpenAI (httpx.AsyncClient)

Strengths

General capability, code generation, agentic tool use, multimodal, 1M context

Google
Judge Priority #2

Gemini 3.1 Pro is Google's most advanced reasoning model, optimized for complex agentic workflows and coding. Gemini 3 Flash provides strong frontier-class performance at low cost, while Gemini 3.1 Flash-Lite is the cheapest option for high-volume, latency-sensitive traffic. Gemini 1.x and 2.x lines have been deprecated.

Environment Variable

GOOGLE_API_KEY

API Client

google.generativeai.GenerativeModel

Strengths

Cost efficiency, multimodal, long context, frontier reasoning

Groq
Judge Priority #6

Groq provides ultra-fast inference for open-weight models including Llama 3.1 8B, Llama 3.3 70B, and OpenAI's GPT-OSS 120B/20B at zero cost through their free tier. Groq Compound and Compound Mini are agentic systems with built-in web search and code execution. Consilium uses Groq as the primary platform free-tier fallback (CONSILIUM_FREE_TIER_GROQ_KEY) when no BYOK key is configured.

Environment Variable

GROQ_API_KEY

API Client

OpenAI-compatible (api.groq.com/openai/v1)

Strengths

Free open-weight models, fastest inference, agentic compound systems

xAI
Judge Priority #4

xAI's Grok lineup launched Grok 4.20 in February 2026 with a four-agent architecture for reasoning. Grok 4.1 Fast (reasoning + non-reasoning variants) and Grok Code Fast cover the lower-cost tier. Use the OpenAI-compatible API format. Grok 2/2-mini and grok-beta are legacy and have been migrated.

Environment Variable

XAI_API_KEY

API Client

OpenAI-compatible (api.x.ai/v1)

Strengths

Multi-agent reasoning, real-time knowledge, fast coding tasks

Moonshot
Judge Priority #5

Moonshot's Kimi K2.6 (released April 2026) is a frontier-scale 1T-parameter open-source MoE model with a 262k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads. The API is OpenAI-compatible.

Environment Variable

MOONSHOT_API_KEY

API Client

OpenAI-compatible (platform.moonshot.ai)

Strengths

1T parameters, agentic tool use, long-context coding stability

OpenRouter
Judge Priority #7

OpenRouter aggregates access to dozens of models behind one OpenAI-compatible endpoint, including a free tier for popular community models like Gemma 4, Qwen3 Coder, Nemotron 3 Super 120B, and Ling 2.6 1T (rate-limited at 20 req/min, 50 req/day per OpenRouter's April 2026 free-tier policy). Consilium uses OpenRouter as the secondary free-tier fallback (CONSILIUM_FREE_TIER_OPENROUTER_KEY) when Groq is unavailable.

Environment Variable

OPENROUTER_API_KEY

API Client

OpenAI-compatible (openrouter.ai/api/v1)

Strengths

Free tier breadth, single endpoint for many providers, easy fallback

Judge Model Priority

The judge model evaluates proposals and produces the final synthesis. Consilium selects the judge based on this priority order (using the first provider for which you have a valid key):

#1 Anthropic#2 Google#3 OpenAI#4 xAI#5 Moonshot#6 Groq#7 OpenRouter

Free-Tier Fallback (BYOK Preserved)

Consilium is BYOK-first. When you supply your own provider API key, that key is always used — no fallback occurs. When no key is set for the requested provider, Consilium routes through a platform-hosted free-tier pool so you can keep working at zero cost. Resolution order:

  1. Your BYOK key for the requested provider (always wins)
  2. Self-hosted env var (e.g. OPENAI_API_KEY)
  3. Groq free-tier pool — CONSILIUM_FREE_TIER_GROQ_KEY
  4. OpenRouter free-tier pool — CONSILIUM_FREE_TIER_OPENROUTER_KEY

Tier is inferred from the requested model's catalog cost (fast / balanced / deep) and routed to a tier-equivalent free model. The CLI prints a pre-flight notice and a routing:fallback SSE event is emitted so you always know when fallback is active.

Tier-equivalent free models

Groq (preferred)

llama-3.1-8b-instant (fast)
llama-3.3-70b-versatile (balanced)
openai/gpt-oss-120b (deep)

OpenRouter (backup)

gemma-2-9b-it:free (fast)
llama-3.3-70b-instruct:free (balanced)
qwen-2.5-72b-instruct:free (deep)

Context Overflow Fallback

When a model returns a 413 or 400 error indicating the context is too large, Consilium automatically retries with a cheaper, smaller-context variant:

Original ModelFallback Model
gpt-5.5-progpt-5.4-mini
gpt-5.5gpt-5.4-mini
claude-opus-4-7claude-haiku-4-5-20251001
claude-sonnet-4-6claude-haiku-4-5-20251001
gemini-3.1-pro-previewgemini-3-flash-preview
grok-4-20grok-4-1-fast-non-reasoning

Security

All API keys are encrypted with AES-256-GCM before storage. Keys are never stored in plaintext, never logged, and never transmitted to any third party.