Back to Docs

System Architecture

How Consilium's microservices, database, queue system, and streaming infrastructure work together.

System Overview

┌─────────────────────────────────────────────────────────────────┐
│                        CLIENT LAYER                             │
│  Web App (Next.js 15)  │  CLI (Commander.js)  │  SDKs (Py/TS)  │
└──────────────┬─────────────────┬──────────────────┬─────────────┘
               │                 │                  │
               ▼                 ▼                  ▼
┌─────────────────────────────────────────────────────────────────┐
│                     API LAYER (Port 4000)                       │
│  NestJS 11 / Fastify                                            │
│  ├── Clerk Auth Guard (JWT verification)                        │
│  ├── REST Controllers (debates, deliberation, agents, personas) │
│  ├── BullMQ Queue (debate-jobs)                                 │
│  ├── SSE Proxy (streams from FastAPI → client)                  │
│  ├── Prisma ORM (PostgreSQL)                                    │
│  └── Encryption Service (AES-256-GCM for API keys)              │
└──────────────┬──────────────────────────────────────────────────┘
               │ HTTP + SSE
               ▼
┌─────────────────────────────────────────────────────────────────┐
│                  AGENT LAYER (Port 8000)                         │
│  FastAPI / LangGraph                                             │
│  ├── Deliberation Graph (state machine)                          │
│  │   ├── Phase Handlers (propose, challenge, rebut, evaluate...) │
│  │   ├── Voting Engine (Condorcet, Borda, Ranked Pairs, Copeland)│
│  │   ├── Convergence Detector (Kendall tau + Jaccard + concession│
│  │   ├── Dissent Detector (agglomerative clustering)             │
│  │   └── Confidence Calibrator (explanation stability)           │
│  ├── Agent Factory (5 providers × 15 models)                     │
│  ├── Cost Router (complexity scoring → mode selection)            │
│  ├── Template Registry (6 vertical templates)                     │
│  └── Benchmark Runner (MMLU, TruthfulQA, HumanEval)              │
└─────────────────────────────────────────────────────────────────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│PostgreSQL│ │ Redis  │ │  LLM   │
│  (Neon) │ │(Upstash)│ │  APIs  │
│  :5432  │ │  :6379 │ │ (5 co.)│
└────────┘ └────────┘ └────────┘

Service Architecture

ServiceStackPortPurpose
Web AppNext.js 15, React 19, Tailwind, shadcn/ui, Clerk, Zustand3000Frontend, marketing, dashboard, debate UI
APINestJS 11, Fastify, Prisma, BullMQ, Clerk SDK4000REST API, auth, queue processing, database
AgentsFastAPI, LangGraph, 5 LLM providers8000Deliberation engine, benchmarks, templates
DatabasePostgreSQL 16 (Neon managed)5432Persistent storage via Prisma ORM
Cache/QueueRedis 7 (Upstash managed)6379BullMQ jobs, SSE relay, session cache

Data Flow

1

User submits topic via Web App, CLI, or SDK

2

API creates DebateSession (status: pending), enqueues BullMQ job

3

BullMQ worker picks up job, calls FastAPI POST /api/v1/deliberation/start

4

FastAPI runs LangGraph state machine through phases (PROPOSAL → ... → OUTPUT)

5

Each phase streams SSE events back through API to client in real-time

6

On completion: golden_prompt, dissent_report, cost stored in PostgreSQL

7

AuditEntry records per-step: model, tokens, cost, latency for full transparency

Database Schema

PostgreSQL via Prisma ORM. All models are relationally connected. Managed by Neon in production.

ModelKey Fields
UserclerkId, email, firstName, lastName, encrypted API keys (AES-256-GCM), cliTokenHash
DebateSessionuserId, topic, status, modelsUsed, totalCost, goldenPrompt, mode, judgeModel
DebateRoundsessionId, roundNumber, status
DebateMessageroundId, agentId, modelUsed, content, promptTokens, completionTokens, cost, latencyMs
ConversationV2userId, title, decisionLog, projectContext, debates[]
DeliberationRunuserId, topic, mode, models, judgeModel, status, goldenPrompt, dissentReport, costTotal, tokensTotal
AuditEntrydeliberationId, step, modelId, inputSummary, outputSummary, latencyMs, tokensIn, tokensOut, cost, roundNumber
AgentuserId, name, provider, modelId, description, isActive, tenantId
AgentPersonauserId, name, description, systemPrompt, isDefault
UsageRecordtenantId, agentId, tokens, cost, recordedAt
AuthLoguserId, event, ip, userAgent, metadata, severity
AgentFailuremodelId, provider, errorType, debateId
Waitlistemail, source, metadata, notified

SSE Event Types

Real-time streaming uses Server-Sent Events. Connect to /deliberation/:id/stream to receive typed events as the debate progresses.

CategoryEvents
Deliberationdeliberation:start, deliberation:complete
Phasesphase:proposal, phase:challenge, phase:rebuttal, phase:evaluation, phase:voting, phase:aggregation
Agentsagent:start, agent:chunk, agent:complete
Convergenceconvergence:detected, convergence:not_detected
Dissentdissent:consensus, dissent:report
Red Teamred_team:attack, red_team:defense, red_team:judgment
Marketmarket:bet, market:update, market:converged
Systemcost:update, error, done, debate:cancelled
Roundsround:start, round:complete
Judgejudge_start, judge_retry, synthesis:start

Authentication Flow

Web App

Clerk SDK → JWT session → ClerkAuthGuard middleware → CurrentUser decorator extracts userId. Supports email, Google, GitHub sign-in.

CLI

consilium login → opens browser → Clerk auth → CLI token generated and stored (hashed, not plaintext) in ~/.consilium/config.json. One token per user.

API / SDK

Bearer token in Authorization header → Clerk SDK verifies JWT → userId extracted from session claims. API keys for LLM providers stored encrypted (AES-256-GCM) in User model.

Error Handling & Resilience

Circuit Breakers — Per-provider failure tracking. After consecutive failures, requests are short-circuited to prevent cascading failures.

Retry Logic — MAX_RETRIES: 2 attempts, RETRY_BACKOFF: [2s, 5s] exponential. Only retries on transient errors (5xx, 429).

Error Classification — Errors categorized as: rate_limit, auth, timeout, server_error, unknown. Raised as LLMProviderError(provider, error_type, original_error).

Context Overflow — On 413/400 errors, automatically retries with cheaper model variant (e.g., gpt-5.4 → gpt-5.4-mini).

Timeout — 60 seconds per API call. Configurable per-provider.

CI/CD Pipeline

WorkflowTriggerPurpose
ci.ymlPush/PR to mainLint + typecheck across monorepo
security.ymlPush/PR/weeklyCodeQL, pip-audit, bandit, gitleaks
coverage.ymlPRCode coverage reporting
docker.ymlPush to mainBuild and push Docker images
consilium-review.ymlPR open/syncMulti-model AI code review (Sonnet + Haiku fallback)
pr-checks.ymlPRPre-merge validation
publish-npm.ymlReleasePublish TypeScript SDK to npm
publish-pypi.ymlReleasePublish Python SDK to PyPI