System Architecture

How Consilium's microservices, database, queue system, and streaming infrastructure work together.

System Overview

┌─────────────────────────────────────────────────────────────────┐
│                        CLIENT LAYER                             │
│  Web App (Next.js 15)  │  CLI (Commander.js)  │  SDKs (Py/TS)  │
└──────────────┬─────────────────┬──────────────────┬─────────────┘
               │                 │                  │
               ▼                 ▼                  ▼
┌─────────────────────────────────────────────────────────────────┐
│                     API LAYER (Port 4000)                       │
│  NestJS 11 / Fastify                                            │
│  ├── Clerk Auth Guard (JWT verification)                        │
│  ├── REST Controllers (debates, deliberation, agents, personas) │
│  ├── BullMQ Queue (debate-jobs)                                 │
│  ├── SSE Proxy (streams from FastAPI → client)                  │
│  ├── Prisma ORM (PostgreSQL)                                    │
│  └── Encryption Service (AES-256-GCM for API keys)              │
└──────────────┬──────────────────────────────────────────────────┘
               │ HTTP + SSE
               ▼
┌─────────────────────────────────────────────────────────────────┐
│                  AGENT LAYER (Port 8000)                         │
│  FastAPI / LangGraph                                             │
│  ├── Deliberation Graph (state machine)                          │
│  │   ├── Phase Handlers (propose, challenge, rebut, evaluate...) │
│  │   ├── Voting Engine (Condorcet, Borda, Ranked Pairs, Copeland)│
│  │   ├── Convergence Detector (Kendall tau + Jaccard + concession│
│  │   ├── Dissent Detector (agglomerative clustering)             │
│  │   └── Confidence Calibrator (explanation stability)           │
│  ├── Agent Factory (5 providers × 15 models)                     │
│  ├── Cost Router (complexity scoring → mode selection)            │
│  ├── Template Registry (6 vertical templates)                     │
│  └── Benchmark Runner (MMLU, TruthfulQA, HumanEval)              │
└─────────────────────────────────────────────────────────────────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│PostgreSQL│ │ Redis  │ │  LLM   │
│  (Neon) │ │(Upstash)│ │  APIs  │
│  :5432  │ │  :6379 │ │ (5 co.)│
└────────┘ └────────┘ └────────┘

Service Architecture

Service	Stack	Port	Purpose
Web App	Next.js 15, React 19, Tailwind, shadcn/ui, Clerk, Zustand	3000	Frontend, marketing, dashboard, debate UI
API	NestJS 11, Fastify, Prisma, BullMQ, Clerk SDK	4000	REST API, auth, queue processing, database
Agents	FastAPI, LangGraph, 5 LLM providers	8000	Deliberation engine, benchmarks, templates
Database	PostgreSQL 16 (Neon managed)	5432	Persistent storage via Prisma ORM
Cache/Queue	Redis 7 (Upstash managed)	6379	BullMQ jobs, SSE relay, session cache

Data Flow

User submits topic via Web App, CLI, or SDK

API creates DebateSession (status: pending), enqueues BullMQ job

BullMQ worker picks up job, calls FastAPI POST /api/v1/deliberation/start

FastAPI runs LangGraph state machine through phases (PROPOSAL → ... → OUTPUT)

Each phase streams SSE events back through API to client in real-time

On completion: golden_prompt, dissent_report, cost stored in PostgreSQL

AuditEntry records per-step: model, tokens, cost, latency for full transparency

Database Schema

PostgreSQL via Prisma ORM. All models are relationally connected. Managed by Neon in production.

Model	Key Fields
User	clerkId, email, firstName, lastName, encrypted API keys (AES-256-GCM), cliTokenHash
DebateSession	userId, topic, status, modelsUsed, totalCost, goldenPrompt, mode, judgeModel
DebateRound	sessionId, roundNumber, status
DebateMessage	roundId, agentId, modelUsed, content, promptTokens, completionTokens, cost, latencyMs
ConversationV2	userId, title, decisionLog, projectContext, debates[]
DeliberationRun	userId, topic, mode, models, judgeModel, status, goldenPrompt, dissentReport, costTotal, tokensTotal
AuditEntry	deliberationId, step, modelId, inputSummary, outputSummary, latencyMs, tokensIn, tokensOut, cost, roundNumber
Agent	userId, name, provider, modelId, description, isActive, tenantId
AgentPersona	userId, name, description, systemPrompt, isDefault
UsageRecord	tenantId, agentId, tokens, cost, recordedAt
AuthLog	userId, event, ip, userAgent, metadata, severity
AgentFailure	modelId, provider, errorType, debateId
Waitlist	email, source, metadata, notified

SSE Event Types

Real-time streaming uses Server-Sent Events. Connect to /deliberation/:id/stream to receive typed events as the debate progresses.

Category	Events
Deliberation	deliberation:start, deliberation:complete
Phases	phase:proposal, phase:challenge, phase:rebuttal, phase:evaluation, phase:voting, phase:aggregation
Agents	agent:start, agent:chunk, agent:complete
Convergence	convergence:detected, convergence:not_detected
Dissent	dissent:consensus, dissent:report
Red Team	red_team:attack, red_team:defense, red_team:judgment
Market	market:bet, market:update, market:converged
System	cost:update, error, done, debate:cancelled
Rounds	round:start, round:complete
Judge	judge_start, judge_retry, synthesis:start

Authentication Flow

Web App

Clerk SDK → JWT session → ClerkAuthGuard middleware → CurrentUser decorator extracts userId. Supports email, Google, GitHub sign-in.

CLI

consilium login → opens browser → Clerk auth → CLI token generated and stored (hashed, not plaintext) in ~/.consilium/config.json. One token per user.

API / SDK

Bearer token in Authorization header → Clerk SDK verifies JWT → userId extracted from session claims. API keys for LLM providers stored encrypted (AES-256-GCM) in User model.

Error Handling & Resilience

Circuit Breakers — Per-provider failure tracking. After consecutive failures, requests are short-circuited to prevent cascading failures.

Retry Logic — MAX_RETRIES: 2 attempts, RETRY_BACKOFF: [2s, 5s] exponential. Only retries on transient errors (5xx, 429).

Error Classification — Errors categorized as: rate_limit, auth, timeout, server_error, unknown. Raised as LLMProviderError(provider, error_type, original_error).

Context Overflow — On 413/400 errors, automatically retries with cheaper model variant (e.g., gpt-5.4 → gpt-5.4-mini).

Timeout — 60 seconds per API call. Configurable per-provider.

CI/CD Pipeline

Workflow	Trigger	Purpose
ci.yml	Push/PR to main	Lint + typecheck across monorepo
security.yml	Push/PR/weekly	CodeQL, pip-audit, bandit, gitleaks
coverage.yml	PR	Code coverage reporting
docker.yml	Push to main	Build and push Docker images
consilium-review.yml	PR open/sync	Multi-model AI code review (Sonnet + Haiku fallback)
pr-checks.yml	PR	Pre-merge validation
publish-npm.yml	Release	Publish TypeScript SDK to npm
publish-pypi.yml	Release	Publish Python SDK to PyPI