What does the Consilium sandbox protect against?

The sandbox blocks the CLI and any tool it spawns from reading or writing outside the current workspace, opening network connections to non-allowlisted hosts, or modifying user-level config files. It is a defence-in-depth layer for prompt-injection attacks where a model is tricked into running a destructive shell command.

Where is the workspace trust file?

~/.consilium/workspace-trust.json. Each entry records a workspace absolute path, a trust level (always or session), and the timestamp of approval. Session-level approvals are wiped on CLI restart; always-level approvals persist until you remove them via /trust remove.

How do I enable or disable the sandbox?

Pass --sandbox to a one-off command to force enforcement. Pass --no-sandbox-strict to downgrade hard blocks to warnings while you debug a profile. Set sandbox.default in ~/.consilium/config.json to make either the project-wide default.

What's the difference between always and session trust?

Always trust persists across CLI restarts and is the right level for your own repos. Session trust persists only for the current CLI process and is the right level when you clone someone else's repo for a one-time review - the trust is gone when you exit.

What overhead does the sandbox add?

Process startup adds roughly 30-60 ms on macOS (Seatbelt) and 10-20 ms on Linux (bwrap). Steady-state file I/O is unaffected because the kernel does the enforcement. Windows worktree creation is a one-time per-session cost of around 100-300 ms.

Back to CLI

OS-Level Sandbox

Q: How is the sandbox implemented on each OS?

macOS uses Apple's Seatbelt via sandbox-exec with a project-scoped .sb profile. Linux uses bubblewrap (bwrap) with unshared mount, PID, and user namespaces. Windows lacks a comparable primitive, so Consilium falls back to running the agent in a per-session git worktree with restricted file ACLs.

The Consilium CLI sandbox isolates every spawned tool from the rest of your machine. On macOS it uses Apple Seatbelt via sandbox-exec. On Linux it uses bubblewrap with unshared mount, PID, and user namespaces. On Windows it falls back to a per-session git worktree with restricted ACLs. Workspaces are gated by a trust file at ~/.consilium/workspace-trust.json.

Why a sandbox?

Anthropic's 2024 research on prompt-injection attacks reported that “agent systems with shell access remain vulnerable to indirect prompt injection even when the top-of-stack model rejects the request, because intermediate tool calls can be hijacked by adversarial content in the workspace.” A workspace-scoped sandbox is the industry-standard mitigation: even if a tool is hijacked, it cannot reach files or hosts outside the approved boundary.

The performance cost is modest. Seatbelt adds ~30-60 ms of startup overhead per spawned process and zero steady-state file I/O overhead because the kernel does the enforcement. Bubblewrap on Linux is even cheaper at ~10-20 ms. Windows pays a one-time ~100-300 ms cost to materialise the worktree at session start.

How is the sandbox implemented on each OS?

OS	Primitive	Profile path	Overhead
macOS	Seatbelt via sandbox-exec	~/.consilium/sandbox/profile.sb	~30-60 ms startup
Linux	bubblewrap (bwrap) + namespace unshare	~/.consilium/sandbox/bwrap-args.json	~10-20 ms startup
Windows	git worktree + restricted file ACLs	~/.consilium/sandbox/worktree-root	~100-300 ms session create

What does a Seatbelt profile look like (macOS)?

The CLI generates a project-scoped .sb file at session start and passes it to sandbox-exec -f:

(version 1)
(deny default)
(allow process-exec)
(allow process-fork)
(allow file-read*)
(allow file-write* (subpath "/Users/you/myrepo"))
(allow file-write* (subpath "/tmp"))
(allow network-outbound
  (remote ip "api.anthropic.com:443")
  (remote ip "api.openai.com:443"))

What does the bwrap call look like (Linux)?

bwrap \
  --unshare-pid \
  --unshare-user \
  --unshare-net \
  --ro-bind / / \
  --bind /home/you/myrepo /home/you/myrepo \
  --bind /tmp /tmp \
  --proc /proc \
  --dev /dev \
  --share-net \
  --setenv CONSILIUM_SANDBOX 1 \
  -- /usr/local/bin/consilium-tool "$@"

Network is unshared then explicitly re-shared so the CLI can insert a slirp4netns-backed allowlist for outbound LLM API hosts. The mount namespace makes everything outside the workspace read-only.

How is the Windows fallback different?

Windows does not expose a primitive comparable to Seatbelt or user namespaces from userspace. The CLI instead creates a per session git worktree add under ~/.consilium/sandbox/worktree-root and applies restrictive ACLs so the spawned tools see only the worktree, not the original repo. This is weaker than the kernel-enforced sandboxes but still blocks the most common prompt-injection patterns (delete sibling files, exfiltrate from parent directories, modify shell rc files).

How does workspace trust work?

The first time the CLI is launched in a new workspace, it prompts for a trust decision. The choice is persisted to ~/.consilium/workspace-trust.json with one of two levels:

always - trusted across restarts. Right level for your own long-lived repos.
session - trusted only for the current CLI process. Right level when reviewing untrusted code.

{
  "workspaces": [
    {
      "path": "/Users/you/myrepo",
      "level": "always",
      "trustedAt": "2026-05-20T14:02:31Z"
    },
    {
      "path": "/tmp/review-pr-129",
      "level": "session",
      "trustedAt": "2026-05-20T14:11:18Z"
    }
  ]
}

Trust slash commands (/trust)

/trust list	Show every trusted workspace with its level.
/trust add	Mark the current workspace as trusted. Prompts for always or session.
/trust remove [path]	Remove a workspace from the trust file. Defaults to the current cwd.
/trust status	Show whether the current workspace is trusted and at what level.

CLI flags

--sandbox - force sandbox enforcement for this invocation, even if the workspace is not trusted.
--no-sandbox-strict - downgrade hard blocks to warnings. Useful while iterating on a profile. Never use in CI.

The full implementation (profile generators, trust store, slash command handlers) lives in the public CLI repository: github.com/skadri1601/consilium-cli.