Keep your context window lean. Let the sandbox do the heavy lifting.
A context-aware MCP server for Claude Code that compresses tool output by processing it in isolated subprocesses. Raw data stays in the sandbox — only concise summaries enter your context window.
Quickstart for Agents · Getting Started · How It Works · Token Reduction · Configuration · CLI · Changelog
Install context-compress — an MCP server that compresses tool output for Claude Code.
Raw data stays in sandboxed subprocesses, only concise summaries enter your context window.
Saves ~99% of tokens on large outputs while keeping everything searchable via FTS5.
npm install -g context-compress
claude mcp add context-compress -- node $(which context-compress)
context-compress doctor
More info: https://github.com/Open330/context-compress
Every byte of tool output that enters your Claude Code context window reduces quality and speed.
A single git log or npm test can dump 50KB+ into context — that's ~12,000 tokens gone.
context-compress intercepts these tools, processes output in a sandbox, and returns only what matters:
Before: git log --oneline -100 → 8.2KB into context
After: execute("git log ...") → 0.3KB summary + full data searchable in FTS5
Based on context-mode by Mert Koseoğlu — rewritten in TypeScript with security hardening, architectural improvements, and better DX.
npm install -g context-compressclaude mcp add context-compress -- node $(which context-compress)Or add to your project's .mcp.json:
{
"mcpServers": {
"context-compress": {
"command": "node",
"args": ["/path/to/context-compress/dist/index.js"]
}
}
}context-compress doctor┌─────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ "Run tests" ──→ PreToolUse Hook intercepts │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ context-compress │ │
│ │ MCP Server │ │
│ └────────┬─────────┘ │
│ │ │
│ ┌───────────┼───────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Executor │ │ Store │ │ Stats │ │
│ │ (11 lang)│ │ (FTS5) │ │ Tracker │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │
│ ▼ ▼ │
│ Raw output Indexed & Only summary │
│ stays here searchable enters context │
└─────────────────────────────────────────────────────────┘
| Tool | What it does |
|---|---|
execute |
Run code in 11 languages. Only stdout enters context. |
execute_file |
Process a file via FILE_CONTENT variable — file never enters context. |
index |
Chunk markdown/text into FTS5 knowledge base for search. |
search |
BM25 search with Porter stemming → trigram → fuzzy fallback. |
fetch_and_index |
Fetch URL → HTML-to-markdown → auto-index. Preview only in context. |
batch_execute |
Run N commands + search in ONE call. Replaces 30+ tool calls. |
stats |
Real-time session statistics: bytes saved, tokens avoided, savings ratio. |
javascript · typescript · python · shell · ruby · go · rust · php · perl · r · elixir
Bun auto-detected for 3-5x faster JS/TS execution.
context-compress offers three compression modes that trade fidelity for compactness. Pass --mode to the CLI, set CONTEXT_COMPRESS_MODE in your environment, or let the default (balanced) just work.
| Mode | Strategy | Use when |
|---|---|---|
conservative |
ANSI strip only — preserves every byte of meaningful content | You need full fidelity, debugging output, archival logs |
balanced (default) |
Strip noise (progress bars, deprecation warnings, hint lines) — keep metadata (commit bodies, file dates, full test failures) | Day-to-day agent work where context might be re-read |
aggressive |
Drop metadata too — git log → oneline, ls -la → name+size, find lower threshold, grep grouped | Maximum token savings; agent will rarely need the dropped detail |
# CLI flag (per-call override)
context-compress wrap --mode aggressive "git log -50"
# Env var (set once for the session)
export CONTEXT_COMPRESS_MODE=aggressiveThe PreToolUse hook also forwards CONTEXT_COMPRESS_MODE automatically when wrapping Bash commands, so agents transparently get whatever mode you've configured.
Head-to-head with RTK
Reproduce locally:
git clone https://github.com/rtk-ai/rtk /tmp/rtk && (cd /tmp/rtk && cargo build --release)
RTK_BIN=/tmp/rtk/target/release/rtk tsx scripts/benchmark-vs-rtk.tsResult on this repository (RTK 0.39.0 vs context-compress 2026.3.22):
| Command | Raw | RTK | CC conservative |
CC balanced |
CC aggressive |
|---|---|---|---|---|---|
git status |
637 B | 279 B (56%) | 637 B (0%) | 435 B (32%) | 225 B (65%) |
git log -10 (full) |
15.1 KB | 2.8 KB (81%) | 15.1 KB (0%) | 11.5 KB (24%) | 891 B (94%) |
git log -50 (full) |
28.9 KB | 9.1 KB (69%) | 28.9 KB (0%) | 20.2 KB (30%) | 2.9 KB (90%) |
git diff --stat |
682 B | 681 B (0%) | 682 B (0%) | 682 B (0%) | 682 B (0%) |
ls src/ |
149 B | 229 B (-54%) | 149 B (0%) | 149 B (0%) | 149 B (0%) |
ls -laR src/ |
3.7 KB | 229 B (94%) | 3.7 KB (0%) | 3.7 KB (0%) | 919 B (76%) |
find *.ts |
1.0 KB | 576 B (44%) | 1.0 KB (0%) | 1.0 KB (0%) | 183 B (82%) |
npm test |
18.8 KB | 114 B (99%) | 14.3 KB (24%) | 120 B (99%) | 120 B (99%) |
| Overall (byte-weighted) | 69.0 KB | 14.0 KB (79.7%) | 64.5 KB (6.5%) | 37.8 KB (45.3%) | 6.0 KB (91.2%) |
context-compress aggressive beats RTK by 11.5 percentage points overall while still letting you fall back to balanced (preserves metadata) or conservative (just ANSI) for fidelity-sensitive workloads. Numbers are byte-weighted across the full set; per-command splits show the trade-offs.
RTK has a single fixed compression strategy — comparable to context-compress
aggressive. context-compress lets the agent choose: reach foraggressivewhen the question is "what changed",balancedwhen the question is "explain why".
context-compress achieves 99.2% token reduction across a typical 12-operation coding session.
| Operation | Before | After | Reduction |
|---|---|---|---|
| Read bundled file (776KB) | 194,076 tok | 105 tok | 99.9% |
| Playwright snapshot (56KB) | 14,000 tok | 75 tok | 99.5% |
| Read CSV/JSON data (100KB) | 25,000 tok | 125 tok | 99.5% |
| Read source file (21KB) | 5,250 tok | 88 tok | 98.3% |
| npm install log (15KB) | 3,750 tok | 50 tok | 98.7% |
| curl API response (12KB) | 3,000 tok | 88 tok | 97.1% |
| npm test (42 tests) | 935 tok | 45 tok | 95.2% |
| batch_execute (5 cmds) | 6,250 tok | 375 tok | 94.0% |
| fetch_and_index (45KB page) | 11,250 tok | 750 tok | 93.3% |
| grep (small output) | 361 tok | 361 tok | 0% |
| Session Total | 267,121 tok | 2,223 tok | 99.2% |
Without context-compress, 12 operations consume 133% of the 200K context window — overflowing it entirely. With context-compress, the same operations use 1.1%, leaving 98.9% free for actual conversation.
Data isn't deleted — it's indexed in FTS5 and searchable on demand. Small outputs (<5KB) pass through uncompressed.
Read the full Token Reduction Report — includes cost analysis, architecture deep-dive, and FAQ on context loss trade-offs.
| context-mode | context-compress | |
|---|---|---|
| Credentials | 20+ auth env vars passed by default | Opt-in only (passthroughEnvVars: []) |
| Hook writes | Self-modifies settings.json |
Zero filesystem writes |
| Rust compile | Shell string → injection risk | execFileSync with array args |
| Upgrade | git clone arbitrary code |
Removed entirely |
| FTS5 indexing | Always dual-table (Porter + trigram) | Lazy trigram — 50% fewer writes |
| Runtime detect | Sequential execSync ~250ms |
Parallel Promise.all ~40ms |
| batch_execute | Sequential commands | Promise.allSettled parallel |
| Config | None | ENV + file + defaults |
| Errors | 23 silent catch blocks | CONTEXT_COMPRESS_DEBUG=1 logs all |
| Uninstall | None | context-compress uninstall |
Loaded in order: ENV vars → .context-compress.json → defaults
# Enable debug logging (stderr)
CONTEXT_COMPRESS_DEBUG=1
# Pass specific env vars to subprocesses (default: none)
CONTEXT_COMPRESS_PASSTHROUGH_ENV=GH_TOKEN,AWS_PROFILE
# Disable curl/wget blocking
CONTEXT_COMPRESS_BLOCK_CURL=0
# Disable WebFetch blocking
CONTEXT_COMPRESS_BLOCK_WEBFETCH=0
# Disable Read/Grep nudges
CONTEXT_COMPRESS_NUDGE_READ=0
CONTEXT_COMPRESS_NUDGE_GREP=0
# Compression mode: conservative | balanced (default) | aggressive
CONTEXT_COMPRESS_MODE=balanced
# RTK-style transparent Bash wrapping (default: off)
CONTEXT_COMPRESS_FILTER_BASH=1
# Override path to the context-compress binary used by the hook
CONTEXT_COMPRESS_BIN=/usr/local/bin/context-compressCreate .context-compress.json in your project root or home directory:
{
"passthroughEnvVars": ["GH_TOKEN", "AWS_PROFILE", "KUBECONFIG"],
"blockCurl": true,
"blockWebFetch": true,
"debug": false
}context-compress # Start MCP server (stdio)
context-compress setup # Detect runtimes, show install instructions
context-compress setup --auto # One-line: write ~/.claude/settings.json
context-compress init --auto # Alias for setup --auto
context-compress doctor # Diagnose: runtimes, hooks, FTS5, version
context-compress uninstall # Clean removal: hooks, MCP reg, stale DBs
# RTK-style transparent compression — use anywhere, agent doesn't need MCP
context-compress wrap "npm test" # default = balanced
context-compress wrap --mode aggressive "git log -50" # max compression
context-compress wrap --stream "tail -f /var/log/app.log" # line-by-line for long-running cmds
context-compress filter --cmd "git push" < captured.log # pipe filterSet CONTEXT_COMPRESS_FILTER_BASH=1 and the PreToolUse hook will route output-heavy Bash calls through context-compress wrap automatically — the agent doesn't need to call execute() to benefit. Combine with CONTEXT_COMPRESS_MODE=aggressive for maximum compression.
context-compress doctor
[PASS] Performance: FAST — Bun detected
[PASS] Language coverage: 7/11 (64%)
[PASS] Server test: OK
[PASS] PreToolUse hook configured
[PASS] FTS5 / better-sqlite3 works
Version: v2026.3.3
All checks passed.
context-compress/
├── src/
│ ├── index.ts # Entry point
│ ├── server.ts # MCP server (7 tools)
│ ├── executor.ts # SubprocessExecutor
│ ├── store.ts # ContentStore (FTS5)
│ ├── config.ts # Config system
│ ├── logger.ts # Debug logger
│ ├── snippet.ts # FTS5 snippet extraction
│ ├── stats.ts # Session tracker
│ ├── types.ts # Shared types
│ ├── runtime/
│ │ ├── plugin.ts # LanguagePlugin interface
│ │ ├── index.ts # Registry + parallel detection
│ │ └── languages/ # 11 language plugins
│ ├── hooks/
│ │ └── pretooluse.ts # PreToolUse hook (no self-mod)
│ └── cli/
│ ├── index.ts # CLI entry
│ ├── setup.ts # Interactive setup
│ ├── doctor.ts # Diagnostics
│ └── uninstall.ts # Clean removal
├── tests/
│ ├── unit/ # 7 unit test files
│ └── integration/ # 3 integration test files
├── hooks/
│ └── hooks.json # Hook matcher config
├── skills/ # Slash command definitions
└── dist/ # Compiled output
| Threat | Mitigation |
|---|---|
| Credential leakage | passthroughEnvVars defaults to [] — zero env vars passed unless opted in |
| Shell injection (Rust) | execFileSync with array arguments — no string interpolation |
| Hook self-modification | No fs.writeFileSync in hooks — zero filesystem side effects |
| Arbitrary code execution | No upgrade command — no git clone or npm install at runtime |
| Silent failures | Debug mode surfaces all catch block errors to stderr |
git clone https://github.com/Open330/context-compress
cd context-compress
npm install
npm run typecheck # Type checking
npm run lint # Biome linting
npm run test:unit # Unit tests
npm run test # All tests (unit + integration)
npm run build # Compile + bundleMIT — Based on context-mode by Mert Koseoğlu.