Skip to content

KakkoiDev/bench

Repository files navigation

bench

Simple command timing + server monitoring + AI-friendly persistent logs.

Why bench?

The gap: Existing tools (time, hyperfine, k6, ab, wrk) don't track server CPU/memory during execution, and most output to stdout only.

bench adds:

  • Multi-process CPU/memory monitoring via --pid or --port (repeatable)
  • Persistent, organized JSON logs you can compare across runs
  • AI-friendly output for LLM analysis

bench doesn't replace specialized tools - wrap them to add server monitoring:

bench --port 8080 "hyperfine 'curl localhost:8080'"
bench --port 8080 "k6 run load-test.js"

Installation

curl -fsSL https://raw.githubusercontent.com/KakkoiDev/bench/main/install.sh | sh

Or clone and run locally:

git clone https://github.com/KakkoiDev/bench.git
cd bench
./install.sh
With Claude Code integration

Installs a Claude Code skill and agent for running benchmarks and analyzing results.

Remote:

curl -fsSL https://raw.githubusercontent.com/KakkoiDev/bench/main/install.sh | sh -s -- --with-claude

Local:

./install.sh --with-claude
Manual installation
chmod +x bench
sudo ln -s "$(pwd)/bench" /usr/local/bin/bench
Install options
./install.sh [OPTIONS]

Options:
  --dir PATH        Install directory (default: ~/.local/bin or /usr/local/bin)
  --with-claude     Also install Claude Code skill and agent
  --skip-deps       Skip dependency checks
  --uninstall       Remove bench and optional Claude Code files
  --help            Show this help

Usage

Basic timing:

bench "echo hello world"

Server monitoring:

# Start a test server
python3 -m http.server 8080 &

# Benchmark with CPU/memory tracking
bench --runs 20 --port 8080 "curl -s localhost:8080"

Track optimization iterations:

# Baseline
bench --name "api" --message "baseline" --runs 100 --port 8080 "curl -s localhost:8080/export"

# After adding cache
bench --name "api" --message "with cache" --runs 100 --port 8080 "curl -s localhost:8080/export"

# Compare
jq -r '"\(.message): \(.timing.mean)ms"' bench-results/api/*/benchmark.json

Scripting with --quiet:

# --quiet suppresses progress, outputs only the results path
RESULTS=$(bench --quiet --runs 10 "curl -s localhost:8080")
jq .timing "$RESULTS/benchmark.json"

# One-liner
jq .timing "$(bench --quiet --runs 5 "echo test")/benchmark.json"

Options

bench [OPTIONS] COMMAND

Options:
  --runs N            Number of runs (default: 10)
  --name NAME         Named group for organizing results
  --message TEXT      Describe what changed (e.g., "baseline", "with cache")
  --quiet             Suppress progress output, only print results path
  --pid [NAME:]PID    Monitor process CPU/memory by PID (repeatable)
  --port [NAME:]PORT  Monitor process by port (repeatable)
  --metrics-interval MS  Metrics sampling interval (default: 500, min: 100)
  --help              Show help
  --version           Show version

Note: Process names cannot contain colons (: is the delimiter).

Output

Results saved to ./bench-results/<name>/<timestamp>/:

benchmark.json        # all metrics
runs/
  1.log               # stdout/stderr per run
  1.app.metrics       # CPU/memory samples (format: "timestamp cpu:% mem:MB")

benchmark.json:

{
  "schema_version": "2.0",
  "name": "api",
  "message": "baseline",
  "command": "curl -s localhost:8080",
  "timing": { "mean": 23.4, "median": 21.0, "min": 12.5, "max": 45.2, "p95": 38.1, "p99": 44.0 },
  "processes": [
    { "name": "app", "pid": 12345, "cpu": { "mean": 15.2 }, "memory": { "mean": 46.2, "delta": 2.8 } }
  ],
  "runs": [{ "run_number": 1, "duration_ms": 23.4, "exit_code": 0 }],
  "environment": { "os": "Linux", "shell": "/bin/bash" }
}

With other tools

Compare runs with jq:

jq -r '"\(.message): \(.timing.mean)ms"' bench-results/api/*/benchmark.json

Analyze with Claude Code:

claude --print "$(cat bench-results/api/*/benchmark.json) compare these runs, identify bottlenecks"

Stress test with xargs:

bench --name "stress" --port 8080 \
  "seq 100 | xargs -P 100 -I {} curl -s localhost:8080"

Monitor Docker Compose services:

# Get container PIDs
APP_PID=$(docker inspect --format '{{.State.Pid}}' myapp_app_1)
REDIS_PID=$(docker inspect --format '{{.State.Pid}}' myapp_redis_1)

# Benchmark with multi-process monitoring
bench --runs 20 \
  --pid "app:$APP_PID" \
  --pid "redis:$REDIS_PID" \
  "curl -s localhost:5000/api/data"

Contributing

# Development
./bench --runs 5 "echo test"

# Run tests (requires BATS)
bats tests/

See CONTRIBUTING.md for guidelines.

Philosophy

  • Composable. Works with any command, wraps existing tools.
  • Portable. POSIX shell, runs anywhere.
  • Persistent. Organized logs you can revisit and compare.
  • AI-friendly. Structured JSON for LLM analysis.

Resources

License

MIT License

About

Simple command timing + server monitoring + AI-friendly persistent logs.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages