High‑performance in‑memory HTTP cache & reverse proxy for latency‑sensitive workloads. Built in Go on top of fasthttp
, with sharded storage, TinyLFU admission, background refresh, upstream controls, and minimal‑overhead observability (Prometheus + OpenTelemetry).
- Throughput: 160–170k RPS locally; ~250k RPS sustained on 24‑core bare‑metal with a 50GB cache.
- Memory safety: 1.5–3GB overhead at 50GB (no traces); ~7GB at 100% OTEL sampling.
- Hot path discipline: zero allocations, sharded counters, per‑shard LRU, TinyLFU admission.
- Control plane: runtime API for toggles (admission, eviction, refresh, compression, observability).
- Observability: Prometheus/VictoriaMetrics metrics + OpenTelemetry tracing.
- Kubernetes‑friendly: health probes, config via ConfigMap, Docker image.
Edit the CHANGEME fields and run. This is a complete config based on advcache.cfg.yaml
, trimmed for a fast start but fully runnable.
cache:
env: prod
enabled: true
logs:
level: debug
runtime:
gomaxprocs: 0
api:
name: advCache.local
port: '8020 # <-- CHANGEME: API port to listen on'
upstream:
backend:
id: example-ams-web
enabled: true
policy: deny
host: service-example:8080
scheme: http
rate: 15000
concurrency: 4096
timeout: 10s
max_timeout: 1m
use_max_timeout_header: ''
healthcheck: /healthcheck
addr: http://127.0.0.1:8081 # <-- CHANGEME: your upstream origin URL
health_path: /health
compression:
enabled: true
level: 1
data:
dump:
enabled: true
dump_dir: public/dump
dump_name: cache.dump
crc32_control_sum: true
max_versions: 3
gzip: false
mock:
enabled: false
length: 1000000
storage:
mode: listing
size: 53687091200
admission:
enabled: true
capacity: 2000000
sample_multiplier: 4
shards: 256
min_table_len_per_shard: 65536
door_bits_per_counter: 12
eviction:
enabled: true
replicas: 32
soft_limit: 0.8
hard_limit: 0.99
check_interval: 100ms
lifetime:
enabled: true
ttl: 2h
on_ttl: refresh
beta: 0.35
rate: 1000
replicas: 32
coefficient: 0.25
observability:
enabled: true
service_name: advCache.local
service_version: dev
service_tenant: star
exporter: http
endpoint: 127.0.0.1:4318 # <-- CHANGEME: your OTEL Collector (http/4318 or grpc/4317)
insecure: true
sampling_mode: ratio
sampling_rate: 0.1
export_batch_size: 512
export_batch_timeout: 3s
export_max_queue: 1024
forceGC:
enabled: true
interval: 6m
metrics:
enabled: true
k8s:
probe:
timeout: 5s
rules:
/api/v2/pagedata:
cache_key:
query:
- project
- language
- timezone
headers:
- Accept-Encoding
cache_value:
headers:
- Vary
- Server
- Content-Type
- Content-Length
- Content-Encoding
- Cache-Control
What to change first:
cache.api.port
— the port advCache listens on.cache.upstream.backend.addr
— point to your origin.cache.compression.enabled
— enable if latency budget allows (runtime‑toggle also available).cache.observability.*
— setenabled: true
andendpoint
of your OTEL Collector; adjust sampling.cache.admission.enabled
— true to protect hot set; details of TinyLFU/Doorkeeper in the main config comments.cache.upstream.policy
— bothdeny
andawait
are production‑ready; choose behavior:deny
→ fail‑fast under pressure (good for synthetic load / when back‑pressure is handled elsewhere).await
→ apply back‑pressure (preferred default in many prod setups).
Full field descriptions and advanced knobs are documented inline in the canonical
advcache.cfg.yaml
.
- Main:
GET /{any:*}
- Health:
GET /k8s/probe
- Metrics:
GET /metrics
(Prometheus/VictoriaMetrics) - Bypass:
/cache/bypass
,/on
,/off
- Compression:
/cache/http/compression
,/on
,/off
- Config dump:
/cache/config
- Entry by key:
/cache/entry?key=<uint64>
- Clear (two‑step):
/cache/clear
→ then/cache/clear?token=<...>
- Invalidate:
/cache/invalidate
(supportsX-Entries-Remove
and_path
) - Upstream policy:
/cache/upstream/policy
,/await
,/deny
- Evictor:
/cache/eviction
,/on
,/off
,/scale?to=<n>
- Lifetime manager:
/cache/lifetime-manager
,/on
,/off
,/rate?to=<n>
,/scale?to=<n>
,/policy
,/policy/remove
,/policy/refresh
- Force GC:
/cache/force-gc
,/on
,/off
,/call
- Admission:
/cache/admission
,/on
,/off
- Tracing:
/cache/observability
,/on
,/off
- Spans:
ingress
(server),upstream
(client on miss/proxy),refresh
(background). - When disabled: fast no‑op provider (atomic toggle only).
- When enabled: stdout exporter → sync; OTLP (
grpc
/http
) → batch exporter.
Enable quickly: set in YAML and/or toggle at runtime:
GET /cache/observability/on # enable tracing now
GET /cache/observability/off # disable tracing
go build -o advCache ./cmd/main.go
./advCache -cfg ./advcache.starter.yaml
# Docker
docker build -t advcache .
docker run --rm -p 8020:8020 -v "$PWD/public/dump:/app/public/dump" advcache -cfg /app/advcache.starter.yaml
- Local (4–6 CPU, 1–16KB docs, 20–25GB store): 160–170k RPS steady.
- Bare‑metal (24 CPU, 50GB store, prod traffic): ~250k RPS sustained.
- Memory overhead at 50GB: 1.5–3GB (no traces) • ~7GB (100% sampling).
go test -race ./tests -count=1 -run .
Apache‑2.0 — see LICENSE.
Maintainer: Borislav Glazunov — [email protected] · Telegram @gl_c137