Skip to content

Graceful shutdown behavior is undocumented and untested under Kubernetes lifecycle #70

@avelino

Description

@avelino

Problem

The proxy handles SIGTERM and SIGINT with graceful shutdown logic (via src/serve.rs):

  1. Stop accepting new connections
  2. Finish in-flight requests
  3. Drain all connected backends in parallel
  4. Each backend gets 5 seconds for graceful shutdown() before being force-killed via kill_on_drop

This works well on bare metal and Docker, but Kubernetes has specific shutdown semantics that introduce edge cases:

1. terminationGracePeriodSeconds vs backend count

Kubernetes sends SIGTERM, then waits terminationGracePeriodSeconds (default: 30s) before sending SIGKILL. The proxy shuts down backends in parallel with a 5s timeout each, but:

  • If multiple backends stall on shutdown, the parallel join still waits up to 5s total (not 5s × N) — this is fine
  • However, the proxy also has a 10s overall shutdown timeout (src/serve.rs:1333): if the full drain exceeds 10s, it logs "shutdown timed out — forcing exit" and calls process::exit(1)
  • The process::exit(1) bypasses any remaining Kubernetes lifecycle hooks and may not flush all logs

The interaction between the proxy's internal 10s timeout, Kubernetes' default 30s grace period, and any preStop hooks is undocumented and untested.

2. preStop hook and connection draining

Kubernetes removes the pod from the Service endpoints asynchronously from the SIGTERM signal. This means:

  • New requests can arrive at the pod after SIGTERM is sent
  • The proxy stops accepting connections immediately on SIGTERM — these late requests get connection refused
  • The standard mitigation is a preStop hook with a small sleep to allow endpoint propagation

There's no guidance or default preStop configuration for this.

3. Stdio backends and PID namespace

When running in Kubernetes, the proxy spawns stdio backends as child processes. If the container uses a shared PID namespace (shareProcessNamespace: true), or if the container runtime sends signals to all processes in the cgroup, stdio backends may receive SIGTERM before the proxy has a chance to shut them down gracefully.

The proxy's kill_on_drop guarantee assumes it controls the lifecycle of its children. External signal delivery to children can cause:

  • Backends exiting before the proxy sends shutdown()
  • Broken pipe errors on the proxy's stdin/stdout transport
  • Race conditions in the reaper task

4. No readiness gate for startup

The proxy binds and starts serving immediately, but backend discovery is async and lazy. On startup in Kubernetes:

  • The pod reports Ready (assuming a simple /health check) before any backends are connected
  • The first request triggers backend connection, adding latency
  • If backend connection fails, the first N requests get errors while the proxy reports healthy

There's no startup probe configuration or readiness gate that accounts for the lazy initialization model.

Related issues

Expected behavior

The proxy's shutdown behavior should be documented and tested under Kubernetes semantics: SIGTERM + grace period + preStop hooks + async endpoint removal. Any edge cases (late requests, PID namespace, startup readiness) should be explicitly addressed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationDocs improvements or additionsenhancementNew feature or improvementinfrastructureDocker, Kubernetes, deploymentkubernetesKubernetes manifests, Helm, and cluster deploymentproxyServe/proxy mode (mcp serve)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions