fix(certgen): trigger rolling restart of Rate Limit on cert rotation#8535
Open
OliverBailey wants to merge 2 commits intoenvoyproxy:mainfrom
Open
fix(certgen): trigger rolling restart of Rate Limit on cert rotation#8535OliverBailey wants to merge 2 commits intoenvoyproxy:mainfrom
OliverBailey wants to merge 2 commits intoenvoyproxy:mainfrom
Conversation
✅ Deploy Preview for cerulean-figolla-1f9435 canceled.
|
a0fe117 to
1cff50f
Compare
… disruption When certgen --overwrite rotates certificates, the ca.crt field of each control-plane Secret was replaced atomically with the new CA. Kubernetes propagates Secret updates to pods via the kubelet volume sync loop, and Envoy reloads its xDS TLS context via SDS: neither is instantaneous. During the convergence window, pods that have picked up a new leaf cert (signed by the new CA) are rejected by peers that still hold only the old CA in their trust store, causing mTLS authentication failures. This is the backwards-incompatible rotation problem described in envoyproxy#4891 and reproduced on v1.6.1 by users in that thread. Fix: when updating an existing Secret that already contains a ca.crt, bundle the outgoing CA together with the incoming CA so that every component trusts both during the transition. Concretely, CreateOrUpdate Secrets now calls bundleCACerts(newCA, oldCA) which: 1. Starts the bundle with all certs from newCA (the freshly generated CA). 2. Appends the first non-expired, non-duplicate cert from oldCA (the CA that was active at the previous rotation). 3. Skips any further certs from oldCA. The cap of one carry-over cert keeps the bundle at a maximum of two entries regardless of how frequently rotations occur. The reasoning is: by the time an operator runs certgen --overwrite a second time, all components (kubelet sync period + SDS reload) will have converged on the certs written during the first rotation. The CA from two rotations ago is therefore never needed, and carrying it forward indefinitely would cause unbounded bundle growth for long-lived CAs (e.g. the default 5-year lifetime). The single carry-over is dropped automatically at the rotation after it would have been needed. The HMAC secret (envoy-oidc-hmac) carries no ca.crt and is unaffected. Fixes envoyproxy#4891 (partial — Rate Limit CA hot-reload addressed separately) Signed-off-by: Oliver Bailey <github@obailey.co.uk>
Rate Limit loads its CA certificate once at startup and does not watch the mounted Secret volume for changes. After certgen --overwrite rotates certificates, the kubelet updates the /certs volume on disk, but the running Rate Limit process continues verifying client certs against the old CA in memory. Any Envoy pod that has already reloaded its new leaf cert via SDS is subsequently rejected by Rate Limit, causing mTLS failures that persist until Rate Limit is manually restarted. This was the root cause of the incident described in envoyproxy#4891 where the failure was observed after a weekend rotation: the pods had been running long enough that the previous restart (which would have loaded the fresh CA) was well in the past. Fix: after writing the rotated Secrets, patch the Rate Limit Deployment's pod-template annotation with the current timestamp. This is the same mechanism used by kubectl rollout restart. Kubernetes will then perform a rolling replacement of Rate Limit pods using the Deployment's existing RollingUpdate strategy, which by default keeps at least 75% of replicas available at all times and respects any PodDisruptionBudget the operator has configured. The restart is gated on --overwrite so it does not fire on the initial install (where Rate Limit has just started with the correct certs). If the Rate Limit Deployment does not exist (Rate Limit not enabled) the function is a no-op. Note: this fix depends on the CA bundling change introduced in fix/ca-bundle-rotation. During the rolling restart, old and new Rate Limit pods run concurrently for a brief period. The CA bundle (new CA + previous CA) written by the prior fix ensures that Envoy can authenticate against both the old and new Rate Limit pod throughout the overlap window. Fixes envoyproxy#4891 Signed-off-by: Oliver Bailey <github@obailey.co.uk>
1cff50f to
c014347
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the fix for #4891.
Problem
Rate Limit loads its CA certificate once at startup and does not watch the mounted Secret volume for changes. After
certgen --overwriterotates certificates:/certsvolume on disk for the Rate Limit pod.Unlike Envoy (which uses SDS path-based reload) or Envoy Gateway (which uses
GetConfigForClientto re-read certs per-connection), Rate Limit has no equivalent hot-reload path for its CA.This is the root cause of the incident described in #4891: the failure was observed after a weekend rotation — pods had been running long enough that no natural restart had occurred since the previous cert write.
Fix
After writing rotated Secrets (
--overwrite), patch the Rate Limit Deployment's pod-template annotation with the current timestamp. This is identical to whatkubectl rollout restartdoes:Kubernetes performs a rolling replacement of Rate Limit pods using the Deployment's existing
RollingUpdatestrategy. By default this keeps at least 75% of replicas available at all times and respects anyPodDisruptionBudgetthe operator has configured.The restart is gated on
--overwriteso it does not fire on the initial install, where Rate Limit has just started with the correct certs. If the Rate Limit Deployment does not exist (Rate Limit not enabled), the function is a no-op.Why this depends on #8534
During the rolling restart, old and new Rate Limit pods run concurrently for a brief overlap period. The CA bundle written by #8534 (
[newCA, previousCA]) ensures Envoy can authenticate against both old and new Rate Limit pods throughout that window. Without the bundle, the rolling restart itself would cause a brief mTLS failure as new Rate Limit pods come up with new leaf certs before Envoy has reloaded the new CA.Note on single-replica deployments
If Rate Limit has 1 replica and no PodDisruptionBudget, there will be a brief gap between the old pod terminating and the new pod becoming ready. This is a pre-existing limitation of single-replica deployments and is not introduced by this change. Users with availability requirements should configure at least 2 replicas or a PDB with
minAvailable: 1.