Proposal: Sharded cache cluster support #8031

CVanF5 · 2025-07-16T13:23:39Z

CVanF5
Jul 16, 2025

The following is a design proposal that describes a way for F5 Kubernetes Ingress to implement the Shared Caches with NGINX Plus Cache Clusters architecture described in the link.

This approach:

Implements a two-tier architecture (load balancer tier → cache tier → origin server), similar to the dual virtual server ("Tiered") design outlined in the referenced F5 blog.
Uses consistent hash-based load balancing between tiers to ensure efficient use of caches.
Does not require a separate deployment of caching NGINX Pods

Fundamentally the two-tier architecture relies on dual NGINX server directives that could be expressed in the following way

# nginx.conf

# Tier 1: Load Balancer Tier
upstream cache_tier {
    hash $request_uri consistent;
    server 10.1.2.10:8081; # self, but different port for cache tier
    server 10.1.2.11:8081; # remote replica, each replica advertised via service/endpoints (ClusterIP: None)
}

server {
    listen 8080;
    server_name cache-example.example.com;

    location / {
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_pass http://cache_tier;  # consistent hashing upstream
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

# Tier 2: Cache Tier (on separate internal listening port)
proxy_cache_path /var/cache/nginx/shared_cache keys_zone=APP_CACHE:500m inactive=24h;

upstream origin_servers {
    server backend-app-service.default.svc.cluster.local:80;
}


server {
    listen 8081;
    server_name cache-tier-cache-example-internal;

    location / {
        proxy_cache APP_CACHE;
        proxy_cache_key "$scheme$host$request_uri";
        proxy_cache_valid 200 302 60m;
        proxy_cache_use_stale updating error timeout invalid_header http_500 http_502 http_503 http_504;
        proxy_pass http://origin_servers;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

With this design a client connection can land on any Ingress replica on tier 1 (load balancing tier). NGINX will then use consistent-hash to select a server from tier 2 (cache tier) which can be itself or any other Ingress replica. Tier 2 has the desired proxy_cache settings and can respond with cache should it be available. The upstream for tier 2 is the origin service.

There are 2 ways to visualise this architecture:

From a virtual perspective Tier 1 and 2 are separate hops.

graph LR
    Ingress["Client"]
    LB["Load Balancing Tier (Tier-1 server directive)"]
    Cache["Cache Tier (Tier-2 server directive)"]
    Origin["Origin (upstream svc)"]

    Ingress --> LB
    LB -->|"consistent-hash"| Cache
    Cache --> Origin

From a Kubernetes user perspective tier 1 and 2 are the same ReplicaSet

graph LR
    Ingress["Client"]
    LB["Ingress Deployment Tier 1 and Tier 2 combined"]
    Origin["Origin (upstream svc)"]

    Ingress --> LB
    LB--> Origin

It should noted that this design is not a "global shared cache" but more of a "distributed partitioned cache/sharded cache system" where each NGINX replica has its own local on disk cache that's warmed up individually over time via the consistent-hash algorithm.

The advantages of this solution:

Leverages proven NGINX consistent-hash algoithm
Fault tolerant and scalable; NGINX will automatically rebalance the cache as replica pods are added or removed
Integrates easily with the proposed cache-policy

This mode could be activated with a mode: sharded directive. For example:

apiVersion: k8s.nginx.org/v1
kind: Policy
metadata:
  name: sharded-cache-policy
spec:
  cache:
    mode: "sharded" # mode: "local" or "sharded"
    excludedResponseCodes:
      - "404"
      - "300-399"
    cachePath: /cache
    size: 10m

For implementation for the controller needs to generate the dual server directives and a Kubernetes service with ClusterIP: None for the cache tier when the cache mode is set to sharded

Any feedback is welcome!

brianehlert · 2025-07-16T15:51:53Z

brianehlert
Jul 16, 2025
Maintainer

Now we get into the concerns of: possible vs complex.

Which leads us down to: how easy is this to support, can we prove to the customer that the sharded options is truly optimizing something for them (these two options give a noticeable / provable different experience for their customers)

We also have to recognize that K8s constantly rebalances pods and this design and the blog originally assumed long lived machines.
Does that change the math at all? Does the dynamic nature of K8s cause this system to be continually rebalancing?

Am I following the proposal...

0 replies

vepatel · 2025-07-16T17:20:58Z

vepatel
Jul 16, 2025
Collaborator

@CVanF5 does it addresses manual cache purging issue (i.e. PURGE request goes to the pods where the cache was stored)? wouldn't the hash change if a pod goes down or a new one comes up?

1 reply

CVanF5 Jul 17, 2025
Author

So the goal here is to direct a PURGE command to the correct NGINX replica (pod).

Consistent Hash works by taking "$request_uri" and hashing it, so we can send a PURGE method to the same URI and it should be directed to the correct Pod.

A privileged user or the Ingress controller could send a request like curl -X PURGE https://www.example.com/product/item123/details

Tier-1 would direct the request to the correct cache tier Pod based on /product/item123/details , and the cache tier Pod can have an if statement to purge the hash key when the method is PURGE

Cache tier nginx.conf example

location / {
    proxy_cache APP_CACHE;
    proxy_cache_key "$scheme$host$request_uri";
    proxy_pass http://origin_servers;
    proxy_set_header X-Real-IP $remote_addr;

    if ($request_method = PURGE) {
        allow INTERNAL_NETWORK; # Ensure external users can't send purge requests
        deny all;
        proxy_cache_purge APP_CACHE "https$host$request_uri";
        return 204;
    }
}

brianehlert · 2025-07-16T17:29:03Z

brianehlert
Jul 16, 2025
Maintainer

"not a global shared cache, but behaves like one"
This is an interesting point. NGINX has never recommended a monolithic single shared volume for cache, in fact history has advised against it. Which is the unspoken part of the referenced blog.

There is still an unspoken need for pod volume persistence across pod restarts. StatefulSet for example.

1 reply

CVanF5 Jul 17, 2025
Author

To this point and your earlier one regarding k8s continuous rebalancing I think can be addressed with StatefulSet support and persistent volumes for the /cache directory.

As the replicas fail the volume will be preserved and the replacement Pod will attach itself to the volume. Any cache that's expired or out of date will result in some additional requests to the origin server until the cache is "warmed-up". It will only be that Pod's shard of the cache, so a reduced impact to the origin server.

For scale up the new Pod will be in a cold start scenario, and the consistent-hash algorithm will send a percentage of the requests to the new Pod. Those requests will be initially sent to the origin server and will reduce as the cache builds up.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Sharded cache cluster support #8031

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Proposal: Sharded cache cluster support #8031

Uh oh!

Uh oh!

CVanF5 Jul 16, 2025

Replies: 3 comments · 2 replies

Uh oh!

Uh oh!

brianehlert Jul 16, 2025 Maintainer

Uh oh!

vepatel Jul 16, 2025 Collaborator

Uh oh!

CVanF5 Jul 17, 2025 Author

Uh oh!

brianehlert Jul 16, 2025 Maintainer

Uh oh!

Uh oh!

CVanF5 Jul 17, 2025 Author

CVanF5
Jul 16, 2025

Replies: 3 comments 2 replies

brianehlert
Jul 16, 2025
Maintainer

vepatel
Jul 16, 2025
Collaborator

CVanF5 Jul 17, 2025
Author

brianehlert
Jul 16, 2025
Maintainer

CVanF5 Jul 17, 2025
Author