-
Notifications
You must be signed in to change notification settings - Fork 118
feat(flowcontrol): Implement registry shard #1187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(flowcontrol): Implement registry shard #1187
Conversation
This commit introduces the core operational layer of the Flow Registry's sharded architecture. It deliberately focuses on implementing the `registryShard`—the concurrent, high-performance data plane for a single worker—as opposed to the top-level `FlowRegistry` administrative control plane, which will be built upon this foundation. The `registryShard` provides the concrete implementation of the `contracts.RegistryShard` port, giving each `FlowController` worker a safe, partitioned view of the system's state. This design is fundamental to achieving scalability by minimizing cross-worker contention on the hot path. The key components are: - **`registry.Config`**: The master configuration blueprint for the entire `FlowRegistry`. It is validated once and then partitioned, with each shard receiving its own slice of the configuration, notably for capacity limits. - **`registry.registryShard`**: The operational heart of this commit. It manages the lifecycle of queues and policies within a single shard, providing the read-oriented access needed by a `FlowController` worker. It ensures concurrency safety through a combination of mutexes for structural changes and lock-free atomics for statistics. - **`registry.managedQueue`**: A stateful decorator that wraps a raw `framework.SafeQueue`. Its two primary responsibilities are to enable the sharded model by providing atomic, upwardly-reconciled statistics, and to enforce lifecycle state (active vs. draining), which is essential for the graceful draining of flows during future administrative updates. - **Contracts and Errors**: New sentinel errors are added to the `contracts` package to create a clear, stable API boundary between the registry and its consumers. This work establishes the robust, scalable, and concurrent foundation upon which the top-level `FlowRegistry` administrative interface will be built.
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @LukeAVanDrie. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test How are we sharding exactly? |
Great question. Here’s the high-level overview of the sharding strategy. The Short Answer We shard by running multiple parallel worker loops ( When a request arrives, a top-level distributor sends it to the shard with the lightest load for that specific flow. This prevents a central bottleneck and allows us to process requests in parallel. Think of it as having multiple, independent dispatchers instead of just one. How it Works: Join-the-Shortest-Queue-by-Bytes (JSQ-Bytes) The distributor uses a Join-the-Shortest-Queue-by-Bytes (JSQ-Bytes) algorithm.
This balances the load (in terms of memory pressure and queuing capacity) across the shards in real-time. Current Status & Next Steps This PR implements the Deeper Dive on Design RationaleThe design of the Flow Controller is built on this sharded architecture to enable parallel processing and prevent the central dispatch loop from becoming a bottleneck at high request rates. Critical Assumption: Workload Homogeneity Within Flows The effectiveness of the sharded model hinges on a critical assumption: while the system as a whole manages a heterogeneous set of flows, the traffic within a single logical flow is assumed to be roughly homogeneous. A logical flow represents a single workload or tenant, so we expect request characteristics to be statistically similar within that flow. The JSQ-Bytes distribution makes this assumption robust by:
Stateful Policies in a Sharded Registry Sharding means that when the critical assumption holds, the independent actions of policies on each shard result in emergent, approximate global fairness. Achieving true, deterministic global fairness would require policies to depend on an external, globally-consistent state store, which adds significant complexity. The |
worth considering a waterfall algorithm, meaning create multiple shards, but the requests should be added to one shard until it is at capacity and then spill to the next one. |
This PR is fine, but again, hard to reason about without the full picture. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, LukeAVanDrie The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
That's a great point, and it's a valid architectural pattern for certain types of systems. I considered a "fill-and-spill" (waterfall) approach during the design phase, but I concluded that a "spread" approach using JSQ-Bytes is a better fit for the specific goals of the Flow Controller. The core trade-off comes down to what we are optimizing for. A waterfall algorithm is excellent for systems that need to optimize for resource packing or locality. However, the primary goals of our Flow Controller are high throughput, low latency, and fairness between concurrent flows. The waterfall model presents challenges for all three of these goals in our specific context:
To summarize the trade-off:
For these reasons, I strongly believe that our current JSQ-Bytes "spread" approach is the correct choice to meet the core requirements of this component. |
Thanks for the LGTM and approval, Abdullah! I agree, this The next PR I'm preparing introduces the execution engine that will consume this state. The reason for the Once you see how a worker uses the |
This work tracks #674
This pull request introduces the operational layer of the new sharded Flow Registry architecture. It delivers the concurrent, high-performance "data plane" that each
FlowController
worker will interact with, paving the way for the top-level administrative control plane (FlowRegistry
) to be built in a subsequent PR.The core architectural principle here is the separation of concerns between the global registry's administrative functions and the per-worker operational view. This PR focuses exclusively on the latter by implementing the
registryShard
.The key components introduced are:
registry.Config
: A new configuration object that defines the master layout for the entire registry. It includes robust validation for checking compatibility between policies and queues, defaulting logic, and apartition()
method to distribute capacity allocations across all shards.registry.registryShard
: This is the heart of the PR. It is a concrete and concurrent-safe implementation of thecontracts.RegistryShard
interface. It provides aFlowController
worker with its partitioned view of the system state, managing all flow queues and policies within its boundary. Concurrency is managed via aRWMutex
for structural map changes and lock-freeatomic
operations for all statistics, ensuring high performance on the hot path.registry.managedQueue
: A critical decorator that wraps aframework.SafeQueue
. It serves two essential functions in the architecture:active
vs.draining
). This is a forward-looking feature that is essential for enabling graceful administrative actions in the future, such as changing a flow's priority or decommissioning a shard without losing requests.Contracts and Errors: New sentinel errors (
ErrFlowInstanceNotFound
,ErrPriorityBandNotFound
,ErrPolicyQueueIncompatible
) have been added to thecontracts
package to create a stable and predictable API for consumers of the registry.Testing
All new components are accompanied by comprehensive unit tests, which can be found in
pkg/epp/flowcontrol/registry/
. The tests cover:managedQueue
behavior, including statistics reconciliation and concurrency safety.registryShard
accessors, error paths, and statistics aggregation.