- 
                Notifications
    You must be signed in to change notification settings 
- Fork 84
Description
Hello!
In bigger k8s cluster we noticed very chaotic behavior when grpc client uses kuberesolver. There was no load balancing, just simple round robin used, but backend pods received very inconsistent amount of requests (which also changed over time). Given there were many pods making requests (using kuberesolver) and many backend pods we've expected more or less equal distribution of requests.
Upon investigation we believe endpoint slices implementation in kuberesolver is broken: it would seem that watching endpointslices object in k8s assumes that whenever changed endpointslice object is received it contains list of all endpoints (whole state).
This is true only when count of pods is low - single endpointslice may contain up to 100 endpoints (configurable in api-server).
When there are hundreds of pods there are many endpointslices and all of them should be used.
Basically it appears kuberesolver only uses a subset of endpoints at any given time: specifically endpoints from endpointslice that was modified most recently. This:
- limits endpoints provided by kuberesolver by 100 (no matter if there are 200 or 5000 endpoints in reality)
- causes chaotic changes when endpointslices are being modifed by k8s
This can be observed also in kuberesolver_endpoints_total metric, which never surpasses 100 and also generally returns count of endpoints in endpointslice that changed in k8s server most recently
How to reproduce:
- use kuberesolver with service with > 100 pods, check behavior and/or kuberesolver metrics