-
Notifications
You must be signed in to change notification settings - Fork 4.2k
fix: VPA add readiness and liveness for vpa admission #8346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -151,6 +151,10 @@ func main() { | |||
as.Serve(w, r) | |||
healthCheck.UpdateLastActivity() | |||
}) | |||
http.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) { | |||
healthCheck.ServeHTTP(w, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just respond with 200, but we were already using the healthcheck struct so assumed we should just use it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a /health-check
endpoint that all three VPA components have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh got it, will update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this cause an issue when no pods are admitted for a minute(idle cluster)? the probes would start failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not as far as I can tell. Why? Have you seen evidence of that happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the admission-controller never runs healthCheck.StartMonitoring()
, so that situation will never happen.
The other components do run that, and they also operate on a loop, so they keep updating the lastActivity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh missed that. so it should be good as is then. thank you!
Thanks for the update. Since we are already updating the admission controller, can we also update other things at the same time? (like the recommender and updater) |
livenessProbe: | ||
httpGet: | ||
path: /health-check | ||
port: prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this port is weird, since it's mixed used.
I don't know if that happens much though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, seemed safer since port is a flag in code.
Could you update the release note? |
/lgtm cc @omerap12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Agree on the port name though
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jklaw90, omerap12 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Server doesn't have a health check, so gets traffic before its ready causing failed open requests.
Which issue(s) this PR fixes:
Fixes #8344
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: