-
Notifications
You must be signed in to change notification settings - Fork 66
✨ Metrics Summary #2134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Metrics Summary #2134
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
37b6efa
to
f6a3350
Compare
@@ -22,13 +22,13 @@ spec: | |||
annotations: | |||
description: "container {{ $labels.container }} of pod {{ $labels.pod }} experienced OOM event(s); count={{ $value }}" | |||
- alert: operator-controller-memory-growth | |||
expr: deriv(sum(container_memory_working_set_bytes{pod=~"operator-controller.*",container="manager"})[5m:]) > 50_000 | |||
expr: deriv(sum(container_memory_working_set_bytes{pod=~"operator-controller.*",container="manager"})[5m:]) > 100_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values were too sensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might just note this in the commit message or put in another commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call out 👍 I'll add a note to the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the following to the commit and the PR description:
Extra: Tuned prometheus alerts to be less sensitive to memory growth. The tests will naturally cause an additional memory footprint at the beginning of the e2e, so we need to account for that somehow. Also tagged a couple of images we were implicitly using 'latest' versions of so nodes won't have to pull them on every test run.
@@ -129,15 +129,15 @@ func (c *MetricsTestConfig) getServiceAccountToken(t *testing.T) string { | |||
func (c *MetricsTestConfig) createCurlMetricsPod(t *testing.T) { | |||
t.Logf("Creating curl pod (%s/%s) to validate the metrics endpoint", c.namespace, c.curlPodName) | |||
cmd := exec.Command(c.client, "run", c.curlPodName, | |||
"--image=curlimages/curl", | |||
"--image=curlimages/curl:8.15.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don't force kind
to pull this image every time we create the pod.
@@ -58,7 +58,7 @@ spec: | |||
terminationGracePeriodSeconds: 0 | |||
containers: | |||
- name: busybox | |||
image: busybox | |||
image: busybox:1.36 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same logic as with the curl
pod.
f6a3350
to
4961a42
Compare
This is very cool |
4961a42
to
5bbcfa9
Compare
That's cool! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2134 +/- ##
=======================================
Coverage 72.83% 72.83%
=======================================
Files 79 79
Lines 7340 7340
=======================================
Hits 5346 5346
Misses 1645 1645
Partials 349 349
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adds a util to the e2e suite which queries prometheus at the end of the test run for alerts and metrics data. This data is then processed into markdown which is displayed to the contributor at the end of their test runs. Extra: Tuned prometheus alerts to be less sensitive to memory growth. The tests will naturally cause an additional memory footprint at the beginning of the e2e, so we need to account for that somehow. Also tagged a couple of images we were implicitly using 'latest' versions of so nodes won't have to pull them on every test run. Signed-off-by: Daniel Franz <[email protected]>
5bbcfa9
to
a28cab6
Compare
/lgtm |
Description
Adds a util to the e2e suite which queries prometheus at the end of the test run for alerts and metrics data. This data is then processed into markdown and displayed to the contributor at the end of their test runs.
Extra: Tuned prometheus alerts to be less sensitive to memory growth. The tests will naturally cause an additional memory footprint at the beginning of the e2e, so we need to account for that somehow. Also tagged a couple of images we were implicitly using 'latest' versions of so nodes won't have to pull them on every test run.
The principal idea here is that if we are to fail a test run based on performance results, then those results need to be easily visible to the contributor to avoid a potentially frustrating development experience.
The markdown will be visible here when the e2e completes.
Example:
Reviewer Checklist