-
Notifications
You must be signed in to change notification settings - Fork 220
Fix possibly excessive memory use when computer test result overview #6850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
It looks like the current version of the test results overview actually doesn't return results across all builds via I'll review whether the new behavior makes sense in the different test cases (manual and unit tests). So for the new behavior makes sense in my manual tests which would mean the current code is already cutting corners somewhere which now seems no longer necessary. EDIT: I've also just tested this with OSD data. The version of this PR keeps the memory usage limited (the final RSS of the process is 178.7 MiB) and it is notably faster than the current version (which ends up with an RSS of 7.4 GiB). With OSD data the difference in the jobs being present on I also get meaningful results for e.g. The only regression I've seen so far is a missing EDIT: The regression about the missing |
c33d9f5 to
a8e88bc
Compare
|
Looks like the current code determines the latest build automatically when no I also added the following to the commit message because this in one of the remaining differences and I haven't found a good way to avoid it:
|
|
I almost fixed all mistakes but the test |
e78ad16 to
8bcf758
Compare
|
The case tested via the sub test I now also covered the API route that is the equivalent of the overview page. With this the only remaining unaltered usage of With all these changes to the PR the memory usage of |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #6850 +/- ##
=======================================
Coverage 99.26% 99.26%
=======================================
Files 402 402
Lines 41493 41522 +29
=======================================
+ Hits 41187 41218 +31
+ Misses 306 304 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
8bcf758 to
572f5a4
Compare
572f5a4 to
ef8c95b
Compare
ef8c95b to
79f52e8
Compare
| @jobs = @jobs[0 .. ($limit - 1)] if $limit_exceeded; | ||
| my @jobids = map { $_->id } @jobs; | ||
| my @jobs = $jobs->all; | ||
| my $preferred_machines = _calculate_preferred_machines(\@jobs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note. The $jobs are used here only for calculate the $preferred_machines which is used later for this condition
&& $preferred_machines->{$job->ARCH}
&& $preferred_machines->{$job->ARCH} ne $job->MACHINE)```
So the `_prepare_job_results` needs this only and not the whole ResultSet. If you call `_calculate_preferred_machines(\@jobs);` before https://github.com/Martchus/openQA/blob/79f52e8c934da96eaa20e95fca56a8891dfc361a/lib/OpenQA/WebAPI/Controller/Test.pm#L848C5-L848C94 and just pass this. I think that way, it would simplify the dependencies between the invocations and the functions would be more clear.
But lets not change anything now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The $jobs are used here only for calculate …
The "only" in this sentence makes it wrong. Just search for @jobs in the subsequent code of this function; you'll find 4 occurrences.
d3flex
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished with the review. I would approve but I would like to ask you for a tiny typo in the 3rd commit. it says That means a jobs will no longer show as. it meant to be without a, right? Other than that I am ready to approve it
When accessing the test results overview page (especially without parameters, e.g. http://localhost:9526/tests/overview) a very huge set of jobs is considered. This leads to a very excessive memory usage, e.g. with current OSD data the RSS rises over 7 GiB and the request takes very long. There's already a limit of 2000 jobs which is also effective but unfortunately not passed to `complex_query` and only used later on. I tracked the memory usage while running the code with debug printing it it was very clear that `complex_query` is the culprit. This change uses the limit from the start. This means additional filtering done in Perl code also needs to happen as part of the now limited initial query (using a sub query, see the added comment). So this change replaces these Perl loops/grep with passing the filter parameters directly to the database. This slightly changes the order of things but should lead to the same end result. With this change the "preferred" machine (basically the most frequent machine per architecture) is only determined based on the jobs that are actually displayed. That means jobs will no longer show as e.g. `kde@uefi` when filtering for `machine=uefi` because with this filter the machine `uefi` becomes the preferred (most frequent) machine (as now all displayed jobs have that machine). I think this change is acceptable. It means that some tests had to be adjusted accordingly. Related ticket: https://progress.opensuse.org/issues/192448
591da33 to
c018f21
Compare
|
I fixed the typo in the 3rd commit. |
When accessing the test results overview page (especially without parameters, e.g. http://localhost:9526/tests/overview) a very huge set of jobs is considered. This leads to a very excessive memory usage, e.g. with current OSD data the RSS rises over 7 GiB and the request takes very long. There's already a limit of 2000 jobs which is also effective but unfortunately not passed to
complex_queryand only used later on. I tracked the memory usage while running the code with debug printing it it was very clear thatcomplex_queryis the culprit.This change uses the limit from the start. This means additional filtering done in Perl code also needs to happen as part of the now limited initial query. So this change replaces these Perl loops/grep with passing the filter parameters directly to the database. This slightly changes the order of things but should lead to the same end result.
Still a draft as I'll also have to take care of other places where we use
complex_queryin a similar way. I probably also still need to change some details depending on test results. If I took care of other places I can probably also remove code that is then no longer required, e.g.latest_jobs.Related ticket: https://progress.opensuse.org/issues/192448