Skip to content

Conversation

@d3flex
Copy link
Contributor

@d3flex d3flex commented Oct 22, 2025

With this commit once the job goes to its end, it will check if it can restarted and if it fulfills the criteria, it will send the new job to the minion queue via enqueue_restart, where now it should populate the event.

issue: https://progress.opensuse.org/issues/190557

@d3flex
Copy link
Contributor Author

d3flex commented Oct 22, 2025

  • I want to see where it will fail (some test failed on my side) and the codecov report
  • also I cant see where a test for this should go. there is t/10-jobs.t as an obvious candidate but it doesnt seem to cover the events

@codecov
Copy link

codecov bot commented Oct 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.26%. Comparing base (1801a6d) to head (4300915).
⚠️ Report is 16 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6816   +/-   ##
=======================================
  Coverage   99.26%   99.26%           
=======================================
  Files         402      402           
  Lines       41396    41460   +64     
=======================================
+ Hits        41090    41154   +64     
  Misses        306      306           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@perlpunk
Copy link
Contributor

  • also I cant see where a test for this should go. there is t/10-jobs.t as an obvious candidate but it doesnt seem to cover the events

t/23-amqp.t looks right in this case. Take a job and call enqueue_restart on it

Copy link
Member

@okurz okurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests

@d3flex d3flex force-pushed the feat/amqp_restart branch from 8632b55 to e5e5fd2 Compare October 23, 2025 07:44
Copy link
Member

@okurz okurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks good so far. I suggest you do a manual system test with a local openQA instance and ensure that no duplicate events are sent.

Also in your commit message I don't quite understand "With this commit once the job goes to its end, it will check if it can restarted and if it fulfills the criteria". What "end" and what "criteria"? Also "check if it can restarted" is grammatically incorrect

@d3flex d3flex force-pushed the feat/amqp_restart branch from e5e5fd2 to 44e622e Compare October 23, 2025 08:12
@d3flex
Copy link
Contributor Author

d3flex commented Oct 23, 2025

code looks good so far. I suggest you do a manual system test with a local openQA instance and ensure that no duplicate events are sent.

Also in your commit message I don't quite understand "With this commit once the job goes to its end, it will check if it can restarted and if it fulfills the criteria". What "end" and what "criteria"? Also "check if it can restarted" is grammatically incorrect

image

@d3flex d3flex force-pushed the feat/amqp_restart branch from 44e622e to 0b69cf5 Compare October 23, 2025 13:05
@d3flex
Copy link
Contributor Author

d3flex commented Oct 23, 2025

code looks good so far. I suggest you do a manual system test with a local openQA instance and ensure that no duplicate events are sent.

Also in your commit message I don't quite understand "With this commit once the job goes to its end, it will check if it can restarted and if it fulfills the criteria". What "end" and what "criteria"? Also "check if it can restarted" is grammatically incorrect

updated.

@d3flex d3flex marked this pull request as ready for review October 23, 2025 13:06
@d3flex d3flex removed the not-ready label Oct 23, 2025
@perlpunk
Copy link
Contributor

With this commit once the job goes to execute its final stage, it will
check if it can be restarted. If yes, it will send the new job to
the minion queue via enqueue_restart, where now it should populate
the event.

The only change in this commit is that, if the job is restarted via enqueue_restart, it now publishes a restart event.
The description sounds overly complicated ("goes to execute its final stage").
" it will check if it can be restarted" - that's already happening without your commit.

How about: "When a job is enqueued to be restarted via the RETRY feature, it now publishes a job.restart event."

@d3flex
Copy link
Contributor Author

d3flex commented Oct 23, 2025

With this commit once the job goes to execute its final stage, it will
check if it can be restarted. If yes, it will send the new job to
the minion queue via enqueue_restart, where now it should populate
the event.

The only change in this commit is that, if the job is restarted via enqueue_restart, it now publishes a restart event. The description sounds overly complicated ("goes to execute its final stage"). " it will check if it can be restarted" - that's already happening without your commit.

How about: "When a job is enqueued to be restarted via the RETRY feature, it now publishes a job.restart event."

I avoid to reffer RETRY because is not the only path to enqueue_restart. are you good with When a job is enqueued to be restarted, it now publishes a job.restart event?

@perlpunk
Copy link
Contributor

When a job is enqueued to be restarted as part of the $job->done handling, a job.restart event is now published

@d3flex d3flex force-pushed the feat/amqp_restart branch 2 times, most recently from c3478ec to c8fe5b4 Compare October 25, 2025 04:12
@d3flex d3flex force-pushed the feat/amqp_restart branch 3 times, most recently from 0499fd0 to b987963 Compare November 4, 2025 09:38
Copy link
Contributor

@Martchus Martchus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the inline comments, this might be much harder to make actually work than we estimated. Maybe we should reevaluate the best way of implementing this and possibly also re-consider whether it is worth it.

sub restart_openqa_job ($minion_job, $openqa_job) {
my $cloned_job_or_error = $openqa_job->auto_duplicate;
my $is_ok = ref $cloned_job_or_error || $cloned_job_or_error =~ qr/(already.*clone|direct parent)/i;
OpenQA::Events->singleton->emit_event('job_restart', data => {id => $openqa_job->id}) if $is_ok;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this actually work we would probably need to make sure the AMQP plugin is loaded in the Gru service. It probably isn't so emitting this event probably has no effect. Now that I think about it, running the AMQP plugin within the Gru service and the processes it forks to run the particular Minion jobs is probably not so easy and could end up becoming quite a mess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good one. Lets take a step back.

It probably isn't so emitting this event probably has no effect

I am not sure about that but the emit_event in lib/OpenQA/Events.pm implies that this method is intended for non-controller contexts, such as Minion background tasks. right?

To make this actually work we would probably need to make sure the AMQP plugin is loaded in the Gru service.

Maybe is enough to make the Openqa::events::emit_events clever and enqueue an event from there? could that possibly work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is intended to be used from non-controller context but not Minion tasks. I think "non-controller context" mainly refers to DBIx result classes or some utility modules.

We probably don't need to make emit_events clever. We just need to make sure that in the process that runs the Minion task the AMQP plugin is initialized (and maybe also other similar plugins for the sake of consistency). The problem with that is just that it might get messy due to the involved forking in combination with having to manage a permanent connection to the AMQP server.

Of course we could actually make emit_event clever and make it invoke some internal route of the normal web UI service to emit the event in case it is running from the Gru service. That might be the easiest solution - even though it requires a new internal route to do IPC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this can closed as well?

Comment on lines 48 to 49
combined_like { $t->post_ok('/api/v1/jobs/99926/restart?force=1')->status_is(200) } qr/Job 99926 duplicated/,
'Job restarted successfully';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This route is not what your ticket is about. Of course it makes also sense to add an explicit test for this in case we don't already have one. However, the relevant route would be the one that ends with /done I suppose. And then we'd of course need to run the Minion job enqueued by that route. And then you'll probably run into the problem mentioned in my other review comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I use this opportunity to remind about TDD (test driven development).
I took this PR, but only the new test and not the code changes.
The test passed.

It should be clear that this should not be the case. If you write a new test for a new or changed feature, then the test must fail until you add the actual code changes. If it passes, then you are not testing the new feature.

For checking this rule it isn't even important whether you started development with a test or not.

So for the future: This is an easy rule to check for yourself before you push something to a PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I remember correctly, I added that change mainly because was not covered in the api tests. Apparently thats why it is added in t/api. it could be something else as well simultaneously. in a few words, this was not meant to cover the new case. The emission from the minion job should have implemented in 10-jobs.t:AMQP event emission for minion restarts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is covering an existing case that was not covered before, it should be in its own commit.

@d3flex d3flex force-pushed the feat/amqp_restart branch 2 times, most recently from 6a94068 to 4d32ac5 Compare November 24, 2025 10:31
t/10-jobs.t Outdated
%published = ();
$job->done(result => FAILED);
stdout_like { perform_minion_jobs($minion) } qr/Job \d+ duplicated as \d+/;
my $expected_topic = 'suse.openqa.job.restart';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it's hard to read what is tested here, and some tests are redundant.
I suggest the following diff:

diff --git a/t/10-jobs.t b/t/10-jobs.t
index d8a706c28..9719980d7 100644
--- a/t/10-jobs.t
+++ b/t/10-jobs.t
@@ -974,22 +974,20 @@ subtest 'AMQP event emission for minion restarts' => sub {
     %published = ();
     $job->done(result => FAILED);
     stdout_like { perform_minion_jobs($minion) } qr/Job \d+ duplicated as \d+/;
-    my $event = OpenQA::Test::Case::find_most_recent_event($schema, 'job_restart');
-    my $expected_topic = 'suse.openqa.job.restart';
-    my @restart_events = grep { $_ eq $expected_topic } keys %published;
-    ok exists $published{$expected_topic}, 'restart event published via AMQP';
-    is $published{$expected_topic}{id}, $job_id, 'event contains original job ID';
-    is $published{$expected_topic}{auto}, 1, 'event marked as auto restart';
-    is scalar(@restart_events), 1, 'exactly one job restart event emitted';
-    is $event_body[0], 'openqa_job_restart', 'event type is openqa_job_restart';
+    is scalar keys %published, 1, 'exactly one job restart event emitted';
+    my $event = $published{'suse.openqa.job.restart'};
+    is $event->{id}, $job_id, 'event contains original job ID';
+    is $event->{auto}, 1, 'event marked as auto restart';
+
     my ($user_id, $connection, $type, $data) = @{$event_body[1]};
     is $user_id, undef, 'user_id is undef for Minion restart';
     is $connection, undef, 'connection is undef for Minion restart';
     is $type, 'openqa_job_restart', 'type matches event type';
-    is $data->{id}, $job_id, 'data contains original job ID';
-    is $data->{auto}, 1, 'data marked as auto restart';
-    is ref($data->{result}), 'HASH', 'result should be a plain hash like API';
-    is_deeply $data->{result}, {$job_id => $data->{result}{$job_id}}, 'result shows cloned job info';
+
+    is_deeply $data, $event, 'published event equals emitted event';
+    ok exists $data->{result}->{$job_id}, 'old job id is in result';
+    $job->discard_changes;
+    is $job->clone_id, $data->{result}->{$job_id}, 'clone_id points to reported id';
 };

t/10-jobs.t Outdated
Comment on lines 977 to 991
my $expected_topic = 'suse.openqa.job.restart';
my @restart_events = grep { $_ eq $expected_topic } keys %published;
ok exists $published{$expected_topic}, 'restart event published via AMQP';
is $published{$expected_topic}{id}, $job_id, 'event contains original job ID';
is $published{$expected_topic}{auto}, 1, 'event marked as auto restart';
is scalar(@restart_events), 1, 'exactly one job restart event emitted';
is $event_body[0], 'openqa_job_restart', 'event type is openqa_job_restart';
my ($user_id, $connection, $type, $data) = @{$event_body[1]};
is $user_id, undef, 'user_id is undef for Minion restart';
is $connection, undef, 'connection is undef for Minion restart';
is $type, 'openqa_job_restart', 'type matches event type';
is $data->{id}, $job_id, 'data contains original job ID';
is $data->{auto}, 1, 'data marked as auto restart';
is ref($data->{result}), 'HASH', 'result should be a plain hash like API';
is_deeply $data->{result}, {$job_id => $data->{result}{$job_id}}, 'result shows cloned job info';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively here as a suggestion:

Suggested change
my $expected_topic = 'suse.openqa.job.restart';
my @restart_events = grep { $_ eq $expected_topic } keys %published;
ok exists $published{$expected_topic}, 'restart event published via AMQP';
is $published{$expected_topic}{id}, $job_id, 'event contains original job ID';
is $published{$expected_topic}{auto}, 1, 'event marked as auto restart';
is scalar(@restart_events), 1, 'exactly one job restart event emitted';
is $event_body[0], 'openqa_job_restart', 'event type is openqa_job_restart';
my ($user_id, $connection, $type, $data) = @{$event_body[1]};
is $user_id, undef, 'user_id is undef for Minion restart';
is $connection, undef, 'connection is undef for Minion restart';
is $type, 'openqa_job_restart', 'type matches event type';
is $data->{id}, $job_id, 'data contains original job ID';
is $data->{auto}, 1, 'data marked as auto restart';
is ref($data->{result}), 'HASH', 'result should be a plain hash like API';
is_deeply $data->{result}, {$job_id => $data->{result}{$job_id}}, 'result shows cloned job info';
is scalar keys %published, 1, 'exactly one job restart event emitted';
my $event = $published{'suse.openqa.job.restart'};
is $event->{id}, $job_id, 'event contains original job ID';
is $event->{auto}, 1, 'event marked as auto restart';
my ($user_id, $connection, $type, $data) = @{$event_body[1]};
is $user_id, undef, 'user_id is undef for Minion restart';
is $connection, undef, 'connection is undef for Minion restart';
is $type, 'openqa_job_restart', 'type matches event type';
is_deeply $data, $event, 'published event equals emitted event';
ok exists $data->{result}->{$job_id}, 'old job id is in result';
$job->discard_changes;
is $job->clone_id, $data->{result}->{$job_id}, 'clone_id points to reported id';

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the later

@d3flex d3flex force-pushed the feat/amqp_restart branch 2 times, most recently from 8e27e9a to 96463df Compare November 24, 2025 15:22
Copy link
Contributor

@perlpunk perlpunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the third commit message subject much "Make restart_openqa_job emit event same as API" (what should "same as API" tell us?)
Also the message from the first commit wasn't fixed yet, it still contains the "TBD" thing from me.
Otherwise I'd approve.

@d3flex d3flex force-pushed the feat/amqp_restart branch 2 times, most recently from 5298baf to bb1fc9d Compare November 24, 2025 16:54
@d3flex
Copy link
Contributor Author

d3flex commented Nov 24, 2025

I don't like the third commit message subject much "Make restart_openqa_job emit event same as API" (what should "same as API" tell us?) Also the message from the first commit wasn't fixed yet, it still contains the "TBD" thing from me. Otherwise I'd approve.

I will look at the commit.
the TBD was not totally sure what was about. edit the commit or feature to be done. However I have modified it a bit.
Do not approve yet. I try to fix the codecov

d3flex and others added 3 commits November 24, 2025 18:31
That should make sure that the event is send to AMQP for jobs with
`RETRY` once it is actually triggered.

Using the `restart_openqa_job` from the Restart module in order to not
duplicate the event which is emitted by the API, which also invoke
`auto_duplicate`. It should also trigger the event when the job is processed
and not while is queued.
To do so, it loads the AMQP plugin to the Gru job. The `one_tick` is
called to trigger the `next_tick` in the `register` function of the AMPQ module
immediately, otherwise it seems like it is registered late and the event is not
emitted.

issue: https://progress.opensuse.org/issues/190557
Signed-off-by: Ioannis Bonatakis <[email protected]>
The plugin gets loaded automatically via OpenQA::Setup
Start Mojo::IOLoop and let it end when the AMQP plugin has finished
(or failed) publishing the event
@d3flex d3flex force-pushed the feat/amqp_restart branch 2 times, most recently from 35b4dad to 7f47099 Compare November 24, 2025 17:42
@d3flex
Copy link
Contributor Author

d3flex commented Nov 24, 2025

I don't like the third commit message subject much "Make restart_openqa_job emit event same as API" (what should "same as API" tell us?) Also the message from the first commit wasn't fixed yet, it still contains the "TBD" thing from me. Otherwise I'd approve.

I will look at the commit. the TBD was not totally sure what was about. edit the commit or feature to be done. However I have modified it a bit. Do not approve yet. I try to fix the codecov

Commit messages were updated.
I didnt find a solution for the missing coverage (https://app.codecov.io/gh/os-autoinst/openQA/pull/6816?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=checks&utm_campaign=pr+comments&utm_term=os-autoinst). I marked it with # uncoverable statement. theoretically it can test it as a unit test. but in 10-jobs.t? is it worth it after all?

And create test coverage
- Covers the emission of events on Minion restarts
- Loads the AMQP plugin
- Checks event body for consistency with events from API

issue: https://progress.opensuse.org/issues/190557
Signed-off-by: Ioannis Bonatakis <[email protected]>
@mergify mergify bot merged commit 9eb96c2 into os-autoinst:master Nov 25, 2025
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants