Skip to content

Conversation

JAVGan
Copy link
Collaborator

@JAVGan JAVGan commented Jun 25, 2025

This commit changes the Azure module to retry publishing the VM image whenever a submission in progress/conflict happens.

It will first attempt to change the target to preview or live for 3 times and then, if the exception comes as ConflictError or RunningSubmissionError it will restart the publishing task.

In order to accomplish that it introduces new models/exceptions for handling fine grained errors from Azure API as well as some new functions in utils to detect whether an operation in progress/conflict error is sent.

Refers to SPSTRAT-549

@JAVGan JAVGan requested review from jajreidy and lslebodn as code owners June 25, 2025 21:22
@JAVGan JAVGan force-pushed the azure_retry_conflict branch 2 times, most recently from 7ff6b5c to b26f169 Compare June 25, 2025 21:30
@JAVGan
Copy link
Collaborator Author

JAVGan commented Jun 26, 2025

@lslebodn @ashwini3326 PTAL

@JAVGan JAVGan force-pushed the azure_retry_conflict branch from b26f169 to 0aa9062 Compare August 11, 2025 15:44
@JAVGan JAVGan force-pushed the azure_retry_conflict branch from 0aa9062 to 944b276 Compare August 21, 2025 19:44
@ashwgit
Copy link
Collaborator

ashwgit commented Aug 29, 2025

looks good to me.

err_lookup = r"An In Progress submission [0-9]+ already exists."
for error in job_details.errors:
err_msg = error.message
if error.code == "internalServerError" and re.match(err_lookup, err_msg):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JAVGan Is error code ad message documented somewhere int eh API?
similar for check_for_conflict

BTW I would prefer to defer the retry logic and firstly solve SPSTRAT-595.
which means I should review !131. will try to at weekend.

But meanwhile we could check in documentation how reliable are error messages.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their documentation is really poor in this sense... Most of the things we discover in practice either via Postman or using the tooling and figuring out through logs (which was this case).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC I was able to get this error state via Postman as well. But yeah, it's a bit difficult to be 100% sure if this is the expected behavior the API or a malfunction but we might introduce this as is and if we caught other type of errors for the same issue we might update... Or contact them and wait for the response, but... Maybe it's faster to rely on practice 🙂

@JAVGan JAVGan force-pushed the azure_retry_conflict branch from 944b276 to ae37379 Compare September 1, 2025 18:03
@JAVGan
Copy link
Collaborator Author

JAVGan commented Sep 1, 2025

@ashwgit can you please use the approval from GH besides just commenting? thanks in advance!

@lslebodn
Copy link
Collaborator

lslebodn commented Sep 2, 2025

@ashwgit can you please use the approval from GH besides just commenting? thanks in advance!

Could we postpone this PR a bit and firstly solve SPSTRAT-595?

This commit introduces a new model named `ConfigureError` and exceptions
named `ConflictError` and `RunningSubmissionError` which aims to provide
a more detailed status of an Azure Publishing Error.

The goal is to be able to differentiate certain errors which are caused
by submission in progress/conflict.

Assisted-By: Cursor
Signed-off-by: Jonathan Gangi <[email protected]>
@JAVGan JAVGan force-pushed the azure_retry_conflict branch from ae37379 to a8949a3 Compare September 2, 2025 17:28
This commit changes the Azure module to retry publishing the VM image
whenever a submission in progress/conflict happens.

It will first attempt to change the target to `preview` or `live` for 3
times and then, if the exception comes as `ConflictError` or
`RunningSubmissionError` it will restart the publishing task.

Assisted-By: Cursor
Signed-off-by: Jonathan Gangi <[email protected]>
This commit updates and creates new tests to make sure the new
implemented exception handlers for retrying on in progress
submission/conflict are properly working.

Assisted-by: Cursor
Signed-off-by: Jonathan Gangi <[email protected]>
@JAVGan JAVGan force-pushed the azure_retry_conflict branch from a8949a3 to 9ffe3ea Compare September 2, 2025 17:35
@JAVGan
Copy link
Collaborator Author

JAVGan commented Sep 2, 2025

@lslebodn @ashwgit I've rebased and fixed the conficts, PTAL again. Then we can continue with other improvements for this library

@lslebodn
Copy link
Collaborator

lslebodn commented Sep 3, 2025

TBH I worry a lot about such retry logic in cloudpub.
IMHO it is very dangerous considering use-case when there are parallel releases from different advisories to the same offer.

Quite often we are not 100% sure of real failure in some reports ("intermittent issue") SPSTRAT-595, SPSTRAT-585, RHELDST-33573. Improved logging in !131 might help to better analyze the issue.

Or it we might focus on modular publish SPSTRAT-604

@JAVGan
Copy link
Collaborator Author

JAVGan commented Sep 3, 2025

Ok!

Let's focus on modular push then and keep this open until we're more confident about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants