-
Notifications
You must be signed in to change notification settings - Fork 0
SPSTRAT-549: Azure: Retry publishing when a submission is in progress/conflict #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7ff6b5c
to
b26f169
Compare
@lslebodn @ashwini3326 PTAL |
b26f169
to
0aa9062
Compare
0aa9062
to
944b276
Compare
looks good to me. |
err_lookup = r"An In Progress submission [0-9]+ already exists." | ||
for error in job_details.errors: | ||
err_msg = error.message | ||
if error.code == "internalServerError" and re.match(err_lookup, err_msg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JAVGan Is error code ad message documented somewhere int eh API?
similar for check_for_conflict
BTW I would prefer to defer the retry logic and firstly solve SPSTRAT-595.
which means I should review !131. will try to at weekend.
But meanwhile we could check in documentation how reliable are error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Their documentation is really poor in this sense... Most of the things we discover in practice either via Postman or using the tooling and figuring out through logs (which was this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I was able to get this error state via Postman as well. But yeah, it's a bit difficult to be 100% sure if this is the expected behavior the API or a malfunction but we might introduce this as is and if we caught other type of errors for the same issue we might update... Or contact them and wait for the response, but... Maybe it's faster to rely on practice 🙂
944b276
to
ae37379
Compare
@ashwgit can you please use the approval from GH besides just commenting? thanks in advance! |
Could we postpone this PR a bit and firstly solve SPSTRAT-595? |
This commit introduces a new model named `ConfigureError` and exceptions named `ConflictError` and `RunningSubmissionError` which aims to provide a more detailed status of an Azure Publishing Error. The goal is to be able to differentiate certain errors which are caused by submission in progress/conflict. Assisted-By: Cursor Signed-off-by: Jonathan Gangi <[email protected]>
ae37379
to
a8949a3
Compare
This commit changes the Azure module to retry publishing the VM image whenever a submission in progress/conflict happens. It will first attempt to change the target to `preview` or `live` for 3 times and then, if the exception comes as `ConflictError` or `RunningSubmissionError` it will restart the publishing task. Assisted-By: Cursor Signed-off-by: Jonathan Gangi <[email protected]>
This commit updates and creates new tests to make sure the new implemented exception handlers for retrying on in progress submission/conflict are properly working. Assisted-by: Cursor Signed-off-by: Jonathan Gangi <[email protected]>
a8949a3
to
9ffe3ea
Compare
TBH I worry a lot about such retry logic in cloudpub. Quite often we are not 100% sure of real failure in some reports ("intermittent issue") SPSTRAT-595, SPSTRAT-585, RHELDST-33573. Improved logging in !131 might help to better analyze the issue. Or it we might focus on modular publish SPSTRAT-604 |
Ok! Let's focus on modular push then and keep this open until we're more confident about it. |
This commit changes the Azure module to retry publishing the VM image whenever a submission in progress/conflict happens.
It will first attempt to change the target to
preview
orlive
for 3 times and then, if the exception comes asConflictError
orRunningSubmissionError
it will restart the publishing task.In order to accomplish that it introduces new models/exceptions for handling fine grained errors from Azure API as well as some new functions in
utils
to detect whether an operation in progress/conflict error is sent.Refers to SPSTRAT-549