Skip to content

Improve Helix Error Reporting#16548

Draft
nagilson wants to merge 2 commits intomainfrom
nagilson-produce-actionable-helix-failures
Draft

Improve Helix Error Reporting#16548
nagilson wants to merge 2 commits intomainfrom
nagilson-produce-actionable-helix-failures

Conversation

@nagilson
Copy link
Member

Motivators

When helix fails, there are several core problems that make the build analysis tab no longer helpful.

  1. Failures do not appear, and instead this error is displayed from dotnet-helix-service: This is a helix work item crash with status: BadExit.

  2. Work Item failures simply say Test has failed. Check the log: log_link when they could actually provide the error, so we don't have to click and read the log.

  3. When I click the link on '[Console]' it takes me here: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1310620&view=ms.vss-test-web.build-test-results-tab&runId=36607218&resultId=113379&paneView=debug

This view is not helpful - here is what it shows me. "The Helix Work Item failed. Often this is due to a test crash. Please see the 'Artifacts' tab above for additional logs." The failures for a "Work Item" are not helpful and provide nothing actionable. What I really want to see is a specific test failure link for that build, https://dev.azure.com/dnceng-public/public/_build/results?buildId=1310620&view=ms.vss-test-web.build-test-results-tab&runId=36607090&resultId=100720&paneView=debug such as this link, which shows ''Expected string to be or empty because Expected command to not output to stderr but it was not:'

Fixes

Adds more error details from the log into the output messages. Note, I have not worked in the Arcade repo so I'm not familiar with this logic. It may also be controversial to create a longer string in the output by showing more error details.

My goal is to speed up diagnostic time and reduce the number of clicks it takes to see what failed in a PR.

To double check:

# Motivators
When helix fails, there are several core problems that make the build analysis tab no longer helpful.

1. Failures do not appear, and instead this error is displayed from `dotnet-helix-service`: This is a helix work item crash with status: `BadExit`.

2. Work Item failures simply say `Test has failed. Check the log: log_link` when they could actually provide the error, so we don't have to click and read the log.

3. When I click the link on '[Console]' it takes me here: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1310620&view=ms.vss-test-web.build-test-results-tab&runId=36607218&resultId=113379&paneView=debug

This view is not helpful - here is what it shows me.
"The Helix Work Item failed. Often this is due to a test crash. Please see the 'Artifacts' tab above for additional logs." The failures for a "Work Item" are not helpful and provide nothing actionable. What I really want to see is a specific test failure link for that build, https://dev.azure.com/dnceng-public/public/_build/results?buildId=1310620&view=ms.vss-test-web.build-test-results-tab&runId=36607090&resultId=100720&paneView=debug such as this link, which shows ''Expected string to be <null> or empty because Expected command to not output to stderr but it was not:'

# Fixes

Adds more error details from the log into the output messages.
Note, I have not worked in the Arcade repo so I'm not familiar with this logic. It may also be controversial to create a longer string in  the output by showing more error details.

My goal is to speed up diagnostic time and reduce the number of clicks it takes to see what failed in a PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant