Skip to content

Conversation

damanm24
Copy link
Contributor

@damanm24 damanm24 commented Oct 6, 2025

This PR resolves #2010
With #1991 it is possible (however unlikely) for a contributor to introduce a test that doesn't end up running on any of the vmm_test jobs in CI. This PR introduces a step that verifies the list of all tests built as part of a CI run are run on at least one vmm_test job.

Here are a list of changes that make this possible:

  1. Introduce a new flowey node that generates a nextest list command which dumps the tests that are built as part of the nextest archive.
  2. Introduce a new flowey node that consumes the output from nextest list and the corresponding junit xml that was produced from all vmm_test runs. This node then gathers a full set of tests from both artifacts and compares them to make sure the sets are equal.
  3. Expanded flowey's built-in artifact publishing API to mark artifacts as force publish which means they are still published if a previous step in the job fails.

@damanm24 damanm24 requested a review from a team as a code owner October 6, 2025 17:20
@Copilot Copilot AI review requested due to automatic review settings October 6, 2025 17:20
@damanm24 damanm24 requested a review from a team as a code owner October 6, 2025 17:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces infrastructure to verify test coverage by creating a step that parses nextest JUnit XML files to collect test names and ensures all tests run on at least one runner. The implementation adds nextest list command generation, JSON artifact publishing, and a new verification node for comparing test results.

  • Adds a new verification step to parse JUnit XML files and extract test names
  • Extends nextest integration to generate list commands and publish JSON artifacts
  • Updates test result publishing to include nextest list JSON output

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
flowey/flowey_lib_hvlite/src/_jobs/verify_all_tests_run.rs New node that parses JUnit XML files to extract test names for verification
flowey/flowey_lib_common/src/run_cargo_nextest_run.rs Extended to generate nextest list commands and capture JSON output
flowey/flowey_lib_common/src/gen_cargo_nextest_list_cmd.rs New module for generating cargo nextest list commands
flowey/flowey_lib_common/src/publish_test_results.rs Updated to publish nextest list JSON as artifacts
.github/workflows/openvmm-pr.yaml Generated workflow updates reflecting the new artifact publishing
.github/workflows/openvmm-pr-release.yaml Generated workflow updates reflecting the new artifact publishing

@damanm24 damanm24 requested a review from a team as a code owner October 6, 2025 20:15
Copy link

github-actions bot commented Oct 6, 2025

@damanm24 damanm24 changed the title [WIP - DNR] flowey: vmm_tests: introduce step to verify all tests run on at least one runner flowey: vmm_tests: introduce step to verify all tests run on at least one runner Oct 8, 2025
@damanm24 damanm24 requested a review from smalis-msft October 9, 2025 18:21
@smalis-msft
Copy link
Contributor

smalis-msft commented Oct 9, 2025

Very cool idea, but I think it needs a bunch of refactoring to clean up the cases where steps have been combined that really should be separate.

fn get_nextest_list_output_from_stdout(output: &str) -> anyhow::Result<serde_json::Value> {
// nextest list prints a few lines of non-json output before the actual
// JSON output, so we need to find the first line that is valid JSON
for line in output.lines() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it always the same number of lines? Could we just skip that many?

let parse = ctx.emit_rust_step(
"parse and analyze junit logs and nextest list output",
|ctx| {
// This step takes all of the junit XML files (i.e. the tests that were run) and the nextest list output (i.e. the tests that were built)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking through the logic here. When we run nextest list I think that's going to use the same filtering that nextest run does. So I would expect the outputs to always match if we run list on the test runners. I think the correct approach would be to run nextest list on each vmm test builder with the include-ignored flag to get a complete list of everything built from that platform. Then combine that with the lists of everything run on the different runners. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, have you tried causing a failure here by manually tweaking something to verify that the logic is working?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think running list on the builders instead of the runners is the way to go, with the include ignored flag i'd expect the runners to all output the same lists for their platform. Moving to the builder just dedups it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vmm_tests: Add flowey pass to ensure all vmm_tests are run on at least one host

2 participants