flowey: vmm_tests: introduce step to verify all tests run on at least one runner #2094

damanm24 · 2025-10-06T17:20:05Z

This PR resolves #2010
With #1991 it is possible (however unlikely) for a contributor to introduce a test that doesn't end up running on any of the vmm_test jobs in CI. This PR introduces a step that verifies the list of all tests built as part of a CI run are run on at least one vmm_test job.

Here are a list of changes that make this possible:

Introduce a new flowey node that generates a nextest list command which dumps the tests that are built as part of the nextest archive.
Introduce a new flowey node that consumes the output from nextest list and the corresponding junit xml that was produced from all vmm_test runs. This node then gathers a full set of tests from both artifacts and compares them to make sure the sets are equal.
Expanded flowey's built-in artifact publishing API to mark artifacts as force publish which means they are still published if a previous step in the job fails.

Copilot

Pull Request Overview

This PR introduces infrastructure to verify test coverage by creating a step that parses nextest JUnit XML files to collect test names and ensures all tests run on at least one runner. The implementation adds nextest list command generation, JSON artifact publishing, and a new verification node for comparing test results.

Adds a new verification step to parse JUnit XML files and extract test names
Extends nextest integration to generate list commands and publish JSON artifacts
Updates test result publishing to include nextest list JSON output

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
flowey/flowey_lib_hvlite/src/_jobs/verify_all_tests_run.rs	New node that parses JUnit XML files to extract test names for verification
flowey/flowey_lib_common/src/run_cargo_nextest_run.rs	Extended to generate nextest list commands and capture JSON output
flowey/flowey_lib_common/src/gen_cargo_nextest_list_cmd.rs	New module for generating cargo nextest list commands
flowey/flowey_lib_common/src/publish_test_results.rs	Updated to publish nextest list JSON as artifacts
.github/workflows/openvmm-pr.yaml	Generated workflow updates reflecting the new artifact publishing
.github/workflows/openvmm-pr-release.yaml	Generated workflow updates reflecting the new artifact publishing

flowey/flowey_lib_hvlite/src/_jobs/verify_all_tests_run.rs

flowey/flowey_lib_common/src/run_cargo_nextest_run.rs

github-actions · 2025-10-06T22:06:36Z

At least one Petri test failed.

flowey/flowey_lib_hvlite/src/_jobs/verify_all_tests_run.rs

flowey/flowey_lib_common/src/run_cargo_nextest_run.rs

flowey/flowey_lib_common/src/publish_test_results.rs

flowey/flowey_lib_common/src/gen_cargo_nextest_list_cmd.rs

flowey/flowey_hvlite/src/pipelines/checkin_gates.rs

smalis-msft · 2025-10-09T21:16:27Z

Very cool idea, but I think it needs a bunch of refactoring to clean up the cases where steps have been combined that really should be separate.

flowey/flowey_lib_common/src/publish_test_results.rs

flowey/flowey_lib_hvlite/src/_jobs/consume_and_test_nextest_unit_tests_archive.rs

flowey/flowey_lib_hvlite/src/_jobs/consume_and_test_nextest_vmm_tests_archive.rs

flowey/flowey_cli/src/pipeline_resolver/generic.rs

flowey/flowey_lib_common/src/gen_cargo_nextest_list_cmd.rs

flowey/flowey_lib_common/src/run_cargo_nextest_list.rs

flowey/flowey_lib_common/src/publish_test_results.rs

flowey/flowey_lib_common/src/run_cargo_nextest_run.rs

smalis-msft · 2025-10-15T16:55:10Z

flowey/flowey_lib_common/src/run_cargo_nextest_list.rs

+fn get_nextest_list_output_from_stdout(output: &str) -> anyhow::Result<serde_json::Value> {
+    // nextest list prints a few lines of non-json output before the actual
+    // JSON output, so we need to find the first line that is valid JSON
+    for line in output.lines() {


Is it always the same number of lines? Could we just skip that many?

smalis-msft · 2025-10-15T17:00:48Z

flowey/flowey_lib_hvlite/src/_jobs/verify_all_tests_run.rs

+        let parse = ctx.emit_rust_step(
+            "parse and analyze junit logs and nextest list output",
+            |ctx| {
+                // This step takes all of the junit XML files (i.e. the tests that were run) and the nextest list output (i.e. the tests that were built)


Just thinking through the logic here. When we run nextest list I think that's going to use the same filtering that nextest run does. So I would expect the outputs to always match if we run list on the test runners. I think the correct approach would be to run nextest list on each vmm test builder with the include-ignored flag to get a complete list of everything built from that platform. Then combine that with the lists of everything run on the different runners. Thoughts?

Additionally, have you tried causing a failure here by manually tweaking something to verify that the logic is working?

I do think running list on the builders instead of the runners is the way to go, with the include ignored flag i'd expect the runners to all output the same lists for their platform. Moving to the builder just dedups it.

This reverts commit eea6cc1.

initial work

3a064d5

damanm24 requested a review from a team as a code owner October 6, 2025 17:20

Copilot AI review requested due to automatic review settings October 6, 2025 17:20

damanm24 requested a review from a team as a code owner October 6, 2025 17:20

Copilot AI reviewed Oct 6, 2025

View reviewed changes

Daman Mulye added 4 commits October 6, 2025 11:30

fixes

b215a0c

Add support for force published artifacts

92d4904

test out new force publish artifact step

233acfa

regen yaml

ccbe79d

damanm24 requested a review from a team as a code owner October 6, 2025 20:15

Daman Mulye added 3 commits October 6, 2025 14:48

Try new job

33381cf

condense

c46f87f

Merge branch 'main' into add-verify-tests-step

e7826ba

Daman Mulye added 13 commits October 6, 2025 15:59

Try a successful vmm_tests run

693630a

update pipeline files

2c2795d

.

6be7aa0

fix typo

689dd30

Prepare for review

922d587

print debug logs

51f8bb8

Handle both dirs and files gracefully

cb736c0

Remove logging statements

d1348e7

.

4c0d383

Proper refactor of publish_test_result.rs

2962333

More cleanup

3fa2e2e

Merge branch 'main' into add-verify-tests-step

044179e

Add file extension

e944546

damanm24 changed the title ~~[WIP - DNR] flowey: vmm_tests: introduce step to verify all tests run on at least one runner~~ flowey: vmm_tests: introduce step to verify all tests run on at least one runner Oct 8, 2025

Daman Mulye added 2 commits October 8, 2025 13:03

Fix local vmm_tests

2cd9f9d

Scope changes in YAML

fcb1ed5

damanm24 requested a review from smalis-msft October 9, 2025 18:21