Skip to content

machine/hyperV: move ssh mounts to after the ready check#28871

Merged
l0rd merged 1 commit into
podman-container-tools:mainfrom
Luap99:hyperv-flake
Jun 19, 2026
Merged

machine/hyperV: move ssh mounts to after the ready check#28871
l0rd merged 1 commit into
podman-container-tools:mainfrom
Luap99:hyperv-flake

Conversation

@Luap99

@Luap99 Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member

We are seeing frequent flakes in hyperV machine tests. The machine start fails with an ssh handshake failure:

ssh: handshake failed: read tcp 127.0.0.1:56425->127.0.0.1:56377:
wsarecv: An existing connection was forcibly closed by the remote host.

Normally we do the ssh probe in conductVMReadinessCheck() with a retry mechanism, however because the hyperV mount code already used ssh in PostStartNetworking() we never got there and failed early.

PostStartNetworking seems the wrong place to mount anyway so move this to MountVolumesToVM() instead which is placed after the ready check already so it should have a working ssh by then.

Does this PR introduce a user-facing change?

Fixed a possible race condition when starting hyperV machines.

We are seeing frequent flakes in hyperV machine tests. The machine start
fails with an ssh handshake failure:

ssh: handshake failed: read tcp 127.0.0.1:56425->127.0.0.1:56377:
wsarecv: An existing connection was forcibly closed by the remote host.

Normally we do the ssh probe in conductVMReadinessCheck() with a retry
mechanism, however because the hyperV mount code already used ssh in
PostStartNetworking() we never got there and failed early.

PostStartNetworking seems the wrong place to mount anyway so move this
to MountVolumesToVM() instead which is placed after the ready check
already so it should have a working ssh by then.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@Luap99 Luap99 added the No New Tests Allow PR to proceed without adding regression tests label Jun 5, 2026
@Luap99

Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

cc @baude @mheon

@Luap99

Luap99 commented Jun 5, 2026

Copy link
Copy Markdown
Member Author
2026-06-05T18:20:50.6580288Z podman machine set
2026-06-05T18:20:50.6580919Z D:/a/podman/podman/pkg/machine/e2e/set_test.go:15
2026-06-05T18:20:50.6582140Z   set machine cpus, disk, memory
2026-06-05T18:20:50.6583082Z   D:/a/podman/podman/pkg/machine/e2e/set_test.go:31
2026-06-05T18:20:50.6584226Z   > Enter [BeforeEach] TOP-LEVEL - D:/a/podman/podman/pkg/machine/e2e/machine_test.go:218 @ 06/05/26 18:20:50.657
2026-06-05T18:20:50.6594902Z   < Exit [BeforeEach] TOP-LEVEL - D:/a/podman/podman/pkg/machine/e2e/machine_test.go:218 @ 06/05/26 18:20:50.658 (1ms)
2026-06-05T18:20:50.6596410Z   > Enter [It] set machine cpus, disk, memory - D:/a/podman/podman/pkg/machine/e2e/set_test.go:31 @ 06/05/26 18:20:50.658
2026-06-05T18:20:50.6598982Z   D:\a\podman\podman\bin\windows\podman.exe machine init --disk-size 11 --image C:\Users\RUNNER~1\AppData\Local\Temp\podman-machine.x86_64.hyperv.vhdx 9d5b1a93fca1
2026-06-05T18:21:23.3820837Z   Machine init complete
2026-06-05T18:21:23.3821609Z   To start your machine run:
2026-06-05T18:21:23.3821892Z 
2026-06-05T18:21:23.3837796Z   	podman machine start 9d5b1a93fca1
2026-06-05T18:21:23.3838255Z 
2026-06-05T18:21:23.3910517Z   D:\a\podman\podman\bin\windows\podman.exe machine set --memory 524288 9d5b1a93fca1
2026-06-05T18:21:23.5157753Z   Error: requested amount of memory (524288 MB) greater than total system memory (16378 MB)
2026-06-05T18:21:23.5309499Z   D:\a\podman\podman\bin\windows\podman.exe machine set --cpus 2 --disk-size 102 --memory 4096 9d5b1a93fca1
2026-06-05T18:21:25.2914933Z   D:\a\podman\podman\bin\windows\podman.exe machine set --cpus 2 --disk-size 5 --memory 4096 9d5b1a93fca1
2026-06-05T18:21:25.4184815Z   Error: new disk size must be larger than 102 GB
2026-06-05T18:21:25.4303190Z   D:\a\podman\podman\bin\windows\podman.exe machine start 9d5b1a93fca1
2026-06-05T18:21:25.5487241Z   Starting machine "9d5b1a93fca1"
2026-06-05T18:22:41.9670457Z 
2026-06-05T18:22:41.9673097Z   This machine is currently configured in rootless mode. If your containers
2026-06-05T18:22:41.9674251Z   require root permissions (e.g. ports < 1024), or if you run into compatibility
2026-06-05T18:22:41.9676692Z   issues with non-podman clients, you can switch using the following command:
2026-06-05T18:22:41.9677303Z 
2026-06-05T18:22:41.9678584Z   	podman machine set --rootful 9d5b1a93fca1
2026-06-05T18:22:41.9678941Z 
2026-06-05T18:23:13.2504756Z   Error: machine did not transition into running state: ssh error: ssh: handshake failed: read tcp 127.0.0.1:63178->127.0.0.1:63088: wsarecv: An existing connection was forcibly closed by the remote host.
2026-06-05T18:23:13.2746605Z   [FAILED] Expected
2026-06-05T18:23:13.2747395Z       <int>: 125
2026-06-05T18:23:13.2748714Z   to match exit code:
2026-06-05T18:23:13.2749102Z       <int>: 0
2026-06-05T18:23:13.2897930Z   In [It] at: D:/a/podman/podman/pkg/machine/e2e/set_test.go:57 @ 06/05/26 18:23:13.273

Mhh, that does not seem to work and still fails, though now with the longer retry timeout so it seem like ssh not coming up at all or a gvproxy error seems more likely.

@mheon

mheon commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Even if it's not fixing the issue, still LGTM for me - this is a much cleaner way of doing things and should be less race-prone

@Luap99

Luap99 commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

I Agree it is worth merging reagrdless but unlikely to fix the root cause of the flake. Maybe it helps a bit, hard to tell from just one PR run.

@baude PTAL

@ashley-cui

Copy link
Copy Markdown
Contributor

LGTM

@l0rd l0rd merged commit 835c8f2 into podman-container-tools:main Jun 19, 2026
225 of 232 checks passed
@Luap99 Luap99 deleted the hyperv-flake branch June 19, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

machine No New Tests Allow PR to proceed without adding regression tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants