Skip to content

Conversation

jaypoulz
Copy link
Contributor

@jaypoulz jaypoulz commented Oct 2, 2025

No description provided.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 2, 2025
@openshift-ci openshift-ci bot requested review from jeff-roche and sjenning October 2, 2025 19:17
Copy link
Contributor

openshift-ci bot commented Oct 2, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jaypoulz
Once this PR has been reviewed and has the lgtm label, please assign neisw for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


// VerifyHypervisorConnectivity verifies SSH connectivity to the hypervisor and checks
// that virsh and libvirt are available.
func VerifyHypervisorConnectivity(sshConfig *SSHConfig, knownHostsPath string) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would rename to VerifyHypervisor or VerifyHypervisorAvailability (as you're checking for more than connectivity and there is already a VerifyConnectivity function)


// SSH to hypervisor, then to surviving node to run pcs debug-start
// We need to chain the SSH commands: host -> hypervisor -> surviving node
output, stderr, err := PcsCommand(fmt.Sprintf("%s && %s", pcsResourceDebugStop, formatPcsCommandString(pcsResourceDebugStart, pcsResourceDebugStartEnvVars)), sshConfig, localKnownHostsPath, remoteKnownHostsPath, nodeIP)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the inconsistency here between having a "formatPcsCommandString" for the second command and directly calling fmt.Sprintf for the first one (in the first parameter) makes this a little harder to read than it should. Is it worth also encapsulating the first one in a function?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code further, why aren't we doing this like the PcsDebugStart below, that uses ExecuteRemoteSSHCommand? That's much easier to parse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed the utility function and replaced all of the pcs commands with the formatPcsCommand string option for simplicity :)


// ExecuteRemoteSSHCommand executes a command on an OpenShift node via two-hop SSH (local → hypervisor → node).
// Uses 'core' user for the node connection.
func ExecuteRemoteSSHCommand(nodeIP, command string, sshConfig *SSHConfig, localKnownHostsPath, remoteKnownHostsPath string) (string, string, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename nodeIP to remoteNodeIP, to make it more explicit. Also sshConfig to hypervisorSSHConfig. This way it's easier to know what info each parameter is providing to the function

@jaypoulz jaypoulz force-pushed the two-node-disruption-test-libs branch 4 times, most recently from 3bde7a8 to 323411e Compare October 3, 2025 15:23
@jaypoulz jaypoulz force-pushed the two-node-disruption-test-libs branch from 323411e to ef0ff06 Compare October 3, 2025 15:38
Copy link
Contributor

openshift-ci bot commented Oct 3, 2025

@jaypoulz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-serial-2of2 ef0ff06 link true /test e2e-aws-ovn-serial-2of2
ci/prow/okd-scos-e2e-aws-ovn ef0ff06 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-gcp-csi ef0ff06 link true /test e2e-gcp-csi
ci/prow/e2e-aws-ovn-fips ef0ff06 link true /test e2e-aws-ovn-fips
ci/prow/e2e-aws-ovn-single-node-serial ef0ff06 link false /test e2e-aws-ovn-single-node-serial
ci/prow/e2e-aws-ovn-single-node-upgrade ef0ff06 link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-aws-csi ef0ff06 link true /test e2e-aws-csi
ci/prow/e2e-openstack-ovn ef0ff06 link false /test e2e-openstack-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

openshift-trt bot commented Oct 3, 2025

Job Failure Risk Analysis for sha: ef0ff06

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-single-node-upgrade IncompleteTests
Tests for this run (32) are below the historical average (3677): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-okd-scos-e2e-aws-ovn IncompleteTests
Tests for this run (140) are below the historical average (1532): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants