Skip to content

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Sep 26, 2025

All jobs using e2e-k8s.sh with PARALLEL=true automatically skipped serial tests even though technically they now could run in the job: https://github.com/kubernetes-sigs/kind/blame/d1eecc46e30cac9d35cd32dc52677ef75ec22e18/hack/ci/e2e-k8s.sh#L226-L234

kubernetes-sigs/kind#4015 is changing that mandatory skip for jobs using LABEL_FILTER because it may be desirable to include serial tests, depending on the job. It's also better to be explicit about it in each job's LABEL_FILTER to avoid confusion and potential mistakes (not running tests that were expected to run).

To prepare for that change, jobs get updated based on the following principles:

  • If a presubmit runs infrequently and is only invoked to test certain aspects, then including serial tests is desirable to get full test coverage of that aspect. Example: pull-kubernetes-e2e-kind-beta-features
  • If a presubmit runs always, serial tests should be excluded. Example: pull-kubernetes-e2e-kind
  • Periodic jobs should always run all supported tests, including the serial ones. Example: ci-kubernetes-e2e-kind
  • Canary jobs match the behavior of the non-canary variant. Example: pull-kubernetes-e2e-kind-canary
  • Jobs for release branches should better not change for the sake of preserving the old behavior (but that
    is debatable). Example: ci-kubernetes-e2e-kind-1-34

To make it more obvious where the upcoming e2e-k8s.sh will change test selection, "Includes serial tests for the sake of completeness." comments get added. Those are not true right now, but will be once the script is changed.

All jobs using e2e-k8s.sh with PARALLEL=true automatically skipped serial tests
even though technically they now could run in the job:
https://github.com/kubernetes-sigs/kind/blame/d1eecc46e30cac9d35cd32dc52677ef75ec22e18/hack/ci/e2e-k8s.sh#L226-L234

kubernetes-sigs/kind#4015 is changing that mandatory
skip for jobs using LABEL_FILTER because it may be desirable to include serial
tests, depending on the job. It's also better to be explicit about it in each
job's LABEL_FILTER to avoid confusion and potential mistakes (not running tests
that were expected to run).

To prepare for that change, jobs get updated based on the following principles:

- If a presubmit runs infrequently and is only invoked to test certain aspects,
  then including serial tests is desirable to get full test coverage of that
  aspect. Example: pull-kubernetes-e2e-kind-beta-features
- If a presubmit runs always, serial tests should be excluded.
  Example: pull-kubernetes-e2e-kind
- Periodic jobs should always run all supported tests, including the serial
  ones. Example: ci-kubernetes-e2e-kind
- Canary jobs match the behavior of the non-canary variant.
  Example: pull-kubernetes-e2e-kind-canary

To make it more obvious where the upcoming e2e-k8s.sh will change test selection,
"Includes serial tests for the sake of completeness." comments get added.
Those are not true *right now*, but will be once the script is changed.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 26, 2025
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/config Issues or PRs related to code in /config area/conformance Issues or PRs related to kubernetes conformance tests labels Sep 26, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pohly
Once this PR has been reviewed and has the lgtm label, please assign madhavjivrajani for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Sep 26, 2025
@pohly
Copy link
Contributor Author

pohly commented Sep 26, 2025

/hold

Needs to be reviewed together with kubernetes-sigs/kind#4015. This PR then needs to be merged first.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 26, 2025
@pohly
Copy link
Contributor Author

pohly commented Sep 26, 2025

/assign @BenTheElder

@dims: this changes some of your jobs. Do you agree that including serial tests makes sense for them?

@aojea: same with your network jobs. You expressed a desired to exclude serial tests from presubmits, so I leaned in that direction for them.

@pohly
Copy link
Contributor Author

pohly commented Sep 26, 2025

Maintaining two variants is more work and makes the testgrid dashboards larger.

I think that's done for jobs not using LABEL_FILTER yet. There is a ci-kubernetes-kube-network-policies-conformance-parallel but no corresponding ci-kubernetes-kube-network-policies-conformance-serial. With this change there is no need for it.

I was worried that that tests like test/e2e/apimachinery/garbage_collector.go: framework.ConformanceIt("should orphan pods created by rc if delete options say so", framework.WithSerial() might not have run as often as desired. But
ci-kubernetes-conformance-kind-ga-only runs everything sequentially, so at least serial conformance tests are covered as release blocking.

The same cannot be said for other tests like test/e2e/storage/testsuites/volumelimits.go: f.It("should support volume limits", f.WithSerial(). That's not in kind-master, nor is there a serial variant of it. It is in gce-cos-master-serial though as release informing.

pohly added a commit to pohly/test-infra that referenced this pull request Oct 1, 2025
This mirrors the presubmit counterparts that were added in
kubernetes#35534 and tested in
kubernetes/kubernetes#134250.

I can help monitor the existing alpha/beta features jobs. Those need to be
renamed in testgrid to avoid confusion. Other than that they remain unchanged
for now.

Potential future work:

- Changes related to serial and/or slow jobs.

  Serial tests are excluded implicitly by e2e-k8s.sh because the jobs enables
  PARALLEL (kubernetes#35594).

  Slow jobs are disabled in LABEL_FILTER because that is what the existing periodics
  did. We might be able to run them because as long as they overlap with other
  tests there shouldn't be much impact on overall job duration (same applies
  to presubmits!). Scheduling of slow tests may be relevant
  (onsi/ginkgo#1599).

- Release informing/blocking.

  The existing jobs are release informing. alpha-beta-features shouldn't be
  because breaking alpha tests is not something that the release team should
  have to deal with. Instead, the new jobs should get promoted once they
  are known to be stable. beta-features can remain release informing,
  tests for beta features (even if off-by-default) need to be stable.

- Decision about "enabled-conformance".

  The conformance jobs got included because it was suggested on Slack.
  They run a subset of the tests run by their "enabled" counterparts.
  It remains to be seen whether having two jobs instead of one really
  provides a better release signal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/config Issues or PRs related to code in /config area/conformance Issues or PRs related to kubernetes conformance tests area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants