Skip to content

Conversation

@dasionov
Copy link
Contributor

@dasionov dasionov commented Jul 5, 2025

VEP Metadata

Tracking issue: #71
SIG label: /sig api

What this PR does

Add support for the --grace-period flag when deleting VMs via kubectl, allowing graceful shutdown before termination. Improves consistency with Kubernetes behavior and user expectations.

Special notes for your reviewer

@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Jul 5, 2025
@kubevirt-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xpivarc for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dasionov dasionov force-pushed the graceful-deletion-vms branch 2 times, most recently from 239a831 to c0b0591 Compare July 6, 2025 12:54
@dankenigsberg
Copy link
Member

dankenigsberg commented Jul 6, 2025 via email

@dasionov
Copy link
Contributor Author

dasionov commented Jul 9, 2025

/cc @iholder101 @enp0s3

@kubevirt-bot kubevirt-bot requested review from enp0s3 and iholder101 July 9, 2025 10:53
@dasionov dasionov force-pushed the graceful-deletion-vms branch from c0b0591 to f9911dd Compare July 9, 2025 14:30
@dasionov
Copy link
Contributor Author

dasionov commented Jul 9, 2025

Please inline the reference to kubernetes/kubernetes#56567 : we already tried Alternative 1 some 8 years ago.

done

@dasionov dasionov force-pushed the graceful-deletion-vms branch from f9911dd to e590c99 Compare July 13, 2025 01:14
Copy link

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dasionov Thank you! I have a few questions below

Copy link

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dasionov I went over the KEP and my take is that I prefer the alternative approach from a couple of reasons:

  • I think that making API calls from validating webhook context is a bad practice, it can introduce latency to the webhook.
  • The webhook modifies the VMI spec which is also an anti-pattern, spec is something that should be modified by the user.

@dasionov
Copy link
Contributor Author

@dasionov I went over the KEP and my take is that I prefer the alternative approach from a couple of reasons:

  • I think that making API calls from validating webhook context is a bad practice, it can introduce latency to the webhook.
  • The webhook modifies the VMI spec which is also an anti-pattern, spec is something that should be modified by the user.

The spec modification is derived directly from a user-provided value in a command such as:
kubectl delete my-vm --grace-period-10
In this case, it’s reasonable to assume the user intends for the grace period in the spec to be updated to 10, since it reflects their explicit request.

Add support for the `--grace-period` flag when deleting VMs via
`kubectl`, allowing graceful shutdown before termination. Improves
consistency with Kubernetes behavior and user expectations.

Signed-off-by: Daniel Sionov <[email protected]>
@dasionov dasionov force-pushed the graceful-deletion-vms branch from e590c99 to aeadb7f Compare August 13, 2025 07:53
Comment on lines +112 to +113
## Scalability
No scalability concerns are anticipated, as the webhook operates on individual `DELETE` operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think scalability is something to consider here.
If we're going to use a webhook, it means all load is on virt-api (which has scalability issues of its own...). Think about mass-deletion of thousands of VMs at once. This will definitely increase the load on virt-api.

I'm not saying it's an issue, but definitely something to explore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with you here, i am also not a fan of this approach, working with k8s maintainers to try to reach an agreement and resolve this issue in k8s in some way
kubernetes/kubernetes#132913
given the outcome of this PR

the design can be adjusted to:

  1. user issue delete api call with the flag value provided
  2. virt-controller take into account the flag value from the metadata, and patches the Spec.TerminationGracePeriodSeconds
  3. since virt-handler: ensure grace period metadata sync before shutdown kubevirt#15170
    the graceful time will be synced and updated and propagated properly to the launcher

no virt-api overload and everyone happy

virt-controller will issue the patch instead. what do we think about that?
will add this to the design once i see it's valid approach

@kubevirt-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants