|
| 1 | +# VEP #NNNN: Your short, descriptive title |
| 2 | + |
| 3 | +## Release Signoff Checklist |
| 4 | + |
| 5 | +Items marked with (R) are required *prior to targeting to a milestone / release*. |
| 6 | + |
| 7 | +- [X] (R) Enhancement issue created, which links to VEP dir in [kubevirt/enhancements] (not the initial VEP PR) |
| 8 | +- [ ] (R) Target version is explicitly mentioned and approved |
| 9 | +- [ ] (R) Graduation criteria filled |
| 10 | + |
| 11 | +## Overview |
| 12 | + |
| 13 | +Recently, memory and CPU hotplug was added to KubeVirt. |
| 14 | +This allows users to add memory and CPU to a running VM on the fly. |
| 15 | + |
| 16 | +When a VM gets hotplugged, the underlying virt-launcher pod's resources need to be modified accordingly. |
| 17 | +Traditionally in Kubernetes, pods are immutable. Once a pod is created, its resource requests and limits cannot be changed. |
| 18 | +Therefore, the hotplug feature was implemented by live-migrating the VM which would result in a new virt-launcher pod |
| 19 | +with the updated resources. |
| 20 | + |
| 21 | +On the hotplug's original [design proposal](https://github.com/kubevirt/community/blob/main/design-proposals/cpu-hotplug.md#goals) |
| 22 | +(that pre-dated the VEP process) it's written: |
| 23 | +> Implementation should be achievable today, with Kubernetes APIs that are at least in beta. |
| 24 | +> Unfortunately, at the time of writing, the Kubernetes vertical pod scaling API is only alpha. |
| 25 | + |
| 26 | +Fortunately, the in-place pod resize feature was [graduated to beta](https://github.com/kubernetes/enhancements/blob/61abddca34caac56d22b7db48734b7040dc68b43/keps/sig-node/1287-in-place-update-pod-resources/kep.yaml#L40) |
| 27 | +in Kubernetes 1.33. |
| 28 | +Therefore, Kubevirt should aim to move away from live-migrating the VM on hotplug and instead use the in-place pod resize feature. |
| 29 | + |
| 30 | +## Motivation |
| 31 | + |
| 32 | +With the in-place pod resize feature, the kubelet (through the CRI) can update the pod's resource requests and limits. |
| 33 | +This will allow us to avoid live-migrating the VM on hotplug, which saves a lot of resources, reduces downtime and risk |
| 34 | +and improves the user experience. |
| 35 | + |
| 36 | +The change should be as transparent to the user as possible, as this is essentially an implementation detail. |
| 37 | + |
| 38 | +## Goals |
| 39 | + |
| 40 | +* Implement in-place pod resize for CPU and memory hotplug. |
| 41 | +* Use in-pod resize as a default strategy for hotplug. |
| 42 | + |
| 43 | +## Non Goals |
| 44 | + |
| 45 | +* The user to explicitly decide whether to use in-place pod resize or live-migration (as migration doesn't really makes sense anymore). |
| 46 | + |
| 47 | +## Definition of Users |
| 48 | + |
| 49 | +* VM owners. |
| 50 | +* Admins / namespace owners. |
| 51 | + |
| 52 | +## User Stories |
| 53 | + |
| 54 | +* As a user, I want to hotplug CPU and memory to my VM without having to live-migrate it. |
| 55 | +* As an admin, I want to save cluster resources and improve performance by avoiding live-migrations. |
| 56 | +* As a namespace owner with a ResourceQuota, I want to be able to hotplug CPU and memory to my VMs without having to worry about the quota being exceeded. |
| 57 | + |
| 58 | +## Repos |
| 59 | + |
| 60 | +kubevirt/kubevirt |
| 61 | + |
| 62 | +## Design |
| 63 | + |
| 64 | +Currently, whenever a VM is hotplugged, virt-controller updates a condition that triggers the workload updater controller |
| 65 | +which leads to a live-migration of the VM. |
| 66 | + |
| 67 | +With this VEP is implemented, the controller would simply change the pod's resources and wait for them to be applied. |
| 68 | + |
| 69 | +TODO: should the controller wait for the kubelet to apply the changes? Expand this. |
| 70 | + |
| 71 | +## API Examples |
| 72 | + |
| 73 | +<!-- |
| 74 | +Tangible API examples used for discussion |
| 75 | +--> |
| 76 | + |
| 77 | +## Alternatives |
| 78 | + |
| 79 | +<!-- |
| 80 | +Outline any alternative designs that have been considered) |
| 81 | +--> |
| 82 | + |
| 83 | +## Scalability |
| 84 | + |
| 85 | +<!-- |
| 86 | +Overview of how the design scales) |
| 87 | +--> |
| 88 | + |
| 89 | +## Update/Rollback Compatibility |
| 90 | + |
| 91 | +<!-- |
| 92 | +Does this impact update compatibility and how?) |
| 93 | +--> |
| 94 | + |
| 95 | +## Functional Testing Approach |
| 96 | + |
| 97 | +<!-- |
| 98 | +An overview on the approaches used to functional test this design) |
| 99 | +--> |
| 100 | + |
| 101 | +## Implementation Phases |
| 102 | + |
| 103 | +<!-- |
| 104 | +How/if this design will get broken up into multiple phases) |
| 105 | +--> |
| 106 | + |
| 107 | +## Feature lifecycle Phases |
| 108 | + |
| 109 | +<!-- |
| 110 | +How and when will the feature progress through the Alpha, Beta and GA lifecycle phases |
| 111 | + |
| 112 | +Refer to https://github.com/kubevirt/community/blob/main/design-proposals/feature-lifecycle.md#releases for more details |
| 113 | +--> |
| 114 | + |
| 115 | +### Alpha |
| 116 | + |
| 117 | +### Beta |
| 118 | + |
| 119 | +### GA |
0 commit comments