Skip to content

Commit 88936ee

Browse files
committed
in-place hotplug
Signed-off-by: Itamar Holder <[email protected]>
1 parent b19b7fd commit 88936ee

File tree

1 file changed

+115
-0
lines changed
  • veps/sig-compute/45-in-place-hotplug

1 file changed

+115
-0
lines changed
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# VEP #NNNN: Your short, descriptive title
2+
3+
## Release Signoff Checklist
4+
5+
Items marked with (R) are required *prior to targeting to a milestone / release*.
6+
7+
- [X] (R) Enhancement issue created, which links to VEP dir in [kubevirt/enhancements] (not the initial VEP PR)
8+
- [ ] (R) Target version is explicitly mentioned and approved
9+
- [ ] (R) Graduation criteria filled
10+
11+
## Overview
12+
13+
Recently, memory and CPU hotplug was added to KubeVirt.
14+
This allows users to add memory and CPU to a running VM on the fly.
15+
16+
When a VM gets hotplugged, the underlying virt-launcher pod's resources need to be modified accordingly.
17+
Traditionally in Kubernetes, pods are immutable. Once a pod is created, its resource requests and limits cannot be changed.
18+
Therefore, the hotplug feature was implemented by live-migrating the VM which would result in a new virt-launcher pod
19+
with the updated resources.
20+
21+
On the hotplug's original [design proposal](https://github.com/kubevirt/community/blob/main/design-proposals/cpu-hotplug.md#goals)
22+
(that pre-dated the VEP process) it's written:
23+
> Implementation should be achievable today, with Kubernetes APIs that are at least in beta.
24+
> Unfortunately, at the time of writing, the Kubernetes vertical pod scaling API is only alpha.
25+
26+
Fortunately, the in-place pod resize feature was [graduated to beta](https://github.com/kubernetes/enhancements/blob/61abddca34caac56d22b7db48734b7040dc68b43/keps/sig-node/1287-in-place-update-pod-resources/kep.yaml#L40)
27+
in Kubernetes 1.33.
28+
Therefore, Kubevirt should aim to move away from live-migrating the VM on hotplug and instead use the in-place pod resize feature.
29+
30+
## Motivation
31+
32+
With the in-place pod resize feature, the kubelet (through the CRI) can update the pod's resource requests and limits.
33+
This will allow us to avoid live-migrating the VM on hotplug, which saves a lot of resources, reduces downtime and risk
34+
and improves the user experience.
35+
36+
The change should be as transparent to the user as possible, as this is essentially an implementation detail.
37+
38+
## Goals
39+
40+
* Implement in-place pod resize for CPU and memory hotplug.
41+
* Use in-pod resize as a default strategy for hotplug.
42+
43+
## Non Goals
44+
45+
* The user to explicitly decide whether to use in-place pod resize or live-migration (as migration doesn't really makes sense anymore).
46+
47+
## Definition of Users
48+
49+
* VM owners.
50+
* Admins / namespace owners.
51+
52+
## User Stories
53+
54+
* As a user, I want to hotplug CPU and memory to my VM without having to live-migrate it.
55+
* As an admin, I want to save cluster resources and improve performance by avoiding live-migrations.
56+
* As a namespace owner with a ResourceQuota, I want to be able to hotplug CPU and memory to my VMs without having to worry about the quota being exceeded.
57+
58+
## Repos
59+
60+
kubevirt/kubevirt
61+
62+
## Design
63+
64+
Currently, whenever a VM is hotplugged, virt-controller updates a condition that triggers the workload updater controller
65+
which leads to a live-migration of the VM.
66+
67+
With this VEP is implemented, the controller would simply change the pod's resources and wait for them to be applied by kubelet.
68+
In turn, the workload updater controller would avoid live-migrating on this situation.
69+
70+
## API Examples
71+
72+
No API changes are expected.
73+
74+
## Alternatives
75+
76+
An alternative to completely dropping the live-migration update method is to keep it as a secondary option that needs
77+
to be explicitly enabled by the user. This could potentially help in situations where the pod cannot increase its resources
78+
due to node constraints or other reasons.
79+
80+
## Scalability
81+
82+
This should improve scalability dramatically as it reduces the number of live-migrations that need to be performed during
83+
hotplugs.
84+
85+
## Update/Rollback Compatibility
86+
87+
No update/rollback compatibility issues are expected.
88+
89+
## Functional Testing Approach
90+
91+
Hotplug tests should be updated to test the in-place pod resize feature.
92+
93+
## Implementation Phases
94+
95+
<!--
96+
How/if this design will get broken up into multiple phases)
97+
-->
98+
99+
## Feature lifecycle Phases
100+
101+
<!--
102+
How and when will the feature progress through the Alpha, Beta and GA lifecycle phases
103+
104+
Refer to https://github.com/kubevirt/community/blob/main/design-proposals/feature-lifecycle.md#releases for more details
105+
-->
106+
107+
### Alpha
108+
109+
- [ ] Implement in-place pod resize for CPU and memory hotplugs.
110+
111+
### Beta
112+
- [ ] Turn this feature on by default.
113+
114+
### GA
115+
- [ ] Ensure tests are constantly green.

0 commit comments

Comments
 (0)