|
| 1 | +# VEP #NNNN: Your short, descriptive title |
| 2 | + |
| 3 | +## Release Signoff Checklist |
| 4 | + |
| 5 | +Items marked with (R) are required *prior to targeting to a milestone / release*. |
| 6 | + |
| 7 | +- [x] (R) Enhancement issue created, which links to VEP dir in [kubevirt/enhancements] (not the initial VEP PR) |
| 8 | +- [ ] (R) Target version is explicitly mentioned and approved |
| 9 | +- [ ] (R) Graduation criteria filled |
| 10 | + |
| 11 | +## Overview |
| 12 | + |
| 13 | +This proposal introduces a VM hibernation mechanism for KubeVirt, enabling users to stop and start virtual machines by saving and restoring their running memory state. |
| 14 | + |
| 15 | +## Motivation |
| 16 | + |
| 17 | +Some users wish to shut down running machines to free up resources, but the virtual machine state remains the same when turned on as when turned off. |
| 18 | + |
| 19 | +## Goals |
| 20 | + |
| 21 | +Add VM hibernation functionality to kubevirt |
| 22 | + |
| 23 | +## Non Goals |
| 24 | + |
| 25 | +Any modification to the hotplug and hotunplug volume process. |
| 26 | + |
| 27 | +## Definition of Users |
| 28 | + |
| 29 | +End Users: these are people/programs that have permission to update Virtual Machine specifications |
| 30 | + |
| 31 | +## User Stories |
| 32 | + |
| 33 | +A user can edit vm to Hibernate a vm which save memory to a pvc and stop vm. User can edit vm to restore vm form the pvc. |
| 34 | + |
| 35 | +## Repos |
| 36 | + |
| 37 | +[https://github.com/kubevirt/kubevirt](https://github.com/kubevirt/kubevirt) |
| 38 | + |
| 39 | +## Design |
| 40 | + |
| 41 | +### **Triggering Hibernation** |
| 42 | + |
| 43 | +- The user sets `spec.runStrategy: Hibernate` in the VM object to initiate hibernation. |
| 44 | +- The controller detects the field change and starts the hibernation process. |
| 45 | + |
| 46 | +```yaml |
| 47 | +spec: |
| 48 | + runStrategy: Hibernate |
| 49 | +``` |
| 50 | +
|
| 51 | +The transition of VM `runStrategy` is as follows: |
| 52 | + |
| 53 | + |
| 54 | + |
| 55 | +### VM State Transition |
| 56 | + |
| 57 | +Also we need some new `VirtualMachinePrintableStatus`:Hiberating, hiberated, Resuming. |
| 58 | + |
| 59 | + |
| 60 | + |
| 61 | +### Hibernation and Wake Strategy |
| 62 | + |
| 63 | +The hibernation configuration includes the method, timeout, and the PVC used. Also we should have a |
| 64 | + |
| 65 | +These should be at the VM level, so it is not suitable to be placed in the KubeVirt CR. There are two approaches: one is to add it to `vm.spec`, the other is to add it as annotations to the VM. |
| 66 | + |
| 67 | +**Method 1:** Add `HibernateStrategy` and `WakeStrategy` in `vm.spec` to specify hibernation-related configuration. |
| 68 | + |
| 69 | +```yaml |
| 70 | +HibernateStrategy: |
| 71 | + mode: save |
| 72 | + timeoutSeconds: 500 |
| 73 | + claimName: XXX-PVC |
| 74 | +WakeStrategy: |
| 75 | + enabled: false |
| 76 | +``` |
| 77 | + |
| 78 | +**Method 2:** Add control annotations to the VM. |
| 79 | + |
| 80 | +```yaml |
| 81 | +kubevirt.io/hibernation-strategy: save |
| 82 | +kubevirt.io/hibernation-strategy-timeout-seconds: 500s |
| 83 | +kubevirt.io/hibernation-strategy-claim-name: |
| 84 | +kubevirt.io/WakeStrategy-strategy: enabled |
| 85 | +``` |
| 86 | + |
| 87 | +If the `save` method is used and `claim name` is not set, KubeVirt will create a PVC based on the VM's memory size and render the corresponding strategy fields. If no `hibernation method` is set, the default is the `save` method. |
| 88 | + |
| 89 | +We need to consider scenarios where VM is hiberated but we want to start it directly using the start interface. Therefore, we hope to expose an interface to control this `WakeStrategy`. If `WakeStrategy` is not set, the default is the `restore` method. We need to set `WakeRategy.enabled` to false to trigger start a vm directly using the start interface. |
| 90 | + |
| 91 | +### VM Status |
| 92 | + |
| 93 | +Just like memorydump add the `VirtualMachineHibernationStatuses` field to the VM's status: |
| 94 | + |
| 95 | +```yaml |
| 96 | +VirtualMachineHibernationStatuses: |
| 97 | + Phase: |
| 98 | + Claim: |
| 99 | + FileName: |
| 100 | + StartTimestamp: |
| 101 | + EndTimestamp: |
| 102 | + Message: |
| 103 | +``` |
| 104 | + |
| 105 | +The `Phase` field references the `dumpmemory` package and includes: |
| 106 | + |
| 107 | +```go |
| 108 | +const ( |
| 109 | + HibernationPhaseInitial HibernationPhase = "Initial" |
| 110 | + HibernationPhaseAssociating HibernationPhase = "Associating" |
| 111 | + HibernationPhaseInProgress HibernationPhase = "InProgress" |
| 112 | + HibernationPhaseCompleted HibernationPhase = "Completed" |
| 113 | + HibernationPhaseFailed HibernationPhase = "Failed" |
| 114 | +) |
| 115 | +``` |
| 116 | + |
| 117 | +Also we need resume from Hibernated(instead of restore already used). Add `VirtualMachineResumeStatuses`field to the VM's status: |
| 118 | + |
| 119 | +``` |
| 120 | +VirtualMachineResumeStatuses: |
| 121 | + Phase: |
| 122 | + Claim: |
| 123 | + FileName: |
| 124 | + StartTimestamp: |
| 125 | + EndTimestamp: |
| 126 | + Message: |
| 127 | +``` |
| 128 | + |
| 129 | +The `Phase` field is also references the `dumpmemory` package and includes: |
| 130 | + |
| 131 | + const( |
| 132 | + ResumePhaseRestoreAssociating ResumePhase = "Associating" |
| 133 | + ResumePhaseRestoreInProgress ResumePhase = "InProgress" |
| 134 | + ResumePhaseRestoreFailed ResumePhase = "Failed" |
| 135 | + ResumePhaseRestoreCompleted ResumePhase = "Completed" |
| 136 | + ResumePhaseRestoreUnmounting ResumePhase = "Dissociating" |
| 137 | + ResumePhaseClean ResumePhase = "Cleaned" |
| 138 | + ) |
| 139 | + |
| 140 | +### VMI Status |
| 141 | + |
| 142 | +In my opinion, we can talk about this later. |
| 143 | + |
| 144 | +--- |
| 145 | + |
| 146 | +### 1. Hibernation |
| 147 | + |
| 148 | +#### Step 1: Trigger & Initial Check |
| 149 | + |
| 150 | +1. The user triggers hibernation by setting `spec.runStrategy` of the VM to `Hibernate`. |
| 151 | + |
| 152 | +2. If PVC is not specified, create PVC. |
| 153 | + |
| 154 | +3. Generate the corresponding `FileName`. |
| 155 | + |
| 156 | +4. Render `VirtualMachineHibernationStatuses` , FileName should related to the vmi, may FileName hash(vmi.uuid): |
| 157 | + |
| 158 | + ```yaml |
| 159 | + HibernationStatuses: |
| 160 | + Phase: Initial-->associating |
| 161 | + Claim: PVCname |
| 162 | + FileName: filename |
| 163 | + ``` |
| 164 | + |
| 165 | +#### Step 2: Hot Mount PVC |
| 166 | + |
| 167 | +1. Use hot-plug logic (similar to current `dumpmemory`; in the future may use "[Utility Volumes](https://github.com/kubevirt/enhancements/pull/91)") to mount the PVC to the `fileName` location. |
| 168 | + |
| 169 | + `VirtualMachineHibernationStatuses.phase` transitions from `associating` to `inprogress`. |
| 170 | + |
| 171 | +#### Step 3: Perform Hibernation (currently only supports `virsh save`) |
| 172 | + |
| 173 | +1. Use the `save` interface to write memory to fIie `VirtualMachineHibernationStatuses.FileName`. |
| 174 | +2. Record `Phase.StartTime` at the beginning. |
| 175 | +3. Upon successful hibernation, record `Phase.StartTime`, and update `Phase` to `Completed`. If failed, update phase to `Failed`. |
| 176 | + |
| 177 | +#### Step 4: Cleanup |
| 178 | + |
| 179 | +1. Sequentially clean up the launcher pod and VMI. |
| 180 | + |
| 181 | +--- |
| 182 | + |
| 183 | +### 2. Restore |
| 184 | + |
| 185 | +#### Step 1: Trigger |
| 186 | + |
| 187 | +1. The user sets VM `spec.runStrategy` to `always` or other, and `WakeStrategy.enabled` to `true` or . |
| 188 | + |
| 189 | +#### Step 2: Hot Mount PVC |
| 190 | + |
| 191 | +1. Use hot-plug logic (similar to current `dumpmemory`; in the future may use "[Utility Volumes](https://github.com/kubevirt/enhancements/pull/91)") to mount the PVC to the `fileName` location. |
| 192 | + `VirtualMachineHibernationStatuses.phase` transitions from `Associating` to `InProgress`. |
| 193 | + |
| 194 | +#### Step 3: Execute Restore |
| 195 | + |
| 196 | +1. Use the restore interface to write memory state to `HibernationInfo.TargetFileName`. |
| 197 | +2. Record `VirtualMachineResumeStatuses.StartTime` at the beginning. |
| 198 | +3. Upon successful restore, record `VirtualMachineResumeStatuses.EndTime`, and update phase to `Completed`. If failed, update phase to `Failed`. |
| 199 | +3. If `Completed` `VirtualMachineResumeStatuses.phase` transitions from `InProgress` to `Dissociating`. |
| 200 | + |
| 201 | +#### Step 4: Cleanup |
| 202 | + |
| 203 | +1. Hot-unmount (`Dissociating`) and remove vm.status.HibernationStatuses (`Clean`). |
| 204 | + |
| 205 | +--- |
| 206 | + |
| 207 | +### 3. Direct Start Without Restore |
| 208 | + |
| 209 | +#### Step 1: Trigger |
| 210 | + |
| 211 | +1. The user sets VM `spec.runStrategy` to `always` or other, and `WakeStrategy.enabled` to `false` . |
| 212 | + |
| 213 | +#### Step 2: Cleanup |
| 214 | + |
| 215 | +1. remove vm.status.VirtualMachineHibernationStatuses(`Clean`). |
| 216 | + |
| 217 | + |
| 218 | + |
| 219 | +## API Examples |
| 220 | + |
| 221 | +### Hibernate a running vm |
| 222 | + |
| 223 | +before |
| 224 | + |
| 225 | +``` |
| 226 | +spec: |
| 227 | + runStrategy: Running |
| 228 | +``` |
| 229 | +
|
| 230 | +after |
| 231 | +
|
| 232 | +``` |
| 233 | +spec: |
| 234 | + runStrategy: Hibernate |
| 235 | + HibernateStrategy: |
| 236 | + mode: save |
| 237 | + timeoutSeconds: 500 |
| 238 | + claimName: XXX-PVC |
| 239 | +``` |
| 240 | +
|
| 241 | +### Resume from a hibernated vm |
| 242 | +
|
| 243 | +before |
| 244 | +
|
| 245 | +``` |
| 246 | +spec: |
| 247 | + runStrategy: Hibernate |
| 248 | + HibernateStrategy: |
| 249 | + mode: save |
| 250 | + timeoutSeconds: 500 |
| 251 | + claimName: XXX-PVC |
| 252 | +``` |
| 253 | +
|
| 254 | +after |
| 255 | +
|
| 256 | +``` |
| 257 | +spec: |
| 258 | + runStrategy: Running |
| 259 | + HibernateStrategy: |
| 260 | + mode: save |
| 261 | + timeoutSeconds: 500 |
| 262 | + claimName: XXX-PVC |
| 263 | +``` |
| 264 | +
|
| 265 | +### start a hibernated vm |
| 266 | +
|
| 267 | +before |
| 268 | +
|
| 269 | +``` |
| 270 | +spec: |
| 271 | + runStrategy: Hibernate |
| 272 | + HibernateStrategy: |
| 273 | + mode: save |
| 274 | + timeoutSeconds: 500 |
| 275 | + claimName: XXX-PVC |
| 276 | +``` |
| 277 | +
|
| 278 | +after |
| 279 | +
|
| 280 | +``` |
| 281 | +spec: |
| 282 | + runStrategy: Running |
| 283 | + HibernateStrategy: |
| 284 | + mode: save |
| 285 | + timeoutSeconds: 500 |
| 286 | + claimName: XXX-PVC |
| 287 | + WakeStrategy: |
| 288 | + enabled: false |
| 289 | +``` |
| 290 | +
|
| 291 | +## Alternatives |
| 292 | +
|
| 293 | +<!-- |
| 294 | +Outline any alternative designs that have been considered) |
| 295 | +--> |
| 296 | +
|
| 297 | +## Scalability |
| 298 | +
|
| 299 | +<!-- |
| 300 | +Overview of how the design scales) |
| 301 | +--> |
| 302 | +
|
| 303 | +## Update/Rollback Compatibility |
| 304 | +
|
| 305 | +<!-- |
| 306 | +Does this impact update compatibility and how?) |
| 307 | +--> |
| 308 | +
|
| 309 | +## Functional Testing Approach |
| 310 | +
|
| 311 | +<!-- |
| 312 | +An overview on the approaches used to functional test this design) |
| 313 | +--> |
| 314 | +
|
| 315 | +## Implementation Phases |
| 316 | +
|
| 317 | +<!-- |
| 318 | +How/if this design will get broken up into multiple phases) |
| 319 | +--> |
| 320 | +
|
| 321 | +## Feature lifecycle Phases |
| 322 | +
|
| 323 | +<!-- |
| 324 | +How and when will the feature progress through the Alpha, Beta and GA lifecycle phases |
| 325 | +
|
| 326 | +Refer to https://github.com/kubevirt/community/blob/main/design-proposals/feature-lifecycle.md#releases for more details |
| 327 | +--> |
| 328 | +
|
| 329 | +### Alpha |
| 330 | +
|
| 331 | +### Beta |
| 332 | +
|
| 333 | +### GA |
0 commit comments