-
-
Notifications
You must be signed in to change notification settings - Fork 772
Setup resource types and proto for lvm support #12530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Pranav Patil <[email protected]>
Signed-off-by: Pranav Patil <[email protected]>
|
I'm still thinking about , #12309 (comment)
Don't know much about how CSI works with handling lv's (still reading) |
|
After playing around a bit with the following design where:
Talos can easily manage PVs and VGs. I tried setting up configs and controllers for both, and they worked fine However, when handing this off to TopoLVM, the container / DaemonSet / pod is expected to perform operations such as: These commands:
To make this work, the CSI pod must:
Which feels like giving CSI pod more than enough powers While it may still be possible to have CSI manage Logical Volumes, this approach feels
We can have logical volumes handled via talos, that might increase complexity within talos but I believe it would be more talos friendly (having api's) do your work. Let me know what would be the best approach to proceed. |
As a user of Talos I've been following topics related to volume management with interest. What you are describing is possible; however, a CSI driver does not mount or execute the host lvm binary. This can be replicated with a privileged pod that includes In fact, the csi-driver-lvm project carries this out end-to-end, including the creation VolumeGroups on existing versions of Talos. (I've verified this works on v1.11). |
|
Yes, this would work, but my concern is whether it aligns with Talos’ philosophy.
which effectively gives it root-level access to the host from within the container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are next steps? Do you want to create controllers in the same PR, or...?
pkg/machinery/resources/block/lvm.go
Outdated
| //gotagsrewrite:gen | ||
| type VolumeGroupSpec struct { | ||
| Name string `yaml:"name" protobuf:"1"` | ||
| PhysicalVolumes []string `yaml:"physicalVolumes" protobuf:"2"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a circular dependency between VG and PV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there would be circular dependency here, we are not embedding and structs/packages.
LVMVolumeGroupSpec uses [] string for PhysicalVolumes and LVMPhysicalVolumeSpec uses a string for VolumeGroupName, both are just string identifiers and not actual type references
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant from the logical standpoint - we will be tracking references by name to other objects in two controllers. Will we be able to detect this details from existing PV/VG created by a workload?
|
@shanduur , Anything works. Can you help me answer this #12530 (comment), There can be 2 approaches
Depending on the answer we'd have to setup configs for LVM. |
I think talos can manage all of pv,vg and lv and users can chose what talos manages and what CSI manages |
|
Alright, I think I was worried too much about how things will work when there will be conflicts, i.e. having both Logical volumes in configs and telling CSI to manage stuff (conflicts between talos and CSI). Can just add validations to avoid those scenarios. |
Signed-off-by: Pranav Patil <[email protected]>
|
Sorry I didn't respond earlier, there was a bank holiday here in Poland. I agree with Noel, Talos API should be self sufficient, and allow management of vg, pv and lv through Talos API, but should allow managing them also outside of Talos - e.g. via CSI. One thing that needs to be clarified (and IMHO it should be a starting point for designing the LVM API), is to detect the LVM objects created by the external entities. |
| // LVMVolumeConfigV1Alpha1 is an LVM volume configuration document. | ||
| // | ||
| // description: | | ||
| // LVM volume configuration allows configuring LVM physical volumes, volume groups, and logical volumes. | ||
| // Each component is optional, allowing Talos to manage only what you specify. | ||
| // For example, you can manage only PVs and VGs, letting a CSI driver manage logical volumes. | ||
| // examples: | ||
| // - value: exampleLVMVolumeConfigSimple() | ||
| // - value: exampleLVMVolumeConfigForCSI() | ||
| // alias: LVMVolumeConfig | ||
| // schemaRoot: true | ||
| // schemaMeta: v1alpha1/LVMVolumeConfig | ||
| type LVMVolumeConfigV1Alpha1 struct { | ||
| meta.Meta `yaml:",inline"` | ||
|
|
||
| // description: | | ||
| // Name of the LVM configuration. | ||
| // | ||
| // Name can only contain lowercase and uppercase ASCII letters, digits, and hyphens. | ||
| MetaName string `yaml:"name"` | ||
| // description: | | ||
| // Physical volumes to create. | ||
| PhysicalVolumes []LVMPhysicalVolumeConfig `yaml:"physicalVolumes,omitempty"` | ||
| // description: | | ||
| // Volume groups to create. | ||
| VolumeGroups []LVMVolumeGroupConfig `yaml:"volumeGroups,omitempty"` | ||
| // description: | | ||
| // Logical volumes to create. | ||
| LogicalVolumes []LVMLogicalVolumeConfig `yaml:"logicalVolumes,omitempty"` | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: If a LV, VG, PV is removed from the config, should it be deleted, whole thing should be recreated, or...?
No problem, yes I got the gist, and I think this might work let me know how you feel about this: A user can have a LVM config like:
For all created resources, we can attach a metadata label such as:
I should've moved this to a draft PR 🤔 , already had changes for label/discovery bit, was trying CSI stuff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like this design - it feels a little bit clunky.
If you generate the docs, the document for the LVM will look like this:
LVMVolumeConfig
apiVersion: v1alpha1
kind: LVMVolumeConfig
name: database-storage # Name of the LVM configuration.
# Physical volumes to create.
physicalVolumes:
- device: /dev/sdb # Device path for the physical volume.
# Volume groups to create.
volumeGroups:
- name: data-vg # Name of the volume group.
# Physical volumes to include in this volume group.
physicalVolumes:
- /dev/sdb
# # Extent size for the volume group.
# extentSize: 4MiB
# extentSize: 8MiB
# Logical volumes to create.
logicalVolumes:
- name: data-lv # Name of the logical volume.
volumeGroup: data-vg # Volume group name this logical volume belongs to.
size: 100GB # Size of the logical volume.
# Filesystem configuration for the logical volume.
filesystem:
type: xfs # Filesystem type. Default is `xfs`.
# Mount configuration for the logical volume.
mount:
path: /var/lib/data # Path where the volume should be mounted.
# # Encryption configuration for the logical volume.
# encryption:
# provider: luks2 # Encryption provider to use for the encryption.
# # Defines the encryption keys generation and storage method.
# keys:
# - slot: 0 # Key slot number for LUKS2 encryption.
# # Key which value is stored in the configuration file.
# static:
# passphrase: exampleKey # Defines the static passphrase value.
#
# # # KMS managed encryption key.
# # kms:
# # endpoint: https://192.168.88.21:4443 # KMS endpoint to Seal/Unseal the key.
# - slot: 1 # Key slot number for LUKS2 encryption.
# # KMS managed encryption key.
# kms:
# endpoint: https://example-kms-endpoint.com # KMS endpoint to Seal/Unseal the key.
# cipher: aes-xts-plain64 # Cipher to use for the encryption. Depends on the encryption provider.
# blockSize: 4096 # Defines the encryption sector size.
# # Additional --perf parameters for the LUKS2 encryption.
# options:
# - no_read_workqueue
# - no_write_workqueueI think it does not represent the LVM relationships between primitives well.
Splitting the resources in the v1alpha1 will give us better control over them?
---
apiVersion: v1alpha1
kind: LVMVolumeGroupConfig
name: vg-pool
physicalVolumes:
volumeSelector:
match: 'disk.dev_path == /dev/sda0'
---
apiVersion: v1alpha1
kind: LVMLogicalVolumeConfig
name: lv-data
lvType: linear
provisioning:
volumeGroup: vg-pool
maxSize: 50GiB # optional, default 100%
# minSize: 2.5GiB
---
apiVersion: v1alpha1
kind: UserVolumeConfig
name: lv-flex-data
volumeType: partition
provisioning:
diskSelector:
match: 'disk.dev_path == "/dev/vg-pool/lv-data"'
# ...Signed-off-by: Pranav Patil <[email protected]>
| VolumeTypeSymlink // symlink | ||
| VolumeTypeOverlay // overlay | ||
| VolumeTypeExternal // external | ||
| LVMVolumeTypePhysicalVolume // physical volume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that LVMs are not "volumes" in the way we treat volumes today in Talos.
Talos volumes represent something "mountable", while LVMs are raw disks (before they are properly formatted/used).
Moreover, I don't think we should start with any implementation work before we land on the "API" of the change (proposal) - what does the machine config look like, how does it work, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,also this file talks all about volumes, I think we can have a different file just describing LVM types.
// LVMType describes LVM type.
LVMTypePhysicalVolume
LVMTypeVolumeGroup
LVMTypeLogicalVolume
Yes, agreed I've been going round and round about implementing this, It would be better to outline how configs look like and basic working.
I liked what @shanduur mentioned above, having seperate configs for each LVM types
- Physical Volumes
Physical Volumes will be talos managed since CSI will only manage LV.
We can have both Declarative Create and Imperative Create.
A Physical Volume Config should always have name, that name must be unique.
A diskSelector can be used to find the device or know which block pv needs to be created from and talos should be responsible for creating this PV.
apiVersion: v1alpha1
kind: LVMPhysicalVolumeConfig
name: pv-sd0
diskSelector:
match: 'disk.dev_path == /dev/sda0'
We can have a LVMDiscovery controller which checks on LVM resources .
Once the PV is created it should have an inactive status (Record/track of which LVMDiscovery should know),
If in future the PV is used by a volume group this status should go to active, and when it has extents allocated state can go to in-use
- After adding this config, if the PV is not present -> Create PV.
- For Deletion:
- If the PV is present and state is active/in-use, we should block anyone trying to remove the cofig.
- If the PV is present but the status is inactive, we can allow removal of config (but the PV is still not deleted) , (If user tried to add same config again, LVMDiscovery should note PV exists already with a
inactivestatus and does nothing.
(I'm having second thoughts about this, maybe we should not allow direct removal of config even if it is not in use, maybe the config removal should only be done via API)
As a user if a config is removed and pv still exists , and I do a get config those configs would be missing, LVMDiscovery might record LVM resources but might get confusing, So I'm more inclined having configs removed only via talosctl lvm delete pv that too only inactive ones.
Complete cleanup/ Deletion of PV should be allowed only when it is inactive and through API
maybe like talosctl lvm delete pv-sd0
For API it should follow what config does,
We can have 3 funtionalities:
`
- talosctl lvm create pv --disk
- talosctl lvm delete pv
- talosctl get pv
`
I don't think it will be a good idea to edit any existing pv, maybe through change in config it will possible but not via API
Let me know how you feel about above config and API, its better to get one component of LVM straight before moving on to next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely think that if we go the route to have PV as a separate entity, it should be immutable. But, do we need to have them as a separate entity? Why don't we just rely on the LVMVolumeGroupConfig to have a selector, based on which disks belonging to the specific VG are selected and provisioned into PV? Users will have harder time breaking things this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could work as well, but what happens when there is no PV with the selector, do we automatically create those PVs? (Don't think that would be the correct answer)
It should go in pending untill such pv appears
But the question being how will we manage PVs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no PV with the selector
Selector is only for the disks / devices. Based on that selector, we provision PVs and add them to VG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I think I get it now, those are disks and will be applicable when talos sees those disks, when seen talos will create those pvs and will be assigned to vg
I'll continue with the idea for VG config next (proper design and co fig and flow)
|
@shanduur , @smira For Volume Groups config can be:
Although Talos always reconciles and manages the VG object itself, the managed-by field defines who is allowed to manage Logical Volumes (LVs) inside the VG. This helps determine the expected workflow and what configuration comes next. VG and PV provisioning flow
LV ownership and behaviorThe A VG must have exactly one LV owner.
Deletion
Deletion flow:
Let me know what you think of above design? |
|
I feel this is a bit more complicated than what I would like it to be. First of all, CSIs (as far as I know) today happily manage LVM volumes without any Talos involvement, so I'm not sure why we need to introduce any concept of it to Talos. Second, I think the configuration should be something that speaks an easier language than the raw LVM language, so it serves the purpose better rather than trying to support all cases. So I would start with a usecase for LVM. Let's imagine I have 4 disks in my machine. One disk will be Talos system disk, let's take it out of the picture. I want to combine remaining 3 disks into a logical volume that spans the whole 3 disks. Usecase: I want to give workload access to a volume that can utilize all three disks as a whole. What is the machine config I would like to have? How does it get reconciled if a new disk is added? I would imagine something that I just match all disks by saying e.g. But I would like to have a higher level, probably opinionated design which exposes the features in a thought through and constrained way. Another case might be RAID 0 (which certainly has different restrictions) and RAID 1. I don't know if there's any real case for slicing a disk, or this should be done some other way. But I would like a scenario and a solution, not a way to map all LVM features into Talos. |
The ownership concept won't work because we have no real way to enforce / lock VG from use outside of Talos. We would need to mask the devices, not propagate them to kubelet, or do some other not-fun hacks, or e.g. requiring enforcing mode with SELinux. Otherwise it's just a metadata that has no impact and does not block any actions - it is meaningless.
I strongly agree with the deletion mechanism. However we should avoid a standalone
I agree here. But as @smira said, we need to build user story here, to ensure this is a good enough design, that can be extended in the future when new requirements are brought.
This will bloat already complicated controller, let's not do that. We should treat LVM as a 'Layer 2' storage entity. The block devices controller identifies raw block devices; the LVM Controller consumes those. |
Pull Request
Resolves a part of #12309
What? (description)
Setup Resources and Proto
Updated with following structs:
Why? (reasoning)
Setup structs for LVM support ( physical volumes, volume groups and logical volumes)
Acceptance
Please use the following checklist:
make conformance)make fmt)make lint)make docs)make unit-tests)