From 48924b9d71ac5c77d17319e4c6de025f0659690c Mon Sep 17 00:00:00 2001 From: Miguel Parada Date: Fri, 5 Dec 2025 13:04:45 -0600 Subject: [PATCH] added vsphere and velero docs --- ...ero-with-swift-vsphere-csi-config-guide.md | 155 ++++++++++++++++++ docs/vsphere-csi-config-guide.md | 114 +++++++++++++ 2 files changed, 269 insertions(+) create mode 100644 docs/velero-with-swift-vsphere-csi-config-guide.md create mode 100644 docs/vsphere-csi-config-guide.md diff --git a/docs/velero-with-swift-vsphere-csi-config-guide.md b/docs/velero-with-swift-vsphere-csi-config-guide.md new file mode 100644 index 0000000..f73693a --- /dev/null +++ b/docs/velero-with-swift-vsphere-csi-config-guide.md @@ -0,0 +1,155 @@ +# Velero Backup Configuration + +## Overview +Velero provides backup and disaster recovery for Kubernetes clusters using OpenStack Swift for object storage and vSphere CSI for volume snapshots. + +## Key Configuration Choices + +### Storage Backend +```yaml +backupStorageLocation: + - name: iad3-flex-dei7343-a9256 + provider: community.openstack.org/openstack + bucket: k8s-dr-velero +``` +**Why**: Uses OpenStack Swift via the community plugin for backup metadata storage. + +**Note**: Initially attempted to use the AWS S3 plugin with Swift's S3-compatible endpoint, but Swift doesn't support AWS chunked uploads (used for large objects). The native OpenStack plugin provides better compatibility with Swift's API. + +### CSI Snapshot Integration +```yaml +configuration: + features: EnableCSI + defaultSnapshotMoveData: false + defaultVolumesToFsBackup: false + volumeSnapshotLoc [] +``` +**Why**: +- `defaulCSI`: Enables CSI snapshot support for volume backups +- `defaultVolushotMoveData: false`: Uses CSI snapshots instead of file-level backups by default +- `defaultVolumesToFsBackup: false`: Prevents automatic file-level backups (opt-in only) +- `volumeSnapshotLocation: []`: Disables legacy VolumeSnapshotLocation (CSI uses VolumeSnapshotClass instead) + +### VolumeSnapshotClass +```yaml +extraObjects: + - apiVersion: snapshot.storage.k8s.io/v1 + kind: VolumeSnapshotClass + metadata: + name: velero-vsphere-snapshot-class + labels: + velero.io/csi-volumesnapshot-cl "true" + driver: csi.vspher + deletionPolicy: Delete +``` +**Why**: Defines how Velero creates CSI snapshots. The label `velero.io/csi-volumesnapshot-class: "true"` tells Velero to use this class for backups. + +### Credentials +```yaml +podEnvFrom: + - secretRef: + name: cloud-credentials +``` +**Why**: OpenStack plugin requires environment variables (`OS_AUTH_URL`, `OS_APPLICATION_CREDENTIAL_ID`, etc.) for authentication. The secret contains both individual env vars and a `cloud` key for Velero's credential file mount. + +### Node Agent +```yaml +deployNodeAgent: false +``` +**Why**: Currently disabled. Kopia (the file-level backup engine) doesn't support OpenStack Swift backend. Requires further research into: +- Using S3-compatible Swift endpoint for Kopia +- Alternative storage backends for file-level backups +- Hybrid approach with separate storage for file-level vs metadata + +CSI snapshots provide sufficient backup coverage for current needs. + +## Common Pitfalls + +### Kopia Backend Incompatibility +**Problem**: "invalid backend type community.openstack.org/openstack" errors during file-level backups. + +**Solution**: Kopia (used by node-agent) doesn't support OpenStack. Set `defaultVolumesToFsBackup: false` to use CSI snapshots by default. + +### Missing Environment Variables +**Problem**: "Missing input for argument [auth_url]" authentication errors. + +**Solution**: The secret must be mounted as environment variables using `podEnvFrom`. Individual `OS_*` keys in the secret are required, not just a clouds.yaml file. + +### Swift Temp URL Authentication +**Problem**: "401 Unauthorized: Temp URL invalid" errors. + +**Solution**: Both `OS_SWIFT_TEMP_URL_KEY` and `OS_SWIFT_TEMP_URL_DIGEST` are required in the credentials secret. These must match the temp URL key configured on the Swift container. + +```bash +# Set temp URL key on Swift container (if not already set) +swift post -m "Temp-URL-Key: " +``` + +The `OS_SWIFT_TEMP_URL_KEY` value must match the key set on the container, and `OS_SWIFT_TEMP_URL_DIGEST` specifies the hash algorithm (typically `sha256`). + +### VolumeSnapshotLocation Errors +**Problem**: "spec.provider: Required value" during Helm upgrade. + +**Solution**: Set `volumeSnapshotLocation: []` to disable legacy snapshot locations. CSI snapshots use VolumeSnapshotClass instead. + +### Pod Security Standards +**Problem**: node-agent DaemonSet fails with "violates PodSecurity" errors. + +**Solution**: Velero namespace requires `privileged` Pod Security Standard for hostPath volumes. + +## Required Secrets + +### cloud-credentials +Contains OpenStack authentication credentials. Must include both environment variables and a `cloud` key. + +```yaml +stringData: + # Environment variables for OpenStack plugin + OS_AUTH_URL: https://keystone.api.iad3.rackspacecloud.com/v3 + OS_APPLICATION_CREDENTIAL_ID: + OS_APPLICATION_CREDENTIAL_SECRET: + OS_REGION_NAME: IAD3 + OS_SWIFT_TEMP_URL_KEY: + OS_SWIFT_TEMP_URL_DIGEST: sha256 +``` + +**Key Fields**: +- `OS_AUTH_URL`: OpenStack Keystone endpoint (required) +- `OS_APPLICATION_CREDENTIAL_ID`: Application credential ID (required) +- `OS_APPLICATION_CREDENTIAL_SECRET`: Application credential secret (required) +- `OS_REGION_NAME`: OpenStack region (required) +- `OS_SWIFT_TEMP_URL_KEY`: Temp URL key for Swift authentication (required, must match container setting) +- `OS_SWIFT_TEMP_URL_DIGEST`: Hash algorithm for temp URLs (required, typically `sha256`) + +## Verification +```bash +# Check backup storage location +kubectl get backupstoragelocation -n velero + +# Verify CSI snapshot class +kubectl get volumesnapshotclass velero-vsphere-snapshot-class + +# Test backup +velero backup create test --include-namespaces=default + +# Check backup status +velero backup describe test --details + +# View backup logs +velero backup logs test +``` + +## Backup Usage + +### CSI Snapshot Backup (Current Method) +```bash +velero backup create my-backup --include-namespaces=myapp +``` + +### File-Level Backup (Future) +File-level backups via node-agent are currently disabled and require: +- Compatible storage backend (Kopia doesn't support OpenStack) +- Additional testing and validation +- Possible migration to S3-compatible Swift endpoint or alternative backend + +For now, all backups use CSI snapshots exclusively. diff --git a/docs/vsphere-csi-config-guide.md b/docs/vsphere-csi-config-guide.md new file mode 100644 index 0000000..9bf09fd --- /dev/null +++ b/docs/vsphere-csi-config-guide.md @@ -0,0 +1,114 @@ +# vSphere CSI Driver Configuration + +## Overview +vSphere CSI driver provides persistent storage and snapshot capabilities for Kubernetes workloads running on vSphere infrastructure. + +## Key Configuration Choices + +### Snapshot Support +```yaml +controller: + replicaCount: 3 + config: + block-volume-snapshot: true + snapshotter: + image: + registry: registry.k8s.io + repository: sig-storage/csi-snapshotter + tag: v8.2.0 +``` +**Why**: +- `block-volume-snapshot: true`: Enables block volume snapshot capability in the CSI driver +- `snapshotter` sidecar: Required for the CSI controller to handle VolumeSnapshot requests from Velero + +Both settings are required for CSI snapshot functionality. + +### Snapshot Controller +```yaml +snapshot: + controller: + enabled: true +``` +**Why**: Deploys the snapshot-controller which watches VolumeSnapshot resources and coordinates with the CSI driver to create snapshots. + +## Common Pitfalls + +### Missing Snapshotter Sidecar +**Problem**: VolumeSnapshots stuck in "Waiting for CSI driver" state. + +**Solution**: The `controller.snapshotter` configuration must be present in helm values. The snapshotter sidecar container is NOT enabled by default and must be explicitly configured. + +**Verification**: +```bash +kubectl get pod -n vmware-system-csi -o jsonpath='{.spec.containers[*].name}' +``` +Should include `csi-snapshotter` in the output. + +### Pod Security Standards +**Problem**: CSI pods fail to start with "violates PodSecurity" errors. + +**Solution**: The vmware-system-csi namespace requires `privileged` Pod Security Standard due to hostPath volumes and privileged containers. + +```yaml +metadata: + labels: + pod-security.kubernetes.io/enforce: privileged +``` + +## Required Secrets + +### vsphere-config-secret (CSI Driver) +Contains vSphere connection details for the CSI driver. Key: `csi-vsphere.conf` + +```ini +[Global] +cluster-id = "k8s-dr" + +[VirtualCenter "vcenter.example.com"] +insecure-flag = "true" +user = "administrator@vsphere.local" +password = "password" +port = "443" +datacenters = "Datacenter1" +``` + +**Key Fields**: +- `cluster-id`: Unique identifier for this Kubernetes cluster +- `insecure-flag`: Set to "true" for self-signed certificates +- `datacenters`: vSphere datacenter name(s) + +### vsphere-cpi-secret (Cloud Provider Interface) +Contains vSphere configuration for the CPI. Key: `vsphere.conf` + +```yaml +global: + port: 443 + insecureFlag: true + +vcenter: + vcenter-name: + server: vcenter.example.com + user: administrator@vsphere.local + password: "password" + datacenters: + - Datacenter1 +``` + +**Key Fields**: +- `vcenter-name`: Arbitrary name for this vCenter (used as identifier) +- `server`: vCenter hostname or IP +- `datacenters`: List of datacenter names + +**Note**: Both secrets use the same vSphere credentials but different formats (INI vs YAML). + +## Verification +```bash +# Check CSI driver is registered +kubectl get csidrivers csi.vsphere.vmware.com + +# Verify snapshot controller is running +kubectl get pods -n vmware-system-csi | grep snapshot-controller + +# Test snapshot capability +kubectl get volumesnapshotclass +```