cluster-api-janitor-openstack is a Kubernetes operator that cleans up resources
created in OpenStack by the
OpenStack Cloud Controller Manager (OCCM)
and
Cinder CSI plugin
for Kubernetes clusters created using the
Cluster API OpenStack infrastructure provider.
cluster-api-janitor-openstack can be installed using Helm:
helm repo add \
cluster-api-janitor-openstack \
https://azimuth-cloud.github.io/cluster-api-janitor-openstack
# Use the latest version from the main branch
helm upgrade \
cluster-api-janitor-openstack \
cluster-api-janitor-openstack/cluster-api-janitor-openstack \
--installWe use tox to run unit tests and linters across the code. To run all the checks, including efforts to automatically fix linting issues, please run:
toxYou can run individual unit tests by running:
tox -e py3 -- <name-of-unit-test>Note, failures on your initial tox run may be automatically fixed, where possible. So your second tox run may pass. This way we can run the default tox target in CI.
cluster-api-janitor-openstack will always clean up
Octavia loadbalancers, and associated
floating IPs, that are created by
the OCCM for LoadBalancer services on Cluster API clusters.
By default, Cinder volumes created by the
Cinder CSI plugin for PersistentVolumeClaims are also cleaned up. However this behaviour
carries a risk of deleting important data, so can be customised in two ways.
The operator default can be changed to keep, meaning that volumes provisioned by the
Cinder CSI plugin will be kept unless overridden by the cluster:
helm upgrade ... --set defaultVolumesPolicy=keepRegardless of the operator default, individual OpenStackClusters can also be annotated
to indicate whether volumes for that cluster should be kept or removed:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
name: my-cluster
annotations:
janitor.capi.stackhpc.com/volumes-policy: "keep|delete"NOTE: Any value other than
deletemeans volumes will be kept.
Annotations on the Kubernetes resources are only available to administrators with access to the Cluster API management cluster's Kubernetes API; therefore, the Janitor also provides an alternative user-facing mechanism for marking volumes which should not be deleted during cluster clean up. This is done by adding a property to the OpenStack volume using:
openstack volume set --property janitor.capi.azimuth-cloud.com/keep='true' <volume-name-or-id>
Any value other than 'true' will result in the volume being deleted when the workload cluster is deleted.
cluster-api-janitor-openstack watches for OpenStackClusters being created and adds its
own finalizer to them. This prevents the OpenStackCluster, and hence the corresponding
Cluster API Cluster, from being removed until the finalizer is removed.
cluster-api-janitor-openstack then waits for the OpenStackCluster to be deleted
(specifically, it waits for the deletionTimestamp to be set, indicating that a deletion
has been requested), at which point it uses the credential from
OpenStackCluster.spec.identityRef to remove any dangling resources that were created by
the OCCM or Cinder CSI with the same cluster name as the cluster being deleted.
The cluster name is determined by the cluster.x-k8s.io/cluster-name label on the
OpenStackCluster resource, if present.
If the label is not set, the name of the OpenStackCluster resource (metadata.name) is
used instead.
Once all the resources have been deleted, the finalizer is removed.
WARNING
The cluster name of the OCCM and Cinder CSI must be set to the
metadata.nameof the OpenStackCluster resource, or to the value of thecluster.x-k8s.io/cluster-namelabel if it is present on the OpenStackCluster resource.For instance, the
openstack-clusterchart from the capi-helm-charts ensures that this happens automatically and sets the OpenStackCluster'smetadata.namefor OCCM and Cinder CSI.
The advantage of this approach vs. a task that runs before the cluster deletion is started is that the external resource deletion happens after all the machines have been deleted, meaning that there is no chance of racing with the OCCM and/or Cinder CSI still running on the cluster that may continue to try and replace resources that are cleaned up.
It is not possible to run this cleanup as a post cluster deletion task, because some of the
resources created by the OCCM may actually block cluster deletion completely. For example,
a load-balancer created by the OCCM for a LoadBalancer service maintains a port on the cluster
network, meaning that the network cannot be cleaned up by the Cluster API OpenStack provider
and preventing deletion of the cluster.