|
| 1 | +# Overview |
| 2 | + |
| 3 | +The Slurm appliance supports mounting shared filesystems using [CephFS](https://docs.ceph.com/en/latest/cephfs/) via [OpenStack Manila](https://wiki.openstack.org/wiki/Manila). These docs explain: |
| 4 | + |
| 5 | +- How to create the shares in OpenStack Manilla |
| 6 | + |
| 7 | +- How to configure the Slurm Appliance to mount these Manila shares. |
| 8 | + |
| 9 | +- How to disable use Manila shares for a shared home directory. |
| 10 | + |
| 11 | +## Creating shares in OpenStack |
| 12 | + |
| 13 | +The Slurm appliance requires that the Manila shares already exist on the system. Follow the instructions below to do this. |
| 14 | + |
| 15 | +If this is the first time Manila is being used on the system, a CephFS share type will need to be created. You will need admin credentials to do this. |
| 16 | + |
| 17 | + ```bash |
| 18 | + openstack share type create cephfs-type false --extra-specs storage_protocol=CEPHFS, vendor_name=Ceph |
| 19 | + ``` |
| 20 | + |
| 21 | +Once this exists, create a share using credentials for the Slurm project. An access rule also needs to be created, where the “access_to” argument (`openstack share access create <share> <access_type> <access_to>`) is a user that will be created in Ceph. This needs to be globally unique in Ceph, so needs to be different for each OpenStack project. |
| 22 | + |
| 23 | + ```bash |
| 24 | + openstack share create CephFS 300 --description 'Scratch dir for Slurm prod' --name slurm-production-scratch --share-type cephfs-type --wait |
| 25 | + openstack share access create slurm-production-scratch cephx slurm-production |
| 26 | + ``` |
| 27 | + |
| 28 | +## Configuring the Slurm Appliance for Manila |
| 29 | + |
| 30 | +To mount shares onto hosts in a group, add the to the `manila` group. |
| 31 | + |
| 32 | + ```ini |
| 33 | + [manila:children] |
| 34 | + login |
| 35 | + compute |
| 36 | + ``` |
| 37 | + |
| 38 | +Set the version of Ceph which is running on the system. |
| 39 | + |
| 40 | + ```yaml |
| 41 | + os_manila_mount_ceph_version: "18.2.4" |
| 42 | + ``` |
| 43 | +
|
| 44 | +Define the list of shares to be mounted, and the paths to mount them to. See the [stackhpc.os-manila-mount role](https://github.com/stackhpc/ansible-role-os-manila-mount) for further configuration options. |
| 45 | +
|
| 46 | + ```yaml |
| 47 | + os_manila_mount_shares: |
| 48 | + - share_name: slurm-production-scratch |
| 49 | + mount_path: /scratch |
| 50 | + ``` |
| 51 | +
|
| 52 | +### Shared home directory |
| 53 | +
|
| 54 | +By default, the Slurm appliance will spin up a local NFS server and mount the home directories to it. When using Manila + CephFS for the home directory instead, this will need to be disabled. |
| 55 | +
|
| 56 | + ```yaml |
| 57 | + nfs_configurations: [] |
| 58 | + ``` |
| 59 | +
|
| 60 | +The basic_users home directory will need to be updated to point to this new shared directory. |
| 61 | +
|
| 62 | + ```yaml |
| 63 | + basic_users_homedir_server: "{{ groups['login'] | first }}" # if not mounting /home on control node |
| 64 | + basic_users_homedir_server_path: /home |
| 65 | + ``` |
| 66 | +
|
| 67 | +Set the Tofu variable `home_volume_size = 0` to stop Tofu from creating a new home volume. NB: If the control node has already been deployed, re-running Tofu will delete the home volume and delete/recreate the control node. |
| 68 | + |
| 69 | +Finally, add the home directory to the list of shares (the share should be created already in OpenStack). |
| 70 | + |
| 71 | + ```yaml |
| 72 | + os_manila_mount_shares: |
| 73 | + - share_name: slurm-production-scratch |
| 74 | + mount_path: /scratch |
| 75 | + - share_name: slurm-production-home |
| 76 | + mount_path: /home |
| 77 | + ``` |
0 commit comments