Skip to content

Conversation

@Foxboron
Copy link
Contributor

@Foxboron Foxboron commented Apr 28, 2025

WIP draft PR for the feature. So far it works, a bit slow on my machine due to the id remapping which should probably be investigated.

  • Investigate the ID remapping
  • I'm unsure about the different migration options
  • Only support vers=4.2, should be fine?
  • Error out on missing source
  • Any specific options we should include?
  • Info array needs some QA. Not sure I understand all of it.
  • nfs can't support xattr, do we need to handle this in other places than migrate?

Fixes: #1311

@Foxboron
Copy link
Contributor Author

Foxboron commented May 8, 2025

@stgraber if you have any opinions or pointers to the checkboxes feel free to look over and I can investigate a bit.

I suspect some research needs to be done to figure out how we should interact with the uid/gid and squashing behavior of NFS mounts.

@bensmrs
Copy link
Contributor

bensmrs commented Jul 21, 2025

Hi! Is this PR stalled? Do you need help?

@Foxboron
Copy link
Contributor Author

@bensmrs I think I need a bit of help to make sure that the Info struct is correct and that I'm not missing any details from the migration steps. I was hoping @stgraber had time to point in the correct direction on this part.

@stgraber
Copy link
Member

Ah, I just left a few comments in the Info function now.

@bensmrs
Copy link
Contributor

bensmrs commented Jul 21, 2025

I was hoping @stgraber had time to point in the correct direction on this part.

Well now you’re served :þ

I can help review, fix stuff, and even write tests, don’t hesitate to ping me. I’d actually be happy to see it working pretty soon.

@Foxboron
Copy link
Contributor Author

@bensmrs thanks!

I'll probably not touch this in July, and there is a hackercamp and work stuff happening in August on my end that might not allow me a lot of energy to pick this up after hours.

I'd appreciate some pointers on the id remapping incus does and how that should play with the ID squashing NFS does. I think there should be some guidance and testing there to make sure it behaves as expected. It also takes a bunch of time which might not be needed if nfs reassings UID/GID anyway.

If you have time to write tests and stuff I'd be happy to give you access to my fork.

@stgraber
Copy link
Member

I don't believe that VFS idmap works on top of NFS at this point, so we'd be dealing with traditional shifting where Incus rewrites all the uid/gid as needed.

What you want to ensure is that NFS is mounted the same way on all machines and that no uid/gid squashing is performed, then that should work fine.

@Foxboron
Copy link
Contributor Author

Should we call the driver nfs4 to make sure we don't end up in a weird situation down the line where we don't want to support a v5 along with the v4 code? Is that realistic?

@stgraber
Copy link
Member

I think we should stick to nfs as nfs4 may give the impression that we don't support nfs3 whereas the featureset we really need is perfectly fine for NFS3, if anything some of the bits in NFS4 may be problematic (built-in uid/gid mapping and such).

Exactly what kind of server version and configuration we can handle is probably something that's best addressed through the documentation for the driver.

@symgryph
Copy link

NFS would be SO nice. Just an interested party!

@Foxboron Foxboron force-pushed the morten/nfs branch 2 times, most recently from b617326 to 49f1676 Compare October 19, 2025 15:00
@Foxboron Foxboron marked this pull request as ready for review October 19, 2025 15:00
@Foxboron
Copy link
Contributor Author

@bensmrs

I did some changes so we can pass ipv6 source= paths. It's not great but couldn't come up with a more reliable way to split up on ::1:/somepath or similar paths.

What sort of testing do we want on this? Locally creating pools and creating contianers and VMs work. Also attaching additional volumes.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 19, 2025

What sort of testing do we want on this? Locally creating pools and creating contianers and VMs work. Also attaching additional volumes.

We’d basically extend the available filesystem drivers in the test matrix, to test everything we’re already testing:

backend:
- dir
- btrfs
- lvm
- zfs
- ceph
- linstor
- random

The tests initialization routines allow you to define how your driver should be initialized in test/backends, although it’s not enough. I can have a quick look at it as soon as I’m done with my other PR.

@Foxboron Foxboron force-pushed the morten/nfs branch 2 times, most recently from 0080a77 to 7070ad5 Compare October 19, 2025 20:32
@github-actions github-actions bot added the Documentation Documentation needs updating label Oct 19, 2025
@bensmrs
Copy link
Contributor

bensmrs commented Oct 20, 2025

I started looking at the tests and am a bit confused by the following error:

> incus storage create incustest-KAg nfs source=10.1.0.227:/media
Error: NFS driver requires "nfs.host" to be specified or included in "source": [<remote host>:]<remote path>

Did I do something wrong here?
IMO, if you have an nfs.host option, I don’t think you should look for a host in source. Just make source be the path on nfs.host, or on localhost if no nfs.host is provided.

@Foxboron
Copy link
Contributor Author

Did I do something wrong here?

Uhh, so it turns out that the Makefile has been moving things from being built into build/ to running go install. So there is a slight chance I have missed testing an iteration while working on this.

IMO, if you have an nfs.host option, I don’t think you should look for a host in source. Just make source be the path on nfs.host, or on localhost if no nfs.host is provided.

I was thinking mimicking the behavior of truenas.host and source from the truenas driver. Where you can just specify nfs.host globally and pass a path to source. But I haven't nailed the entire behavior there I suspect. I'm a little bit unsure what the most ergonomic way for the options to behave, that also harmonizes with the existing drivers.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 20, 2025

Uhh, so it turns out that the Makefile has been moving things from being built into build/ to running go install. So there is a slight chance I have missed testing an iteration while working on this.

No problem, I ended up using nfs.host for driver initialization.

I was thinking mimicking the behavior of truenas.host and source from the truenas driver. Where you can just specify nfs.host globally and pass a path to source. But I haven't nailed the entire behavior there I suspect. I'm a little bit unsure what the most ergonomic way for the options to behave, that also harmonizes with the existing drivers.

Ok I can’t beat this argument :)

I’m getting some permission denied errors in my tests, I’m investigating.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 20, 2025

So, maybe I’m missing something when initializing the pool, but I can’t get past permission errors. Container creation and launch work well, but that’s not enough, see https://github.com/bensmrs/incus/actions/runs/18648669885/job/53161380407?pr=4

Am I missing something in my setup? See

echo "/media 10.0.0.0/8(rw,sync,no_subtree_check,no_root_squash,no_all_squash)" | sudo tee /etc/exports
sudo exportfs -a

@Foxboron
Copy link
Contributor Author

Foxboron commented Oct 20, 2025

Hmmm, I've only tried incus exec container bash and manually created files on NFS volumes on both ends. And it worked. Are we sure that /media doesn't have any weird permissions in ubuntu? Does it exist etc?

@bensmrs
Copy link
Contributor

bensmrs commented Oct 20, 2025

Container launch works, so regular operations don’t seem to cause any problem.
I’ll debug a bit more later, I was just wondering if your setup had anything specific other than the squash options.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 20, 2025

Well I’m currently out of ideas. user_is_instance_user fails with a permission error. We can’t list /root from within the container, it appears owned by 0:0, whoami fails because user 0 is not found, and from the host, the mount appears owned by 1000000:1000000. I don’t know where to go from there; maybe I’m missing something obvious happening in the OpenFGA tests.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 28, 2025

And couldn’t an idmapped bind mount be a solution? The host mounts the NFS shares somewhere, then bind-mounts it to /var/lib/incus/containers/<instance>/ with the proper shift.

Or maybe it’s bindfs-specific, in which case we’re back to using FUSE…

@stgraber
Copy link
Member

And couldn’t an idmapped bind mount be a solution? The host mounts the NFS shares somewhere, then bind-mounts it to /var/lib/incus/containers/<instance>/ with the proper shift.

Or maybe it’s bindfs-specific, in which case we’re back to using FUSE…

That's 1), idmap bind-mount requires per-filesystem kernel code to handle it. NFS does not have support for it. FUSE in general does but requires each filesystem to add some logic, which is why I also brought up 2) as a quicker option to get something done.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 28, 2025

Ok, I had in mind bindfs’ uid-offset and gid-offset options and just thought that maybe this translation mechanism existed for bind mounts (the bind of bindfs made my brain shortcut to my previous question).

I’ll have a quick look at fuse-nfs.

@stgraber stgraber marked this pull request as draft October 29, 2025 01:16
@bensmrs
Copy link
Contributor

bensmrs commented Oct 29, 2025

https://github.com/bensmrs/fuse-nfs should be ok-ish for FUSE3; I’ll try to implement VFS idmap tomorrow.

@bensmrs
Copy link
Contributor

bensmrs commented Oct 29, 2025

My patches should now work (I also integrated a long-waiting PR that I feel is desirable for us), but testing it will probably be painful. Are there quick ways for me to test the patched libfuse, or do I have to compile @mihalicyn’s tree? (in which case, big meh on my side as I have little time for it).
Or you could test on your side @stgraber if you already have everything ready…

@stgraber
Copy link
Member

Cool!

If we get this working, we'd probably do a scheme like /var/lib/storage-pools/my-nfs is the NFS kernel mount, then /var/lib/storge-pools/my-nfs/.fuse is the same share but mounted over FUSE. Then whenever dealing with an unprivileged container, we'd use the .fuse path and everything else goes through the kernel client.

@Foxboron
Copy link
Contributor Author

Foxboron commented Nov 7, 2025

@stgraber thanks for investigating the issue!

Whats the proper way forward for this? Try the fuse strategy? I guess it means we also need more dependencies packages in downstreams as well?

@stgraber
Copy link
Member

stgraber commented Nov 7, 2025

I think the next steps should be:

  • Try to get the FUSE filesystem to run on FUSE3 and support VFS idmap (I think @bensmrs was looking into that)
  • Get the driver to detect the presence of the FUSE client and if present, mount both the kernel version and user space version of the share
  • Tweak the storage driver to adjust its featureset based on what's present on the system

Given the new FUSE client is going to take a little while to be packed in all distros, I don't want us to be hard depending on it.

On my end, I'll be happy to just build and ship the client as part of the Zabbly package, so we can at least get it out to a whole bunch of users (including folks using IncusOS) pretty rapidly while the distro packagers get it through to the other distros.

@bensmrs
Copy link
Contributor

bensmrs commented Nov 7, 2025

Try to get the FUSE filesystem to run on FUSE3 and support VFS idmap (I think @bensmrs was looking into that)

I’d half-confidently say that my fork does that already. The blocker for me is to have a kernel with the necessary patches to make it run. I don’t really have time planned to build such a kernel before, say, december. I’d be more than happy to steal a disk image to deploy on my staging environment though ^^

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

I’d half-confidently say that my fork does that already. The blocker for me is to have a kernel with the necessary patches to make it run. I don’t really have time planned to build such a kernel before, say, december. I’d be more than happy to steal a disk image to deploy on my staging environment though ^^

Ah, what do you need on the kernel side? I thought we'd had FUSE VFS idmap for a bit now (6.12 apparently).

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

I adapted the code from the patch you linked, which uses #if FUSE_VERSION >= FUSE_MAKE_VERSION(3, 18) conditions. FUSE 3.18 doesn’t seem to be a thing yet (and my testing environment running Trixie indeed only has 3.17).
So I guessed that the VFS patches didn’t land in mainline kernels.

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

I adapted the code from the patch you linked, which uses #if FUSE_VERSION >= FUSE_MAKE_VERSION(3, 18) conditions. FUSE 3.18 doesn’t seem to be a thing yet (and my testing environment running Trixie indeed only has 3.17). So I guessed that the VFS patches didn’t land in mainline kernels.

That's odd because I'm definitely seeing the commits in my clone of Linus' tree.

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

I remember the build breaking when removing the version condition. Checking libfuse source code, d393ffa85b0926374c8df543a9ffc81b1d0ce232 suggests some kind of idmap support, but struct fuse_ctx has not been patched (so we may need to partially apply https://github.com/mihalicyn/libfuse/tree/idmap_support).

I’m not comfortable with Incus/Zabbly providing patched libfuse, though, so maybe a nudge to upstream would be good.

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

Yeah, I don't mind doing the patched libfuse dance in /opt/incus/lib like we do some other libraries but that would be something I'd only really want to do after upstream has merged the needed support.

I pinged @mihalicyn about it.

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

And to think that it just started with “it would be nice to have NFS support” :D

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

Yeah, that one became complicated pretty quick :)

@Foxboron
Copy link
Contributor Author

Foxboron commented Nov 9, 2025

I took your word for it when you said it would "probably" be easy 🫠

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

It’s always nice when the issue on which you’ve been working for 6 month is marked “Easy” (which, may I remind the description, means “Good for new contributors”) :)

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

I took your word for it when you said it would "probably" be easy 🫠

Haha, yeah, the storage logic really should have been pretty easy, take the dir backend and make it act as clustered similar to lvmcluster.

Now the underlying NFS protocol being annoying with idmap wasn't something I had seen coming ;)
It kinda makes me want to look at CIFS, see if we have the same problem over there...

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

It kinda makes me want to look at CIFS

Said… no one? ever?

@stgraber
Copy link
Member

stgraber commented Nov 9, 2025

It kinda makes me want to look at CIFS

Said… no one? ever?

Well, in this case it coming from more of a Windows world where UIDs/GIDs are not really a thing makes me hopeful that the protocol doesn't just carry the user cred and file details but instead has the kernel module handle the checks and then tells the remote storage what to write.

If that's the case, then it should work fine here. NFS is only broken because it's too UNIX :)

@bensmrs
Copy link
Contributor

bensmrs commented Nov 9, 2025

Ok, interesting. Now I’m curious too!

@stgraber
Copy link
Member

@mihalicyn provided this one as an example of conversion

https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/245/diffs

@bensmrs
Copy link
Contributor

bensmrs commented Nov 10, 2025

I’m confused. What question are you answering with this message?

@stgraber
Copy link
Member

I’m confused. What question are you answering with this message?

How to get a FUSE filesystem to use the VFS idmap stuff without needing libfuse changes.

@bensmrs
Copy link
Contributor

bensmrs commented Nov 11, 2025

Ok, I’ll see how this can be translated to my code then.

@bensmrs
Copy link
Contributor

bensmrs commented Nov 13, 2025

So, after a frustrating attempt at making it work, I realized that even though the needed patches are in the main branch of libfuse, it’s not in any released version, so I need to compile libfuse myself. It also looks like Trixie’s stock kernel is not fit for these patches, so I also had to install the backported kernel. I should be nearing something that might work pretty soon (yeah, that doesn’t exude confidence).

@bensmrs
Copy link
Contributor

bensmrs commented Nov 13, 2025

Well, I give up, I can’t get any of my attempts to work, even though the proper capabilities are advertised to and by FUSE. Either I’m missing a pretty obvious wiring, or something’s just not ready yet. I’ve left two branches on my fork, in case someone wants to have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Changes to the REST API Documentation Documentation needs updating

Development

Successfully merging this pull request may close these issues.

Add a nfs storage driver

5 participants