-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat: Clarify Rootless Runtime Requirements #4022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,10 @@ Ingress exposes HTTP and HTTPS routes from outside the cluster to services withi | |
> **NOTE**: You may also want to consider using [Gateway API](https://gateway-api.sigs.k8s.io/) instead of Ingress. | ||
> Gateway API has an [Ingress migration guide](https://gateway-api.sigs.k8s.io/guides/migrating-from-ingress/). | ||
|
||
> **WARNING**: If you are using a [rootless container runtime], ensure your host is | ||
> properly configured before creating the KinD cluster. Most Ingress and Gateway controllers will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph does not seem meaningful; no program works unless it is properly configured. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. kind's installation instructions and quickstart assume a rootful container runtime. There is no additional configuration required beyond installing said runtime and the On Fedora (42) with podman (rootless), There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
To me,
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had to run Didn't you need that? |
||
> not work if these steps are skipped. | ||
|
||
### Create Cluster | ||
|
||
#### Option 1: LoadBalancer | ||
|
@@ -139,3 +143,4 @@ curl localhost/bar | |
|
||
[LoadBalancer]: /docs/user/loadbalancer/ | ||
[Cloud Provider KIND]: /docs/user/loadbalancer/ | ||
[rootless container runtime]: /docs/user/rootless/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -160,6 +160,10 @@ More usage can be discovered with `kind create cluster --help`. | |
kind can auto-detect the [docker], [podman], or [nerdctl] installed and choose the available one. If you want to turn off the auto-detect, use the environment variable `KIND_EXPERIMENTAL_PROVIDER=docker`, `KIND_EXPERIMENTAL_PROVIDER=podman` or `KIND_EXPERIMENTAL_PROVIDER=nerdctl` to | ||
select the runtime. | ||
|
||
> **NOTE**: In some distributions (ex: Fedora), the container runtime operates in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, the default mode depends on the runtime implementation, not on the host distribution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for this - I am not familiar with nerdctl's capabilities. My experience is Fedora + Podman, and wanted to hedge wrt Podman on other Linux distributions. Will correct in a follow-up commit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is technically a feature of the packaging, AIUI, but these runtimes are consistently packaged in this regard, at least currently. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In any case, I think referring to the runtime default probably makes more sense than discussing the packaging. |
||
> [rootless mode](/docs/user/rootless) by default. Extra setup is needed for KinD clusters to be fully | ||
> functional. | ||
|
||
## Interacting With Your Cluster | ||
|
||
After [creating a cluster](#creating-a-cluster), you can use [kubectl][kubectl] | ||
|
@@ -501,4 +505,4 @@ kind, the Kubernetes cluster itself, etc. | |
[Private Registries]: /docs/user/private-registries | ||
[customize control plane with kubeadm]: https://kubernetes.io/docs/setup/independent/control-plane-flags/ | ||
[access multiple clusters]: https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/ | ||
[release notes]: https://github.com/kubernetes-sigs/kind/releases | ||
[release notes]: https://github.com/kubernetes-sigs/kind/releases |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -9,41 +9,58 @@ menu: | |||||
Starting with kind 0.11.0, [Rootless Docker](https://docs.docker.com/go/rootless/), [Rootless Podman](https://github.com/containers/podman/blob/master/docs/tutorials/rootless_tutorial.md) and [Rootless nerdctl](https://github.com/containerd/nerdctl/blob/main/docs/rootless.md) can be used as the node provider of kind. | ||||||
|
||||||
## Provider requirements | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Markdown style consistency - I like having an empty line after a heading. This doesn't appear to impact the site rendering by Hugo (see the deploy preview). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, if we ever decided to enable a markdown linter (highly unlikely) it would complain about there not being a blank line between headers, code blocks, etc. That said, unrelated changes to the file does make it slightly harder to review, but... |
||||||
- Docker: 20.10 or later | ||||||
- Podman: 3.0 or later | ||||||
- nerdctl: 1.7 or later | ||||||
|
||||||
## Host requirements | ||||||
|
||||||
### cgroups v2 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
For consistency with the other occurrences |
||||||
|
||||||
The host needs to be running with cgroup v2. | ||||||
Make sure that the result of the `docker info` command contains `Cgroup Version: 2`. | ||||||
If it prints `Cgroup Version: 1`, try adding `GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"` to `/etc/default/grub` and | ||||||
running `sudo update-grub` to enable cgroup v2. | ||||||
|
||||||
Also, depending on the host configuration, the following steps might be needed: | ||||||
Your host may also need to enable [cgroup delegation](https://systemd.io/CGROUP_DELEGATION/) for daemon-based controller runtimes. | ||||||
This is not required for daemonless runtimes, such as podman. Note that this procedure may | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Untrue. What's the source of this misinformation? Is this from some hallucinating LLM? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, just a human trying to figure out how all this stuff works. I don't have this setting enabled, and my kind clusters (Fedora 42 + podman) seem to behave well once the other guidance in this article is adopted. My key questions are:
I don't think an LLM (Gemini in this case) is hallucinating when it claims "Podman itself doesn't hava a hard requirement on systemd (it can run without it)." Correct me if I'm wrong: if Invoking podman containers through $ cat /usr/lib/systemd/user/podman.service
[Unit]
Description=Podman API Service
Requires=podman.socket
After=podman.socket
Documentation=man:podman-system-service(1)
StartLimitIntervalSec=0
[Service]
Delegate=true
Type=exec
KillMode=process
Environment=LOGGING="--log-level=info"
ExecStart=/usr/bin/podman $LOGGING system service
[Install]
WantedBy=default.target I would hope that rootless Docker and rootless nerdctl/containerd do similar things for their systemd services these days, in which case maybe the guidance here is obsolete? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. systemd seems to have begun to enable This is the matter of the default configuration of systemd, not of Docker/Podman/nerdctl. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The delegation seems no longer needed to be configured manually, for systemd >= 252. Confirmed with a clean installation of Ubuntu 25.04 (systemd 257) with Rootless Docker. |
||||||
[negatively impact performance](https://lists.fedoraproject.org/archives/list/[email protected]/thread/ZMKLS7SHMRJLJ57NZCYPBAQ3UOYULV65/). | ||||||
|
||||||
- Create `/etc/systemd/system/[email protected]/delegate.conf` with the following content, and then run `sudo systemctl daemon-reload`: | ||||||
To enable cgroup delegation, perform the folowing actions: | ||||||
|
||||||
```ini | ||||||
[Service] | ||||||
Delegate=yes | ||||||
``` | ||||||
1. As root, create the directory `/etc/systemd/system/[email protected]/` if it does not already exist | ||||||
|
||||||
```sh | ||||||
sudo mdkir -p /etc/systemd/system/[email protected]/ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo: mdkir There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't that obvious that mkdir is needed when the directory does not exist? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Convenience for users - code blocks are easy to copy/paste. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, we will get a lot of novice users that are not familiar with any of this. A lot of Linux users will run less common configurations without much experience. We occasionally get a bug from someone using a custom CPU scheduler that lacks cgroups .... etc. |
||||||
``` | ||||||
2. As root, create the file `/etc/systemd/system/[email protected]/delegate.conf` with the following content: | ||||||
|
||||||
```ini | ||||||
[Service] | ||||||
Delegate=yes | ||||||
``` | ||||||
|
||||||
3. Reload the systemd daemon: | ||||||
|
||||||
```sh | ||||||
sudo systemctl daemon-reload | ||||||
``` | ||||||
|
||||||
(This is not enabled by default because ["the runtime impact of | ||||||
[delegating the "cpu" controller] is still too | ||||||
high"](https://lists.fedoraproject.org/archives/list/[email protected]/thread/ZMKLS7SHMRJLJ57NZCYPBAQ3UOYULV65/). | ||||||
Beware that changing this configuration may affect system | ||||||
performance.) | ||||||
4. If using docker, reload the user docker daemon: | ||||||
|
||||||
Please note that: | ||||||
```sh | ||||||
systemctl --user restart docker | ||||||
``` | ||||||
|
||||||
- `/etc/systemd/system/[email protected]/` directory needs to be created if not already present on your host | ||||||
- If using Docker and it was already running when this step was done, a restart is needed for the changes to take | ||||||
effect | ||||||
{{< codeFromInline lang="bash" >}} | ||||||
systemctl --user restart docker | ||||||
{{< /codeFromInline >}} | ||||||
### Networking | ||||||
|
||||||
- Create `/etc/modules-load.d/iptables.conf` with the following content: | ||||||
Containers running in rootless mode are not typically loaded with host-level iptable modules. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Disagree with "typically" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My Linux experience is almost exclusively with Fedora. I am not familiar with what kernel mods are loaded by default in other distributions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A clean installation of Ubuntu 25.04 seems to load |
||||||
This breaks the behavior of most Ingress and Gateway controllers. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not specific to these controllers There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to phrase as "This breaks the behavior of many networking components, such as Ingress and Gateway controllers"? |
||||||
|
||||||
To load the iptable modules into the KinD containers, do the following: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, kernel modules are loaded to the host kernel, not into "the KinD containers" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My question - why isn't this procedure needed for rootful runtimes? How are these kernel modules getting loaded dynamically? As with the other modifications here, these are system-level changes that may have undesirable side effects. I think it's helpful if end users understand the "deeper why" behind these changes. For all I know, this is a feature gap that could be fixed somewhere in the stack. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because the modules are loaded on demand in the case of rootful |
||||||
|
||||||
1. As root, create the file `/etc/modules-load.d/iptables.conf` with the following content: | ||||||
|
||||||
``` | ||||||
ip6_tables | ||||||
|
@@ -52,14 +69,62 @@ Also, depending on the host configuration, the following steps might be needed: | |||||
iptable_nat | ||||||
``` | ||||||
|
||||||
- If using podman, be aware that by default there is a [limit](https://docs.podman.io/en/v4.3/markdown/options/pids-limit.html#pids-limit-limit) to the number of pids that can be created. This can cause problems like nginx workers inside a container not spawning correctly. | ||||||
- If you want to disable this limit, edit your `containers.conf` file (generally located in `/etc/containers/containers.conf`). Note that this could cause things like pid exhaustion to happen on the host machine. Alternatively, change `0` to your desired new limit: | ||||||
2. Restart your system to ensure these changes take effect. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
|
||||||
### Increase PID Limits | ||||||
|
||||||
KinD nodes are represented as individual containers on their hosts. Runtimes such as podman set default | ||||||
[process id limits](https://docs.podman.io/en/v4.3/markdown/options/pids-limit.html#pids-limit-limit) | ||||||
that may be too low for the node or for a pod running on the node. The NGINX ingress controller is | ||||||
[particularly susceptible](https://github.com/kubernetes-sigs/kind/issues/3451) to this issue. | ||||||
|
||||||
To increase the PID limit, do the following: | ||||||
|
||||||
1. If using podman, edit your `containers.conf` file (generally located in | ||||||
`/etc/containers/containers.conf` or `~/.config/containers/containers.conf`) to increase the PIDs | ||||||
limit to a desired value (default 4096 on most systems). | ||||||
|
||||||
```ini | ||||||
[containers] | ||||||
pids_limit = 0 | ||||||
pids_limit = 65536 | ||||||
``` | ||||||
|
||||||
|
||||||
### Increase inotify Limits | ||||||
|
||||||
As documented in [known issues](/docs/user/known-issues/#pod-errors-due-to-too-many-open-files), pods may | ||||||
fail by reaching inotify watch and instance limits. Ingress controllers such as NGINX and Contour | ||||||
are particularly susceptible to this issue. | ||||||
|
||||||
To increase the inotify limits, do the following: | ||||||
|
||||||
1. As root, create a `.conf` file in `/etc/systctl.d` that increases the `fs.inotify` max user settings: | ||||||
|
||||||
``` | ||||||
fs.inotify.max_user_watches = 524288 | ||||||
fs.inotify.max_user_instances = 512 | ||||||
``` | ||||||
|
||||||
2. Restart your system for these changes to take effect. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will add in a follow-up commit. |
||||||
|
||||||
|
||||||
### Allow Unprivileged Binding to HTTP(S) Ports | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not specific to HTTP(S) |
||||||
|
||||||
If you use the `extraPortMappings` method to provide ingress to your KinD cluster, you can allow | ||||||
the KinD container to bind to ports 80 and 443 on the host. User containers cannot bind to these | ||||||
ports by default as they are considered privileged. | ||||||
|
||||||
To allow a KinD node to bind to ports 80 and/or 443 on the host, do the following: | ||||||
|
||||||
1. As root, create a `.conf` file in `/etc/systctl.d` that lowers the privileged port start number: | ||||||
|
||||||
``` | ||||||
net.ipv4.ip_unprivileged_port_start=80 | ||||||
``` | ||||||
|
||||||
2. Restart your system for these changes to take effect. | ||||||
|
||||||
|
||||||
## Restrictions | ||||||
|
||||||
The restrictions of Rootless Docker apply to kind clusters as well. | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/KinD/kind/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use
kind
or if avoiding codeblocks (perhaps start of a sentence) "KIND" but never "KinD"I thought we'd added this to https://kind.sigs.k8s.io/docs/contributing/development/#documentation but we haven't, we should. I've been pretty short on time recently.