@@ -8,11 +8,13 @@ subnets or associated infrastructure such as routers. The requirements are that:
8
8
4 . At least one network on each node provides outbound internet access (either
9
9
directly, or via a proxy).
10
10
11
- Futhermore, it is recommended that the deploy host has an interface on the
12
- access network. While it is possible to e.g. use a floating IP on a login node
13
- as an SSH proxy to access the other nodes, this can create problems in recovering
14
- the cluster if the login node is unavailable and can make Ansible problems harder
15
- to debug.
11
+ Addresses on the "access network" used as the ` ansible_host ` IPs.
12
+
13
+ It is recommended that the deploy host either has a direct connection to the
14
+ "access network" or jumps through a host on it which is not part of the appliance.
15
+ Using e.g. a floating IP on a login node as a jumphost creates problems in
16
+ recovering the cluster if the login node is unavailable and can make Ansible
17
+ problems harder to debug.
16
18
17
19
> [ !WARNING]
18
20
> If home directories are on a shared filesystem with no authentication (such
@@ -29,8 +31,8 @@ the OpenTofu variables. These will normally be set in
29
31
need to be overriden for specific environments, this can be done via an OpenTofu
30
32
module as discussed [ here] ( ./production.md ) .
31
33
32
- Note that if an OpenStack subnet has a gateway IP defined then nodes with ports
33
- attached to that subnet will get a default route set via that gateway.
34
+ Note that if an OpenStack subnet has a gateway IP defined then by default nodes
35
+ with ports attached to that subnet get a default route set via that gateway.
34
36
35
37
## Single network
36
38
This is the simplest possible configuration. A single network and subnet is
@@ -77,8 +79,9 @@ vnic_types = {
77
79
## Additional networks on some nodes
78
80
79
81
This example shows how to modify variables for specific node groups. In this
80
- case a baremetal node group has a second network attached. As above, only a
81
- single subnet can have a gateway IP.
82
+ case a baremetal node group has a second network attached. Here "subnetA" must
83
+ have a gateway IP defined and "subnetB" must not, to avoid routing problems on
84
+ the multi-homeed compute nodes.
82
85
83
86
``` terraform
84
87
cluster_networks = [
@@ -109,3 +112,85 @@ compute = {
109
112
}
110
113
...
111
114
```
115
+
116
+ ## Multiple networks with non-default gateways
117
+
118
+ In some multiple network configurations it may be necessary to manage default
119
+ routes rather than them being automatically created from a subnet gateway.
120
+ This can be done using the tofu variable ` gateway_ip ` which can be set for the
121
+ cluster and/or overriden on the compute and login groups. If this is set:
122
+ - a default route via that address will be created on the appropriate interface
123
+ during boot if it does not exist
124
+ - any other default routes will be removed
125
+
126
+ For example the cluster configuration below has a "campus" network with a
127
+ default gateway which provides inbound SSH / ondemand access and outbound
128
+ internet attached only to the login nodes, and a "data" network attached to
129
+ all nodes. The "data" network has no gateway IP set on its subnet to avoid dual
130
+ default routes and routing conflicts on the multi-homed login nodes, but does
131
+ have outbound connectivity via a router:
132
+
133
+ ``` terraform
134
+ cluster_networks = [
135
+ {
136
+ network = "data" # access network, CIDR 172.16.0.0/23
137
+ subnet = "data_subnet"
138
+ }
139
+ ]
140
+
141
+ login = {
142
+ interactive = {
143
+ nodes = ["login-0"]
144
+ extra_networks = [
145
+ {
146
+ network = "campus"
147
+ subnet = "campus_subnet"
148
+ }
149
+ ]
150
+ }
151
+ }
152
+ compute = {
153
+ general = {
154
+ nodes = ["compute-0", "compute-1"]
155
+ }
156
+ gateway_ip = "172.16.0.1" # Router interface
157
+ }
158
+ ```
159
+
160
+ If there is no default route at all (either from a subnet gateway or from
161
+ ` gateway_ip ` ) then a dummy route is created via the access network interface to
162
+ ensure [ correct] ( https://docs.k3s.io/installation/airgap#default-network-route )
163
+ ` k3s ` operation.
164
+
165
+ When using a subnet with no default gateway, OpenStack's nameserver for the
166
+ subnet may refuse lookups. External nameservers can be defined using the
167
+ [ resolv_conf] ( ../ansible/roles/resolv_conf/README.md ) role.
168
+
169
+ ## Proxies
170
+
171
+ If some nodes have no outbound connectivity via any networks, the cluster can
172
+ be configured to deploy a [ squid proxy] ( https://www.squid-cache.org/ ) on a node
173
+ with outbound connectivity. Assuming the ` compute ` and ` control ` nodes have no
174
+ outbound connectivity and the ` login ` node does, the minimal configuration for
175
+ this is:
176
+
177
+ ``` yaml
178
+ # environments/$SITE/inventory/groups:
179
+ [squid:children]
180
+ login
181
+ [proxy:children]
182
+ control
183
+ compute
184
+ ```
185
+
186
+ ``` yaml
187
+ # environments/$SITE/inventory/group_vars/all/squid.yml:
188
+ # these are just examples
189
+ squid_cache_disk : 1024 # MB
190
+ squid_cache_mem : ' 12 GB'
191
+ ` ` `
192
+
193
+ Note that name resolution must still be possible and may require defining an
194
+ nameserver which is directly reachable from the node using the
195
+ [resolv_conf](../ansible/roles/resolv_conf/README.md)
196
+ role.
0 commit comments