Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions nix/installer.nix
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,58 @@ in
documentation.enable = false;
documentation.man.man-db.enable = false;

# reduce closure size through package set crafting
# where there's no otherwise globally effective
# config setting available
# TODO: some are candidates for a long-term upstream solution
nixpkgs.overlays = [
(final: prev: {
# save ~12MB by not bundling manpages
coreutils-full = prev.coreutils;
Comment on lines +34 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also question here: Do we actually care about this? We are not shipping manpages afaik and they are hopefully in a separate output?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are hopefully in a separate output?

Unfortunately not, at least in nixos-23.11:

  postInstall = optionalString (isCross && !minimal) ''
    rm $out/share/man/man1/*
    cp ${buildPackages.coreutils-full}/share/man/man1/* $out/share/man/man1
  ''
  # du: 8.7 M locale + 0.4 M man pages
  + optionalString minimal ''
    rm -r "$out/share"
  '';

minimal is false on -full

# save ~16MB by making them minimal
util-linux = prev.util-linux.override {
nlsSupport = false;
ncursesSupport = false;
systemdSupport = false;
};
# save ~6MB by removing one bash
bashInteractive = prev.bash;
# saves ~25MB
systemd = prev.systemd.override {
pname = "systemd-slim";
withDocumentation = false;
withCoredump = false;
withFido2 = false;
withRepart = false;
withMachined = false;
withRemote = false;
withTpm2Tss = false;
withLibBPF = false;
withAudit = false;
withCompression = false;
withImportd = false;
withPortabled = false;
withSysupdate = false;
withHomed = false;
withLocaled = false;
withPolkit = false;
# withQrencode = false;
# withVmspawn = false;
withPasswordQuality = false;
};
})
];
systemd.coredump.enable = false;


environment.systemPackages = [
# for zapping of disko
pkgs.jq
# for copying extra files of nixos-anywhere
pkgs.rsync
# for installing nixos via nixos-anywhere
config.system.build.nixos-enter
config.system.build.nixos-install
];

imports = [
Expand Down
3 changes: 3 additions & 0 deletions nix/kexec-installer/kexec-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,9 @@ if ! "$SCRIPT_DIR/kexec" --load "$SCRIPT_DIR/bzImage" \
exit 1
fi

sync; echo 3 > /proc/sys/vm/drop_caches
echo "current available memory: $(free -h | awk '/^Mem/ {print $7}')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the linux kernel do this automatically anyway if it needs memory.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could very well imagine that yes, although I didn't verify and don't intent to stake that claim either.

The main reason I put this here is for reporting so that (especially during tests) we can see how much effective free memory was available prior to switching the kernel.

Not sure if it is utterly useful otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But you are already printing the available memory, which will automatically substract any memory claimed by page cache or dirty pages.
So no need to flush the caches, which might even contain the initrd that we are about to kexec into.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could even print a warning if we are below X MB available memory.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could even print a warning if we are below X MB available memory.

Yeah, I had that idea, too, today: outright refuse to proceed because it's potentially not recoverable, if you have no IPMI access.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But you are already printing the available memory, which will automatically substract any memory claimed by page cache or dirty pages.
So no need to flush the caches, which might even contain the initrd that we are about to kexec into.

Iirc, I did some A/B testing and saw a small ~30MB difference, but yeah, I agree that it shouldn't be necessary.

I did notice, however, on a slightly tangential note, that at the RAM-limit, kexec --load kept working while then kexec -e failed. But I guess that's just due to what the stage 1 or squashfs ultimately command into RAM.


# Disconnect our background kexec from the terminal
echo "machine will boot into nixos in 6s..."
if test -e /dev/kmsg; then
Expand Down
5 changes: 3 additions & 2 deletions nix/kexec-installer/module.nix
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{ config, lib, modulesPath, pkgs, ... }:
let
restore-network = pkgs.writers.writePython3 "restore-network" { flakeIgnore = [ "E501" ]; }
./restore_routes.py;

restore-network = pkgs.writers.writeBash "restore-network" ./restore_routes.sh;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the saving removing python out of interest?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly. It was relatively significant though. I may even think in the ballpark of 150MB or so. It was, indeed, the lowest hanging fruit. I could have gone with python minimal, but since I was trying to really size this down as much as possible, I thought I'd save those additional dozens of MB that python minimal would have left us with, as well.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the effect on the RAM was rather small (maybe 30-60MB or so?), ostensible due to a somewhat efficient caching and on-demand decompression of the squashfs, while having a (much!) smaller impact while compressed.


# does not link with iptables enabled
iprouteStatic = pkgs.pkgsStatic.iproute2.override { iptables = null; };
Expand Down Expand Up @@ -56,6 +56,7 @@ in
environment.etc.is_kexec.text = "true";

systemd.services.restore-network = {
path = [pkgs.jq];
before = [ "network-pre.target" ];
wants = [ "network-pre.target" ];
wantedBy = [ "multi-user.target" ];
Expand Down
118 changes: 0 additions & 118 deletions nix/kexec-installer/restore_routes.py

This file was deleted.

121 changes: 121 additions & 0 deletions nix/kexec-installer/restore_routes.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Lassulus for review of this.


# filter_interfaces function
filter_interfaces() {
# This function takes a list of network interfaces as input and filters
# out loopback interfaces, interfaces without a MAC address, and addresses
# with a "link" scope or marked as dynamic (from DHCP or router
# advertisements). The filtered interfaces are returned one by one on stdout.
local network=("$@")

for net in "${network[@]}"; do
local link_type="$(jq -r '.link_type' <<< "$net")"
local address="$(jq -r '.address // ""' <<< "$net")"
local addr_info="$(jq -r '.addr_info | map(select(.scope != "link" and (.dynamic | not)))' <<< "$net")"
local has_dynamic_address=$(jq -r '.addr_info | any(.dynamic)' <<< "$net")

# echo "Link Type: $link_type -- Address: $address -- Has Dynamic Address: $has_dynamic_address -- Addr Info: $addr_info"

if [[ "$link_type" != "loopback" && -n "$address" && ("$addr_info" != "[]" || "$has_dynamic_address" == "true") ]]; then
net=$(jq -c --argjson addr_info "$addr_info" '.addr_info = $addr_info' <<< "$net")
echo "$net" # "return"
fi
done
}

# filter_routes function
filter_routes() {
# This function takes a list of routes as input and filters out routes
# with protocols "dhcp", "kernel", or "ra". The filtered routes are
# returned one by one on stdout.
local routes=("$@")

for route in "${routes[@]}"; do
local protocol=$(jq -r '.protocol' <<< "$route")
if [[ $protocol != "dhcp" && $protocol != "kernel" && $protocol != "ra" ]]; then
echo "$route" # "return"
fi
done
}

# generate_networkd_units function
generate_networkd_units() {
# This function takes the filtered interfaces and routes, along with a
# directory path. It generates systemd-networkd unit files for each interface,
# including the configured addresses and routes. The unit files are written
# to the specified directory with the naming convention 00-<ifname>.network.
local -n interfaces=$1
local -n routes=$2
local directory="$3"

mkdir -p "$directory"

for interface in "${interfaces[@]}"; do
local ifname=$(jq -r '.ifname' <<< "$interface")
local address=$(jq -r '.address' <<< "$interface")
local addresses=$(jq -r '.addr_info | map("Address = \(.local)/\(.prefixlen)") | join("\n")' <<< "$interface")
local route_sections=()

for route in "${routes[@]}"; do
local dev=$(jq -r '.dev' <<< "$route")
if [[ $dev == $ifname ]]; then
local route_section="[Route]"
local dst=$(jq -r '.dst' <<< "$route")
if [[ $dst != "default" ]]; then
route_section+="\nDestination = $dst"
fi
local gateway=$(jq -r '.gateway // ""' <<< "$route")
if [[ -n $gateway ]]; then
route_section+="\nGateway = $gateway"
fi
route_sections+=("$route_section")
fi
done

local unit=$(cat <<-EOF
[Match]
MACAddress = $address

[Network]
DHCP = yes
LLDP = yes
IPv6AcceptRA = yes
MulticastDNS = yes

$addresses
$(printf '%s\n' "${route_sections[@]}")
EOF
)
echo -e "$unit" > "$directory/00-$ifname.network"
done
}

# main function
main() {
if [[ $# -lt 4 ]]; then
echo "USAGE: $0 addresses routes-v4 routes-v6 networkd-directory" >&2
# exit 1
return 1
fi

local addresses
readarray -t addresses < <(jq -c '.[]' "$1") # Read JSON data into array

local v4_routes
readarray -t v4_routes < <(jq -c '.[]' "$2")

local v6_routes
readarray -t v6_routes < <(jq -c '.[]' "$3")

local networkd_directory="$4"

local relevant_interfaces
readarray -t relevant_interfaces < <(filter_interfaces "${addresses[@]}")

local relevant_routes
readarray -t relevant_routes < <(filter_routes "${v4_routes[@]}" "${v6_routes[@]}")

generate_networkd_units relevant_interfaces relevant_routes "$networkd_directory"
}

main "$@"
3 changes: 3 additions & 0 deletions nix/kexec-installer/test.nix
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,9 @@ makeTest' {
node1.succeed('/root/kexec/kexec --version >&2')
node1.succeed('/root/kexec/run >&2')

# the kexec script will sleep 6s before doing anything, so do we here.
time.sleep(6)

# wait for kexec to finish
while ssh(["true"], check=False).returncode == 0:
print("Waiting for kexec to finish...")
Expand Down