Conversation
|
/gcbrun |
|
Could you add an image test in "https://github.com/google/go-tpm-tools/blob/main/launcher/image/test" we may also need a new test container workload container that can catch the signal. |
alexmwu
left a comment
There was a problem hiding this comment.
This needs image tests. See https://github.com/google/go-tpm-tools/blob/main/launcher/image/test/README.md.
There was a problem hiding this comment.
Any reason we can't do this from the launcher? https://github.com/containerd/containerd/blob/v1.7.30/process.go#L33-L54
There was a problem hiding this comment.
We decided to not rely on the launcher because "container process and the runner process are mostly detached, we want the container process to continue running even the runner process get terminated for whatever reasons." (from the internal bug)
In other words, we consider the case when the launcher is not running.
Create a systemd service to send SIGTERM to running containerd tasks during shutdown/service stop to allow for graceful termination.
Also delete the comments added in the previous commit
When all workloads handle SIGTERM before the timeout, the system can stop earlier.
Add a new integration test to verify that a workload receives SIGTERM for a graceful shutdown.
Map substitutions to environment variables in `script` blocks to prevent them from evaluating to empty strings and causing exit code 2 errors in `gcloud` commands.
Also, increase the timeout to 10 minutes
Running a UDP listener inside a Confidential Space (CS) VM is complex because the hardened environment restricts direct access to host devices (like `/dev/ttyS0` used to dump logs) and prevents standard troubleshooting of container networking. To simplify the test, this commit switches the monitor to a standard GCE VM. The VM dynamically selects the latest x86_64 Debian image and the smallest available machine type meeting the minimum requirements of 2 vCPUs and 1 GB of memory (equivalent to `e2-micro`).
Dynamic lookup failed because some machine families (like `e4`) are not available within the project due to quota limits.
Pass `tee-cmd` as an array to fix the parsing error.
- Added `allow_cmd_override` label to the workload Dockerfile to permit `tee-cmd` usage. - Redirected the `container-cleanup.service` output to `/dev/ttyS0` for serial console visibility.
…se power button press As the hardened image lacks `systemd-logind`, the new service, power button listener, takes the responsibility of watching /dev/input/eventX which was previously done by `logind`. When it detects a power button press, or VM stop, it triggers systemd to stop services, including the listener itself. Then, it uses the service's ExecStop script to send SIGTERM to all containers.
9cb862b to
df36b2c
Compare
|
re: the comment about the image test During end-to-end testing, we found that the previous approach (solely using an ExecStop script) did not work in the hardened image because nothing was watching for power button press events. This is typically handled by To address this issue, we introduced the power button listener service which identifies the correct event device, listens to it, and sends a power-off event to systemd so that it can propagate the shutdown. The listener service also receives this event from systemd and executes the ExecStop script, which in turn sends SIGTERM to all containers. |
Create a systemd service to send SIGTERM to running containerd tasks during shutdown/service stop to allow for graceful termination.