Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ ensure_python_library_installed() {
cleanup() {
echo "Stopping everything…"
trap - INT TERM # prevent re-entrancy
pkill -9 -f "disagg_proxy_p2p_nccl_xpyd.py"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using SIGKILL (-9) should be a last resort, as it terminates the process immediately without allowing it to perform any cleanup operations. The disagg_proxy_p2p_nccl_xpyd.py script runs a Quart server, which can handle a graceful shutdown on SIGTERM (the default signal for pkill) to properly close network sockets and other resources.

Forcibly killing the server with SIGKILL might leave resources in an inconsistent state, potentially causing issues like "Address already in use" errors if you run the example again quickly.

I recommend removing the -9 flag to allow the process to shut down gracefully.

Suggested change
pkill -9 -f "disagg_proxy_p2p_nccl_xpyd.py"
pkill -f "disagg_proxy_p2p_nccl_xpyd.py"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

kill -- -$$ # negative PID == "this whole process-group"
wait # reap children so we don't leave zombies
exit 0
Expand Down