Skip to content

NAS-140250 / 26.0.0-BETA.1 / Fix keepalived boot deadlock in configure_addresses_impl (by bmeagherix)#18440

Merged
yocalebo merged 1 commit intorelease/26.0.0-BETA.1from
NAS-140250-26.0.0-BETA.1
Mar 11, 2026
Merged

NAS-140250 / 26.0.0-BETA.1 / Fix keepalived boot deadlock in configure_addresses_impl (by bmeagherix)#18440
yocalebo merged 1 commit intorelease/26.0.0-BETA.1from
NAS-140250-26.0.0-BETA.1

Conversation

@bugclerk
Copy link
Contributor

ix-netif.service runs Before=network-pre.target, but keepalived requires After=network-online.target. Starting keepalived from configure_addresses_impl (called via ix-netif.service) caused systemd to queue the start job for ~95s until network-online.target was eventually satisfied after ix-netif.service completed - a structural deadlock.

Fix by guarding the keepalived START behind the ix-netif completion sentinel. If keepalived is already running, RELOAD as before. If it is not running and the sentinel exists (i.e. we are in a post-boot interface.sync call), START it. If the sentinel does not exist we are in the early boot call and skip keepalived entirely; it will be started once the network is online.

Move NETIF_COMPLETE_SENTINEL from smb_/constants.py to the more appropriate middlewared/utils/interface.py and update importers accordingly.

Original PR: #18437

ix-netif.service runs Before=network-pre.target, but keepalived requires
After=network-online.target. Starting keepalived from configure_addresses_impl
(called via ix-netif.service) caused systemd to queue the start job for ~95s
until network-online.target was eventually satisfied after ix-netif.service
completed - a structural deadlock.

Fix by guarding the keepalived START behind the ix-netif completion sentinel.
If keepalived is already running, RELOAD as before. If it is not running and
the sentinel exists (i.e. we are in a post-boot interface.sync call), START it.
If the sentinel does not exist we are in the early boot call and skip keepalived
entirely; it will be started once the network is online.

Move NETIF_COMPLETE_SENTINEL from smb_/constants.py to the more appropriate
middlewared/utils/interface.py and update importers accordingly.

(cherry picked from commit 825ab34)
@bugclerk
Copy link
Contributor Author

@yocalebo yocalebo merged commit 67db6cc into release/26.0.0-BETA.1 Mar 11, 2026
1 check passed
@yocalebo yocalebo deleted the NAS-140250-26.0.0-BETA.1 branch March 11, 2026 19:41
@bugclerk
Copy link
Contributor Author

This PR has been merged and conversations have been locked.
If you would like to discuss more about this issue please use our forums or raise a Jira ticket.

@truenas truenas locked as resolved and limited conversation to collaborators Mar 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants