Simplify Quiescence and prepare for splice integration #4007

TheBlueMatt · 2025-08-13T01:34:52Z

Mostly small tweaks to our quiescence logic, ~~but ultimately integrates splicing with quiescence.~~

IMO its important that we allow quiescence-init while disconnected, but the only awkward part of the API is we can't cancel a splice once its started (except by FC'ing the channel). I started writing a cancel API but then realized its not possible because its only actually cancelled once the ChannelManager is persisted which may be a while. The other option we could consider is dropping the pending splice on restart and giving the user an API to see whether a splice happened or not (I guess its via the ChannelReady event?) and telling them to start again.

WDYT @wpaulino and @jkczyz?

Updated to not touch splicing yet, just simplify the quiescence logic and prep for splice-integration.

ldk-reviews-bot · 2025-08-13T01:34:55Z

👋 Thanks for assigning @wpaulino as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

codecov · 2025-08-13T01:41:18Z

Codecov Report

❌ Patch coverage is 92.36641% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.64%. Comparing base (9066f62) to head (c7e4887).
⚠️ Report is 38 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/ln/channel.rs	86.66%	6 Missing ⚠️
lightning/src/ln/quiescence_tests.rs	95.29%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4007      +/-   ##
==========================================
- Coverage   88.82%   88.64%   -0.18%     
==========================================
  Files         175      174       -1     
  Lines      127767   127719      -48     
  Branches   127767   127719      -48     
==========================================
- Hits       113486   113221     -265     
- Misses      11712    12021     +309     
+ Partials     2569     2477      -92

Flag	Coverage Δ
fuzzing	`?`
tests	`88.64% <92.36%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

wpaulino · 2025-08-13T17:59:14Z

IMO its important that we allow quiescence-init while disconnected, but the only awkward part of the API is we can't cancel a splice once its started (except by FC'ing the channel). I started writing a cancel API but then realized it's not possible because it's only actually cancelled once the ChannelManager is persisted which may be a while. The other option we could consider is dropping the pending splice on restart and giving the user an API to see whether a splice happened or not (I guess it's via the ChannelReady event?) and telling them to start again.

I was envisioning we'd have a SplicePending event that gets emitted after the tx_signatures exchange, and a SpliceFailed event for any failed/aborted attempts prior to said exchange. I think we could support a cancel API (either disconnect or send tx_abort) until the user calls back with funding_transaction_signed and rely on the SpliceFailed event as the response, but canceling after funding_transaction_signed wouldn't be possible so we'd have to force close.

Supporting quiescence throughout disconnects makes sense, but we may want to have a fixed number of retries or a timer so that we don't go on forever?

TheBlueMatt · 2025-08-13T22:13:12Z

I was envisioning we'd have a SplicePending event that gets emitted after the tx_signatures exchange, and a SpliceFailed event for any failed/aborted attempts prior to said exchange. I think we could support a cancel API (either disconnect or send tx_abort) until the user calls back with funding_transaction_signed and rely on the SpliceFailed event as the response, but canceling after funding_transaction_signed wouldn't be possible so we'd have to force close.

This still has the persistence issue, though. The user could track that they no longer want a splice in a channel and refuse to sign, which always works (and maybe is simply what we should do and document that users should rely on it) but we can't provide a built-in "cancel" function that actually is guaranteed to cancel. I think maybe this is fine, though, we don't really need to provide a way to cancel upgrading a channel type (just let it happen when the peer finally comes online?), so the other use of quiescence is fine, I guess?

Supporting quiescence throughout disconnects makes sense, but we may want to have a fixed number of retries or a timer so that we don't go on forever?

You mean because the other end may be rejecting the splice for some reason (why would they do that?)?

wpaulino · 2025-08-13T23:16:49Z

The user could track that they no longer want a splice in a channel and refuse to sign, which always works (and maybe is simply what we should do and document that users should rely on it) but we can't provide a built-in "cancel" function that actually is guaranteed to cancel.

Even then, I don't think it's possible to cancel because the counterparty may have already provided its tx_signatures and is now expecting ours. The channel would remain quiescent until they're exchanged. If we want to support this, I think we'd have to block sending our commitment_signed until the user calls back with funding_transaction_signed? It's safe/possible to abort the negotiation while commitment_signed hasn't been exchanged.

You mean because the other end may be rejecting the splice for some reason (why would they do that?)?

Could be that the peer is not around long enough to finish the negotiation before disconnecting again. Or maybe they require confirmed inputs (we don't support this at the moment), and we keep providing an unconfirmed one.

TheBlueMatt · 2025-08-14T13:30:54Z

Even then, I don't think it's possible to cancel because the counterparty may have already provided its tx_signatures and is now expecting ours. The channel would remain quiescent until they're exchanged. If we want to support this, I think we'd have to block sending our commitment_signed until the user calls back with funding_transaction_signed? It's safe/possible to abort the negotiation while commitment_signed hasn't been exchanged.

Hmmmmmm. Maybe we do that then? Not supporting splice init during disconnection seems like a pretty terrible API but also not supporting cancel wouldn't be an option. This is more complicated (storing the user's signatures for later use, I guess, though we already have to support not sending CS until the monitor completes) but it seems not crazy bad for a much better API. It also doesn't have to happen to enable splicing and get started testing, just to ship.

Could be that the peer is not around long enough to finish the negotiation before disconnecting again.

I imagine in this case we want to keep retrying until it succeeds or the user cancels.

Or maybe they require confirmed inputs (we don't support this at the moment), and we keep providing an unconfirmed one.

Should the peer not send a TxAbort in this case and we'll return the channel to normal operation? Or are they supposed to send a warning and disconnect?

TheBlueMatt · 2025-08-14T13:38:36Z

In any case, note that the quiesence action is taken, so we'll actually only ever try any action once. If it fails we let it fail and won't retry. I suppose in the "peer disconnected due to network disruption" case we'd really like to retry but I'm not sure its worth trying to implement that and track a failure counter.

wpaulino

Basically LGTM

lightning/src/ln/channel.rs

ldk-reviews-bot · 2025-08-14T22:32:05Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

ldk-reviews-bot · 2025-08-18T00:00:14Z

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

wpaulino · 2025-08-18T18:17:24Z

LGTM after squash

TheBlueMatt · 2025-08-18T20:48:21Z

Squashed without further changes.

In the case where we prepare to initiate quiescence, but cannot yet send our `stfu` because we're waiting on some channel operations to settle, and our peer ultimately sends their `stfu` before we can, we would detect this case and, if we were able, send an `stfu` which would allow us to send "something fundamental" first. While this is a nifty optimization, its a bit overkill - the chance that both us and our peer decide to attempt something fundamental at the same time is pretty low, and worse this required additional state tracking. We simply remove this optimization here, simplifying the quiescence state machine a good bit.

When we initiate quiescence, it should always be because we're trying to accomplish something (in the short term only splicing). In order to actually do that thing, we need to store the instructions for that thing somewhere the splicing logic knows to look at once we reach quiescence. Here we add a simple enum which will eventually store such actions.

There are a number of things in LDK where we've been lazy and not allowed the user to initiate an action while a peer is disconnected. While it may be accurate in the sense that the action cannot be started while the peer is disconnected, it is terrible dev UX - these actions can fail without the developer being at fault and the only way for them to address it is just try again. Here we fix this dev UX shortcoming for splicing, keeping any queued post-quiescent actions around when a peer disconnects and retrying the action (and quiescence generally) when the peer reconnects.

TheBlueMatt · 2025-08-19T15:27:53Z

Rebased to address a conflict.

graphite-app · 2025-08-19T15:47:13Z

lightning/src/ln/channel.rs

+					if self.quiescent_action.is_some() {
+						// If we're trying to get quiescent to do something, try again when we
+						// reconnect to the peer.
+						channel_state.set_awaiting_quiescence();
+					}


Duplicate logic for handling quiescent_action during disconnection

There appears to be duplicate logic for handling the quiescent_action state during disconnection in two places:

Lines 8209-8212:

if self.quiescent_action.is_some() { // If we were trying to get quiescent, try again after reconnection. self.context.channel_state.set_awaiting_quiescence(); }

This code block (lines 12864-12867):

if self.quiescent_action.is_some() { // If we're trying to get quiescent to do something, try again when we // reconnect to the peer. channel_state.set_awaiting_quiescence(); }

This duplication increases maintenance burden and risk of inconsistencies. Consider extracting this logic into a helper method that can be called from both locations to ensure consistent behavior.

Suggested change

if self.quiescent_action.is_some() {

// If we're trying to get quiescent to do something, try again when we

// reconnect to the peer.

channel_state.set_awaiting_quiescence();

}

if self.quiescent_action.is_some() {

// If we're trying to get quiescent to do something, try again when we

// reconnect to the peer.

self.handle_quiescent_action_on_disconnect(channel_state);

}

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

Its one line of code...I'm not convinced this is worth it.

ldk-reviews-bot · 2025-08-20T00:01:02Z

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

wpaulino · 2025-08-20T17:20:59Z

Merging with just my approval as the changes here are pretty straightforward and well covered by tests.

TheBlueMatt requested a review from wpaulino August 13, 2025 01:34

TheBlueMatt added this to Weekly Goals Aug 14, 2025

TheBlueMatt self-assigned this Aug 14, 2025

wpaulino reviewed Aug 14, 2025

View reviewed changes

lightning/src/ln/channel.rs Outdated Show resolved Hide resolved

lightning/src/ln/channel.rs Outdated Show resolved Hide resolved

wpaulino requested a review from jkczyz August 14, 2025 23:24

TheBlueMatt force-pushed the 2025-08-splice-quiescent branch from 76210a8 to b15949b Compare August 18, 2025 13:55

TheBlueMatt changed the title ~~Integrate Splicing with Quiescence~~ Simplify Quiescence and prepare for splice integration Aug 18, 2025

TheBlueMatt mentioned this pull request Aug 18, 2025

Integrate Splicing with Quiescence #4019

Draft

TheBlueMatt force-pushed the 2025-08-splice-quiescent branch from b15949b to 4a47f19 Compare August 18, 2025 20:48

TheBlueMatt requested a review from wpaulino August 18, 2025 21:07

wpaulino previously approved these changes Aug 18, 2025

View reviewed changes

TheBlueMatt added 3 commits August 19, 2025 15:27

TheBlueMatt dismissed wpaulino’s stale review via c7e4887 August 19, 2025 15:27

TheBlueMatt force-pushed the 2025-08-splice-quiescent branch from 4a47f19 to c7e4887 Compare August 19, 2025 15:27

wpaulino approved these changes Aug 19, 2025

View reviewed changes

graphite-app bot reviewed Aug 19, 2025

View reviewed changes

TheBlueMatt added the weekly goal Someone wants to land this this week label Aug 19, 2025

wpaulino merged commit 9724d19 into lightningdevkit:main Aug 20, 2025
36 of 44 checks passed

github-project-automation bot moved this to Done in Weekly Goals Aug 20, 2025

jkczyz mentioned this pull request Aug 21, 2025

Dual-funded channels and Splicing Project Tracking #1621

Open

7 tasks

Simplify Quiescence and prepare for splice integration #4007

Simplify Quiescence and prepare for splice integration #4007

Uh oh!

Conversation

TheBlueMatt commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wpaulino commented Aug 13, 2025

Uh oh!

TheBlueMatt commented Aug 13, 2025

Uh oh!

wpaulino commented Aug 13, 2025

Uh oh!

TheBlueMatt commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Aug 14, 2025

Uh oh!

wpaulino left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ldk-reviews-bot commented Aug 14, 2025

Uh oh!

ldk-reviews-bot commented Aug 18, 2025

Uh oh!

wpaulino commented Aug 18, 2025

Uh oh!

TheBlueMatt commented Aug 18, 2025

Uh oh!

TheBlueMatt commented Aug 19, 2025

Uh oh!

graphite-app bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Aug 20, 2025

Uh oh!

wpaulino commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Aug 13, 2025 •

edited

Loading

ldk-reviews-bot commented Aug 13, 2025 •

edited

Loading

codecov bot commented Aug 13, 2025 •

edited

Loading

TheBlueMatt commented Aug 14, 2025 •

edited

Loading