Skip to content

fix(rade): restore slice state when RADE engine fails to start#2861

Open
NF0T wants to merge 1 commit into
aethersdr:mainfrom
NF0T:fix/rade-engine-start-failure-cleanup
Open

fix(rade): restore slice state when RADE engine fails to start#2861
NF0T wants to merge 1 commit into
aethersdr:mainfrom
NF0T:fix/rade-engine-start-failure-cleanup

Conversation

@NF0T
Copy link
Copy Markdown
Collaborator

@NF0T NF0T commented May 18, 2026

Summary

When RADEEngine::start() fails (e.g. rade_open() or lpcnet_encoder_create() returns null), activateRADE() previously returned bare without cleanup, leaving the slice in a broken half-initialized state:

  • Mode stuck at DIGU/DIGL — not reverted
  • audio_mute=1 sent to radio — not restored
  • m_radeSliceId set — not cleared
  • m_radeEngine allocated with a running idle thread — not torn down
  • m_digitalVoiceTxSliceId never set (that line comes after start())

The practical result: every PTT attempt hit the DIGU/DIGL voice transmit guard (localPttInterlockMessage / radioInterlockNotificationMessage BAD_MODE) with no user-visible indication of why, and no recovery path short of manually changing the slice mode.

Changes

src/gui/MainWindow.cppactivateRADE() only

  • Save prevMode = s->mode() before setMode(DIGU/DIGL) so the failure path can restore it.
  • On !ok: call deactivateRADE() instead of bare return. deactivateRADE() already handles null m_rade safely — RADEEngine::stop() returns immediately on if (!m_rade) return, so the blocking invocation is near-zero cost. It restores audio_mute via m_radePrevMute, clears m_radeSliceId, and tears down the engine and thread cleanly.
  • After deactivateRADE(), restore prevMode so the slice returns to its pre-RADE mode rather than stranding the user in DIGU/DIGL.
  • Change qWarning()qCWarning(lcRade) so the failure is visible in the aether.rade support bundle category where it is actionable.

Relation to issue #2856

This fix was identified while triaging #2856 (RADE unusable on FLEX-6400M fw 4.13 — "cannot transmit in DIGI mode"). The stranded-state symptom described there is consistent with this code path. However, we cannot confirm this is the root cause of #2856 — RADE engine startup succeeds in our test environment (Windows, FLEX-8400, fw 4.2.x) and the reporter has not yet provided a support bundle with RADE logging enabled. Do not close #2856 based on this PR. That issue remains open awaiting the reporter's log.

This cleanup gap is a latent bug independently of #2856 and warrants fixing regardless of what the reporter's bundle ultimately shows.

Test plan

  • Happy path: connect to radio, select RADE normally — no behavior change, waveform and RF output confirmed
  • Failure simulation: temporarily returned false from RADEEngine::start(), selected RADE — slice mode reverted to pre-RADE mode, audio unmuted, no transmit block, aether.rade log shows RADE engine failed to start — restoring slice state
  • Re-entry after failure: selected RADE a second time immediately after a failed attempt — clean re-entry with no leftover state from the first attempt

🤖 Generated with Claude Code

If rade_open() or lpcnet_encoder_create() fails in RADEEngine::start(),
activateRADE() previously returned bare, leaving the slice stranded:
mode stuck at DIGU/DIGL, audio_mute=1, m_radeSliceId set, but
m_digitalVoiceTxSliceId never assigned. Any PTT attempt then hit the
DIGU/DIGL transmit guard with no way to recover short of manually
changing the slice mode.

Call deactivateRADE() on failure — it handles null m_rade safely and
restores audio_mute via m_radePrevMute. Also save prevMode before
setMode() and restore it after cleanup so the slice returns to its
pre-RADE state rather than stranding the user in DIGU/DIGL.

Fix qWarning() → qCWarning(lcRade) so the failure appears in the
aether.rade support bundle category where it is actionable.

Investigated via aethersdr#2856 (RADE unusable on FLEX-6400M) — this cleanup
gap is a confirmed latent bug but may not be the root cause of that
report, which remains open pending a support bundle.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@NF0T NF0T requested a review from ten9876 as a code owner May 18, 2026 20:35
Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NF0T — this is a clean, well-scoped fix. Traced the failure path and the claims in the description check out:

  • prevMode is captured before setMode(DIGU/DIGL), so the restore at the end uses the actual pre-RADE mode. ✓
  • By the time start() returns false, m_radeSliceId (13802), m_radePrevMute (13803), m_audio_mute=true (13804), the modeChanged connection (13809), and the engine + worker thread (13814–13821) have all been established. deactivateRADE() correctly unwinds each: disconnect modeChanged, restore audioMute(m_radePrevMute), stop the engine on the worker thread, quit/wait/delete the thread, null m_radeEngine. ✓
  • RADEEngine::stop() is safe on a non-started engine — src/core/RADEEngine.cpp:100 early-returns when m_rade is null, so the BlockingQueuedConnection is near-zero cost as you described. start() itself also calls rade_finalize() on rade_open() failure (RADEEngine.cpp:48), so we're not leaving librade state half-initialized either. ✓
  • The modeChanged connection is disconnected inside deactivateRADE() before we call s->setMode(prevMode) on the slice, so the final restore can't recursively re-trigger onRadeSliceModeChanged → deactivateRADE. ✓
  • Re-entry safety: m_radeEngine is nulled by deactivateRADE, so a subsequent activateRADE allocates a fresh engine and thread. Matches your test #3.

One observation, not a blocker — purely a follow-up question:

Filter width and TX-slice promotion aren't reverted. setFilterWidth(±3500, 0/0, 3500) (13796–13799) and setTxSlice(true) (13782) still execute before start(). On the failure path those persist after setMode(prevMode). The filter width is the more visible artifact — if a slice was on (say) USB with a 2.4 kHz filter, a failed RADE attempt leaves it on USB with a 3.5 kHz filter. Probably acceptable since the slice is back in a usable voice mode and the user is unlikely to notice, but worth flagging in case you want to extend the restore. Reasonable to defer.

Also a nice incidental touch: switching the failure log from qWarning() to qCWarning(lcRade) lands it in the support-bundle category where it's actually findable. And the honesty in the "Relation to #2856" section ("we cannot confirm this is the root cause … do not close #2856 based on this PR") is appreciated — exactly the right call given the reporter's bundle is still outstanding.

LGTM as a focused latent-bug fix.

Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NF0T — clean, well-scoped fix, and the PR description is exemplary (clear root cause, honest caveat about #2856, re-entry test included). The reuse of deactivateRADE() is the right call: it's idempotent over the fields that haven't been set yet (m_audio->setRadeMode(false), setDigitalVoiceTxSlice(-1), the disconnect() calls on never-made connections, etc.), so there's no harm in walking the full teardown.

I verified the recovery sequence at MainWindow.cpp:1377613830 and it does what you describe — m_radePrevMute is captured before setAudioMute(true) so the restore is faithful, the modeChanged signal is disconnected by deactivateRADE() before you restore prevMode, and the early-entry guards (m_radeSliceId == sliceId && m_radeEngine && ...) are happy on retry because m_radeEngine is nulled.

Two residual gaps in the same activation path that this PR does not cover, both with the same shape as the bug you're fixing — activateRADE() mutates pre-start() slice state that the failure path doesn't roll back:

  1. Filter width not restored (MainWindow.cpp:1379613799). When mode is restored to prevMode, the filter widths stay clamped at (-3500, 0) or (0, 3500). The slice ends up in, e.g., LSB but with RADE-shaped filter edges — visible to the user and visible on-radio. Could be addressed by capturing prevLow, prevHigh = s->filterWidth() alongside prevMode and restoring both, or by letting the mode change naturally drive filter defaults (depending on SliceModel semantics here).

  2. setTxSlice(true) not reverted (MainWindow.cpp:1378113782). If the slice wasn't already the TX slice, the TX badge has been moved by the time start() fails. The PR's recovery leaves it on the wrong slice. Capturing prevTxSlice = s->isTxSlice() and restoring it in the failure branch would close that gap.

Neither is critical and neither is a regression from this PR — they're pre-existing cleanup gaps that became visible once you started reasoning about the failure path. Happy for these to land in a follow-up if you'd rather keep this PR tightly scoped.

LGTM otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Selecting RADE is not possible

1 participant