Skip to content

Fix stopDispatcherThreads() to block until dispatcher threads have actually terminated#1149

Draft
Copilot wants to merge 3 commits intomasterfrom
copilot/fix-thread-dispatcher-stop-issue
Draft

Fix stopDispatcherThreads() to block until dispatcher threads have actually terminated#1149
Copilot wants to merge 3 commits intomasterfrom
copilot/fix-thread-dispatcher-stop-issue

Conversation

Copy link
Contributor

Copilot AI commented Feb 25, 2026

  • Understand the issue: stopDispatcher() sets stopped = true immediately, making the wait loop exit before the dispatcher thread actually finishes
  • Remove premature stopped = true from stopDispatcher() so isStopped() reflects real termination
  • Add thread-tracking join mechanism to ThreadAdapter: capture runnerThread in run(), signal startedLatch, then awaitTermination() calls Thread.join() — guarantees isAlive() == false on return (fixes Windows/JDK21 race)
  • Update stopDispatcherThreads() to snapshot dispatchers (not use live view) and use awaitTermination() instead of polling isStopped()
  • Add regression test testStopDispatcherThreadsIsBlocking: enqueue slow-processing message, call stop, assert thread is dead immediately after stop returns (no polling needed)
  • All 7 tests pass
Original prompt

Create a fix for issue #1128 so that when stop() returns, the ThreadPerSessionEventHandlingStrategy message dispatcher threads are actually stopped (no further message processing occurs).

Constraints / requirements:

  • Open a PR against the master branch.
  • Do NOT change behavior such that END_OF_STREAM is prevented from being delivered to quickfixSession.next(); keep the existing semantics where the END_OF_STREAM message is passed to quickfixSession.next(message).

Observed bug:

  • ThreadPerSessionEventHandlingStrategy.stopDispatcherThreads() attempts to wait for dispatchers to stop by polling MessageDispatchingThread::isStopped.
  • MessageDispatchingThread.stopDispatcher() currently sets stopping = true and stopped = true immediately after enqueuing END_OF_STREAM, so isStopped() can return true before the dispatcher thread has actually terminated.
  • This makes stop() effectively non-blocking with respect to message processing: messages may continue to be processed after stop() returns.

Implementation goals:

  1. Ensure stopDispatcherThreads() blocks until each dispatcher thread has completed its doRun() (or until a reasonable timeout, if needed).
  2. Make isStopped() reflect real termination rather than a stop request.
  3. Avoid iterating on a live view of dispatchers.values() in a way that can cause concurrent modification surprises; snapshot dispatchers to stop.
  4. Provide a join/await mechanism for ThreadPerSessionEventHandlingStrategy.ThreadAdapter so callers can wait for thread completion even when backed by an Executor.
  5. Add or update tests to reliably reproduce the issue (e.g., enqueue messages and slow processing, call stop, assert no more processing happens after stop returns; also assert QF/J dispatcher threads are gone).

Likely files:

  • quickfixj-core/src/main/java/quickfix/mina/ThreadPerSessionEventHandlingStrategy.java
  • quickfixj-core/src/test/java/... (add/modify tests for thread-per-session stop behavior)

Acceptance criteria:

  • After stopDispatcherThreads() returns (and therefore after stop() which invokes it), no dispatcher threads continue to process messages.
  • Tests pass and cover the regression.

This pull request was created from Copilot chat.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

…ads have actually terminated

- Remove premature `stopped = true` from `stopDispatcher()` so `isStopped()` only returns
  true when `doRun()` fully completes, not when a stop was merely requested.
- Add `CountDownLatch doneLatch` to `ThreadAdapter` and count it down in the `run()` finally
  block; expose `awaitTermination(timeout, unit)` so callers can join the thread even when
  backed by an external Executor.
- Rewrite `stopDispatcherThreads()` to snapshot the dispatcher map (avoids live-view
  concurrent-modification surprises) and call `awaitTermination()` on each dispatcher
  instead of polling `isStopped()`.
- Add regression test `testStopDispatcherThreadsIsBlocking` that enqueues a slow-processing
  message, calls `stopDispatcherThreads()`, then immediately asserts (no extra sleep) that
  the dispatcher thread is dead.

Co-authored-by: chrjohn <6644028+chrjohn@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix stop() method for ThreadPerSessionEventHandlingStrategy Fix stopDispatcherThreads() to block until dispatcher threads have actually terminated Feb 25, 2026
Copilot AI requested a review from chrjohn February 25, 2026 16:55
@chrjohn chrjohn changed the title Fix stopDispatcherThreads() to block until dispatcher threads have actually terminated Fix stopDispatcherThreads() to block until dispatcher threads have actually terminated Feb 25, 2026
@chrjohn
Copy link
Member

chrjohn commented Feb 25, 2026

@copilot the test fails on windows with jdk21, but probably it is a general problem

ThreadPerSessionEventHandlingStrategyTest.testStopDispatcherThreadsIsBlocking:374 Dispatcher thread must be dead after stopDispatcherThreads() returns

The previous implementation called CountDownLatch.countDown() in the run()
finally block, but this fires BEFORE Thread.isAlive() transitions to false.
On Windows/JDK21 the gap is large enough to cause the test assertion to fail.

Replace the done-latch approach with thread tracking + Thread.join():
- Capture runnerThread = Thread.currentThread() at the top of run()
- Signal startedLatch so awaitTermination() knows the thread reference is set
- awaitTermination() now calls runnerThread.join(remaining) which blocks until
  the thread has truly terminated, guaranteeing isAlive() == false on return.

Co-authored-by: chrjohn <6644028+chrjohn@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 25, 2026

@copilot the test fails on windows with jdk21, but probably it is a general problem

ThreadPerSessionEventHandlingStrategyTest.testStopDispatcherThreadsIsBlocking:374 Dispatcher...

Fixed in dcf45b2. The root cause was a race between CountDownLatch.countDown() (which fired in the finally block of run()) and Thread.isAlive() transitioning to false — on Windows/JDK21 that gap was observable.

Replaced the done-latch with thread tracking + Thread.join():

  • run() now captures runnerThread = Thread.currentThread() and counts down a startedLatch before doing any work
  • awaitTermination() first awaits the startedLatch (handles the case where it's called before the thread even starts), then calls runnerThread.join(remaining) — which by JVM spec guarantees isAlive() == false when it returns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants