Skip to content

Conversation

@ctrueden
Copy link

@ctrueden ctrueden commented Aug 20, 2025

This patch improves ndv's image showing functions to work more seamlessly when called from non-main threads, especially on macOS.

Below is an MCVE script that illustrates the problem before this patch. Without this patch, commenting out the @ensure_main_thread decorator causes the program to crash, whereas with this patch, the program works with or without the decorator in place.

Note that the patch itself was written by Claude.ai, not me, so I won't be offended by any harsh criticism. 😆

import sys
import threading
import time

from qtpy.QtWidgets import QApplication


def create_and_show_random_image() -> "ndv.ArrayViewer":
    from ndv import ArrayViewer
    import numpy
    from ndv.views._app import ensure_main_thread

    @ensure_main_thread  # Comment out this decorator to observe the crash.
    def show(data, **kwargs):
        viewer = ArrayViewer(data)
        viewer.show()
        return viewer

    data = numpy.random.random([300, 200])
    retval = show(data)
    return retval.result() if hasattr(retval, 'result') else retval


def do_stuff_later():
    print("I'll do it in a sec...")
    time.sleep(1)
    print("Time to show some images!")
    for _ in range(3):
        print("Showing an image...")
        viewer = create_and_show_random_image()
        viewers.append(viewer)
        print(f"Viewers -> {viewers}")
    print("All images shown.")


# Create QApplication on main thread.
print("Creating Qt application...")
app = QApplication(sys.argv)

viewers = []

threading.Thread(target=do_stuff_later).start()

print("Starting Qt loop...")
app.exec()

print("That's it! Go home to your pets!")

ctrueden and others added 2 commits August 20, 2025 16:59
Notes from Claude.ai:

I've made the ArrayViewer class thread-safe by implementing the
following changes:

  1. Thread-safe Constructor
     (src/ndv/controllers/_array_viewer.py:100-108)

     - Moved GUI component initialization to a separate
       _init_gui_components() method.
     - Queue GUI operations to the main thread using
       _app.ndv_app().call_in_main_thread().
     - Block until operations complete to maintain synchronous behavior.

  2. Thread-safe View Synchronization
     (src/ndv/controllers/_array_viewer.py:106-108)

     - Queue _fully_synchronize_view() to main thread since it contains
       GUI operations like create_sliders().

  3. Thread-safe Show/Hide/Close Methods
     (src/ndv/controllers/_array_viewer.py:216-233)

     - Modified show(), hide(), and close() methods to queue
       set_visible() calls to main thread.
     - Maintain synchronous behavior by blocking until operations
       complete.

The fix ensures that all GUI-sensitive operations are properly queued
to the main thread while maintaining the existing synchronous API.
@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.90%. Comparing base (e3e94d3) to head (73393a8).

Files with missing lines Patch % Lines
src/ndv/controllers/_array_viewer.py 66.66% 4 Missing ⚠️

❌ Your patch check has failed because the patch coverage (66.66%) is below the target coverage (85.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #211      +/-   ##
==========================================
- Coverage   85.92%   85.90%   -0.02%     
==========================================
  Files          46       46              
  Lines        5202     5209       +7     
==========================================
+ Hits         4470     4475       +5     
- Misses        732      734       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tlambert03
Copy link
Member

having a look into this today. (Context, for anyone reading this, is trying to get ndv and other Qt/python apps working with Fiji's new experimental python runtime support)

As I played around with this a bit, it does work... but I have to say, more generally, that the model of running everything on a non-main thread is going to be so unusual to most libraries, that I think this might be more of a "core" fiji problem than we want it to be. I don't imagine it will work well to have to tell every library anyone wants to use in Fiji to use something like the ensure_main_thread decorator from superqt for every function that might touch the main gui. The assumption for most libraries that have a gui component, of course, is that they'll be started on the main thread. Wrapping every potentially-gui-touching function inside such a decorator might have other performance and/or signal-related consequences if not done carefully (including deadlock or other nasties).

So, let's keep playing with this (and other patterns) in the context of Fiji's python launcher

@ctrueden
Copy link
Author

ctrueden commented Aug 23, 2025

My thinking is that in both Python and Java worlds, programmers rarely if ever expect to see a native crash—only exceptions. If we decide not to make show and friends thread-safe in ndv, then my next suggestion is to add a thread check instead, which fails fast with an exception if we aren't on the correct thread, rather than crashing the program outright, which in my view is a bug in the vast majority of circumstances.

I agree with what you're saying about ndv not being the only library that's going to be tougher to invoke from Fiji. This is also an issue in Java, where you have to queue GUI-manipulating functions to the EDT via EventQueue.invokeLater and the like, and I'm assuming also an issue for Qt GUI manipulation after the Qt event loop starts as well. It's just a thing people need to know to do when they are working with GUI code. That said, we do make an effort in SciJava Common's UserInterface implementations to be thread-safe for functions such as ij.ui().show(object) and ij.ui().showUI(); it's easy to check the thread and do the right thing in both cases, so why not?

@tlambert03
Copy link
Member

tlambert03 commented Aug 24, 2025

My thinking is that in both Python and Java worlds, programmers rarely if ever expect to see a native crash—only exceptions. If we decide not to make show and friends thread-safe in ndv, then my next suggestion is to add a thread check instead, which fails fast with an exception if we aren't on the correct thread, rather than crashing the program outright, which in my view is a bug in the vast majority of circumstances.

sys.excepthook

with this phrasing, i just want to double check here that you know about the whole sys.excepthook thing and how Qt plays into it. People are often surprised when they use QApplication.exec() to find that python exceptions no longer raise exceptions, but rather crash out to the terminal. But that's not mandatory... it's due to the fact that all python bindings for Qt (both PySide and PyQt) call qFatal during unhandled python exceptions encountered during the event loop. You can (and most long-lived applications do) modify sys.excepthook to prevent crashing out on exceptions. For example:

so, if you're going to "own" the main event loop, and if you're going to commit to using Qt's event loop, then fiji.py should also probably install a custom except hook.

app.exec()

I think it's also worth discussing the call to app.exec() here in fiji.py

That might be a bit too presumptuous if you want to more generally support python in Fiji. There are many different event loop models in python, plenty of event-loop requiring python modules don't use Qt. (the main alternates being asyncio/anyio/trio and other structured concurrency libs).

If you call app.exec() then i think we are basically stuck running all application logic in another thread, and all downstream libraries will likely need to implement some sort of "call-on-main-thread" workaround. But you could also have your own concept of an event-loop, similar to how IPython does. For example, the way IPython supports Qt-based apps (i.e. %gui qt) is not to put them on another thread; instead, they never actually start a blocking Qt app with app.exec(), but rather make clever use of QtCore.QEventLoop and QSocketNotifier(nix)/QTimer(win) and to run both user code and their own prompt on the main thread. They pump the event loop manually and exit back out repeatedly on user input in the REPL. relevant code here... making sure that everything runs on the main thread; sharing time with their own REPL event loop.

So... Fiji might also try having some magic method for picking an event loop and going back and forth?

... but it's possibly that AWT/Swing will make this situation harder here than it is over in IPython...

I tried playing around a bit briefly with alternatives to calling app.exec() in fiji.py (e.g., by polling app.processEvents()), but I found that I was unable to get the Java GUI to ever show. I was a bit surprised to find that without entering app.exec(), this call to org.scijava.launcher.Splash.show() blocked indefinitely... and I guess here now I'm seeing some of the expectations of the native macOS event loop that you were talking about last week. However, it looks like QtCore.QEventLoop.exec() does make it to the native os loop (via processEvents() inside of exec() -> call to eventDispatcher.loadRelaxed()->processEvents -> call to QCocoaEventDispatcher::processEvents ... which I think calls NSApp event loops stuff around here. So that's something to try

I will say that Qt apps are noticeably slower/laggier on IPython %gui qt than the standard event loop. so it's not "free"... but it might be a mode to consider.

anyway, just some thoughts. will keep playing when I have a moment!

@ctrueden
Copy link
Author

ctrueden commented Nov 13, 2025

Thanks @tlambert03, I didn't know about sys.excepthook and Qt crashing out the process on exception! That's useful.

I tried adding a sys.excepthook handler in fiji.py:

# Install custom exception handler to prevent Qt crashes from
# unhandled Python exceptions. This is important because Qt's
# default behavior is to call qFatal and crash the app.
original_excepthook = sys.excepthook

def fiji_excepthook(exc_type, exc_value, exc_traceback):
    # Log the exception.
    _logger.error(
        "Unhandled exception:",
        exc_info=(exc_type, exc_value, exc_traceback),
    )
    # Call original handler (prints to stderr).
    original_excepthook(exc_type, exc_value, exc_traceback)
sys.excepthook = fiji_excepthook

but it still crashes when Qt operations happen on the wrong thread.

uncaught NSException crash
WARNING: QObject::installEventFilter(): Cannot filter events for objects in a different thread.
[WARNING:vispy] QObject::installEventFilter(): Cannot filter events for objects in a different thread.
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow should only be instantiated on the main thread!'
*** First throw call stack:
(
	0   CoreFoundation                      0x00000001912a0770 __exceptionPreprocess + 176
	1   libobjc.A.dylib                     0x0000000190d7e418 objc_exception_throw + 88
	2   CoreFoundation                      0x00000001912bc2c0 _CFBundleGetValueForInfoKey + 0
	3   AppKit                              0x00000001962c58d4 -[NSWindow _initContent:styleMask:backing:defer:contentView:] + 260
	4   AppKit                              0x00000001962c5cdc -[NSWindow initWithContentRect:styleMask:backing:defer:] + 48
	5   AppKit                              0x00000001962c5c90 -[NSWindow initWithContentRect:styleMask:backing:defer:screen:] + 24
	6   libqcocoa.dylib                     0x000000010e9cf508 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 236776
	7   libqcocoa.dylib                     0x000000010e9a52bc _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 64156
	8   libqcocoa.dylib                     0x000000010e99d3a4 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 31620
	9   libqcocoa.dylib                     0x000000010e99cdb4 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 30100
	10  QtGui                               0x000000010e2404f4 _ZN14QWindowPrivate6createEb + 464
	11  QtWidgets                           0x000000010ef76144 _ZN14QWidgetPrivate6createEv + 1028
	12  QtWidgets                           0x000000010ef735f4 _ZN7QWidget6createEybb + 376
	13  QtWidgets                           0x000000010f0ec964 _ZNK8QMenuBar15initStyleOptionEP20QStyleOptionMenuItemPK7QAction + 1704
	14  QtWidgets                           0x000000010f0ec55c _ZNK8QMenuBar15initStyleOptionEP20QStyleOptionMenuItemPK7QAction + 672
	15  QtWidgets                           0x000000010f0ecdc4 _ZN8QMenuBarC1EP7QWidget + 216
	16  QtWidgets                           0x000000010f0ad6e8 _ZNK11QMainWindow7menuBarEv + 68
	17  QtWidgets.abi3.so                   0x000000010db48538 _ZL24meth_QMainWindow_menuBarP7_objectS0_ + 88
	18  libpython3.12.dylib                 0x000000010534f5a0 cfunction_call + 308
	19  libpython3.12.dylib                 0x00000001052f1e90 _PyObject_MakeTpCall + 312
	20  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	21  libpython3.12.dylib                 0x00000001052f22a0 _PyObject_FastCallDictTstate + 160
	22  libpython3.12.dylib                 0x0000000105377e40 slot_tp_init + 312
	23  libpython3.12.dylib                 0x000000010536ede0 type_call + 468
	24  libpython3.12.dylib                 0x00000001052f1e90 _PyObject_MakeTpCall + 312
	25  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	26  libpython3.12.dylib                 0x00000001052f22a0 _PyObject_FastCallDictTstate + 160
	27  libpython3.12.dylib                 0x0000000105377e40 slot_tp_init + 312
	28  libpython3.12.dylib                 0x000000010536ede0 type_call + 468
	29  libpython3.12.dylib                 0x00000001052f2c88 _PyObject_Call + 176
	30  libpython3.12.dylib                 0x000000010541c034 _PyEval_EvalFrameDefault + 23276
	31  libpython3.12.dylib                 0x0000000105415644 PyEval_EvalCode + 276
	32  libpython3.12.dylib                 0x00000001054120dc builtin_exec + 1488
	33  libpython3.12.dylib                 0x00000001053501cc cfunction_vectorcall_FASTCALL_KEYWORDS + 144
	34  libpython3.12.dylib                 0x00000001052f2a20 PyObject_Vectorcall + 88
	35  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	36  libpython3.12.dylib                 0x00000001052f61f4 method_vectorcall + 536
	37  _jpype.cpython-312-darwin.so        0x0000000105b3e170 Java_org_jpype_proxy_JPypeProxy_hostInvoke + 196
	38  ???                                 0x0000000132268a88 0x0 + 5136353928
	39  ???                                 0x0000000132264f70 0x0 + 5136338800
	40  ???                                 0x0000000132265420 0x0 + 5136340000
	41  ???                                 0x0000000132265420 0x0 + 5136340000
	42  ???                                 0x0000000132265420 0x0 + 5136340000
	43  ???                                 0x000000013226562c 0x0 + 5136340524
	44  ???                                 0x000000013226517c 0x0 + 5136339324
	45  ???                                 0x0000000132264f70 0x0 + 5136338800
	46  ???                                 0x0000000132265420 0x0 + 5136340000
	47  ???                                 0x0000000132264f70 0x0 + 5136338800
	48  ???                                 0x0000000132265420 0x0 + 5136340000
	49  ???                                 0x000000013226562c 0x0 + 5136340524
	50  ???                                 0x000000013226517c 0x0 + 5136339324
	51  ???                                 0x000000013226562c 0x0 + 5136340524
	52  ???                                 0x000000013226517c 0x0 + 5136339324
	53  ???                                 0x0000000132260144 0x0 + 5136318788
	54  libjvm.dylib                        0x0000000112fd9444 _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP10JavaThread + 992
	55  libjvm.dylib                        0x0000000112fd8374 _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP10JavaThread + 320
	56  libjvm.dylib                        0x0000000112fd8440 _ZN9JavaCalls12call_virtualEP9JavaValue6HandleP5KlassP6SymbolS6_P10JavaThread + 100
	57  libjvm.dylib                        0x00000001130aa10c _ZL12thread_entryP10JavaThreadS0_ + 156
	58  libjvm.dylib                        0x0000000112fedb74 _ZN10JavaThread17thread_main_innerEv + 152
	59  libjvm.dylib                        0x0000000113529d60 _ZN6Thread8call_runEv + 200
	60  libjvm.dylib                        0x000000011334b108 _ZL19thread_native_entryP6Thread + 280
	61  libsystem_pthread.dylib             0x00000001911b0c08 _pthread_start + 136
	62  libsystem_pthread.dylib             0x00000001911abba8 thread_start + 8
)
2025-11-13 13:21:30.839 fiji-macos-arm64[91819:13258507] Apple AWT Internal Exception: NSWindow should only be instantiated on the main thread!
2025-11-13 13:21:30.839 fiji-macos-arm64[91819:13258507] trace: (
	0   CoreFoundation                      0x00000001912a0770 __exceptionPreprocess + 176
	1   libobjc.A.dylib                     0x0000000190d7e418 objc_exception_throw + 88
	2   CoreFoundation                      0x00000001912bc2c0 _CFBundleGetValueForInfoKey + 0
	3   AppKit                              0x00000001962c58d4 -[NSWindow _initContent:styleMask:backing:defer:contentView:] + 260
	4   AppKit                              0x00000001962c5cdc -[NSWindow initWithContentRect:styleMask:backing:defer:] + 48
	5   AppKit                              0x00000001962c5c90 -[NSWindow initWithContentRect:styleMask:backing:defer:screen:] + 24
	6   libqcocoa.dylib                     0x000000010e9cf508 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 236776
	7   libqcocoa.dylib                     0x000000010e9a52bc _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 64156
	8   libqcocoa.dylib                     0x000000010e99d3a4 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 31620
	9   libqcocoa.dylib                     0x000000010e99cdb4 _ZN20QCocoaSystemTrayIcon13emitActivatedEv + 30100
	10  QtGui                               0x000000010e2404f4 _ZN14QWindowPrivate6createEb + 464
	11  QtWidgets                           0x000000010ef76144 _ZN14QWidgetPrivate6createEv + 1028
	12  QtWidgets                           0x000000010ef735f4 _ZN7QWidget6createEybb + 376
	13  QtWidgets                           0x000000010f0ec964 _ZNK8QMenuBar15initStyleOptionEP20QStyleOptionMenuItemPK7QAction + 1704
	14  QtWidgets                           0x000000010f0ec55c _ZNK8QMenuBar15initStyleOptionEP20QStyleOptionMenuItemPK7QAction + 672
	15  QtWidgets                           0x000000010f0ecdc4 _ZN8QMenuBarC1EP7QWidget + 216
	16  QtWidgets                           0x000000010f0ad6e8 _ZNK11QMainWindow7menuBarEv + 68
	17  QtWidgets.abi3.so                   0x000000010db48538 _ZL24meth_QMainWindow_menuBarP7_objectS0_ + 88
	18  libpython3.12.dylib                 0x000000010534f5a0 cfunction_call + 308
	19  libpython3.12.dylib                 0x00000001052f1e90 _PyObject_MakeTpCall + 312
	20  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	21  libpython3.12.dylib                 0x00000001052f22a0 _PyObject_FastCallDictTstate + 160
	22  libpython3.12.dylib                 0x0000000105377e40 slot_tp_init + 312
	23  libpython3.12.dylib                 0x000000010536ede0 type_call + 468
	24  libpython3.12.dylib                 0x00000001052f1e90 _PyObject_MakeTpCall + 312
	25  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	26  libpython3.12.dylib                 0x00000001052f22a0 _PyObject_FastCallDictTstate + 160
	27  libpython3.12.dylib                 0x0000000105377e40 slot_tp_init + 312
	28  libpython3.12.dylib                 0x000000010536ede0 type_call + 468
	29  libpython3.12.dylib                 0x00000001052f2c88 _PyObject_Call + 176
	30  libpython3.12.dylib                 0x000000010541c034 _PyEval_EvalFrameDefault + 23276
	31  libpython3.12.dylib                 0x0000000105415644 PyEval_EvalCode + 276
	32  libpython3.12.dylib                 0x00000001054120dc builtin_exec + 1488
	33  libpython3.12.dylib                 0x00000001053501cc cfunction_vectorcall_FASTCALL_KEYWORDS + 144
	34  libpython3.12.dylib                 0x00000001052f2a20 PyObject_Vectorcall + 88
	35  libpython3.12.dylib                 0x000000010541b038 _PyEval_EvalFrameDefault + 19184
	36  libpython3.12.dylib                 0x00000001052f61f4 method_vectorcall + 536
	37  _jpype.cpython-312-darwin.so        0x0000000105b3e170 Java_org_jpype_proxy_JPypeProxy_hostInvoke + 196
	38  ???                                 0x0000000132268a88 0x0 + 5136353928
	39  ???                                 0x0000000132264f70 0x0 + 5136338800
	40  ???                                 0x0000000132265420 0x0 + 5136340000
	41  ???                                 0x0000000132265420 0x0 + 5136340000
	42  ???                                 0x0000000132265420 0x0 + 5136340000
	43  ???                                 0x000000013226562c 0x0 + 5136340524
	44  ???                                 0x000000013226517c 0x0 + 5136339324
	45  ???                                 0x0000000132264f70 0x0 + 5136338800
	46  ???                                 0x0000000132265420 0x0 + 5136340000
	47  ???                                 0x0000000132264f70 0x0 + 5136338800
	48  ???                                 0x0000000132265420 0x0 + 5136340000
	49  ???                                 0x000000013226562c 0x0 + 5136340524
	50  ???                                 0x000000013226517c 0x0 + 5136339324
	51  ???                                 0x000000013226562c 0x0 + 5136340524
	52  ???                                 0x000000013226517c 0x0 + 5136339324
	53  ???                                 0x0000000132260144 0x0 + 5136318788
	54  libjvm.dylib                        0x0000000112fd9444 _ZN9JavaCalls11call_helperEP9JavaValueRK12methodHandleP17JavaCallArgumentsP10JavaThread + 992
	55  libjvm.dylib                        0x0000000112fd8374 _ZN9JavaCalls12call_virtualEP9JavaValueP5KlassP6SymbolS5_P17JavaCallArgumentsP10JavaThread + 320
	56  libjvm.dylib                        0x0000000112fd8440 _ZN9JavaCalls12call_virtualEP9JavaValue6HandleP5KlassP6SymbolS6_P10JavaThread + 100
	57  libjvm.dylib                        0x00000001130aa10c _ZL12thread_entryP10JavaThreadS0_ + 156
	58  libjvm.dylib                        0x0000000112fedb74 _ZN10JavaThread17thread_main_innerEv + 152
	59  libjvm.dylib                        0x0000000113529d60 _ZN6Thread8call_runEv + 200
	60  libjvm.dylib                        0x000000011334b108 _ZL19thread_native_entryP6Thread + 280
	61  libsystem_pthread.dylib             0x00000001911b0c08 _pthread_start + 136
	62  libsystem_pthread.dylib             0x00000001911abba8 thread_start + 8
)
libc++abi: terminating due to uncaught exception of type NSException
./fiji: line 55: 91819 Abort trap: 6           "$launcher" "$@"

Here is what Claude said about it:

The issue is that sys.excepthook only handles unhandled exceptions that propagate to the top level. It doesn't catch exceptions that occur inside Qt callbacks or other frameworks' event loops.

The crash you're seeing is happening because:

  1. Qt widget creation (QMenuBar/QMainWindow) is being called from a non-main thread (the Fiji thread)
  2. This triggers an NSException at the Objective-C level on macOS
  3. NSException is not a Python exception, so sys.excepthook never sees it
  4. The app crashes before Python's exception handling even gets involved

The real problem is that your architecture has Fiji running in a background thread while Qt runs on the main thread, but Java/Swing code is trying to create Qt widgets from the Fiji thread. This violates the macOS/Qt threading requirement.

The sys.excepthook approach won't help here because:

  • The exception is happening at the C/Objective-C level, not Python
  • It's a threading violation, not a recoverable Python exception
  • Even if caught, you can't "fix" the threading issue after the fact

Solutions:

  1. Best approach: Keep using @ensure_main_thread decorator where needed - this is the correct solution. It ensures Qt operations happen on the main thread.
  2. If you want broader protection, you could patch Qt object creation to automatically dispatch to main thread, but this is complex and fragile.
  3. sys.excepthook is still useful for other cases (like pure Python exceptions in Qt slots), so keep it, but understand its limitations: it can't save you from threading violations at the native level.

Pushing Claude a bit more, we tried some other approaches:

  1. Using PyObjC or ctypes, install an Objective-C exception handler to catch NSExceptions (like threading violations) that would otherwise crash.

    ❌ Still crashes, due to internal exception handling in Java AWT:

    The handler is set up correctly, but it's not being called. The key issue is in the stack trace:

    2025-11-13 13:38:59.170 fiji-macos-arm64[93160:13275157] Apple AWT Internal Exception: NSWindow should only be instantiated on the main thread!
    

    "Apple AWT Internal Exception" - this means the Java AWT layer is catching the NSException first and handling it internally, preventing it from reaching your NSSetUncaughtExceptionHandler.

    The problem is that the exception is being caught and re-thrown by Java's AWT code, which then calls abort() directly, bypassing the normal exception handling mechanism.

  2. Patch threading.Thread.run to catch Python exceptions on any thread.

    # Patch threading.Thread.run to catch all exceptions in threads,
    # including those from Java->Python callbacks via JPype.
    original_thread_run = threading.Thread.run
    
    def patched_thread_run(self):
        try:
            original_thread_run(self)
        except Exception:
            # Log the exception but don't let it crash the thread.
            _logger.error(
                f"Exception in thread {self.name}:",
                exc_info=True,
            )
            # Also print to stderr for visibility.
            traceback.print_exc()

    ❌ Still crashes. The crash comes from the JVM, not from a Python thread.

  3. Try noticing SIGABRT by registering a callback via signal.signal:

    # Define our signal handler.
    SIGHANDLER = ctypes.CFUNCTYPE(None, ctypes.c_int)
    
    def sigabrt_handler(signum):
        """Handle SIGABRT without crashing."""
        _logger.error("Caught SIGABRT - preventing crash")
        print("*** Caught SIGABRT (NSException threading violation)", file=sys.stderr)
        # Don't exit - just return and hope for the best.
    
    _sigabrt_callback = SIGHANDLER(sigabrt_handler)
    
    # Install the handler using signal.signal().
    # This is simpler than sigaction and should work.
    signal.signal(signal.SIGABRT, lambda sig, frame: sigabrt_handler(sig))

    ❌ Still crashes, because:

    The SIGABRT handler isn't being called because the exception is being thrown inside a Java thread (see the stack trace - it's in libjvm.dylib), and Python signal handlers only work on the main Python thread.

    Since the crash is happening because abort() is called by the C++ runtime (libc++abi: terminating due to uncaught exception), we need to intercept it at a lower level. The fundamental problem is that we can't easily catch NSExceptions thrown from JVM threads.

  4. Install a C++ terminate handler using std::set_terminate. Unfortunately, the symbol is mangled on macOS, so the code looks like this:

    # The symbol is mangled as _ZSt13set_terminatePFvvE on macOS/Linux.
    libcxx._ZSt13set_terminatePFvvE.restype = ctypes.c_void_p
    libcxx._ZSt13set_terminatePFvvE.argtypes = [TERMINATE_HANDLER]
    libcxx._ZSt13set_terminatePFvvE(_cpp_terminate_callback)
    _logger.debug("Installed C++ terminate handler")

    ✅ WORKS! The complete function is 118 lines of Python code, but it does the trick.

Here's what I ended up with:

https://github.com/fiji/fiji/blob/f1460d8d01d092fe93d0ffc325db469c155af538/config/fiji.py

Still crashes when you forget the @ensure_main_thread, but not before explaining why it's crashing, and how to avoid it next time. 🤷

@marktsuchida
Copy link

Very interesting -- I didn't know that the C++ terminate handler gets called for ObjC exceptions (I guess it makes sense given the shared Itanium ABI).

std::set_terminate() has the usual problem of "what happens if somebody else also changes the (global) handler?", but of course if anybody should, it should be the app, not a library, so doing it in Fiji makes sense.

One thing to keep in mind is that std::terminate() is called in several different situations (not just unhandled exceptions) in C++ (exception thrown while unwinding the stack, noexcept function threw exception, exception escaped thread, joinable thread destroyed, mutex destroyed while locked, terminate handler threw, pure virtual function called, etc.).

The default terminate handler (depending on the platform, but definitely on macOS) prints some useful information, like the type and message of the uncaught exception, if that was the cause, or whatever else the cause was. So it might be good to call the default terminate handler instead of (or before) just exiting (and btw sys.exit() seems dangerous because it may run atexit handlers among other things).

std::set_terminate() returns a function pointer to the previous terminate handler, so it can be stored and called. (And it looks like the default handler already prints the type and message for ObjC exceptions, and even adds a stack trace.)

I'd also note that the ObjC NSInternalInconsistencyException is used for non-threading-related issues, too. An example I remember from many years ago is if the data source for a UI table gets the math wrong. Not to mention generic NSException which could be a lot of things (somewhat analogous to java.lang.Error).

So it might be nice to print a hint that GUI threading could be a possible cause, but maybe not sound too definite, lest it mislead the reader.

I also don't know what the chances are of the Python interpreter being in a consistent state when the terminate handler is called. If std::terminate() is called on a C++ thread that never acquired the GIL, while another thread is hung or crashed holding the GIL, I believe it will deadlock. Even if std::terminate() is called on the same thread that currently holds the GIL, it will likely find the interpreter in the wrong state (interpreter has called out into C while the terminate handler tries to run Python code without having returned from that C call).

Probably best if production terminate handlers are implemented in C or C++ and kept minimal so that nothing futher can go wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants