Skip to content

Conversation

@prsadhuk
Copy link
Collaborator

@prsadhuk prsadhuk commented Nov 10, 2025

NPE is seen while accessing transient "scenePeer" variable between reads..
Fix is made to store it in a temp variable rather than reading it twice since the value can change between successive reads in many places it is accessed.
Also some debug logs added to be enabled via jfxpanel.debug property


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8255248: NullPointerException in JFXPanel due to race condition in HostContainer (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1968/head:pull/1968
$ git checkout pull/1968

Update a local copy of the PR:
$ git checkout pull/1968
$ git pull https://git.openjdk.org/jfx.git pull/1968/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1968

View PR using the GUI difftool:
$ git pr show -t 1968

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1968.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 10, 2025

👋 Welcome back psadhukhan! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 10, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Ready for review label Nov 10, 2025
@mlbridge
Copy link

mlbridge bot commented Nov 10, 2025

Webrevs

debugPrint = "true".equalsIgnoreCase(debugStr);
}

protected static void debug_println(String str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc complains about this new public API. Or is it a temporary debugging thing? Can it be declared private?

If it is a permanent thing, it incurs a string concatenation overhead even when disabled. Use lambdas instead? Alternatively (and faster), one needs to check if debug printout is enabled on each use inline:

if(DEBUG) {
  debug_println("JFXPanel Thread " + Thread.currentThread().getName() + " isFXUserThread " + Toolkit.getToolkit().isFxUserThread());
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend to keep it as permanent to enable logging of flow and thread context viewing.
Updated to keep the method private

String debugStr = System.getProperty(JFXPANEL_DEBUG);

debugPrint = "true".equalsIgnoreCase(debugStr);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion:

private static final boolean DEBUG = Boolean.getBoolean("jfxpanel.debug");

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@kevinrushforth kevinrushforth self-requested a review November 10, 2025 20:35
@kevinrushforth
Copy link
Member

@prsadhuk Can you provide your evaluation as to why this is the right fix in light of the comments on the earlier PR #1178 and the discussion in JBS issue JDK-8255248?

Reviewers: @kevinrushforth @andy-goryachev-oracle

Also, @mstr2 and @hjohn might want to weigh in since they had comments on #1178

/reviewers 2

@openjdk
Copy link

openjdk bot commented Nov 10, 2025

@kevinrushforth
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@prsadhuk
Copy link
Collaborator Author

I couldn't find anything adverse w.r.t thread usage using the added debug logs and I think it's standard and proven practice to store global (and in this case transient) variable into temp variable and use it to prevent
inadvertent modification of the global variable and also avoid variable state change causing issue during context switching
especially in multi-threaded environments
and it is consistent with the approach taken for other swing-interop class like SwingNode

private static final String debugPrefix = "JFXPanel:>> ";

private static void debug_println(String str) {
if (DEBUG) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I was not sufficiently clear.
This method does not need the conditional. The conditionals are needed in every place that calls here.
In other words, we don't want to incur the string concatenation overhead if DEBUG is false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will that be too much overhead? We had used the same in jdk having DEBUG check in one place...but I will modify it for FX..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you might want to fix all these places in JDK!

The code you currently have now (after 115001f ) has minimum runtime overhead - basically a boolean check.

Anything else consumes more CPU and memory at runtime (we don't have a pre-processor in java thankfully), so if(DEBUG) { debug_print(...); } is the best, albeit a bit verbose option.

Event the modern logging facades incur more overhead with lambdas or formats, since one needs to actually enter the logging function to check whether the level/logger is enabled. I mean logger.debug("param {}", param); or logger.debug("param {}", () -> param());

public JFXPanel() {
super();

debug_println("JFXPanel Thread " + Thread.currentThread().getName() + " isFXUserThread " + Toolkit.getToolkit().isFxUserThread());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, here (and elsewhere debug_println() is called) we should have

if(DEBUG) {
          debug_println("JFXPanel Thread " + Thread.currentThread().getName() + " isFXUserThread " + Toolkit.getToolkit().isFxUserThread());
}

As a separate note, you may want to consider removing some noise from debugging output and only print when "unexpected" condition occurs, such as (thread != fx), right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess different method will have different unexpected condition so I kept it generic to crosscheck if calling thread are indeed what it should be..

@hjohn
Copy link
Collaborator

hjohn commented Nov 13, 2025

Sorry, what exactly are we trying to do here? Is the problem that intractable that we must add a million debug lines that will be shipped to everyone in the hope someone will debug the problem on their machine? This stuff belongs on some local branch, not be considered for inclusion in FX IMHO.

@andy-goryachev-oracle
Copy link
Contributor

Is the problem that intractable that we must add a million debug lines

This is a good point - perhaps we ought to create a follow-up ticket to remove the debug prints once we determine they served their purpose?

Copy link
Contributor

@andy-goryachev-oracle andy-goryachev-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest changes look good to me.
I am also curious why we need these debug prints in the first place.

@mstr2
Copy link
Collaborator

mstr2 commented Nov 13, 2025

Implementing a system with shared state that is concurrently accessed by different threads is difficult, and it requires rigorous analysis because it's almost impossible to empirically verify that code is not racy. Neither the original implementation nor the changes in this PR inspire confidence that this analysis has happened. It seems like we are presented yet again with a supposed fix without a clear explanation of why this is the right approach.

Just by looking at JFXPanel I can see that stagePeer is accessed in the same unsafe way as scenePeer. Now, I don't suggest that it can be "fixed" by making local copies of the reference, this might just as well obfuscate the problem even further. As it stands, I think this patch should not be integrated into JavaFX before an appropriate analysis of the problem has been done. An appropriate analysis is not pointing out that "an NPE happens because another thread set a shared field to null"; this is just a consequence of a racing system that the JVM managed to elucidate because it checked the reference for us.

As a side note, I don't understand the point of all the logging. If these methods have thread invariants (which you imply in your comment), then enforce those invariants. If they don't, what's the point?

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug print logging is OK for debugging purposes to help validate your analysis, as long as you realize that it isn't a substitute for analysis. I do agree with John and Michael that keeping the debug statements isn't needed (or wanted) as part of the product, so I'd like to see it removed once this has proceeded to the point that we are otherwise ready to approve it.

On that note, there are still unanswered questions about the thread safety of the JFXPanel class. We do know that, with the exception of setScene (and by extension, getScene()), all public methods are specified to be called on the EDT (the AWT thread). setScene checks the thread and always calls setSceneImpl on the FX thread to do the "heavy lifting". The question then is around the rest of the implementation methods, some of which are called on the FX thread and some on the EDT.

The NPE being hit by the test program is coming because setScene(null), which is called on the FX app thread, will lead to calling HostContainer::setEmbeddedStage(null) and HostContainer::setEmbeddedScene(null). Those methods will set the stagePeer and scenePeer fields to null. Meanwhile, a repaint operation or a mouse move or ... on the EDT could be trying to access the scene peer.

Locally capturing the stagePeer and scenePeer fields in methods that are called on the EDT so that it doesn't change out from under us will prevent the NPE, but doesn't guarantee thread safety. In the case where it prevents the NPE, we go one to call a method on a scene peer that is no longer being used. This might be OK or it might not be.

As long as this doesn't make things worse (and I don't see how it would), we could consider taking some variant of your proposed fix as a "workaround" to solve the NPE, but I'd like to understand the problem better. If we do take this fix, we would need to file a follow-on bug to fix the root cause (which could involve some design work to ensure thread safety without introducing deadlocks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfr Ready for review

Development

Successfully merging this pull request may close these issues.

5 participants