Skip to content

Conversation

sspitsyn
Copy link
Contributor

@sspitsyn sspitsyn commented Jul 17, 2025

If JVMTI StopThread is done when the thread is in certain various states (but not all states), after the async exception is delivered and handled, hotspot leaves the thread's interrupted flag set. The end result is the next time the thread does something like Thread.sleep(), it will immediately get an InterruptedException.

The fix is to clear the interrupted flag in the JavaThread::handle_async_exception() after an async pending exception has been set to be thrown with the set_pending_exception().

There are a couple of concerns with this fix which would be nice to sort out with reviewers:

  1. The proposed fix may clear the interrupt state when it was already set prior to the issuing of the StopThread() (this concern was raised by @dholmes-ora in a comment of this JBS issue)
  2. The impacted code path is shared between the class InstallAsyncExceptionHandshakeClosure used by the JVMTI StopThread implementation and the class ScopedAsyncExceptionHandshakeClosure used by the ScopedMemoryAccess

I feel that clearing the interrupted flag byt the JavaThread::handle_async_exception() is a right thing to do even though it was set before the call to JavaThread::install_async_exception(). Also, it has to be done for both StopThread and ScopedMemoryAccess.

The fix also includes minor tweaks of the test StopThreadTest to make the issue reproducible with it.

Testing:

  • Mach5 tiers 1-6 are passed
  • Ran the updated reproducer test hotspot/jtreg/serviceability/jvmti/vthread/StopThreadTest

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8306324: StopThread results in thread being marked as interrupted, leading to unexpected InterruptedException (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26365/head:pull/26365
$ git checkout pull/26365

Update a local copy of the PR:
$ git checkout pull/26365
$ git pull https://git.openjdk.org/jdk.git pull/26365/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 26365

View PR using the GUI difftool:
$ git pr show -t 26365

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26365.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 17, 2025

👋 Welcome back sspitsyn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 17, 2025

@sspitsyn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8306324: StopThread results in thread being marked as interrupted, leading to unexpected InterruptedException

Reviewed-by: pchilanomate, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 58 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 17, 2025
@openjdk
Copy link

openjdk bot commented Jul 17, 2025

@sspitsyn The following labels will be automatically applied to this pull request:

  • hotspot-runtime
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Jul 17, 2025

@dholmes-ora
Copy link
Member

I need to dive into the code tomorrow but I can't help think that doing an actual interrupt is not really what is needed, we just need to unpark the thread if it is blocked ...

@AlanBateman
Copy link
Contributor

AlanBateman commented Jul 17, 2025

I need to dive into the code tomorrow but I can't help think that doing an actual interrupt is not really what is needed, we just need to unpark the thread if it is blocked ...

I think we also need to step back and write down examples of where JVMTI StopThread is useful. The main use-case may be in the debugger where a thread is suspended at a breakpoint and the user wants to throw an exception to exercise some code path and exception handling. So I think it may be less about wakeup. Also if the target thread is in Object.wait then it can't continue until it re-enters the monitor so it's never been guaranteed to wakeup immediately. Although it may be unpalatable, I think we should look at having StopThread fail in these cases.

@pchilano
Copy link
Contributor

I also don’t see the purpose of setting the interrupted flag if all we want is for the target to wake up if it’s in one of the blocking calls. From the 3 cases in JavaThread::interrupt(), the only one where I see setting that flag is needed is for Thread.sleep because we check it to bail out, otherwise we would just keep sleeping for the remaining time. But maybe we could also add a has_async_exception_condition() check to bailout.

@dholmes-ora
Copy link
Member

But maybe we could also add a has_async_exception_condition() check to bailout.

Yes.

BTW I don't think this is in any way new. I think Thread.stop would have had the same issue. Issuing the internal interrupt was just a convenient way to unblock all the interruptible blocking points - and as Alan notes, monitor acquisition is not interruptible. We likely did not care that this could make the interrupt state appear odd if the thread continued after the async (ThreadDeath) exception. Not sure it is really that different for the debugger - if we have stopped the thread (not just suspended it) then we shouldn't really be expecting it to continue as normal afterwards.

@zhaosongzs
Copy link
Member

/touch

@openjdk
Copy link

openjdk bot commented Jul 18, 2025

@zhaosongzs The pull request is being re-evaluated and the inactivity timeout has been reset.

@sspitsyn
Copy link
Contributor Author

Thank you for the comments and suggestions!
I'll give it a try to get rid of the interrupt flag setting.

@AlanBateman
Copy link
Contributor

@sspitsyn It might be useful to reach out to the IDEs to see what they are doing. From a quick test with IntelliJ then it appears to invoke both StopThread and InterruptThread when "Exception to Throw" is used. In that case, it means that Thread.sleep will wakeup, even if StopThread doesn't interrupt.

@sspitsyn
Copy link
Contributor Author

@sspitsyn It might be useful to reach out to the IDEs to see what they are doing. From a quick test with IntelliJ then it appears to invoke both StopThread and InterruptThread when "Exception to Throw" is used. In that case, it means that Thread.sleep will wakeup, even if StopThread doesn't interrupt.

Good suggestion, thanks.

@sspitsyn
Copy link
Contributor Author

I've implemented and pushed the suggestion from Patricio. The mach5 tiers 1-6 are clean.
I'm not sure about correctness of the tweak in the JavaThread::sleep_nanos():

@@ -2122,6 +2117,9 @@ bool JavaThread::sleep_nanos(jlong nanos) {
   jlong nanos_remaining = nanos;
 
   for (;;) {
+    if (has_async_exception_condition()) {
+      return false;
+    }
     // interruption has precedence over timing out
     if (this->is_interrupted(true)) {
       return false;

The mach5 tiers 1-6 tests are all passed without this tweak.

@AlanBateman
Copy link
Contributor

@sspitsyn It might be useful to reach out to the IDEs to see what they are doing. From a quick test with IntelliJ then it appears to invoke both StopThread and InterruptThread when "Exception to Throw" is used. In that case, it means that Thread.sleep will wakeup, even if StopThread doesn't interrupt.

Good suggestion, thanks.

It would put the onus on the debugger to interrupt, which I think is the right thing to do. it would remove the interrupt from JavaThread::install_async_exception and would mean no change to JavaThread::sleep_nanos.

@dholmes-ora
Copy link
Member

FWIW the way this used to work in the past for the blocking methods is that "stop" would install the pending-async-exception and interrupt the thread to unblock it. The thread doing a sleep/park/wait would see that it was interrupted and proceed to throw InterruptedException, and in the process clear the interrupt flag (as per the spec). But when we attempted to return to Java from _thread_in_vm we would see the pending-async-exception and replaced the pending IE with the the async one (ThreadDeath usually).

But if the thread was not blocked in an interruptible blocking method then it would have the interrupt state set. So we basically have always behaved this way and I'm wondering what is driving us to change this behaviour now?

FWIW I think the fix is reasonable to avoid messing with the interrupt flag, but the fact Alan seems to want the "stop" to not interrupt at all makes me wonder how stop would then actually work?

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Jul 29, 2025

It would put the onus on the debugger to interrupt, which I think is the right thing to do. it would remove the interrupt from JavaThread::install_async_exception and would mean no change to JavaThread::sleep_nanos.

Thank you for the suggestion. I've tested it and found that a couple of tests are failed including one JCK test. So, at a minimum this approach is going to be more complicated, it would require a supporting JDI update, consultation with the IDE vendors, CSR and a release note. Also, I kind of share the David's concern above. So, I'm thinking if it is okay to separate this effort from the current fix. I can file an enhancement if it makes sense and worth it.
As I see, the tweak in JavaThread::sleep_nanos() is not really needed, so I'll remove it now.

@AlanBateman
Copy link
Contributor

FWIW I think the fix is reasonable to avoid messing with the interrupt flag, but the fact Alan seems to want the "stop" to not interrupt at all makes me wonder how stop would then actually work?

I don't think the current proposed change (which drop's setting the interrupt flag in install_async_exception) will cause a target thread blocked in sleep to wakeup. A target thread blocked in JavaThread::sleep_nanos will wakeup from park_nanos but will just park again with the remaining time. I assume this is the test failures that Serguei mentions.

As you noted, the bug is the side effect of setting the interrupt status of a target thread that is not in a InterruptedException throwing method. It may be that the "Throw Exception" feature in debuggers isn't used much and nobody has noticed.

If we can reach out the debugger/IDE maintainers then we might be able to a bit more insight into how this is used. In some debuggers you can't use "Throw Exception" (= StopThread) when the target is blocked in the native code or in the VM. When suspended at a breakpoint then it looks like IntelliJ calls both ThreadReference::stop and ThreadReference::interrupt but I can't tell if the latter is conditional or not.

@dholmes-ora
Copy link
Member

I don't think the current proposed change (which drop's setting the interrupt flag in install_async_exception) will cause a target thread blocked in sleep to wakeup. A target thread blocked in JavaThread::sleep_nanos will wakeup from park_nanos but will just park again with the remaining time. I assume this is the test failures that Serguei mentions.

Right but that is why there was a tweak to sleep_nanos:

if (has_async_exception_condition()) {
      return false;
}

But that tweak has now been removed hence this fix no longer maintains the functionality of StopThread.

And again this issue of leaving the interrupt flag set has existed "forever".

@sspitsyn
Copy link
Contributor Author

But that tweak has now been removed hence this fix no longer maintains the functionality of StopThread.

Sorry, this was removed by a mistake. I incorrectly interpreted some code. Restored now.

@sspitsyn
Copy link
Contributor Author

I don't think the current proposed change (which drop's setting the interrupt flag in install_async_exception) will cause a target thread blocked in sleep to wakeup. A target thread blocked in JavaThread::sleep_nanos will wakeup from park_nanos but will just park again with the remaining time.

Thank you checking this. I've restored the tweak in the JavaThread::sleep_nanos().

I assume this is the test failures that Serguei mentions.

The test failures I mentioned were after an attempt to remove the following lines from the JavaThread::install_async_exception():

  oop vt_oop = vthread();
  if (vt_oop == nullptr || !vt_oop->is_a(vmClasses::BaseVirtualThread_klass())) {
    // Interrupt thread so it will wake up from a potential wait()/sleep()/park()
    this->interrupt();
  }

In order to remove the above a corresponding update in the jdwp/debugger is additionally needed to invoke the JVMTI ThreadInterrupt().

@AlanBateman
Copy link
Contributor

The latest update to have JavaThread::sleep_nanos to bail out if there is async exception looks okay. Combined it means that a StopThread when the target platform thread is in Thread.sleep and Object.wait will cause both to wakeup and throw the exception, without the side effect of setting the interrupt status. So I think it's a good approach.

@AlanBateman
Copy link
Contributor

And again this issue of leaving the interrupt flag set has existed "forever".

Right, and mostly surfaced when updating JVMTI to support virtual threads.

The related issue is that JVMTI InterruptThread doesn't cause a platform thread to wakeup from interruptible I/O operations. It works when the target is a virtual thread because it's an upcall to Thread::interrupt, not so for platform threads.

Copy link
Contributor

@pchilano pchilano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, just a few comments below.

@@ -2118,6 +2117,9 @@ bool JavaThread::sleep_nanos(jlong nanos) {
jlong nanos_remaining = nanos;

for (;;) {
if (has_async_exception_condition()) {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above JavaThread::sleep_nanos should be updated about this case.
Also, back in JVM_SleepNanos we should probably skip throwing IE for this case (even though it will be overwritten by the async one later).

Copy link
Contributor Author

@sspitsyn sspitsyn Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, addressed now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mach5 tiers 1-6 are good with the latest pushes.

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this to a good place.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 31, 2025
Copy link
Contributor

@pchilano pchilano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Serguei, looks good to me.

@@ -2893,15 +2893,16 @@ JVM_ENTRY(void, JVM_SleepNanos(JNIEnv* env, jclass threadClass, jlong nanos))
} else {
ThreadState old_state = thread->osthread()->get_state();
thread->osthread()->set_state(SLEEPING);
if (!thread->sleep_nanos(nanos)) { // interrupted
if (!thread->sleep_nanos(nanos)) { // interrupted or async exception was installed
// An asynchronous exception could have been thrown on
// us while we were sleeping. We do not overwrite those.
if (!HAS_PENDING_EXCEPTION) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for this bug but we have this HAS_PENDING_EXCEPTION check here and further up but I don't see how we can have a pending exception when calling this method. Based on the comment here seems we just wanted to check the async ones as added now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we always have HAS_PENDING_EXCEPTION == true if async exception was installed?
If so, then this newly added check is not really needed:

           if (!thread->has_async_exception_condition()) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we process the async exception handshake operation, HAS_PENDING_EXCEPTION will be false. The only way for HAS_PENDING_EXCEPTION to be true would be if we already entered the method with a pending exception, but I don’t see how that is possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that once-upon-a-time we would install any async exception in the ThreadBlockInVM destructor.

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Aug 1, 2025

Thank you for review and suggestions, Alan and Patricio!

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Aug 1, 2025

David, do you have any concerns or suggestions? Should I wait for your approval?

@sspitsyn
Copy link
Contributor Author

sspitsyn commented Aug 2, 2025

/integrate

@openjdk
Copy link

openjdk bot commented Aug 2, 2025

Going to push as commit e801e51.
Since your change was applied there have been 80 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 2, 2025
@openjdk openjdk bot closed this Aug 2, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 2, 2025
@openjdk
Copy link

openjdk bot commented Aug 2, 2025

@sspitsyn Pushed as commit e801e51.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@sspitsyn sspitsyn deleted the d6 branch August 2, 2025 05:18
@dholmes-ora
Copy link
Member

David, do you have any concerns or suggestions? Should I wait for your approval?

Sorry I was out of action for a couple of days. I'm happy to see the change to sleep_nanos restored. The change seems fine in the current state.

Though I have to wonder if there is any other non-directly-interruptible native blocking I/O operation that directly checks the interrupt flag and thus would see the stop request that way. Need to see if any Java library native code uses JNI to check the interrupt state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants