-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[GR-67169] Support JFR emergency dumps on out of memory #11530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
roberttoyonaga
wants to merge
16
commits into
oracle:master
Choose a base branch
from
roberttoyonaga:emergency-dump
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…mp purge in-flight data bug. checkstyle gate fixes cleanup
1b11e12
to
3d1e62b
Compare
6b0208c
to
905d59d
Compare
905d59d
to
847d6fb
Compare
zapster
reviewed
Jul 7, 2025
substratevm/src/com.oracle.svm.core/src/com/oracle/svm/core/jfr/JfrEmergencyDumpFeature.java
Outdated
Show resolved
Hide resolved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
native-image
native-image-jfr
OCA Verified
All contributors have signed the Oracle Contributor Agreement.
redhat-interest
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related issue: #10600
Summary
One of JFR's primary goals is to provide insight in the event of a crash like OOME. JFR data may be useful for investigating OOME. For example, JFR's CPU and allocation profiling can help locate where problem areas might be occurring. JFR's garbage collection events and thread data could also be helpful with diagnosing problems.
Currently, it's possible to receive heap dumps on out of memory (OOM) but this is not yet possible for JFR. OpenJDK has this feature and we should try to implement it in Native Image too.
Goals
Non-Goals
Details
This PR can be broken into two main parts: (1) making JFR flushing allocation free and (2) creating the emergency dump file.
(1) Making JFR flushing allocation free
Many small changes had to be made to make the JFR flushing procedure allocation free:
JfrChunkFileWriter#writeString
adapted to use native memoryJfrSerializer
classes pre-initialize a small amount of data while in hosted mode.for (Object name : names)
format tofor (int i=0; i<names.length; i++)
formatSecondsNanos
class was made into aRawStructure
so it could be allocated on the stack.Larger changes to the
JfrTypeRepository
andJfrSymbolRepository
were also required. The general procedure used by theJfrTypeRepository
remains the same, but we cannot usePackage
,Module
, andClassloader
classes directly because their methods may allocate and they are not pinned objects referenceable fromAbstractUninterruptibleHashtable
. To work around this, I've madeJfrTypeInfo
RawStructure
s corresponding to each of these Java classes (PackageInfoRaw
,ModuleInfoRaw
etc.). Some type data such as package names must be manually computed to avoid allocation (seesetPackageNameAndLength
). In some cases, serialization of symbols to native memory buffers must happen earlier (inJfrTypeRepository
instead ofJfrSymbolRepository
) in order to avoid allocating new JavaStrings
. TheJfrSymbolRepository
has been modified accordingly to cache pointers to serialized data rather than String objects. The regular Java hash maps have been replaced with new implementations ofAbstractUninterruptibleHashtable
as well.One large obstacle was that
JfrTypeRepository#collectTypeInfo
originally needed to walk the image heap and allocate a list of loaded classes. That process is not easy to make allocation free. To work around this, I experimented with pre-allocating the loaded class list at start-up but found that this negatively affected startup times. My solution was to make theJfrTypeRepository
function more similarly to the other JFR repositories in SubstrateVM by maintaining previous/currentepochData
. Specifically, during event emission,JfrTypeRepository#getClassId
now caches the class constant data used by events. Types used by JFR are stored in previous/current epoch data hash tables. This uses some more memory than the old approach, but at least it avoids allocation and is consistent with other the JFR repositories in SubstrateVM. This is a lazy approach so it avoids the start up penalty of pre-allocation.A small bug in
JfrTypeRepository
was fixed. The bootstrap classloader was originally not being serialized to chunks. Hotspot gives this classloader the reserved ID of 0 and serializes it if it was tagged during the epoch.(2) Creating the emergency dump file
New classes implementing this support:
JfrEmergencyDumpSupport
,JfrEmergencyDumpFeature
,PosixJfrEmergencyDumpSupport
. I have tried to keep the components and logic as similar as possible to Hotspot classJfrEmergencyDump
found in jfrEmergencyDump.cpp.After the emergency dump flush has completed, the JFR disk repository directory is scanned. The names of chunk files are gathered and sorted (which also implicitly sorts them chronologically). Each chunkfile in the sorted list is then copied to the emergency dump snapshot.
A lot of the work in
PosixJfrEmergencyDumpSupport
involves handling/creating filenames as C strings. Similar to Hotspot JFr, a pre-allocated native memory path buffer is used as a temporary place to construct filenames and paths.Hotspot JFR uses quicksort to handle sorting chunk filenames. In SubstrateVM, a Java quick sort implementation has been added to
GrowableWordArrayAccess
to sort chunk files while avoiding using the Java heap.