-
-
Notifications
You must be signed in to change notification settings - Fork 121
feat(stdlib): Memory leak checker #984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Updated test snapshots.
|
Forgot to mention. I made the CI fail intentionally for now, just to see how it looks on GitHub when a test fails for memory leaks detected. 🤭 |
|
Hi @cician Just wanted to say thanks for a great contribution and I look forward to looking through it! |
- The LeakDebugSummaryMode option wasn't respected so tests were noisy and confusing. - Added filters to exclude invocation of hooks and of the function to be tested. - Added detection of deallocations of pre-existing objects. - Added detailed reporting of ref. counting issues on pre-existing objects. - Fixed decRef issues in printing object contents and added special case for printing strings. - Commented out the failing test for Buffer (the leak in question has not been fixed yet).
peblair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally had some time to give this a look. @cician this is incredible stuff! I've left some first-pass comments, and I'm curious to see what the other core team members have to say.
| // buffer is not being flushed? If the debugging hooks are active then | ||
| // behave ad if ignoreZeros was true to continue executing and give the | ||
| // debugging hooks an opportunity to report the issue. | ||
| if (WasmI32.ne(_DEBUG_HOOK_DECREF, 0n)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting. I wonder if we could get around this by adding a call to the flush syscall in the if (WasmI32.ne(_DEBUG_HOOK_DECREF, 0n)) expression? 🤔
stdlib/runtime/runtimeDebug.gr
Outdated
|
|
||
| let tagToTypeName = (tag: Number) => { | ||
| match (tag) { | ||
| 1 => "string", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use the constants from runtime/unsafe/tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use them to avoid @disableGC. Should I convert tagToTypeName to no gc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't considered that! In that case, I think a comment is fine for now (maybe also add a comment to tags.gr so we can remember to keep this in sync?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this function to use @disableGC. It's not pretty, but probably still better than not using the constants.
stdlib/runtime/runtimeDebug.gr
Outdated
| // totals can create confusion. The simplest solution would be to get the tag | ||
| // of mallocated objects, but they don't yet have a tag associated. So I | ||
| // have to compensate again by adding up the objects that have been | ||
| // mallocated but never freed. Which simply means the ones in "allocations" | ||
| // Map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part loses me a little. Can you maybe elaborate a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this comment is pretty confusing. I'll rewrite it.
What I mean is that to correctly present the summary of incRefs/decRefs by tag, it's not enough to just track incRefs/decRefs themselves because incRef doesn't get called for newly allocated objects. They start with ref. count of 1 set directly in GC.malloc. But it also isn't as simple as adding some code in the hook for GC.malloc because freshly allocated objects don't have a tag yet. The tag is normally set after (like in dataStructures.allocateString and similar functions). So what I actually do is track the lifetime of single objects in the allocations map. Only when the object is freed I add to the mallocPlusFreeByTag map to present along with incRefs/decRefs by tag. If an object is not freed then it's left in the allocations map. So I group these leaked objects by tag as well and present the totals in the summary as "malloc without free".
stdlib/runtime/runtimeDebug.gr
Outdated
| let mallocPlusFreeByTag = Map.make(): (Map.Map<Number, Number>) | ||
|
|
||
| @disableGC | ||
| let rec readTag = (userPtrAsNumber: Number) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine why this is the case, but it may be worth a comment explaining why these functions use Number instead of WasmI32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment explaining the choice for Number.
| if (Set.contains(userPtr, allocations)) { | ||
| // A new allocation, but for an object already allocated inside the debug block, but never removed. | ||
| print( | ||
| "⚠ PROBLEM DETECTED: unexpected malloc at address 0x" ++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of wish we had colored output here hehe
(not really requesting this, but just a thought)
| } | ||
|
|
||
| if (Map.contains(userPtr, incRefsOnPreExistingObjects) || | ||
| Map.contains(userPtr, decRefsOnPreExistingObjects)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the formatter output? If so, cc @marcusroberts (I think that Map should be indented here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is.
Co-authored-by: Philip Blair <[email protected]>
Co-authored-by: Philip Blair <[email protected]>
Co-authored-by: Philip Blair <[email protected]>
ospencer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! And awesome that we've already seen this in action 🙂
A couple of thoughts from me but no big blockers :)
One thing to consider— @phated when we split the runtime from the stdlib, will this be annoying to refactor? I think this makes sense to live in the runtime, but since it uses stdlibs it might be hard to break out, and I'm not sure where it really should live then. I feel like we can worry about it when we actually make that change, but figured I'd grab your thoughts.
| let mut _DEBUG_HOOK_DECREF = 0n | ||
| let mut _DEBUG_HOOK_MALLOC = 0n | ||
| let mut _DEBUG_HOOK_FREE = 0n | ||
| let mut _DEBUG_HOOK_RUNNING = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me that the _DEBUG value above doesn't have anything to do with these values. I wonder if there's any renaming we can do to make that more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just move it to the commented code below? Currently it's never used, its for the debugging functions that are all commented out. I agree though that leak checker code can get mixed up with these.
| @@ -0,0 +1,712 @@ | |||
| import WasmI32 from "runtime/unsafe/wasmi32" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filename is a little confusing vs the current runtime/unsafe/debug file. We can't combine them because runtime/debug needs to run in runtime mode—maybe we rename this file to memoryDebug.gr or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoryDebug.gr sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I move it outside of the runtime folder as well?
This was causing memory leaks in the leak checker.
…rain-lang#773 as well as for improved readability.
|
Probably not a big deal, but the leak checker currently reports leaks when converting numbers to strings due to buffers allocated and intentionally never released in the numberUtils module (_POWERS10, _DIGITS, _HEX_DIGITS etc). It can be a bit confusing when you get a resulting like this: One solution would be to simply call a few functions preemptively to force initialize these buffers, but maybe we want to give them a proper reserved heap tag? Something like "cache" or "lookup table". Otherwise, if it used malloc directly instead of |
With #1003 fixed and a fix in the checker itself it's now mostly leak free. There are still leaks elsewhere in the stdlib that make it leak indirectly when working on Sets, Maps and Arrays to print the summary. If you're wondering how I leak check the leak checker, I'm just running it in an infinite loop and watching if the memory continues to rise indefinitely in the task manager. |
|
We'll want to land #1410 before we land this to minimize the effect on compiled modules. |
This PR adds a tool that is intended to help solve issues in code with manual memory management.
I've added a new module called
runtimeDebug, but I'm open to suggestions if another name or placing it in an existing one is more appropriate. For example there's adebugmodule, but I don't know what its purpose is.How it works
The changes to the
gcmodule has added support to setup hooks called on each incRef, decRef, malloc and free. Otherwise the changes here are limited to expose these hooks and remove them.The
runtimeDebugexposes two functions:ensureNoLeaks, intended for leak checks in unit testing. It only prints anything if leaks are actually found.leakDebugBlock, intended to be used when debugging gets tricky and more details are needed. It can print every incRef, decRef, malloc and free with addresses, tags and even object contents as strings (when safe-ish to do so). This part is inspired by gc.setDebug functionally that is currently commented out.When the hooks are connected, internally there's a bit of a delicate dance on hot ash, since
runtimeDebugis using managed memory and even GC, while GC itself calls into incRef/decRef/malloc/free. To accomplish this, the hooks are inhibited temporarily.A few details
I've used raw Wasmi32 pointers for function hooks. Maybe I should use something else?
box? Or actual function type?While the hook functions use WasmI32 pointers for tracked objects, I convert them internally to Number for use in GC land. May have used Int32 instead, but I chose Number hoping to reduce allocations at least for programs with small enough heap.
Limitations and drawbacks
Nesting
ensureNoLeaks/leakDebugBlockis not supported, but not prevented.Per function/code block based testing may not work for debugging some cases covered by the old gc.setDebug functionality. In particular when runtime modules themselves are starting up.
ensureNoLeaksis obviously useful for detecting memory leaks, but may not be all that helpful for other issues like use after free, overwritten object headers or infinite loops in malloc.free (all things experience by yours truly). It may help to runleakDebugBlockwith printIncrefsAndDecRefs=true, printMallocAndFree=true and printObjectContents=false.It adds a slight overhead to memory management for all programs, even if the hooks aren't hooked up. I haven't done any comparisons yet. Hopefully not much, just checking if the hook function is set in a variable. Some sort of conditional compilation would be nice in this case.
Finally, it's not able to leak check the leak checker, so I don't guarantee it doesn't leak memory.