-
-
Notifications
You must be signed in to change notification settings - Fork 84
Description
Status Quo
We have this thing called symbolic-mimidump::cfi::AsciiCfiWriter which outputs breakpad ASCII STACK records for all the various debug formats that we support. We use this as part of symbolicator, and afaik mozilla uses it as part of their dump_cfi utility.
Problem Statement
This ascii format has a couple of shortcomings:
- It needs to be parsed, either ahead-of-time, or lazily.
- Parsing it is super slow, and needs to happen every damn time.
- It might even end up being larger than the actual unwind info.
- It is low fidelity and a bad common denominator; for example it does not support some DWARF operations like "set return addr to 0; aka end of stack".
- Did I mention its a text format that needs to be parsed?
Proposed Solution
So I was thinking for quite some time about an "indexed" format that I don’t need to parse from beginning to end every single time, but can quickly look up unwind info based on instruction offset.
Also, while working on #549 I thought that converting the unwind instructions into this bad intermediate text format is a bad fit, since it would be a lot nicer to just execute the unwind operations.
Long story short, how about we had a serialized, mmap-able format similar to SymCaches that have something like the following format, in pseudocode:
struct CfiCache {
ranges: BTreeMap<usize, UnwindInfo> // instruction addr => unwind info
}
enum UnwindInfo {
Breakpad(String), // same as now, just tiny indexed snippets so you don’t need to parse the whole file ahead of time
WindowsX64(goblin::pe::exception::UnwindInfo), // well, a reference to a binary representation of the raw info
Dwarf(gimli::read::CallFrameInstructionIter), // again, binary representation of the raw DWARF unwind info
Compact(symbolic_debuginfo::macho::compact::CompactCfiOpIter), // again, same for apples format
// etc, whatever other formats there are
}The proposed CfiCache / UnwindInfo would implement minidump_processor::symbols::SymbolProvider (or at least walk_frame) to just execute the provided unwind info directly, without needing to go through that horrible intermediate format.
Open Questions
Since we are not the only users of this code, I would have some questions especially for external users: (hello @Gankra, @gabrielesvelto, etc)
- Pls give feedback on the proposal ;-)
- Would a "unwind info -> breakpad ASCII" converter still be useful in that scenario? Can we just remove that completely?
- Would you expect to create such an unwinder directly from an object file, without needing to go through an intermediate format/struct?
- How transparent / opaque should this format be? Is it sufficient to have "object file -> (opaque intermediate format) ->
.unwind(caller frame) -> Option<callee frame>"; as in: havingunwindbeing the only public API it has? Or would you expect to have access to the underlying raw unwind info (raw DWARF bytes; whatever)?