We currently register JIT'd unwind-info with libunwind on Darwin by calling
libunwind's register_frame and deregister_frame functions (or
unw_add_dynamic_eh_frame_section and unw_remove_dynamic_eh_frame_section
where available), however this registration scheme has some drawbacks:
(1) It only supports registration of DWARF eh-frames and not Darwin's preferred
compact-unwind format. This problem becomes acute on Darwin/arm64 machines (e.g. Apple Silicon Macs) where the compiler only emits compact-unwind by default (falling back to DWARF only if the frame info cannot be encoded in compact-unwind). This currently prevents the JIT from handling exceptions (and from unwinding in general) in any object files not specifically compiled to target the JIT on Darwin/arm64.
(2) It provides no way to describe the sub-architecture for a given frame,
which is necessary in order to properly handle arm64e pointer authentication. This issue made it impossible to catch exceptions thrown in JIT'd code on Apple Silicon Macs until recently, even when they had been compiled to target the JIT. (See https://github.com/llvm/llvm-project/issues/49036).
(3) The data structures used to hold the registered unwind info are hand-rolled
(presumably to avoid introducing a dependency on libcxx) and quite simple, so not particularly efficient for lookup.
This commit adds support for a new callback-based lookup scheme for unwind
info that was inspired by the _dyld_find_unwind_info_sections SPI that
libunwind uses to find unwind-info in non-JIT'd frames. From
llvm-project/libunwind/src/AddressSpace.hpp:
struct dyld_unwind_sections { const struct mach_header* mh; const void* dwarf_section; uintptr_t dwarf_section_length; const void* compact_unwind_section; uintptr_t compact_unwind_section_length; }; extern bool _dyld_find_unwind_sections(void *, dyld_unwind_sections *);
During unwinding libunwind calls _dyld_find_unwind_sections to both find
unwind section addresses and identify the subarchitecture for frames (via the
MachO-header pointed to by the mh field).
This commit introduces two new libunwind SPI functions:
struct unw_dynamic_unwind_sections { unw_word_t dso_base; unw_word_t dwarf_section; size_t dwarf_section_length; unw_word_t compact_unwind_section; size_t compact_unwind_section_length; }; typedef int (*unw_find_dynamic_unwind_sections)( unw_word_t addr, struct unw_dynamic_unwind_sections *info); // Returns 0 if successfully registered, -1 to indicate too many // registrations. extern int __unw_add_find_dynamic_unwind_sections( unw_find_dynamic_unwind_sections find_dynamic_unwind_sections); // Returns 0 if successfully deregistered, -1 to indicate no such // registration. extern int __unw_remove_find_dynamic_unwind_sections( unw_find_dynamic_unwind_sections find_dynamic_unwind_sections);
These can be used to register (and deregister) callbacks that have a similar
signature to _dyld_find_unwind_sections. During unwinding, if
_dyld_find_unwind_sections returns false (indicating that no frame info
was found by dyld) then registered callbacks are run in registration order until
either the unwind info is found or the end of the list is reached.
With this commit, and by implementing the find-unwind-info callback in the ORC
runtime in LLVM, we can address all three issues above: (1) compact unwind
ranges for JIT'd objects can be constructed and communicated to libunwind via
the new callback, (2) the subarchitecture for the given frame can be identified
by including a pointer to a JIT'd MachO header, and (3) the ORC runtime can use
STL or other complex data structures to enable efficient address-based lookup
for unwind info (a custom IntervalMap data type was recently introduced in
ab59185fbfb as a first step in this direction).
Explicitly out of scope for this commit is any change to registration on
non-Darwin platforms. While a callback-based lookup scheme may be useful on
other platforms it is not currently necessary for basic functionality the way
that it is on Darwin/arm64 (at least not to my knowledge).
Should document the return value, particularly since success/failure is not the same as other functions returning int in this file.