This is an archive of the discontinued LLVM Phabricator instance.

[libunwind] On Darwin, add a callback-based lookup scheme for JIT'd unwind info.
ClosedPublic

Authored by lhames on Jan 19 2023, 9:29 PM.

Details

Summary

We currently register JIT'd unwind-info with libunwind on Darwin by calling
libunwind's register_frame and deregister_frame functions (or
unw_add_dynamic_eh_frame_section and unw_remove_dynamic_eh_frame_section
where available), however this registration scheme has some drawbacks:

(1) It only supports registration of DWARF eh-frames and not Darwin's preferred

compact-unwind format. This problem becomes acute on Darwin/arm64 machines
(e.g. Apple Silicon Macs) where the compiler only emits compact-unwind by
default (falling back to DWARF only if the frame info cannot be encoded in
compact-unwind). This currently prevents the JIT from handling exceptions
(and from unwinding in general) in any object files not specifically
compiled to target the JIT on Darwin/arm64.

(2) It provides no way to describe the sub-architecture for a given frame,

which is necessary in order to properly handle arm64e pointer
authentication. This issue made it impossible to catch exceptions thrown
in JIT'd code on Apple Silicon Macs until recently, even when they had been
compiled to target the JIT. (See
https://github.com/llvm/llvm-project/issues/49036).

(3) The data structures used to hold the registered unwind info are hand-rolled

(presumably to avoid introducing a dependency on libcxx) and quite simple,
so not particularly efficient for lookup.

This commit adds support for a new callback-based lookup scheme for unwind
info that was inspired by the _dyld_find_unwind_info_sections SPI that
libunwind uses to find unwind-info in non-JIT'd frames. From
llvm-project/libunwind/src/AddressSpace.hpp:

struct dyld_unwind_sections {
  const struct mach_header*   mh;
  const void*                 dwarf_section;
  uintptr_t                   dwarf_section_length;
  const void*                 compact_unwind_section;
  uintptr_t                   compact_unwind_section_length;
};

extern bool _dyld_find_unwind_sections(void *, dyld_unwind_sections *);

During unwinding libunwind calls _dyld_find_unwind_sections to both find
unwind section addresses and identify the subarchitecture for frames (via the
MachO-header pointed to by the mh field).

This commit introduces two new libunwind SPI functions:

struct unw_dynamic_unwind_sections {
  unw_word_t dso_base;
  unw_word_t dwarf_section;
  size_t     dwarf_section_length;
  unw_word_t compact_unwind_section;
  size_t     compact_unwind_section_length;
};

typedef int (*unw_find_dynamic_unwind_sections)(
    unw_word_t addr, struct unw_dynamic_unwind_sections *info);

// Returns 0 if successfully registered, -1 to indicate too many
// registrations.
extern int __unw_add_find_dynamic_unwind_sections(
    unw_find_dynamic_unwind_sections find_dynamic_unwind_sections);

// Returns 0 if successfully deregistered, -1 to indicate no such
// registration.
extern int __unw_remove_find_dynamic_unwind_sections(
    unw_find_dynamic_unwind_sections find_dynamic_unwind_sections);

These can be used to register (and deregister) callbacks that have a similar
signature to _dyld_find_unwind_sections. During unwinding, if
_dyld_find_unwind_sections returns false (indicating that no frame info
was found by dyld) then registered callbacks are run in registration order until
either the unwind info is found or the end of the list is reached.

With this commit, and by implementing the find-unwind-info callback in the ORC
runtime in LLVM, we can address all three issues above: (1) compact unwind
ranges for JIT'd objects can be constructed and communicated to libunwind via
the new callback, (2) the subarchitecture for the given frame can be identified
by including a pointer to a JIT'd MachO header, and (3) the ORC runtime can use
STL or other complex data structures to enable efficient address-based lookup
for unwind info (a custom IntervalMap data type was recently introduced in
ab59185fbfb as a first step in this direction).

Explicitly out of scope for this commit is any change to registration on
non-Darwin platforms. While a callback-based lookup scheme may be useful on
other platforms it is not currently necessary for basic functionality the way
that it is on Darwin/arm64 (at least not to my knowledge).

Diff Detail

Event Timeline

lhames created this revision.Jan 19 2023, 9:29 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 19 2023, 9:29 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
lhames requested review of this revision.Jan 19 2023, 9:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2023, 9:29 PM

I've committed the LLVM ORC and ORC runtime sides of this in 3507df9c20a4. If you check out that commit or later and then build libunwind with this patch you should be see the new registration pathway kick in on Darwin:

% ORC_RT_DEBUG=1 llvm-jitlink -orc-runtime=/path/to/built-llvm/lib/clang/16/lib/darwin/liborc_rt_osx.a  <object file(s)>
__unw_add/remove_find_dynamic_unwind_sections available. Using callback-based frame info lookup.
...

Sadly this will only work on x86-64 Macs for now as the ptrauth changes for arm64e haven't been upstreamed yet. I'm working on that, but don't have a timeline for it yet. This commit should flow into libunwind releases however, addressing https://github.com/llvm/llvm-project/issues/49036 properly (the existing released fix is unreliable) and opening a path to compact-unwind support.

I'm not that familiar with development practices for libunwind, but is possible to write a test for this?

libunwind/src/libunwind_ext.h
71

Should document the return value, particularly since success/failure is not the same as other functions returning int in this file.

76

Would be nice to have doc comments for this; maybe just a lightly edited version of this paragraph from your description?

These can be used to register (and deregister) callbacks that have a similar
signature to _dyld_find_unwind_sections. During unwinding, if
_dyld_find_unwind_sections returns false (indicating that no frame info
was found by dyld) then registered callbacks are run in registration order until
either the unwind info is found or the end of the list is reached.

lhames updated this revision to Diff 490929.Jan 20 2023, 11:44 AM

Expand comments for new SPI.

lhames updated this revision to Diff 490930.Jan 20 2023, 11:47 AM

Rebase to include the original work in the updated diff.

pete accepted this revision.Jan 20 2023, 12:37 PM

LGTM

lhames marked an inline comment as done.Jan 20 2023, 1:17 PM

I'm not that familiar with development practices for libunwind, but is possible to write a test for this?

libunwind doesn't have tests for the JIT registration paths -- to exercise them it would need a way to produce executable code with valid headers and unwind info that was outside any range known to the system loader.

I think we're better off testing this in the ORC runtime -- we could have tests that look for the new registration paths and then verify that they're used.

libunwind/src/libunwind_ext.h
76

Yep -- good idea. I've included more detailed comments in the latest diff.

lhames marked an inline comment as done.
lhames added a reviewer: ab.Jan 23 2023, 5:54 PM

Here's a potential testcase, for anyone who wants to try this out. You'll need to build LLVM, compiler-rt, and libunwind:

C++ source for testcase:

int main(int argc, char *argv[]) {
  try {
    throw 42;
  } catch (int X) {
    return X;
  }
  return 0;
}

This can be compiled and run under the llvm-jitlink test tool on Darwin with:

% clang++ -c -o eh.o eh.cpp
% DYLD_LIBRARY_PATH=/path/to/llvm-build/lib ORC_RT_DEBUG=1 llvm-jitlink -orc-runtime=/path/to/llvm-build/lib/clang/16/lib/darwin/liborc_rt_osx.a eh.o
__unw_add/remove_find_dynamic_unwind_sections available. Using callback-based frame info lookup.
...

You can verify that the exception was caught and handled correctly by printing the return code:

% echo $?
42

Are there any further comments on this patch? If not I'll go ahead and land it based on Pete's approval.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 10 2023, 2:37 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

Here's a potential testcase, for anyone who wants to try this out. You'll need to build LLVM, compiler-rt, and libunwind:

C++ source for testcase:

int main(int argc, char *argv[]) {
  try {
    throw 42;
  } catch (int X) {
    return X;
  }
  return 0;
}

This can be compiled and run under the llvm-jitlink test tool on Darwin with:

% clang++ -c -o eh.o eh.cpp
% DYLD_LIBRARY_PATH=/path/to/llvm-build/lib ORC_RT_DEBUG=1 llvm-jitlink -orc-runtime=/path/to/llvm-build/lib/clang/16/lib/darwin/liborc_rt_osx.a eh.o
__unw_add/remove_find_dynamic_unwind_sections available. Using callback-based frame info lookup.
...

You can verify that the exception was caught and handled correctly by printing the return code:

% echo $?
42

I was wondering if it is feasible to add a regression test in clang-repl, possibly now with the right callback?