Page MenuHomePhabricator

[lld-macho] Initial support for EH Frames

Authored by int3 on Apr 8 2022, 7:27 PM.




llvm-mc generates unwind info in both compact unwind and DWARF
formats. LLD already handles the compact unwind format; this diff gets
us close to handling the DWARF format properly.


It's not quite done yet, but I figure it's worth getting this reviewed
and landed first as it's shaping up to be a fairly large code change.

Known limitations of the current code:

Since the feature is not ready for real use yet, I've gated it behind a
flag that only gets toggled on during test suite runs. With most of the
new code disabled, we see just a hint of perf regression, so I don't
think it'd be remiss to land this as-is:

           base           diff           difference (95% CI)
sys_time   1.926 ± 0.168  1.979 ± 0.117  [  -1.2% ..   +6.6%]
user_time  3.590 ± 0.033  3.606 ± 0.028  [  +0.0% ..   +0.9%]
wall_time  7.104 ± 0.184  7.179 ± 0.151  [  -0.2% ..   +2.3%]
samples    30             31


Like compact unwind entries, EH frames are also represented as regular
ConcatInputSections that get pointed to via Defined::unwindEntry. This
allows them to be handled generically by e.g. the MarkLive and ICF
code. (But note that unlike compact unwind subsections, EH frame
subsections do end up in the final binary.)

In order to make EH frames "look like" a regular ConcatInputSection,
some processing is required. First, we need to split the __eh_frame
section along EH frame boundaries rather than along symbol boundaries.
We do this by decoding the length field of each EH frame. Second, the
abs-ified relocations need to be turned into regular Relocs.

Next Steps

In order to support EH frames on ARM targets, we will either have to
teach LLD how to handle EH frames with explicit relocs, or we can try to
make llvm-mc emit abs-ified relocs for ARM as well. I'm hoping to do
the latter as I think it will make the LLD implementation both simpler
and faster to execute.


The obj-file-with-stabs.s test had to be updated as the previous
version would trip assertion errors in the code. It appears that in our
attempt to produce a minimal YAML test input, we created a file with
invalid EH frame data. I've fixed this by re-generating the YAML and not
doing any hand-pruning of it.

Diff Detail

Event Timeline

int3 created this revision.Apr 8 2022, 7:27 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 8 2022, 7:27 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
int3 requested review of this revision.Apr 8 2022, 7:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 7:27 PM

Is there another EH frame parser in-tree? LLVM or LLDB?

int3 added a comment.Apr 9 2022, 10:03 AM

There is DWARFDataExtractor, but I don't need to handle the full generality of the DWARF format with the perf overhead that entails. LLD-ELF has its own custom DWARF parser too.

int3 updated this revision to Diff 422696.Apr 13 2022, 5:03 PM

fix alignment

int3 updated this revision to Diff 423728.Apr 19 2022, 1:49 PM

more comments

int3 updated this revision to Diff 424002.Apr 20 2022, 12:56 PM
int3 edited the summary of this revision. (Show Details)

remove -dead_strip dependency

Roger added a subscriber: Roger.Apr 21 2022, 2:57 PM

Nice! I noticed that the tests in eh-frame.s only contain failing cases. Are there already success cases tested elsewhere?


Why is there a copy of *off for errors? Could we not use *off in the case of errors?


In the other functions, you use a > comparison. Why do you use an equality check here?


Is there no need to check that *off + 1 < data.size() (+ 1 to have room for the null symbol)? I think we could potentially segfault if *off is too large?


I could go either way, but what do you think about a comment like this just to make things obvious to the reader?


I'm a bit surprised why we want to use it++ instead of just it given that we're in a for loop that already increments it for us. Could we have a comment explaining this behavior?

int3 marked 5 inline comments as done.Apr 21 2022, 4:44 PM

I noticed that the tests in eh-frame.s only contain failing cases. Are there already success cases tested elsewhere?

There's eh-frame.s and invalid/eh-frame.s. I'm guessing you only looked at the latter :)


I thought it'd be a bit clearer to have the error message point at the start of a mis-encoded length field, rather than in the middle of the value.

Also for line 35 below, when saying that a CIE/FDE extends past the end, I think it would make sense to point to the start of the CIE/FDE, rather than in the middle


I went with something a bit more terse. Note that man strnlen already says

The strnlen() function returns
either the same result as strlen() or maxlen, whichever is smaller.

but -- I realized belatedly that this entire function is wrong. I was not checking the right maxlen... fixing + adding test

int3 updated this revision to Diff 424328.Apr 21 2022, 4:45 PM
int3 marked 2 inline comments as done.

address comments

Roger accepted this revision.Apr 22 2022, 11:18 AM
This revision is now accepted and ready to land.Apr 22 2022, 11:18 AM
int3 added a subscriber: oontvoo.Apr 22 2022, 1:01 PM
int3 added inline comments.

it appears that this cast is asserting when running on obj-file-with-stabs.yaml. Unfortunately it's a bit hard to debug when we only have the yaml without the source that generated it originally. @oontvoo do you happen to have a link to the original source of that file?

oontvoo added inline comments.Apr 26 2022, 7:46 AM

Yes - this was produced with the following:

echo "int main() {return 1;}" >
clang -c -g -o test.o test.c
ld -r -o test2.o test.o;

The trick is ld -r which kept the debug stuff (stabs) in the output .o file.

int3 updated this revision to Diff 425321.Apr 26 2022, 3:23 PM
int3 edited the summary of this revision. (Show Details)

fix obj-file-with-stabs.s test

This revision was landed with ongoing or failed builds.Wed, Jun 8, 8:41 PM
This revision was automatically updated to reflect the committed changes.
thakis added a comment.Thu, Jun 9, 4:23 AM

This breaks tests everywhere where assertions are enabled:

dyung added a subscriber: dyung.Thu, Jun 9, 5:23 AM

This breaks tests everywhere:

I believe the follow-up commit 977d62c33e3343a394777c1754682761eebb66cd is the one ultimately causing the current problems, but this commit does seem to cause at least one problem on its own:

int3 added a comment.Thu, Jun 9, 5:44 AM

Ugh, sorry about that

kstoimenov added a subscriber: kstoimenov.EditedMon, Jun 13, 1:16 PM

Looks like this might be causing some msan bots to fail:

******************** TEST 'lld :: MachO/icf-safe.ll' FAILED ********************
Command Output (stderr):
ld64.lld: error: invalid FDE relocation in __eh_frame