This is an archive of the discontinued LLVM Phabricator instance.

[MC] Omit DWARF unwind info if compact unwind is present where eligible
ClosedPublic

Authored by int3 on Mar 22 2022, 1:45 PM.

Details

Summary

Previously, omitting unnecessary DWARF unwinds was only done in two
cases:

  • For Darwin + aarch64, if no DWARF unwind info is needed for all the functions in a TU, then the __eh_frame section would be omitted entirely. If any one function needed DWARF unwind, then MC would emit DWARF unwind entries for all the functions in the TU.
  • For watchOS, MC would omit DWARF unwind on a per-function basis, as long as compact unwind was available for that function.

This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
--emit-dwarf-unwind= which can toggle between always,
no-compact-unwind (only emit DWARF when CU cannot be emitted for a
given function), and the target platform default. no-compact-unwind
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.

Motivation: I'm working on adding support for __eh_frame to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.

Addendum: It turns out that there was a latent bug in the X86
backend when OmitDwarfIfHaveCompactUnwind is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.

For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
UNWIND_X86_MODE_DWARF -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the UNWIND_X86_MODE_DWARF CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize UNWIND_X86_MODE_DWARF if it is absent, so there
is no impact to the final binary size.

Diff Detail

Event Timeline

int3 created this revision.Mar 22 2022, 1:45 PM
int3 requested review of this revision.Mar 22 2022, 1:45 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 22 2022, 1:45 PM

Previously, omitting unnecessary DWARF unwinds was only done for watchOS,

I have looked at compact unwind descriptors in 2020-11 to write https://maskray.me/blog/2020-11-08-stack-unwinding#compact-unwind-descriptors
I vaguely remember there is a place where DWARF unwind info is enabled for all but few environments (like watchOS). Where is it?
My conjecture is that it may be related to the fact that the compact unwind descriptor does not support async unwind tables well and there may be cases needing it.

int3 added inline comments.Mar 22 2022, 1:58 PM
llvm/lib/MC/MCObjectFileInfo.cpp
67–68

@MaskRay

I vaguely remember there is a place where DWARF unwind info is enabled for all but few environments (like watchOS). Where is it?

here :)

int3 added a subscriber: Restricted Project.Mar 23 2022, 9:08 AM

I'm currently doing that via MCContext::getGenDwarfForAssembly...

This isn't my wheelhouse, but at first glance GenDwarfForAssembly looks like it might control all Dwarf sections (including debug ones). If that's the case you might need to introduce something eh-frame specific.

Previously, omitting unnecessary DWARF unwinds was only done for
watchOS, but I've heard that it might be possible to do this for all
archs.

Sounds like this was for bincompat with older unwinders when compact-unwind was first released. I think we can turn this off by default now.

I believe the root cause is that libunwind is unable to load compact
unwind dynamically. (I believe this limitation applies not just to MCJIT but to
ORC and the regular interpreter as well -- @lhames, would you be able to
confirm?)

That's right. Adding dynamic registration for compact unwind is pretty high on my todo list. I'm hoping to get to it within the next month or so. We should still have a way to force eh-frames though.

clang/unittests/Interpreter/ExceptionTests/InterpreterExceptionTest.cpp
48 ↗(On Diff #417390)

I guess this is just a placeholder? The final commit should include more detail.

llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
131–132

Is this needed? With your change, registerEHFramesInProcess should be a no-op if Size == 0.

llvm/tools/llc/llc.cpp
697–700 ↗(On Diff #417390)

This option hand-off feels very manual. I wonder if we should add a propagateTargetOptionsToMC function, but maybe the set of options that it would apply to is small enough not to bother for now?

I would add something like this to LLVMTargetMachine::addPassesToEmitMC too, for the the case where the caller passes in a null MCContext*.

int3 added a comment.Mar 25 2022, 1:03 PM

I'm currently working on the eh_frame support in LLD itself, but will circle back to this diff in a week or so. Thanks for the feedback!

clang/unittests/Interpreter/ExceptionTests/InterpreterExceptionTest.cpp
48 ↗(On Diff #417390)

yep, just a placeholder

llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
131–132

might not be yeah. I wasn't sure whether the extra entry in EHFrames would be an issue, but I will double check

llvm/tools/llc/llc.cpp
697–700 ↗(On Diff #417390)

thanks for the pointers! I'll look into that

int3 marked an inline comment as done.Apr 18 2022, 9:48 PM
int3 added inline comments.
llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
131–132

looks like it isn't necessary. removing

llvm/tools/llc/llc.cpp
697–700 ↗(On Diff #417390)

actually, I'm not sure I understood your suggestion... which class do you envision propagateTargetOptionsToMC being defined in? How would it make things "less manual"?

int3 updated this revision to Diff 423532.Apr 18 2022, 9:49 PM
  • Use a separate ForceDwarfUnwindInfo flag
  • Fix the remaining tests
int3 retitled this revision from [MC][RFC] Omit DWARF unwind info if compact unwind is present for all archs to [MC] Omit DWARF unwind info if compact unwind is present for all archs.Apr 19 2022, 12:55 PM
int3 edited the summary of this revision. (Show Details)

This is ready for review, please take a look 🙏

int3 updated this revision to Diff 424044.Apr 20 2022, 3:10 PM
int3 edited the summary of this revision. (Show Details)

fix x86 backend's omission of DWARF unwind

int3 added a comment.Apr 25 2022, 11:54 AM

bump @lhames @JDevlieghere would either of y'all be able to review this?

lhames added inline comments.Apr 26 2022, 2:26 PM
llvm/tools/llc/llc.cpp
697–700 ↗(On Diff #417390)

Oh! TargetOptions has an MCTargetOptions member already. If GenDwarfUnwindInfo === ForceDwarfUnwindInfo, could you just sink ForceDwarfUnwindInfo into MCTargetOptions?

int3 updated this revision to Diff 425408.Apr 26 2022, 8:35 PM
int3 marked an inline comment as done.

use MCTargetOptions

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2022, 8:35 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
int3 added inline comments.Apr 26 2022, 8:38 PM
llvm/lib/CodeGen/MachineModuleInfo.cpp
69

MCTargetOptions isn't passed directly to MCObjectFileInfo; it only gets an MCContext. MCContext already has a member pointer to MCTargetOptions, but it seemed not to be initialized in a number of cases, hence the need for this change.

llvm/tools/llc/llc.cpp
697–700 ↗(On Diff #417390)

alright, I think I got it to work :)

davide requested changes to this revision.EditedApr 27 2022, 5:16 PM

Compact unwind was developed as a goal to replace the DWARF unwind. Some apps did not work if the OS was missing DWARF unwind, so we kept both for Intel. The binary compatibility issue is non-existent on arm64, but it could be still a thing on x86_64.
If a function cannot be encoded in CU, we need the compiler to emit both (where the CU is a magic value that points to the dwarf unwind info for that function).

tl;dr: this is "a nice cleanup", and potentially the bincompat issue might be gone, but the cost of breaking programs in the wild is IMHO too high.

This revision now requires changes to proceed.Apr 27 2022, 5:16 PM
int3 added a comment.Apr 27 2022, 7:09 PM

Thanks for the feedback!

The binary compatibility issue is non-existent on arm64, but it could be still a thing on x86_64.

Can we enable OmitDwarfIfHaveCompactUnwind by default on arm64 then? And have it be opt-in just for x86_64?

Thanks for the feedback!

The binary compatibility issue is non-existent on arm64, but it could be still a thing on x86_64.

Can we enable OmitDwarfIfHaveCompactUnwind by default on arm64 then? And have it be opt-in just for x86_64?

I think it's fine.

davide accepted this revision.Apr 27 2022, 7:10 PM

LGTM

This revision is now accepted and ready to land.Apr 27 2022, 7:10 PM
int3 updated this revision to Diff 425944.Apr 28 2022, 7:06 PM

add --no-force-dwarf-unwind-info; only omit by default on aarch64

int3 updated this revision to Diff 427396.May 5 2022, 11:08 AM
int3 retitled this revision from [MC] Omit DWARF unwind info if compact unwind is present for all archs to [MC] Omit DWARF unwind info if compact unwind is present where eligible.
int3 edited the summary of this revision. (Show Details)

update

int3 updated this revision to Diff 427398.May 5 2022, 11:11 AM
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

this code doesn't execute if clang is passed an assembly file instead of a .c file, so this option doesn't have the desired effect on assembly inputs. I'm not sure what's the right way to tackle this, or if this behavior inconsistency is acceptable

smeenai added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

It seems unfortunate to have that inconsistency. From what I can tell, clang/tools/driver/cc1as_main.cpp looks like it might be the rough equivalent of this for the integrated assembler?

int3 added inline comments.May 5 2022, 12:24 PM
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

that's what I'd thought too, but I set a breakpoint on cc1as_main & ExecuteAssemblerImpl and then ran clang -c foo.s; neither breakpoint triggered

smeenai added inline comments.May 5 2022, 12:50 PM
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

Hmm, interesting. If you run with -###, is -cc1as being invoked in-process or out of process? Idk if the in-process cc1as has a different entry point.

smeenai added inline comments.May 5 2022, 12:53 PM
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

Nah, looks like in-process cc1as should be calling cc1as_main as well: https://github.com/llvm/llvm-project/blob/98616cfc02613d98964588fac6494ec7583c495f/clang/tools/driver/driver.cpp#L319

(If it's out of process you'd need to debug the cc1as process and not the driver process, of course, but I'd be surprised if you end up with an out of process cc1as)

int3 added inline comments.May 5 2022, 12:57 PM
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

ah yeah I guess that's it. I'd enabled settings set target.process.follow-fork-mode child but that doesn't seem to have the desired effect; but modifying cc1as_main itself shows that it is indeed being called.

int3 updated this revision to Diff 427509.May 5 2022, 7:11 PM

make things work under cc1as too

int3 marked 3 inline comments as done.May 5 2022, 7:14 PM
int3 added inline comments.
clang/lib/CodeGen/BackendUtil.cpp
456 ↗(On Diff #427398)

alright this works now, thanks for the pointers!

clang/tools/driver/cc1as_main.cpp
323 ↗(On Diff #427509)

this is a bit of code duplication, but the same approach is used to have OPT_fembed_bitcode_EQ accepted by both the integrated assembler as will as the CompilerInvocation.

lhames accepted this revision.Jun 10 2022, 11:17 AM
This revision was landed with ongoing or failed builds.Jun 12 2022, 7:04 AM
This revision was automatically updated to reflect the committed changes.
int3 added a comment.Jun 12 2022, 7:49 AM

Buildbots gave me a bunch of errors of the form

/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/llvm-mc: error: unable to get target for 'arm64-apple-macos11.0', see --version and --triple.

I'm wondering if I'm just missing a REQUIRES line, but then again I don't see any other tests under MC/MachO that use REQUIRES...

Or should I be using darwin instead of macos? Does it matter?

thakis added a subscriber: thakis.Jun 12 2022, 12:33 PM

I think it should work if you put the test in llvm/test/MC/MachO/AArch64 instead of in llvm/test/MC/MachO, because of:

% cat llvm/test/MC/MachO/AArch64/lit.local.cfg 
if not 'AArch64' in config.root.targets:
    config.unsupported = True
int3 added a comment.Jun 12 2022, 2:14 PM

d'oh, I see it now. And of course the parent llvm/test/MC/MachO directory is gated to x86-only. Thanks!

uabelho added a subscriber: uabelho.EditedJun 13 2022, 12:46 AM

Hi,

I noticed that on one of our downstream (not public) buildbots the

clang/test/Driver/femit-dwarf-unwind.s

test seems to fail rather randomly.
So if I run

clang -target x86_64-apple-macos11.0 -c ../clang/test/Driver/femit-dwarf-unwind.s -o foo.o
llvm-objdump --macho --dwarf=frames foo.o

a couple of times I sometimes get

.debug_frame contents:


.eh_frame contents:

and sometimes I get

.debug_frame contents:


.eh_frame contents:

00000000 00000014 00000000 CIE
  Format:                DWARF32
  Version:               1
  Augmentation:          "zR"
  Code alignment factor: 1
  Data alignment factor: -8
  Return address column: 16
  Augmentation data:     10

  DW_CFA_def_cfa: reg7 +8
  DW_CFA_offset: reg16 -8
  DW_CFA_nop:
  DW_CFA_nop:

  CFA=reg7+8: reg16=[CFA-8]

00000018 0000001c 0000001c FDE cie=00000000 pc=00000000...00000001
  Format:       DWARF32
  DW_CFA_def_cfa_offset: +8
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

  0x0: CFA=reg7+8: reg16=[CFA-8]

On another machine I don't seem to get that problem though, even if I compile in the same way so I really don't know what the difference is or why it is happening at all.
Any ideas? Anything anyone else is seeing?

EDIT:
If I add

-femit-dwarf-unwind=default

it seems to pass all the time, so I don't know if the problem is that the -femit-dwarf-unwind flag doesn't get a proper default value if not set explicitly?

int3 added a comment.Jun 13 2022, 3:42 AM

Oh dear. Hmm does running the test with ASAN enabled hit any memory issues?

int3 added a comment.Jun 13 2022, 3:42 AM

Actually let me try it now

int3 added inline comments.Jun 13 2022, 3:50 AM
clang/tools/driver/cc1as_main.cpp
323–329 ↗(On Diff #436220)

still building ASAN but this looks sus -- if no flag gets passed, we never initialize EmitDwarfUnwind before using it on line 381

also it occurred to me that if the buildbots haven't flagged anything, ASAN probably can't detect this. Let me just put up a patch

uabelho added inline comments.Jun 13 2022, 3:52 AM
clang/tools/driver/cc1as_main.cpp
323–329 ↗(On Diff #436220)

We have bots running ASAN on this without complaining.

A patch sounds good! Thanks!

Compact unwind was developed as a goal to replace the DWARF unwind. Some apps did not work if the OS was missing DWARF unwind, so we kept both for Intel. The binary compatibility issue is non-existent on arm64, but it could be still a thing on x86_64.
If a function cannot be encoded in CU, we need the compiler to emit both (where the CU is a magic value that points to the dwarf unwind info for that function).

tl;dr: this is "a nice cleanup", and potentially the bincompat issue might be gone, but the cost of breaking programs in the wild is IMHO too high.

Do you think clang could enable it by default even on intel if the deployment target is "new enough"?

no-compact-unwind is particularly useful for newer x86_64 platforms: we don't want to omit DWARF unwind for x86_64 in general due to possible backwards compat issues, but we should make it possible for people to opt into this behavior if they are only targeting newer platforms.

Shouldn't we make default imply no-compact-unwind when the target triple OS version is high enough? (Compact-unwind was added in macOS 10.6, correct? That's lower than anyone realistically targets these days...)