Page MenuHomePhabricator

[NFC] Refactor how CFI move sections are represented in AsmPrinter
AcceptedPublic

Authored by scott.linder on Mar 20 2020, 12:10 PM.

Details

Summary

AsmPrinter didn't directly represent the possibility of needing both
.eh_frame and .debug_frame, even though it is possible with
-force-dwarf-frame-section. Switch to representing it as bitflags, and
just cache the set of sections needed for the module rather than
handling the no-EH case specially.

It seems like this could all get moved to MachineFunction, especially
considering MachineFunction::needsFrameMoves should really be
referencing it, but there is some dependency on Function which prevents
that so I stopped here.

Diff Detail

Unit TestsFailed

TimeTest
320 mslinux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp

Event Timeline

scott.linder created this revision.Mar 20 2020, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2020, 12:10 PM

Add a missing prototype

Rebase and fix a callsite

Rebase and ping

MaskRay added a subscriber: asl.EditedDec 11 2020, 1:40 PM

Thanks for refactoring it:) I dug up a bit of history. @asl introduced the variable shouldEmitMoves in rL123474 and espindola introduced the member function needsCFIMoves.
I cannot find "CFI move section" in standards so it is probably not a standard term. Should we use a more appropriate term to best capture the intention?

Thanks for refactoring it:) I dug up a bit of history. @asl introduced the variable shouldEmitMoves in rL123474 and espindola introduced the member function needsCFIMoves.
I cannot find "CFI move section" in standards so it is probably not a standard term. Should we use a more appropriate term to best capture the intention?

I don't know if there is a generally accepted term to encompass .debug_frame+.eh_frame? Maybe just "CFI section"? Or "section containing CFI"?

I can change the shortlog message to:

[NFC] Refactor AsmPrinter tracking of CFI sections

And the comment to:

// Flags representing which sections to emit CFI to.
dblaikie added inline comments.Dec 11 2020, 8:48 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
974

Naively, it looks to me like this API change would allow for emission of debug_frame and eh_frame simultaneously. Is that understanding correct? If it is, is that outcome useful/intended? I'd have thought one or the other would suffice? (if you have eh_frame and its drawback of being ALLOC - you don't gain anything by also emmitting debug_frame, do you?)

scott.linder added inline comments.Dec 14 2020, 6:57 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
974

Yes, that's correct. It is still NFC, as that was the behavior before, it just wasn't represented here. See below:

else if (Asm->TM.Options.ForceDwarfFrameSection)
      Asm->OutStreamer->emitCFISections(true, true);

IIRC this was the confusion that led me to the refactoring to begin with. It also generalizes isCFIMoveForDebugging and documents what it actually is.

As to whether it makes sense to emit both, I don't know, but it is intended (emphasis mine):

I've modified the patch so that the new flag will ensure the cfi instructions are actually present to be emitted as well. I went ahead and renamed the flag -gdwarf-frame too, to better reflect that it's dealing with the debug information you'd otherwise get with -g, and is meant to specifically put the information in a .debug_frame section and not .eh_frame.

Currently, two things signal for need for cfi: exceptions (via the function's needsUnwindTableEntry()), and debug (via the machine module information's hasDebugInfo()). At frame lowering, both trigger the same thing. But when the assembly printer decides on which section to use, needsUnwindTableEntry() is checked first and triggers the need for .eh_frame, while hasDebugInfo() is checked afterwards for whether .debug_frame is needed. So .debug_frame is only present when any level of debug is requested, and no functions need unwinding for exceptions.

It wouldn't be appropriate to change either needsUnwindTableEntry() or hasDebugInfo(), so I've added a check for my flag alongside them. Because the same logic is used in multiple places, I've wrapped all three checks into one function to try and clean things up slightly. When deciding on which section to emit, the new flag means .debug_frame is produced instead of nothing. If .eh_frame would have been needed, rather than replace it, the new flag simply emits both .debug_frame and .eh_frame.

The end result is that -gdwarf-frame should only provide a .debug_frame section as additional information, without otherwise modifying anything. The existing -funwind-tables (and -fasynchronous-unwind-tables) flag can be used to provide similar information, but because it takes the exception angle, it alters function attributes and ultimately produces .eh_frame instead.

MaskRay accepted this revision.Dec 14 2020, 10:09 AM

The refactoring makes sense (did not notice that it was posted a couple of months ago!) . This dropped a confusing isCFIMoveForDebugging and removed the following block which confused me a bit.

for (auto &F: M.getFunctionList()) {
      // If the module contains any function with unwind data,
      // .eh_frame has to be emitted.
      // Ignore functions that won't get emitted.
      if (!F.isDeclarationForLinker() && F.needsUnwindTableEntry()) {
        isCFIMoveForDebugging = false;
        break;
      }
    }

As a potential follow-up, I do want the CFIMoveTypes concept to be better documented and probably rename it if it does not convey the intended meaning to readers.

The code does allow both sections due to ForceDwarfFrameSection. Please give some time to @dblaikie who has commented.

This revision is now accepted and ready to land.Dec 14 2020, 10:09 AM
dblaikie accepted this revision.Dec 17 2020, 10:55 AM
dblaikie added a subscriber: ostannard.
dblaikie added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
974

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file? (is that a useful thing? why is that supported?)

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

scott.linder added inline comments.Dec 17 2020, 11:40 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
974

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file?

Correct, for example:

~/code/llvm-project:main$ git show --no-patch --oneline
d5700fdf1045 (HEAD -> main, origin/master, origin/main, origin/HEAD) [gn build] Port 6eff12788ee
~/code/llvm-project:main$ echo 'void f() {}' | release/bin/clang -x c - -fforce-dwarf-frame -S -o - | grep cfi_sections
        .cfi_sections .eh_frame, .debug_frame
~/code/llvm-project:main$ echo 'void f() {}' | release/bin/clang -x c - -fforce-dwarf-frame -c -o - | release/bin/llvm-dwarfdump -debug-frame - | grep -A5 contents
.debug_frame contents:

00000000 00000014 ffffffff CIE
  Format:                DWARF32
  Version:               4
  Augmentation:          ""
--
.eh_frame contents:

00000000 00000014 00000000 CIE
  Format:                DWARF32
  Version:               1
  Augmentation:          "zR"

(is that a useful thing? why is that supported?)

The first thing that comes to my mind is if you have a language runtime which requires .eh_frame (i.e. for EH) and some other tool which only understands .debug_frame. The runtime cannot use .debug_frame (or I suppose it could if it were marked ALLOC and the runtime were willing to do some more relocation), and the tool would have to be taught how to use .eh_frame (this seems reasonable, but so does just putting in both tables). It would be convenient in that case to just emit both.

To answer the converse question of "is it useful to forbid this? why would we not support this?" I would say that it might cause confusion (both for users and for tools), the tables may somehow disagree (although that seems like it would constitute a bug in LLVM), and it adds some amount of complexity in that we are changing an enum of mutually exclusive things to orthogonal flags.

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

974

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file? (is that a useful thing? why is that supported?)

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

974

Whoops, I must have hit reply twice; disregard this response.

As a potential follow-up, I do want the CFIMoveTypes concept to be better documented and probably rename it if it does not convey the intended meaning to readers.

Since the AsmPrinter essentially consumes the CFI instructions and emits the parsed information, in DWARF CFI format, into either .debug_frame or .eh_frame or both, I think may be it makes sense to rename CFIMoveTypes on the following lines?

// Flags representing which section, containing DWARF CFI unwind information, to be generated i.e. .eh_frame or .debug_frame.
  enum CFIUnwindSection : unsigned { CFI_None = 0, CFI_EH_Frame = 1, CFI_Debug_Frame = 2 };
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
974

Naively, it looks to me like this API change would allow for emission of debug_frame and eh_frame simultaneously. Is that understanding correct? If it is, is that outcome useful/intended? I'd have thought one or the other would suffice? (if you have eh_frame and its drawback of being ALLOC - you don't gain anything by also emmitting debug_frame, do you?)

Yes, both .eh_frame and .debug_frame can be emitted into the same output file. AFAIU this ability is primarily needed for the following reasons.

  1. PowerPC uses different register numbering in .debug_frame and .eh_frame. I think, this is (one of) the reason why GCC also has the capability to generate both .eh_frame and .debug_frame into the same output file.
  1. Erstwhile LLVM toolchain used the system assembler/linker, typically the GCC's assembler/linker, and also I think the LLVM's assembler/linker strives to be a drop in replacement for GCC ones. And this naturally forces LLVM to support GAS assembler directives and their syntax.
  1. A strict DWARF compliant debugger needs .debug_frame which doesn't have exception handling features and this forces generating .eh_frame as well if we also need exception handling support in those cases.