This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Refactor how CFI section types are represented in AsmPrinter
ClosedPublic

Authored by RamNalamothu on Mar 20 2020, 12:10 PM.

Details

Summary

In terms of readability, the enum CFIMoveType didn't better document what it
intends to convey i.e. the type of CFI section that gets emitted.

Diff Detail

Event Timeline

scott.linder created this revision.Mar 20 2020, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2020, 12:10 PM

Add a missing prototype

Rebase and fix a callsite

Rebase and ping

MaskRay added a subscriber: asl.EditedDec 11 2020, 1:40 PM

Thanks for refactoring it:) I dug up a bit of history. @asl introduced the variable shouldEmitMoves in rL123474 and espindola introduced the member function needsCFIMoves.
I cannot find "CFI move section" in standards so it is probably not a standard term. Should we use a more appropriate term to best capture the intention?

Thanks for refactoring it:) I dug up a bit of history. @asl introduced the variable shouldEmitMoves in rL123474 and espindola introduced the member function needsCFIMoves.
I cannot find "CFI move section" in standards so it is probably not a standard term. Should we use a more appropriate term to best capture the intention?

I don't know if there is a generally accepted term to encompass .debug_frame+.eh_frame? Maybe just "CFI section"? Or "section containing CFI"?

I can change the shortlog message to:

[NFC] Refactor AsmPrinter tracking of CFI sections

And the comment to:

// Flags representing which sections to emit CFI to.
dblaikie added inline comments.Dec 11 2020, 8:48 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1037–1042

Naively, it looks to me like this API change would allow for emission of debug_frame and eh_frame simultaneously. Is that understanding correct? If it is, is that outcome useful/intended? I'd have thought one or the other would suffice? (if you have eh_frame and its drawback of being ALLOC - you don't gain anything by also emmitting debug_frame, do you?)

scott.linder added inline comments.Dec 14 2020, 6:57 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1037–1042

Yes, that's correct. It is still NFC, as that was the behavior before, it just wasn't represented here. See below:

else if (Asm->TM.Options.ForceDwarfFrameSection)
      Asm->OutStreamer->emitCFISections(true, true);

IIRC this was the confusion that led me to the refactoring to begin with. It also generalizes isCFIMoveForDebugging and documents what it actually is.

As to whether it makes sense to emit both, I don't know, but it is intended (emphasis mine):

I've modified the patch so that the new flag will ensure the cfi instructions are actually present to be emitted as well. I went ahead and renamed the flag -gdwarf-frame too, to better reflect that it's dealing with the debug information you'd otherwise get with -g, and is meant to specifically put the information in a .debug_frame section and not .eh_frame.

Currently, two things signal for need for cfi: exceptions (via the function's needsUnwindTableEntry()), and debug (via the machine module information's hasDebugInfo()). At frame lowering, both trigger the same thing. But when the assembly printer decides on which section to use, needsUnwindTableEntry() is checked first and triggers the need for .eh_frame, while hasDebugInfo() is checked afterwards for whether .debug_frame is needed. So .debug_frame is only present when any level of debug is requested, and no functions need unwinding for exceptions.

It wouldn't be appropriate to change either needsUnwindTableEntry() or hasDebugInfo(), so I've added a check for my flag alongside them. Because the same logic is used in multiple places, I've wrapped all three checks into one function to try and clean things up slightly. When deciding on which section to emit, the new flag means .debug_frame is produced instead of nothing. If .eh_frame would have been needed, rather than replace it, the new flag simply emits both .debug_frame and .eh_frame.

The end result is that -gdwarf-frame should only provide a .debug_frame section as additional information, without otherwise modifying anything. The existing -funwind-tables (and -fasynchronous-unwind-tables) flag can be used to provide similar information, but because it takes the exception angle, it alters function attributes and ultimately produces .eh_frame instead.

MaskRay accepted this revision.Dec 14 2020, 10:09 AM

The refactoring makes sense (did not notice that it was posted a couple of months ago!) . This dropped a confusing isCFIMoveForDebugging and removed the following block which confused me a bit.

for (auto &F: M.getFunctionList()) {
      // If the module contains any function with unwind data,
      // .eh_frame has to be emitted.
      // Ignore functions that won't get emitted.
      if (!F.isDeclarationForLinker() && F.needsUnwindTableEntry()) {
        isCFIMoveForDebugging = false;
        break;
      }
    }

As a potential follow-up, I do want the CFIMoveTypes concept to be better documented and probably rename it if it does not convey the intended meaning to readers.

The code does allow both sections due to ForceDwarfFrameSection. Please give some time to @dblaikie who has commented.

This revision is now accepted and ready to land.Dec 14 2020, 10:09 AM
dblaikie accepted this revision.Dec 17 2020, 10:55 AM
dblaikie added a subscriber: ostannard.
dblaikie added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1037–1042

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file? (is that a useful thing? why is that supported?)

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

scott.linder added inline comments.Dec 17 2020, 11:40 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1037–1042

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file?

Correct, for example:

~/code/llvm-project:main$ git show --no-patch --oneline
d5700fdf1045 (HEAD -> main, origin/master, origin/main, origin/HEAD) [gn build] Port 6eff12788ee
~/code/llvm-project:main$ echo 'void f() {}' | release/bin/clang -x c - -fforce-dwarf-frame -S -o - | grep cfi_sections
        .cfi_sections .eh_frame, .debug_frame
~/code/llvm-project:main$ echo 'void f() {}' | release/bin/clang -x c - -fforce-dwarf-frame -c -o - | release/bin/llvm-dwarfdump -debug-frame - | grep -A5 contents
.debug_frame contents:

00000000 00000014 ffffffff CIE
  Format:                DWARF32
  Version:               4
  Augmentation:          ""
--
.eh_frame contents:

00000000 00000014 00000000 CIE
  Format:                DWARF32
  Version:               1
  Augmentation:          "zR"

(is that a useful thing? why is that supported?)

The first thing that comes to my mind is if you have a language runtime which requires .eh_frame (i.e. for EH) and some other tool which only understands .debug_frame. The runtime cannot use .debug_frame (or I suppose it could if it were marked ALLOC and the runtime were willing to do some more relocation), and the tool would have to be taught how to use .eh_frame (this seems reasonable, but so does just putting in both tables). It would be convenient in that case to just emit both.

To answer the converse question of "is it useful to forbid this? why would we not support this?" I would say that it might cause confusion (both for users and for tools), the tables may somehow disagree (although that seems like it would constitute a bug in LLVM), and it adds some amount of complexity in that we are changing an enum of mutually exclusive things to orthogonal flags.

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

1037–1042

Sorry, I don't have enough context here, so it's still a bit confusing:

"It is still NFC, as that was the behavior before, it just wasn't represented here. " - you're saying that prior to this patch/currently in LLVM, it was possible that both eh_frame and debug_frame were emitted into the same output file? (is that a useful thing? why is that supported?)

@dcandler @ostannard - do you folks have some context you could share here?

But, at least as far as the immediate/blocking issues for this current proposed patch are concerned - yeah, I guess this API change clarifies the existing behavior/intent. Thanks!

1037–1042

Whoops, I must have hit reply twice; disregard this response.

As a potential follow-up, I do want the CFIMoveTypes concept to be better documented and probably rename it if it does not convey the intended meaning to readers.

Since the AsmPrinter essentially consumes the CFI instructions and emits the parsed information, in DWARF CFI format, into either .debug_frame or .eh_frame or both, I think may be it makes sense to rename CFIMoveTypes on the following lines?

// Flags representing which section, containing DWARF CFI unwind information, to be generated i.e. .eh_frame or .debug_frame.
  enum CFIUnwindSection : unsigned { CFI_None = 0, CFI_EH_Frame = 1, CFI_Debug_Frame = 2 };
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1037–1042

Naively, it looks to me like this API change would allow for emission of debug_frame and eh_frame simultaneously. Is that understanding correct? If it is, is that outcome useful/intended? I'd have thought one or the other would suffice? (if you have eh_frame and its drawback of being ALLOC - you don't gain anything by also emmitting debug_frame, do you?)

Yes, both .eh_frame and .debug_frame can be emitted into the same output file. AFAIU this ability is primarily needed for the following reasons.

  1. PowerPC uses different register numbering in .debug_frame and .eh_frame. I think, this is (one of) the reason why GCC also has the capability to generate both .eh_frame and .debug_frame into the same output file.
  1. Erstwhile LLVM toolchain used the system assembler/linker, typically the GCC's assembler/linker, and also I think the LLVM's assembler/linker strives to be a drop in replacement for GCC ones. And this naturally forces LLVM to support GAS assembler directives and their syntax.
  1. A strict DWARF compliant debugger needs .debug_frame which doesn't have exception handling features and this forces generating .eh_frame as well if we also need exception handling support in those cases.
RamNalamothu commandeered this revision.Apr 23 2021, 9:28 PM
RamNalamothu added a reviewer: scott.linder.

Scott, I will take this forward. Hope that is fine.

Renaming the state variables for better representation.

Does it look better now, after the renaming?

Refactor how CFI move sections

Since 'move' is coined term which is not accurate, just avoid using it.

Can you rephrase why we need both .eh_frame & .debug_frame?

See also D100251. The logic here is subtle. I enumerate {-g,-g0} x {-fno-exceptions,-fexceptions} x {-fno-asynchronous-unwind-tables,-fasynchronous-unwind-tables} and compare the result with GCC to ensure the behaviors are reasonable.

llvm/include/llvm/CodeGen/AsmPrinter.h
210

Just use CFISection. You can consider enum class.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1044–1045

Drop braces around one-line simple statements.

Address review comments.

RamNalamothu marked 2 inline comments as done.EditedApr 24 2021, 8:22 AM
RamNalamothu added a subscriber: t-tye.

Refactor how CFI move sections

Since 'move' is coined term which is not accurate, just avoid using it.

Ah, I had fixed it in the commit message but not on Phabricator.

Can you rephrase why we need both .eh_frame & .debug_frame?

Capability of emitting both sections was already there and this patch just maintained the status quo.

Probably the following was adding the confusion. I have changed it little bit in the latest diff update.

/// Cached bitwise OR of CFISection flags for all functions in the module.
unsigned ModuleCFISections = CFISection::None;

And on why emit both sections, as it was mentioned earlier also, this is all that comes to my mind.

In D67216#1705839, @dcandler wrote:
I've modified the patch so that the new flag will ensure the cfi instructions are actually present to be emitted as well. I went ahead and renamed the flag -gdwarf-frame too, to better reflect that it's dealing with the debug information you'd otherwise get with -g, and is meant to specifically put the information in a .debug_frame section and not .eh_frame.

Currently, two things signal for need for cfi: exceptions (via the function's needsUnwindTableEntry()), and debug (via the machine module information's hasDebugInfo()). At frame lowering, both trigger the same thing. But when the assembly printer decides on which section to use, needsUnwindTableEntry() is checked first and triggers the need for .eh_frame, while hasDebugInfo() is checked afterwards for whether .debug_frame is needed. So .debug_frame is only present when any level of debug is requested, and no functions need unwinding for exceptions.

It wouldn't be appropriate to change either needsUnwindTableEntry() or hasDebugInfo(), so I've added a check for my flag alongside them. Because the same logic is used in multiple places, I've wrapped all three checks into one function to try and clean things up slightly. When deciding on which section to emit, the new flag means .debug_frame is produced instead of nothing. If .eh_frame would have been needed, rather than replace it, the new flag simply emits both .debug_frame and .eh_frame.

The end result is that -gdwarf-frame should only provide a .debug_frame section as additional information, without otherwise modifying anything. The existing -funwind-tables (and -fasynchronous-unwind-tables) flag can be used to provide similar information, but because it takes the exception angle, it alters function attributes and ultimately produces .eh_frame instead.

@scott.linder wrote:
The first thing that comes to my mind is if you have a language runtime which requires .eh_frame (i.e. for EH) and some other tool which only understands .debug_frame. The runtime cannot use .debug_frame (or I suppose it could if it were marked ALLOC and the runtime were willing to do some more relocation), and the tool would have to be taught how to use .eh_frame (this seems reasonable, but so does just putting in both tables). It would be convenient in that case to just emit both.

@RamNalamothu wrote:
Yes, both .eh_frame and .debug_frame can be emitted into the same output file. AFAIU this ability is primarily needed for the following reasons.

  1. PowerPC uses different register numbering in .debug_frame and .eh_frame. I think, this is (one of) the reason why GCC also has the capability to generate both .eh_frame and .debug_frame into the same output file.
  1. Erstwhile LLVM toolchain used the system assembler/linker, typically the GCC's assembler/linker, and also I think the LLVM's assembler/linker strives to be a drop in replacement for GCC ones. And this naturally forces LLVM to support GAS assembler directives and their syntax.
  1. A strict DWARF compliant debugger needs .debug_frame which doesn't have exception handling features and this forces generating .eh_frame as well if we also need exception handling support in those cases.
RamNalamothu retitled this revision from [NFC] Refactor how CFI move sections are represented in AsmPrinter to [NFC] Refactor how CFI section types are represented in AsmPrinter.Apr 24 2021, 8:39 AM
RamNalamothu edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Apr 24 2021, 11:00 AM
This revision was automatically updated to reflect the committed changes.

I noticed that this is not quite NFC. In ThinLTO this may change .eh_frame output to .debug_frame output. I'll need to look more closely.

MaskRay reopened this revision.Apr 26 2021, 3:18 PM

I am sorry but I reverted this in e01c666b136e3f97587b22a2029f31e5c36a0e71.

D76519 was not quite NFC. If we see a CFISection::Debug function before a
CFISection::EH one (-fexceptions -fno-asynchronous-unwind-tables), we may
incorrectly pick CFISection::Debug and emit a `.cfi_sections .debug_frame`.
We should use .eh_frame instead.
This revision is now accepted and ready to land.Apr 26 2021, 3:18 PM
MaskRay added inline comments.Apr 26 2021, 3:19 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
359–360
ModuleCFISection = getFunctionCFISectionType(F);
if (ModuleCFISection == AsmPrinter::CFISection::EH)
  break;
RamNalamothu added inline comments.Apr 26 2021, 8:55 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
359–360

Ah!

ModuleCFISection = getFunctionCFISectionType(F);
if (ModuleCFISection == AsmPrinter::CFISection::EH)
  break;

But wouldn't this give incorrect results if we see CFISection::Debug first and then a CFISection::None function as the last one i.e. without a CFISection::EH in between?
That is what I was trying to avoid in my original patch.

May be the following is a better one?

if (ModuleCFISection == AsmPrinter::CFISection::EH)
  break;

if (getFunctionCFISectionType(F) != AsmPrinter::CFISection::None)
  ModuleCFISection = getFunctionCFISectionType(F);

Address review comments.

RamNalamothu marked an inline comment as done.Apr 27 2021, 4:10 AM
MaskRay accepted this revision.Apr 27 2021, 8:29 AM

Thanks!

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
359–360

If you reorder if (ModuleCFISection == CFISection::EH) after if (getFunctionCFISectionType(F) != CFISection::None), the loop will exit in the same iteration, not the next iteration.

Implemented the suggested improvement.

RamNalamothu marked an inline comment as done.Apr 27 2021, 1:08 PM
This revision was landed with ongoing or failed builds.Apr 27 2021, 8:34 PM
This revision was automatically updated to reflect the committed changes.