Page MenuHomePhabricator

Call Frame Information (CFI) Handling for Basic Block Sections
Needs ReviewPublic

Authored by tmsriram on May 14 2020, 6:26 PM.

Details

Reviewers
dblaikie
wmi
Summary

Presenting this for review on behalf of @amharc who is the original author of this patch. This patch handles CFI with basic block sections, which unlike DebugInfo does not support ranges. The DWARF standard explicitly requires emitting separate CFI Frame Descriptor Entries for each contiguous fragment of a function. Thus, the CFI information for all callee-saved registers (possibly including the frame pointer, if necessary) have to be emitted along with redefining the Call Frame Address (CFA), viz. where the current frame starts.

Diff Detail

Event Timeline

tmsriram created this revision.May 14 2020, 6:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 14 2020, 6:26 PM
davidxl edited reviewers, added: wmi; removed: davidxl.May 19 2020, 9:54 AM
davidxl added a subscriber: davidxl.
wmi added inline comments.May 20 2020, 5:51 PM
llvm/lib/CodeGen/AsmPrinter/DwarfException.h
70–71

They are new functions marked as override, but there are no virtual functions added in the base class.

llvm/lib/CodeGen/CFIInstrInserter.cpp
310–316

Can we refactor it a little so the case to generate createDefCfa can be separated out like following? It is easier to understand when ForceFullCFA is true, only createDefCfa will be used to set up CFA info.

if (PrevMBBInfo->OutgoingCFAOffset != MBBInfo.IncomingCFAOffset &&
    PrevMBBInfo->OutgoingCFARegister != MBBInfo.IncomingCFARegister || ForceFullCFA) {
  MF.addFrameInst(MCCFIInstruction::createDefCfa...
  ...
} else if (...) {
...
377–379

Better move the snippet upwards and early return for ForceFullCFA in case later people add continue before the snippet since it is expected to call emitCalleeSavedFrameMoves for every BB except the first BB.

tmsriram updated this revision to Diff 265654.May 21 2020, 6:01 PM
tmsriram marked 4 inline comments as done.

Code refactoring in CFIInstrInserter.cpp

wmi added inline comments.May 28 2020, 10:23 AM
llvm/lib/MC/MCDwarf.cpp
1501 ↗(On Diff #265654)

Could you explain a little about how you decide which instructions can be deduped? I see the code is comparing the label of each CFI instruction with the label of the first CFI instruction in the frame. What is the meaning of the label? When those CFI intructions are created in CFIInstrInserter, the labels are initialized to nullptr.

tmsriram updated this revision to Diff 273599.Thu, Jun 25, 11:38 PM

Split deduplication out of base CFI patch.

This patch now only contains the basic support needed to make CFI correct with basic block sections. CFI FDE deduplication will be added as a separate child patch to make it easy to review.

tmsriram edited the summary of this revision. (Show Details)Thu, Jun 25, 11:42 PM
tmsriram added subscribers: sanjoy, vstefanovic.
tmsriram added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfException.h
70–71

It is in the parent patch, D78851 in AsmPrinterHandler.h just like beginFragment.

(don't have any particular feedback on most of the patch - CFI stuff isn't something I'm especially familiar with - just a couple of optional thoughts on simplifying/clarifying the tests)

llvm/test/DebugInfo/X86/basicblock-sections-cfi.ll
4–10 ↗(On Diff #273599)

I /might/ find this simpler:

void f1();
void f2();
void f3(bool b) {
  if (b)
    f1();
  else
    f2();
}

But no big deal either way - maybe some more exposition on why 1 CIE and 4FDEs are expected? (I guess 1 CIE shares some common data across the 4 FDEs which are for the 4 basic blocks? Could it be fewer, what about "void f1(); void f2(bool b) { if (b) f1(); }" is that still adequate to test this functionality?)

llvm/test/DebugInfo/X86/basicblock-sections-cfiinstr.ll
23–30 ↗(On Diff #273599)

Seems like a surprisingly large amount of computation - is it there for a reason? needed to push some optimization or layout decisions? Could it all use the same operation (just all multiplication, for instance) or is the different operations significant? (Well, I guess they have to differ between the two branches - but could all be the same within each one?) does it need 12 parameters? Could it be fewer & use a function call?

(etc, etc - simple test case, maybe some comments describing what's significant about the features of it that are needed to demonstrate the desired behavior, etc)

tmsriram marked an inline comment as done.Fri, Jun 26, 1:21 PM
tmsriram added inline comments.
llvm/test/DebugInfo/X86/basicblock-sections-cfi.ll
4–10 ↗(On Diff #273599)

@amharc

Thanks for taking a look, I will change the code.

The FDEs cannot be fewer, one is needed per section. This is because FDEs do not support ranges like debug info and it is only low-high PC. In theory, each section could be in arbitrary non-contiguous locations and hence conservatively we need four.

We have a CFI dedup that will reduce the size of each FDE and move the common parts out to the CIE. I split that out of this patch to keep this simple.

Could it be fewer, what about "void f1(); void f2(bool b) { if (b) f1(); }" is that still adequate to test this functionality?)

This will not capture the deduping well because we need atleast 4 basic blocks to have duplicates. The entry and the exit block have other CFI instructions.

dblaikie added inline comments.Fri, Jun 26, 2:02 PM
llvm/test/DebugInfo/X86/basicblock-sections-cfi.ll
4–10 ↗(On Diff #273599)

We have a CFI dedup that will reduce the size of each FDE and move the common parts out to the CIE. I split that out of this patch to keep this simple.

Then I'd generally suggest keeping this test simple too - though I can understand the benefit of having the diff on the test just demonstrate how things got better in the later change (rather than introducing test changes or new tests that demonstrate the new behavior). Your call.

tmsriram marked 2 inline comments as done.Fri, Jun 26, 3:37 PM
tmsriram added inline comments.
llvm/test/DebugInfo/X86/basicblock-sections-cfi.ll
4–10 ↗(On Diff #273599)

Sure, Iwill make it simple and evolve it when submitting the dedup patch.

llvm/test/DebugInfo/X86/basicblock-sections-cfiinstr.ll
23–30 ↗(On Diff #273599)

It was done so that more callee-saved registers are used and when more callee saved registers are used cfi_offset directives are needed for it. The .s looks like this for a basic block that does the computation:

_Z7computebiiiiiiiiiiii.1: # %if.then
.cfi_startproc
.cfi_def_cfa %rbp, 16
.cfi_offset %rbx, -48
.cfi_offset %r12, -40
.cfi_offset %r14, -32
.cfi_offset %r15, -24
.cfi_offset %rbp, -16

Each basic block that goes in a different section must emit cfi directives for callee-saved registers. The parameters is to make sure the caller saved registers are taken and the callee saved registers are forced so that we can check that the cfi emission indeed works for callee saved registers.

tmsriram updated this revision to Diff 274249.Mon, Jun 29, 2:50 PM
tmsriram marked 2 inline comments as done.

Simplify CFI test.

tmsriram updated this revision to Diff 274324.Mon, Jun 29, 8:52 PM

Simplify cfiinstr test .ll file