This is an archive of the discontinued LLVM Phabricator instance.

Debug Info Support for Basic Block Sections
ClosedPublic

Authored by tmsriram on Apr 24 2020, 8:26 PM.

Details

Summary

This patch adds basic debug info support with basic block sections supported in D68063 and D73674.

The original author of this patch is @amharc and I am presenting this for review on his behalf.

This patch uses ranges for debug information when a function contains basic block sections rather than using [lowpc, highpc]. This is also the first in a series of patches for debug info and does not contain the support for linker relaxation. That will be done as a follow up patch.

Diff Detail

Event Timeline

tmsriram created this revision.Apr 24 2020, 8:26 PM
tmsriram updated this revision to Diff 260681.Apr 28 2020, 10:04 AM

Update patch with missing file (MachineBasicBlock.h) and simplify test.

Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2020, 10:04 AM
tmsriram updated this revision to Diff 263889.May 13 2020, 4:47 PM

Rebase and add a test for fission.

This is currently only for "basic block sections", right? No linker relaxation, etc? (so symbol differences within a bb section (even using symbols at the very end of the bb section, after the trailing jump) are valid/knowable at compile-time/don't change during linking)

llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
215–223

Is there a test case covering this? Rough sense of how much this hurts debug info quality?

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
593–598

Would this fall out naturally/produce the same answer as the loop below? If so, I'd probably let it happen that way rather than special casing it.

599–610

Probably add a comment describing what's happening here (similar to the comment above about the continuous range case)

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2118–2126

If you remove the &MBB != &MF->front() case, would the function begin/end entry fall out/be produced by the loop? (or is "MBBSectionRanges" only valid if "hasBBSections"? Maybe it'd be handy if it always contained the ranges - just one range when "hasBBSections" is false?)

llvm/test/DebugInfo/X86/basicblock-sections_1.ll
14–17

This might be a case where testing the assembly would be appropriate - to check which symbols and symbolic expressions are used for the ranges/high/low rather than only that is "some range".

tmsriram updated this revision to Diff 270527.Jun 12 2020, 3:00 PM
tmsriram marked 6 inline comments as done.

Rebase and address reviewer comments which are code refactoring and enhancing the test case.

llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
215–223

I am still working on getting a test case for this. I will iterate on this shortly.

llvm/test/DebugInfo/X86/basicblock-sections_1.ll
14–17

I have added checks for the assembly. Please let me know if there is something specific you want to check here.

dblaikie added inline comments.Jun 15 2020, 7:27 PM
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
402–412

This is a bit of a duplicate of the code in DwarfDebug::endFunctionImpl - perhaps it could be refactored into a reusable function in some way?

402–414

Does this code produce the correct behavior when !BBSections? If so, I don't think it's worth the special-case for !BBSections - happy to just let it flow out from this code.

597

Probably "Otherwise, low/high PC can be used."

llvm/test/DebugInfo/X86/basicblock-sections_1.ll
44–45

Probably put this up along with the NO-SECTIONS dwarfdump testing.

Also - probably best to test the .debug_info section rather than the .debug_abbrev section. You can add -v to dump which will print forms and sections:

no sections:

DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000 ".text")
DW_AT_high_pc [DW_FORM_data4] ({{.*}})

basic block sections:

DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
DW_AT_ranegs [DW_FORM_sec_offset] (
                 [0x0000000000000000, {{.*}}) ".text._Z2f1v"
                 [0x0000000000000000, {{.*}}) ".text._Z2f2v")

or something like that (I mean that sample was from function sections, but bb sections should work similarly - easier with unique section names to identify which ones are there)

tmsriram updated this revision to Diff 271576.Jun 17 2020, 10:08 PM
tmsriram marked 5 inline comments as done.

Address reviewer comments :

+ Refactor code
+ Fix test to use --debug-info

llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
402–412

It is really small now. deduping complicates it as one is a simple list and the other is DwarfCompileUnit::addRange.

dblaikie added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
598–609

Hmm, the existence of this code makes me think that this means LLVM debug info emission depends on block order. If some optimization did interesting things to block order it could break/change the semantics of the DWARF emitted.

That's perhaps a big enough bug that it's out of scope to deal with here (@aprantl @JDevlieghere @probinson - just FYI to keep in mind), but hopefully a general solution to this problem would remove the need for this code here. (& then no need for MBBSectionRanges to be a map, probably).

Maybe include a FIXME above this code to explain that?

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2122–2123

Could we iterate MBBSectionRanges (it looks like it's built per-function?) instead of iterating BBs and looking up MBBSectionRanges? Looks like /maybe/ this code could be:

for (const auto &R : Asm->MBBSectionRanges)
  TheCU.addRange(R.second.BeginLabel, R.second.EndLabel);

(similarly for the code in updateSubprogramScopeDIE)

tmsriram marked an inline comment as done.Jun 18 2020, 4:13 PM
tmsriram added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2122–2123

IndexedMap does not support begin() end() iterators, so this won't work right? This is a good idea but I might have to find a different Map data structure that is viable. A DenseMap variant looks good as section IDs can be mapped to a small dense set of integers, but DenseMap does not support this either?

dblaikie added inline comments.Jun 18 2020, 4:39 PM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2122–2123

Ah, I see - yeah, I think it'd probably be fine to just use a DenseMap<MCSection*, LabelRangeThing>, but I can see the nice-ness of the IndexedMap.

Are the section IDs unique just to the BB sections of one function? Otherwise it seems inefficient to have a map that's as big as all the sections in the output object file to be able to do indexed lookup into, compared to a denser map data structure - but perhaps it's still a small enough number that flat/indexed lookup is better.

In any case, it probably wouldn't be too hard to add begin/end filter_iterators to IndexedMap.

DenseMap should support iteration (implemented in DenseMapBase) - but has the problem of unreliable ordering (even for non-pointer keys, like ints - we shouldn't depend on the ordering of a hash data structure), so that'd make the output unstable, which isn't any good.

So maybe adding filtering iteration for IndexedMap is the right way to go - though I still wouldn't'd mind knowing just how big this IndexedMap gets - hmmm, I guess from the way it's built "getNumBlockIDs" means it is function-locally-unique, not whole-output-file-unique, which means it shouldn't be just as big as it needs to be. Sounds good.

tmsriram marked an inline comment as done.Jun 19 2020, 12:21 AM
tmsriram added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2122–2123

Yes, the section IDs correspond to the BB section of one function. Worst case, each BB goes into one section and gets a entry in the map. I do not have a average case number with section clusters and inter-procedural layout, but is probably like one-third the number of basic blocks if I have to hazard a guess.

I can look at adding begin/end iterators to Indexed Map. It looks like it is using a vector underneath which is pre-allocated in size. If I have to scan the vector fully then it is probably not more efficient than what we have right now as it is still linear in the number of basic blocks and not sections, but I can take a closer look.

tmsriram updated this revision to Diff 272903.Jun 23 2020, 8:11 PM
tmsriram marked 4 inline comments as done.

Use MapVector for MBBSectionRanges instead of IndexedMap.

MapVector guarantees fast lookups and iterations and preserves the insertion order.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2122–2123

TLDR; MapVector seems to work great for this use case, lookups and iterations are fast. Deletions are slow but we don' do this.

+ IndexedMap is not what we want. Maybe I just don't get it but it is basically a vector underneath and the mapping of key->integers is left to the user. If it is sparse, iterating becomes an issue as we have to go over non-existent elements. It also requires early resizing. My feeling is IndexedMap was never made to be iterated.
+ MapVector is still off DenseMap and this seems to work perfect here. A plain vector would do if we solve the block order problem which you noted.

tmsriram updated this revision to Diff 274939.Jul 1 2020, 3:24 PM
tmsriram marked 2 inline comments as done.

Remove check for skipping debug info in DebugHandlerBase.cpp as it is not needed.

llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
215–223

I removed these lines and did a full bootstrap with debuginfo and assertions enabled with basic block sections and see no issues. These lines were added before we got a lot of debug info right and this seems no longer necessary.

dblaikie accepted this revision.Jul 1 2020, 4:28 PM

Looks good - thanks!

This revision is now accepted and ready to land.Jul 1 2020, 4:28 PM
This revision was automatically updated to reflect the committed changes.