[DWARF] - Take relocations in account when extracting ranges from .debug_ranges
ClosedPublic

Authored by grimar on Apr 19 2017, 8:27 AM.

Details

Summary

I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD.

When we have object file with .debug_ranges section it may be filled with zeroes.
Relocations are exist in file to relocate this zeroes into real values later, but until that
a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all
when I am trying to collect address ranges for building .gdb_index.

Solution implemented in this patch is to take relocations in account when parsing ranges.
FWIW DWARFFormValue::extractValue() already do that for calculating DW_AT_low_pc/DW_AT_high_pc,
when ranges are not involved. So solution seems to be consistent with existent code.

Diff Detail

Repository
rL LLVM
grimar created this revision.Apr 19 2017, 8:27 AM
grimar retitled this revision from [DWARF] - Take relocations in account when exctracting ranges from .debug_ranges to [DWARF] - Take relocations in account when extracting ranges from .debug_ranges.Apr 19 2017, 8:32 AM
dblaikie added inline comments.Apr 19 2017, 8:35 AM
include/llvm/DebugInfo/DWARF/DWARFContext.h
234 ↗(On Diff #95752)

I don't think there is a ranges.dwo section, is there? The ranges section is in the .o file, always. (the DWARF5 rnglists can go in the .dwo, though)

lib/DebugInfo/DWARF/DWARFDebugRangeList.cpp
25–31 ↗(On Diff #95752)

Should/can/could this be refactored to use the same code as the case you mentioned was already in place for DW_FORM_addr (high/low pc)?

test/DebugInfo/dwarfdump-ranges-unrelocated.s
13–15 ↗(On Diff #95752)

Would 2 functions (also probably simpler for them to be void and empty functions) would suffice to give the CU a DW_AT_ranges rather than high/low pc?

emaste added a subscriber: emaste.Apr 19 2017, 9:56 AM
grimar updated this revision to Diff 95908.Apr 20 2017, 2:36 AM

Thanks for review, David ! All comments were addressed.

include/llvm/DebugInfo/DWARF/DWARFContext.h
234 ↗(On Diff #95752)

There is no ranges.dwo in my testcase, but I had to change this API too, because I changed DWARFUnitSectionBase::parseImpl (please see DWARFUnit.h) to take const DWARFSection *RS.

I think I can't change return type for getRangeSection() without changing return type for getRangeDWOSection(),
because DWARFUnitSectionBase::parse() pushes getRangeSection(), but DWARFUnitSectionBase::parseDWO() pushes C.getRangeDWOSection() for RS.

lib/DebugInfo/DWARF/DWARFDebugRangeList.cpp
25–31 ↗(On Diff #95752)

Done. There was many places that could use some helper method like getAddress() instead of inlined applying relocations to values. All of them now uses getRelocatedValue() introduced in this patch.

test/DebugInfo/dwarfdump-ranges-unrelocated.s
13–15 ↗(On Diff #95752)

Yes, fixed. I added nop by myself to asm code though, thats to show that ranges are different explicitly.
I leaved single empty function to show we can dump ranges [0x0, 0x0] correctly now.
Previous diff was not able to do that, though that was not a real problem probably, LLD anyways supposed to use
LoadedObjectInfo::getSectionLoadAddress, so values would never be [0x0, 0x0] even for empty method, not sure about other possible users though.

I suggest to split refactoring change that introduced getRelocatedValue() to different patch, because it touches many
places and landing it separatelly will allow to reduce amount of changes in this patch (after rebasing) and simplify the review probably.
I'll post it very soon.

dblaikie added inline comments.Apr 20 2017, 10:07 AM
include/llvm/DebugInfo/DWARF/DWARFContext.h
234 ↗(On Diff #95752)

Looks like a bug in the code - since there is no range.dwo section this probably causes llvm-dwarfdump to be less good than it could be. If parseDWOCompileUnits passed getRangeSection, then at least when dumping an unsplit split-dwarf file (ie: the object file that comes out of the clang frontend before objcopy is run on it to split the .dwo sections) it would dump ranges correctly, I think.

If you like - you could send a separate patch befoer or after this one to clean that up (create a small example object file with ranges in it - show that changing that call in parseDWOCompileUnits does cause the ranges to be correctly dumped - and then also strip out all the ranges.dwo stuff)

Ah, not a bug, because RangeDWOSection is the same as RangeSection... just confusing.

The support for rnglists.dwo (which I'm guessing is what the original comment that setup RangesDWO was referring to)will look very different from all of this anyway, I think.

lib/DebugInfo/DWARF/DWARFDebugRangeList.cpp
48 ↗(On Diff #95908)

Isn't this applying the relocation twice? The entry's Start/EndAddress are already relocated above so there shouldn't be any need to look up relocations again here, should there?

This is to handle the case where a function wasn't selected (comdat, etc) & so its base address resolves to zero by the linker - and specifically when such a function was empty anyway? (eg: "int f()" - absolutely no instructions, so begin/end are the same address)? /maybe/ that's correct/useful, but I doubt it. I'm not sure how important it is to support zero-length ranges in a range list... maybe for debugging weird DWARF output, etc.

grimar updated this revision to Diff 96117.Apr 21 2017, 3:20 AM
  • Rebased.
  • Addressed review comments.
lib/DebugInfo/DWARF/DWARFDebugRangeList.cpp
48 ↗(On Diff #95908)

Isn't this applying the relocation twice? The entry's Start/EndAddress are already relocated above so there shouldn't be >any need to look up relocations again here, should there?

That not always enough, I tried to resolve corner case here. If we have asm code like one from testcase:

.section .text.foo1,"ax",@progbits
.Lfunc_begin0:
.Lfunc_end0:

Then if I dump result object file with llvm-dwarfdump, start/end address are [0x0, 0x0] both before and after relocations applied above. Because address of section is zero and relocation just adds zero size of range to end address, what leaves is as zero.
So dumping prints end marker for a entry. I think the only way to handle that is check that we do not have relocations that points to end marker [0x0, 0x0].

This is to handle the case where a function wasn't selected (comdat, etc) & so its base address resolves to zero by the >linker - and specifically when such a function was empty anyway? (eg: "int f()" - absolutely no instructions, so begin/end >are the same address)? /maybe/ that's correct/useful, but I doubt it.

So linker is not involved, I just noticed this when dumped testcase object. I am not sure how much is it useful in general.
(its not what I need for LLD and not intention of this patch).
Just supposed that would be more correct for llvm-dwarfdump than report <end of list> in output for a "valid" range.

I'm not sure how important it is to support zero->length ranges in a range list... maybe for debugging weird DWARF >output, etc.

Yeah, I am not sure also. I removed that change from patch and modified testcase to have no zero length ranges for now.

dblaikie accepted this revision.Apr 21 2017, 10:22 AM

Looks good - check if the test case would be simpler with -gmlt rather than -gsplit-dwarf, and use that if so, otherwise commit whenever you're ready.

test/DebugInfo/dwarfdump-ranges-unrelocated.s
10 ↗(On Diff #96117)

I'd probably not use -gsplit-dwarf in this test case, it doesn't seem to be important/relevant (though I suppose it'll make the DWARF longer as it'll now include all the subprogram DIEs, etc) - maybe -gmlt would suffice?

This revision is now accepted and ready to land.Apr 21 2017, 10:22 AM
This revision was automatically updated to reflect the committed changes.
grimar added inline comments.Apr 24 2017, 3:33 AM
test/DebugInfo/dwarfdump-ranges-unrelocated.s
10 ↗(On Diff #96117)

Yes, you right, -gmlt is suffice. And it makes output easier for reducing into current testcase.
Thanks !