This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
ClosedPublic

Authored by grimar on May 15 2017, 3:44 AM.

Details

Summary

This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.May 15 2017, 3:44 AM
grimar retitled this revision from [DWARF] - Make collectAddressRanges() return senction index in addition to Low/High PC to [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC.
aprantl added inline comments.May 15 2017, 9:42 AM
include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h
28 ↗(On Diff #98973)

I'm assuming that this is the "official" name used in the specification? Then this name is fine, otherwise I would prefer to call this field SectionIndex.

48 ↗(On Diff #98973)

If you find the time, it would be nice to convert all these to /// in a separate commit.

lib/DebugInfo/DWARF/DWARFContext.cpp
949 ↗(On Diff #98973)

return {Ret, RSec->getIndex()};

1124 ↗(On Diff #98973)

{}

lib/DebugInfo/DWARF/DWARFUnit.cpp
355 ↗(On Diff #98973)

(--B)->second.first:

Would it make sense to replace some std::pairs with structs for better readability?

lib/Object/COFFObjectFile.cpp
298 ↗(On Diff #98973)

Why does this return -1 and the others 0 for not found? Should it return an optional?

dblaikie edited edge metadata.May 15 2017, 10:14 AM

+1 to Rafael

include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h
25–27 ↗(On Diff #98973)

This should probably be called DWARFAddressRange since it represents more than a single address (or possibly no addresses, if the range is empty) - and I think that's the right terminology in DWARF (that a contiguous range is a 'range' and a series of ranges is a 'range list').

+1 to Rafael's comment about separating the renaming/un-pair-ification from the semantic change/addition. Also probably the addition of the RelocAddrEntry class would be good to be done in a separate preliminary commit too.

grimar updated this revision to Diff 99261.May 17 2017, 2:21 AM
  • Addressed review comments.
include/llvm/DebugInfo/DWARF/DWARFDebugRangeList.h
25–27 ↗(On Diff #98973)

Done in rL303163.

28 ↗(On Diff #98973)

No, "SecNdx" was something from my head. Does not seem that specification mentions section index for address range in any way. Renamed to SectionIndex here and below.

48 ↗(On Diff #98973)

Done in rL303241.

lib/DebugInfo/DWARF/DWARFContext.cpp
949 ↗(On Diff #98973)

I changed code around, made Ret to be pair like it probably should be to match return value and this issue was gone.

1124 ↗(On Diff #98973)

Done.

lib/DebugInfo/DWARF/DWARFUnit.cpp
355 ↗(On Diff #98973)

I think yes, I'll try to take a look while I am here changing DWARF code around.

lib/Object/COFFObjectFile.cpp
298 ↗(On Diff #98973)

Others ? I am returning -1 in all getSectionIndex() implementations because 0 can be an index of valid section in theory. Since numeration starts from zero.

I would not return Optional. Because that would be an overcomplication of code I think. If we pass
some existent DataRefImpl Sec, then we should be able to find it's index.

I probably can't imagine code that would have real benefits from Optional. We return uint64_t in
getSectionAddress() and getSectionSize() below, so it is also a consistent solution.

As an althernative solution I was thinking about just placing llvm_unreachable() here and then store section address only for ELF objects in getSymbolInfo(). Would it be better ?

rafael added inline comments.May 17 2017, 11:12 AM
include/llvm/DebugInfo/DWARF/DWARFRelocMap.h
20 ↗(On Diff #99261)

Please rebase, as Width was removed.

include/llvm/Object/ELFObjectFile.h
652 ↗(On Diff #99261)

I think you can assert in here.

This function is passed a section, so whatever found it must have looked in the section table.

lib/Object/COFFObjectFile.cpp
297 ↗(On Diff #99261)

This should be trivial to implement:

toSec(Sec) - SectionTable.

lib/Object/MachOObjectFile.cpp
1824 ↗(On Diff #99261)

This is just "return Sec.d.a"

lib/Object/WasmObjectFile.cpp
748 ↗(On Diff #99261)

This is just Sec.d.a;

grimar updated this revision to Diff 99413.May 18 2017, 2:21 AM
  • Addressed review comments.
include/llvm/DebugInfo/DWARF/DWARFRelocMap.h
20 ↗(On Diff #99261)

Done.

include/llvm/Object/ELFObjectFile.h
652 ↗(On Diff #99261)

Done.

lib/Object/COFFObjectFile.cpp
297 ↗(On Diff #99261)

Cool, thanks ! I just do not know anything about non-ELF so did not even try to implement getSectionIndex() for those types.

lib/Object/MachOObjectFile.cpp
1824 ↗(On Diff #99261)

Done.

dblaikie added inline comments.May 19 2017, 10:45 AM
include/llvm/DebugInfo/DWARF/DWARFFormValue.h
50 ↗(On Diff #99413)

Should this go inside the union, with the uval (wrapping the two together in a struct (anonymous would be OK if that's possible, but I'm not sure if anonymous structs are portable here, etc))? (I mean, it doesn't change the size of this structure, I guess - but seems like it'd make the intent clearer with regard to which elements are 'active' together, etc)

62 ↗(On Diff #99413)

Presumably this should assert that the type is an address form? Though I guess the other accessors don't have such assertions, so maybe that's OK.

include/llvm/Object/ELFObjectFile.h
653 ↗(On Diff #99413)

Yeah, as Rafael said - 'assert(false)' isn't appropriate in LLVM (llvm_unreachable is used instead - though as you point out, this isn't 100% adhered to/fixed/etc but generally should be the goal)

As for the error handling, llvm::Error has a special function for "ignore any errors here": llvm::consumeError(SectionsOrErr). Alternatively, you could handle the error with an assert:

handleAllErrors(std::move(SectionsOrErr.getError()), [](const ErrorInfoBase &) { llvm_unreachable("There should always be sections here, since a section was passed into getSectionIndex in the first place"); });

Or the like.

656 ↗(On Diff #99413)

Use -> rather than (*x).y

lib/DebugInfo/DWARF/DWARFContext.cpp
911 ↗(On Diff #99413)

seems like it might be cheap enough to introduce a struct here with named parameters, rather than a std::pair. I mean it's file local so the reader doesn't have to go far to check what the first/second are, but equally - seems pretty cheap to avoid them needing to (& might make the implementation easier to read too)

Actually regarding the implementation - might be easier to have two named variables, and stuff them into the result (be it a std::pair, or named struct) at the end.

1114 ↗(On Diff #99413)

should be able to use -> here:

SymInfoOrErr->first
1121 ↗(On Diff #99413)

and here

grimar updated this revision to Diff 99862.May 23 2017, 1:29 AM
  • Addressed review comments.
include/llvm/DebugInfo/DWARF/DWARFFormValue.h
50 ↗(On Diff #99413)

Your idea looks reasonable to me, but unfortunately use of unnamed struct looks to be not portable solution, at least I am using MSVS 2015 and it reports:
warning C4201: nonstandard extension used: nameless struct/union
when I am trying to do that :(

And use of named struct sure works here, but requires much more changes in code (particullary in DWARFFormValue.cpp) and I probably would not do that.

62 ↗(On Diff #99413)

Yes, here I would also be consistent and go without assert.

I actually was thinking about removing setters/getters here. Not sure I understand why do we need getRawUValue()/setUValue() pair for example. But that would be different patch.

include/llvm/Object/ELFObjectFile.h
653 ↗(On Diff #99413)

Done. I used handleAllErrors and a shorter message because I really do not expect anyone see it..

656 ↗(On Diff #99413)

Done.

lib/DebugInfo/DWARF/DWARFContext.cpp
911 ↗(On Diff #99413)

Done. I agree that sometimes std::pairs looks weak in compare with struct for readability. Since here value is used for caching too, it is looks better probably to have named values.

1114 ↗(On Diff #99413)

Done.

1121 ↗(On Diff #99413)

Done.

dblaikie accepted this revision.May 25 2017, 11:52 AM

Seems good to me.

This revision is now accepted and ready to land.May 25 2017, 11:52 AM

Seems good to me.

Thanks ! I'll be happy to land it once LLVM svn gets back alive. It is down at least few hours already for me :`(

This revision was automatically updated to reflect the committed changes.

Had a bot failure in COFF code when tried to commit today, it revealed one more issue, that can be probably fixed easily.
My suggestion how to fix is below in comment.

llvm/trunk/lib/DebugInfo/DWARF/DWARFContext.cpp
1006

I suggest to do

Ret.SectionIndex = (RSec != Obj.section_end()) ? RSec->getIndex() : -1LL;

here ? (motivation is in comment below)
So return -1LL for absolute values. At first I thought about Optional value,
but it seems an overkill here for me.

llvm/trunk/lib/Object/COFFObjectFile.cpp
297

This is the reason of BB fail:

******************** TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED ********************
...
LLVM ERROR: Section was outside of section table.
********************

Code where is fails is:

  const coff_section *Addr = reinterpret_cast<const coff_section*>(Ref.p);

#ifndef NDEBUG
  // Verify that the section points to a valid entry in the section table.
  if (Addr < SectionTable || Addr >= (SectionTable + getNumberOfSections()))
    report_fatal_error("Section was outside of section table.");

And reason is that during call (from getSymbolInfo()):

Ret.SectionIndex = RSec->getIndex();

RSec was still Obj.section_end(), what looks fine for absolute values.

grimar reopened this revision.May 26 2017, 7:13 AM
This revision is now accepted and ready to land.May 26 2017, 7:13 AM
grimar requested review of this revision.May 26 2017, 7:13 AM
grimar edited edge metadata.
grimar updated this revision to Diff 100411.May 26 2017, 7:56 AM
  • Updated diff in according to my latest suggestion.
  • Also contains fix for testcase compilation I had to perform.

LGTM

include/llvm/DebugInfo/DWARF/DWARFRelocMap.h
20 ↗(On Diff #100411)

Document that it is -1 if the relocation points to an absolute symbol.

lib/DebugInfo/DWARF/DWARFContext.cpp
966 ↗(On Diff #100411)

Document that it is -1 if the symbol is absolute.

This revision was automatically updated to reflect the committed changes.

This seems to have caused a failure in the SystemZ build bot:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/8750

/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/test/tools/sanstats/elf.test:28:10: error: expected string not found in input
# CHECK: /tmp{{[/\\]}}f.c:1 f1 cfi-vcall 1
         ^
<stdin>:1:1: note: scanning from here
<invalid>:0 f1 cfi-vcall 1
^