This is an archive of the discontinued LLVM Phabricator instance.

[dsymutil] Add preliminary support for DWARF 5.
ClosedPublic

Authored by JDevlieghere on Jan 8 2021, 11:10 AM.

Details

Summary

Currently dsymutil will silently fail when processing binaries with
Dwarf 5 debug info. This patch adds rudimentary support for Dwarf 5 in
dsymutil.

  • Recognize relocations in the debug_addr section.
  • Recognize (a subset of) Dwarf 5 form values.
  • Emits valid Dwarf 5 compile unit header chains.

To simplify things (and avoid having to emit indexed sections) I decided
to emit the relocated addresses directly in the debug info section.

  • DW_FORM_strx gets relocated and rewritten to DW_FORM_strp
  • DW_FORM_addrx gets relocated and rewritten to DW_FORM_addr

Obviously there's a lot of work left, but this should be a step in the
right direction.

rdar://62345491

Diff Detail

Event Timeline

JDevlieghere requested review of this revision.Jan 8 2021, 11:10 AM
JDevlieghere created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 8 2021, 11:10 AM
JDevlieghere edited the summary of this revision. (Show Details)Jan 8 2021, 11:15 AM
avl added a comment.Jan 8 2021, 2:14 PM

If I understood correctly, It looks like this change is not enough to work properly for DWARF5.
Currently, relocation is checked using offsets to the attribute:

bool DwarfLinkerForBinary::AddressManager::hasLiveAddressRange(
    const DWARFDie &DIE, CompileUnit::DIEInfo &MyInfo) {
  const auto *Abbrev = DIE.getAbbreviationDeclarationPtr();

  Optional<uint32_t> LowPcIdx = Abbrev->findAttributeIndex(dwarf::DW_AT_low_pc);
  if (!LowPcIdx)
    return false;

  uint64_t Offset = DIE.getOffset() + getULEB128Size(Abbrev->getCode());
  uint64_t LowPcOffset, LowPcEndOffset;
  std::tie(LowPcOffset, LowPcEndOffset) =
      getAttributeOffsets(Abbrev, *LowPcIdx, Offset, *DIE.getDwarfUnit());

  return hasValidRelocationAt(LowPcOffset, LowPcEndOffset, MyInfo);   <<<< check relocation inside debug_info
}

i.e. we check whether relocation exists between LowPcOffset, LowPcEndOffset of dwarf::DW_AT_low_pc inside debug_info table.

If we would read relocations for "debug_addr" table instead of the "debug_info" table then we would check the wrong relocations. To work properly we would need to recognize offset inside "debug_addr". Something like this(I did not check whether it could be compiled or whether it is full enough):

Optional<uint32_t> LowPcIdx = Abbrev->findAttributeIndex(dwarf::DW_AT_low_pc);
if (!LowPcIdx)
  return false;

if ( Abbrev->getFormByIndex(*LowPcIdx) == DW_FORM_addr ) {
  uint64_t Offset = DIE.getOffset() + getULEB128Size(Abbrev->getCode());
  uint64_t LowPcOffset, LowPcEndOffset;
  std::tie(LowPcOffset, LowPcEndOffset) =
    getAttributeOffsets(Abbrev, *LowPcIdx, Offset, *DIE.getDwarfUnit());

  return hasValidRelocationAt(LowPcOffset, LowPcEndOffset, MyInfo); <<<< check relocations inside debug_info
} else ( Abbrev->getFormByIndex(*LowPcIdx) == DW_FORM_addrx) {
  Optional<DWARFFormValue> OffsetValue = DIE.find(dwarf::DW_AT_low_pc);
  if (OffsetValue) {
    uint64_t LowPcOffset = OffsetValue->getRawUValue();
    uint64_t LowPcEndOffset = LowPcOffset + 4;
    return hasValidRelocationAt(LowPcOffset, LowPcEndOffset, MyInfo); <<<< check relocations inside debug_addr
  }
}

i.e. in case when we do "findValidRelocs(Section, Obj, DMO)" for "debug_addr" then we need to use other offsets(LowPcOffset/LowPcEndOffset) inside hasLiveMemoryLocation/hasLiveAddressRange.

JDevlieghere retitled this revision from [dsymutil] Support finding relocations in debug_addr for DWARF5 to [dsymutil] Add preliminary support for DWARF 5..
JDevlieghere edited the summary of this revision. (Show Details)
In D94323#2487899, @avl wrote:

If I understood correctly, It looks like this change is not enough to work properly for DWARF5.

Yeah, I knew that when I uploaded the patch but I was planning to put up a series of smaller patches but then I changed my mind. I went with a slightly different approach than what you suggested, but the idea is the essentially the same. At first I tried to shoehorn them into the ValidRelocs list but the separate vector simplifies things a lot.

JDevlieghere added inline comments.Jan 8 2021, 5:26 PM
llvm/lib/DWARFLinker/DWARFStreamer.cpp
231

Note to self: remove this when updating the diff.

avl added inline comments.Jan 11 2021, 4:41 AM
llvm/include/llvm/DWARFLinker/DWARFLinker.h
88–89
llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
695–697

it is also necessary to update how dwarf::DW_AT_location is handled for debug_addr. It looks like we need to parse location expression, extract addresses, and check whether it points to live code. I think it needs comment in this review if it is supposed to be done in some follow-up review.

713

I would propose to use single unified searching function here. i.e. to use the same AddressManager::hasValidRelocationAt() for both tables(debug_info and debug_abbrev). Like in this review - D93106 .

avl added inline comments.Jan 11 2021, 4:58 AM
llvm/tools/dsymutil/DwarfLinkerForBinary.h
173

nit: probably it would be better to name argument not Addr, but DebugAddrOffset or something like that. To make it clear that specified operand is not an address but offset in the debug_addr table.

JDevlieghere updated this revision to Diff 315884.
JDevlieghere marked 3 inline comments as done.
avl added inline comments.Jan 12 2021, 2:40 AM
llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
614–621

hasValidDebugAddrRelocationAt() and hasValidDebugInfoRelocationAt() have different implementations - first uses binary search and second sequentially searches from the first till last. My previous comment was about using one implementation for both cases. But, let it would be done with D93106.

776

it looks like we need to add ValidReloc.Addend here, the same as in applyValidRelocs():

return It->Mapping->getValue().BinaryAddress + It->Addend;

Also, not for this review, but as a general thing: probably it would be better to make relocateIndexedAddr() to be more general(since it appears in AddressManager interface). i.e. make relocateIndexedAddr to be able to relocate any address, not only one from debug_addr section:

class AddressManager {
public:

uint64_t relocateAddr(const DWARFFormValue&);
}

llvm/tools/dsymutil/DwarfLinkerForBinary.h
154

The comment is about debug_addr section but the function works with debug_info section.

163
/// This function must be called with offsets in strictly ascending order
/// because it never looks back at relocations it already 'went past'.

above comment is old. This function could not be called with offsets in strictly ascending order. DIEs from debug_info are evaluated in strictly ascending order, but addresses which they are referred may be in any order. The implementation of hasValidDebugAddrRelocationAt uses lower_bound search function and does not relate on order of offsets.

it looks like comments for hasValidDebugAddrRelocationAt and hasValidDebugInfoRelocationAt somehow mixed up.

avl added inline comments.Jan 12 2021, 2:50 AM
llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
775

It looks incorrect to return offset in debug_addr section as the value of relocated address. It might be better to make return value Expected and return Error in case offset is not found.

JDevlieghere marked 2 inline comments as done.

Address @avl's review comments.

JDevlieghere marked an inline comment as done.Jan 12 2021, 9:20 AM
JDevlieghere added inline comments.
llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
614–621

Ah, I misunderstood. Yeah I figured we could unify this after that patch lands.

avl accepted this revision.Jan 12 2021, 12:42 PM

LGTM with small comment inside.

llvm/tools/dsymutil/DwarfLinkerForBinary.h
152–163
This revision is now accepted and ready to land.Jan 12 2021, 12:42 PM
This revision was automatically updated to reflect the committed changes.
JDevlieghere marked an inline comment as done.