This is an archive of the discontinued LLVM Phabricator instance.

Add a DWARF transformer class that converts DWARF to GSYM.
ClosedPublic

Authored by clayborg on Feb 11 2020, 4:19 PM.

Details

Summary

The DWARF transformer is added as a class so it can be unit tested fully.

The DWARF is converted to GSYM format and handles many special cases for functions:

  • omit functions in compile units with 4 byte addresses whose address is UINT32_MAX (dead stripped)
  • omit functions in compile units with 8 byte addresses whose address is UINT64_MAX (dead stripped)
  • omit any functions whose high PC is <= low PC (dead stripped)
  • StringTable builder doesn't copy strings, so we need to make backing copies of strings but only when needed. Many strings come from sections in object files and won't need to have backing copies, but some do.
  • When a function doesn't have a mangled name, store the fully qualified name by creating a string by traversing the parent decl context DIEs and then. If we don't do this, we end up having cases where some function might appear in the GSYM as "erase" instead of "std::vector<int>::erase".
  • omit any functions whose address isn't in the optional TextRanges member variable of DwarfTransformer. This allows object file to register address ranges that are known valid code ranges and can help omit functions that should have been dead stripped, but just had their low PC values set to zero. In this case we have many functions that all appear at address zero and can omit these functions by making sure they fall into good address ranges on the object file. Many compilers do this when the DWARF has a DW_AT_low_pc with a DW_FORM_addr, and a DW_AT_high_pc with a DW_FORM_data4 as the offset from the low PC. In this case the linker can't write the same address to both the high and low PC since there is only a relocation for the DW_AT_low_pc, so many linkers tend to just zero it out.
  • converts DWARF line tables to GSYM
  • converts DWARF inline info into GSYM inline information
  • if a function doesn't have any line entries, add one entry for the start address and use the DW_AT_decl_file and DW_AT_decl_line to at least point to something. Some DWARF is made using the assembler and doesn't have line info.

Diff Detail

Event Timeline

clayborg created this revision.Feb 11 2020, 4:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2020, 4:19 PM
clayborg edited the summary of this revision. (Show Details)Feb 11 2020, 5:17 PM

That's a huge patch and it's difficult for me to tell whether all new code is covered by tests. Ideally the GSYM reader dump functions, for example, would be a separate patch, etc...

llvm/include/llvm/DebugInfo/GSYM/DwarfTransformer.h
27

A doxygen comment explaining what DwarfTransformer is doing would be useful here.

llvm/include/llvm/MC/StringTableBuilder.h
69

I believe the more common pattern is to write this as return StringIndexMap.count(S)

llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
2

Nit: no need to mark a .cpp file as C++.

71

unless you need this for re-entrancy, you can use a uint32_t & and avoid the second lookup below.

clayborg updated this revision to Diff 244459.Feb 13 2020, 9:25 AM

Fix all issues from Adrian.

clayborg marked 4 inline comments as done.Feb 13 2020, 9:27 AM

I tried to keep the logging out of this patch, but the logging is used by the DwarfTransformer code to log things to output streams.

aprantl accepted this revision.Feb 13 2020, 10:43 AM
This revision is now accepted and ready to land.Feb 13 2020, 10:43 AM
This revision was automatically updated to reflect the committed changes.

commit 22d63b631892fe5c2e0c8062ccf954c71c77b0dd (HEAD -> master, origin/master, origin/HEAD)
Author: Greg Clayton <gclayton@fb.com>
Date: Thu Feb 13 11:35:43 2020 -0800

Fix buildbots by not using "and" and "not".

commit e8e97b28cd8eab836375603b5cae97696248c9b6 (HEAD -> master, origin/master, origin/HEAD)
Author: Greg Clayton <gclayton@fb.com>
Date: Thu Feb 13 11:43:07 2020 -0800

Fix buildbots that create shared libraries from GSYM library by adding a dependency on LLVMDebugInfoDWARF.
arcfilter () {
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:/' | git commit --amend -F -
}

can strip unneeded Phabricator metadata tags from the git description.

commit e8e97b28cd8eab836375603b5cae97696248c9b6 (HEAD -> master, origin/master, origin/HEAD)

Mentioning D74450 in the commit should be sufficient to associate the commit with the Differential. No need to post a separate comment.