This is an archive of the discontinued LLVM Phabricator instance.

[BOLT][DWARF] Implement new mechanism for DWARFRewriter
ClosedPublic

Authored by ayermolo on Jul 21 2022, 4:27 PM.

Details

Summary

This revision implement new mechanism for DWARFRewriter.
In the new mechanism, we adopt the same way with DWARFLinker did. By parsing Debug information into IR, we are allowed to handle debug information more flexible. Now the debug information updating process relies on IR and IR will be written out to binary once the updating finished.

A new class was added: DIEBuilder. This class is responsible for parsing debug information and raising it to the IR level.
This class is also used to write out the .debug_info and .debug_abbrev sections.
Since we output brand new Abbrev section we won't need to always convert low_pc/high_pc into ranges.
When conversion does happen we can also remove low_pc entry.

Followup patches:
https://reviews.llvm.org/D151906
https://reviews.llvm.org/D151908
https://reviews.llvm.org/D151909

Diff Detail

Event Timeline

zr33 created this revision.Jul 21 2022, 4:27 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
zr33 requested review of this revision.Jul 21 2022, 4:27 PM
Herald added a project: Restricted Project. · View Herald Transcript
zr33 retitled this revision from 1. add DIEBuilder 2. refine DIEBilder to avoid construct all DWARFDie 3. Implement New mechanism in updateUnitDebugInfo 4. Clean the code to Implement new mechanism for DWARFRewriter.Jul 21 2022, 4:28 PM
zr33 retitled this revision from Implement new mechanism for DWARFRewriter to [DWARF][BOLT] Implement new mechanism for DWARFRewriter.
zr33 edited the summary of this revision. (Show Details)
ayermolo added inline comments.Jul 21 2022, 5:38 PM
bolt/test/X86/shared-abbrev.s
105–106

just remove this.

zr33 updated this revision to Diff 446683.Jul 21 2022, 6:28 PM

change the non-null parameter from pointer to reference.

Please update the summary to be more descriptive.

Move bolt/test/X86/shared-abbrev.s into a separate diff.

zr33 added a comment.Jul 22 2022, 12:10 AM

Move bolt/test/X86/shared-abbrev.s into a separate diff.

I will create a new revision for that test, however, this diff should also keep it, because I fix the abbrev offset to 0.

zr33 updated this revision to Diff 446737.Jul 22 2022, 12:51 AM
zr33 marked an inline comment as done.

make some parameters const

ayermolo added inline comments.Jul 22 2022, 1:51 PM
bolt/include/bolt/Rewrite/DIEBuilder.h
55 ↗(On Diff #446737)
58 ↗(On Diff #446737)
78 ↗(On Diff #446737)
79 ↗(On Diff #446737)
81 ↗(On Diff #446737)

Is this necessary? I don't think this implementation is multi threaded. Right?

108 ↗(On Diff #446737)
124 ↗(On Diff #446737)

Should DWARFExpression be by reference?

155 ↗(On Diff #446737)
158 ↗(On Diff #446737)

Can DU be null? If not can you make it a const reference?
Tangental thought. Would it make sense to have a separate unordered map to look up of DWARFUnit pointer to uniqueID?
Instead of doing linear search every single time this function is invoked.
The uniqueID can be an index in this vector.
So wrapper that ads *DU to this vector and stores size() -1 in the unordered map as ID, keyed by ptr address.
Something like that?

172 ↗(On Diff #446737)
172 ↗(On Diff #446737)

Maybe just turn this in to print out of BOLT-ERROR?

bolt/lib/Rewrite/DIEBuilder.cpp
37 ↗(On Diff #446737)
66 ↗(On Diff #446737)

If DU can't be null, pass by reference.

ayermolo added inline comments.Jul 22 2022, 3:19 PM
bolt/lib/Rewrite/DIEBuilder.cpp
120 ↗(On Diff #446737)

When will DwarfContext be null?

180 ↗(On Diff #446737)

Where is UnitDIEId actually used?

304 ↗(On Diff #446737)
382 ↗(On Diff #446737)

Does this need to be a unique_ptr? Can we just store a struct itself?

ayermolo added inline comments.Jul 22 2022, 3:23 PM
bolt/lib/Rewrite/DWARFRewriter.cpp
759

As discussed can you move this to a separate commit + test?

zr33 updated this revision to Diff 447022.Jul 22 2022, 6:53 PM
zr33 marked 17 inline comments as done.

Adopt new Unit ID lookup approach.
fix names

ayermolo added inline comments.Jul 25 2022, 6:19 PM
bolt/lib/Rewrite/DWARFRewriter.cpp
998

Not sure this logic is correct.

The DW_OP_addrx operation has a single operand that encodes an unsigned
15 LEB128

zr33 added inline comments.Jul 25 2022, 10:06 PM
bolt/lib/Rewrite/DWARFRewriter.cpp
998

This part is create a DIE IR for DW_AT_location. In this process, we keep tracking the size of new expressions.

  1. We first get the new Index and calculate the LEB128 format for this index.
  2. Append the opcode into IR.
  3. Append the the LEB128 format index into IR.

After iterating all the expression, update the size of IR.

zr33 updated this revision to Diff 448213.EditedJul 27 2022, 5:05 PM
  1. Rewrite finding TypeDIE part. It used to find by traversing the tree and find according to the Attribute name. However, this is not accurate because a type unit can have multiple Attributes which can be TypeDIE, but there is only 1 can represent the Type Unit. Currently, the TypeDIE is record during IR construction.
  1. Re-calibrate the cross-CU reference in case there are type units in .debug_types section.
  1. clean up code and fix nits.
zr33 edited the summary of this revision. (Show Details)Jul 27 2022, 5:32 PM
zr33 updated this revision to Diff 449730.Aug 3 2022, 11:30 AM
zr33 marked an inline comment as done.

optimize the Type unit part and rebase.

zr33 updated this revision to Diff 450677.Aug 7 2022, 5:05 PM

rebase and add error message for new typeunit search

ayermolo commandeered this revision.Sep 13 2022, 2:54 PM
ayermolo edited reviewers, added: zr33; removed: ayermolo.

Taking over this from @zr33 as his internship has ended.

zr33 removed a reviewer: zr33.Sep 14 2022, 5:03 PM
ayermolo updated this revision to Diff 460508.Sep 15 2022, 2:15 PM

rebase, nits. Future changes will be on subsequent commits.

ayermolo retitled this revision from [DWARF][BOLT] Implement new mechanism for DWARFRewriter to [BOLT][DWARF] Implement new mechanism for DWARFRewriter.Jun 1 2023, 10:38 AM
ayermolo edited the summary of this revision. (Show Details)Jun 1 2023, 10:43 AM
ayermolo updated this revision to Diff 527516.Jun 1 2023, 11:07 AM

fixed Optional, after rebase

Let's add a better summary with the description of the new rewriter classes.

ayermolo updated this revision to Diff 533342.Jun 21 2023, 11:09 AM

rebase + fix for upstream change https://reviews.llvm.org/D147270

ayermolo updated this revision to Diff 534719.Jun 26 2023, 12:57 PM

rebase + moved DIEBuilder under Core

ayermolo edited the summary of this revision. (Show Details)Jul 5 2023, 4:52 PM
maksfb accepted this revision.Jul 6 2023, 12:02 PM
This revision is now accepted and ready to land.Jul 6 2023, 12:02 PM
ayermolo closed this revision.Jul 6 2023, 4:58 PM

Hi,

This commit introduces a null pointer dereference in a unit test, so I'm going to revert it together with all dependent commits:

$ lldb -- ./unittests/DebugInfo/DWARF/DebugInfoDWARFTests --gtest_filter=DWARFDebugInfo.TestDwarfToFunctions
This is google-lldb.
Help: http://go/lldb. File a bug: http://go/lldb-bug.
(lldb) target create "./unittests/DebugInfo/DWARF/DebugInfoDWARFTests"
Current executable set to '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64).
(lldb) settings set -- target.run-args  "--gtest_filter=DWARFDebugInfo.TestDwarfToFunctions"
(lldb) r
Process 2771099 launched: '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64)
Note: Google Test filter = DWARFDebugInfo.TestDwarfToFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from DWARFDebugInfo
[ RUN      ] DWARFDebugInfo.TestDwarfToFunctions
Process 2771099 stopped
* thread #1, name = 'DebugInfoDWARFT', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x20)
    frame #0: 0x0000555555c0d41c DebugInfoDWARFTests`llvm::DWARFUnitHeader::getVersion(this=0x0000000000000018) const at DWARFUnit.h:89:33
   86     bool applyIndexEntry(const DWARFUnitIndex::Entry *Entry);
   87     uint64_t getOffset() const { return Offset; }
   88     const dwarf::FormParams &getFormParams() const { return FormParams; }
-> 89     uint16_t getVersion() const { return FormParams.Version; }
   90     dwarf::DwarfFormat getFormat() const { return FormParams.Format; }
   91     uint8_t getAddressByteSize() const { return FormParams.AddrSize; }
   92     uint8_t getRefAddrByteSize() const { return FormParams.getRefAddrByteSize(); }
(lldb) up
frame #1: 0x000055555810bc59 DebugInfoDWARFTests`llvm::DWARFUnit::getVersion(this=0x0000000000000000) const at DWARFUnit.h:323:47
   320    const dwarf::FormParams &getFormParams() const {
   321      return Header.getFormParams();
   322    }
-> 323    uint16_t getVersion() const { return Header.getVersion(); }
   324    uint8_t getAddressByteSize() const { return Header.getAddressByteSize(); }
   325    uint8_t getRefAddrByteSize() const { return Header.getRefAddrByteSize(); }
   326    uint8_t getDwarfOffsetByteSize() const {
(lldb) up
frame #2: 0x00005555581953a3 DebugInfoDWARFTests`llvm::DWARFFormValue::getAsSectionedAddress(Value=0x00007fffffffd2e8, Form=0, U=0x0000000000000000) at DWARFFormValue.cpp:642:51
   639
   640  std::optional<object::SectionedAddress> DWARFFormValue::getAsSectionedAddress(
   641      const ValueType &Value, const dwarf::Form Form, const DWARFUnit *U) {
-> 642    if (!doesFormBelongToClass(Form, FC_Address, U->getVersion()))
   643      return std::nullopt;
   644    bool AddrOffset = Form == dwarf::DW_FORM_LLVM_addrx_offset;
   645    if (Form == DW_FORM_GNU_addr_index || Form == DW_FORM_addrx ||
(lldb) p U
(const llvm::DWARFUnit *) nullptr

Please also note that https://github.com/llvm/llvm-project/commit/460a2244430fae192298a5fd9fa2a269e540e8c1 was landed with an incorrect Phabricator link.

This commit also needed numerous fixup commits, which is a good signal that this commit chain needs better testing before landing.

This comment was removed by ayermolo.

Hi,

This commit introduces a null pointer dereference in a unit test, so I'm going to revert it together with all dependent commits:

$ lldb -- ./unittests/DebugInfo/DWARF/DebugInfoDWARFTests --gtest_filter=DWARFDebugInfo.TestDwarfToFunctions
This is google-lldb.
Help: http://go/lldb. File a bug: http://go/lldb-bug.
(lldb) target create "./unittests/DebugInfo/DWARF/DebugInfoDWARFTests"
Current executable set to '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64).
(lldb) settings set -- target.run-args  "--gtest_filter=DWARFDebugInfo.TestDwarfToFunctions"
(lldb) r
Process 2771099 launched: '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64)
Note: Google Test filter = DWARFDebugInfo.TestDwarfToFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from DWARFDebugInfo
[ RUN      ] DWARFDebugInfo.TestDwarfToFunctions
Process 2771099 stopped
* thread #1, name = 'DebugInfoDWARFT', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x20)
    frame #0: 0x0000555555c0d41c DebugInfoDWARFTests`llvm::DWARFUnitHeader::getVersion(this=0x0000000000000018) const at DWARFUnit.h:89:33
   86     bool applyIndexEntry(const DWARFUnitIndex::Entry *Entry);
   87     uint64_t getOffset() const { return Offset; }
   88     const dwarf::FormParams &getFormParams() const { return FormParams; }
-> 89     uint16_t getVersion() const { return FormParams.Version; }
   90     dwarf::DwarfFormat getFormat() const { return FormParams.Format; }
   91     uint8_t getAddressByteSize() const { return FormParams.AddrSize; }
   92     uint8_t getRefAddrByteSize() const { return FormParams.getRefAddrByteSize(); }
(lldb) up
frame #1: 0x000055555810bc59 DebugInfoDWARFTests`llvm::DWARFUnit::getVersion(this=0x0000000000000000) const at DWARFUnit.h:323:47
   320    const dwarf::FormParams &getFormParams() const {
   321      return Header.getFormParams();
   322    }
-> 323    uint16_t getVersion() const { return Header.getVersion(); }
   324    uint8_t getAddressByteSize() const { return Header.getAddressByteSize(); }
   325    uint8_t getRefAddrByteSize() const { return Header.getRefAddrByteSize(); }
   326    uint8_t getDwarfOffsetByteSize() const {
(lldb) up
frame #2: 0x00005555581953a3 DebugInfoDWARFTests`llvm::DWARFFormValue::getAsSectionedAddress(Value=0x00007fffffffd2e8, Form=0, U=0x0000000000000000) at DWARFFormValue.cpp:642:51
   639
   640  std::optional<object::SectionedAddress> DWARFFormValue::getAsSectionedAddress(
   641      const ValueType &Value, const dwarf::Form Form, const DWARFUnit *U) {
-> 642    if (!doesFormBelongToClass(Form, FC_Address, U->getVersion()))
   643      return std::nullopt;
   644    bool AddrOffset = Form == dwarf::DW_FORM_LLVM_addrx_offset;
   645    if (Form == DW_FORM_GNU_addr_index || Form == DW_FORM_addrx ||
(lldb) p U
(const llvm::DWARFUnit *) nullptr

Please also note that https://github.com/llvm/llvm-project/commit/460a2244430fae192298a5fd9fa2a269e540e8c1 was landed with an incorrect Phabricator link.

This commit also needed numerous fixup commits, which is a good signal that this commit chain needs better testing before landing.

Please specify OS and architecture that this is failing for. Not sure what OS version this was failing for you on?

Hi,

This commit introduces a null pointer dereference in a unit test, so I'm going to revert it together with all dependent commits:

$ lldb -- ./unittests/DebugInfo/DWARF/DebugInfoDWARFTests --gtest_filter=DWARFDebugInfo.TestDwarfToFunctions
This is google-lldb.
Help: http://go/lldb. File a bug: http://go/lldb-bug.
(lldb) target create "./unittests/DebugInfo/DWARF/DebugInfoDWARFTests"
Current executable set to '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64).
(lldb) settings set -- target.run-args  "--gtest_filter=DWARFDebugInfo.TestDwarfToFunctions"
(lldb) r
Process 2771099 launched: '/usr/local/google/home/dmitrig/llvm2/build-debug/unittests/DebugInfo/DWARF/DebugInfoDWARFTests' (x86_64)
Note: Google Test filter = DWARFDebugInfo.TestDwarfToFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from DWARFDebugInfo
[ RUN      ] DWARFDebugInfo.TestDwarfToFunctions
Process 2771099 stopped
* thread #1, name = 'DebugInfoDWARFT', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x20)
    frame #0: 0x0000555555c0d41c DebugInfoDWARFTests`llvm::DWARFUnitHeader::getVersion(this=0x0000000000000018) const at DWARFUnit.h:89:33
   86     bool applyIndexEntry(const DWARFUnitIndex::Entry *Entry);
   87     uint64_t getOffset() const { return Offset; }
   88     const dwarf::FormParams &getFormParams() const { return FormParams; }
-> 89     uint16_t getVersion() const { return FormParams.Version; }
   90     dwarf::DwarfFormat getFormat() const { return FormParams.Format; }
   91     uint8_t getAddressByteSize() const { return FormParams.AddrSize; }
   92     uint8_t getRefAddrByteSize() const { return FormParams.getRefAddrByteSize(); }
(lldb) up
frame #1: 0x000055555810bc59 DebugInfoDWARFTests`llvm::DWARFUnit::getVersion(this=0x0000000000000000) const at DWARFUnit.h:323:47
   320    const dwarf::FormParams &getFormParams() const {
   321      return Header.getFormParams();
   322    }
-> 323    uint16_t getVersion() const { return Header.getVersion(); }
   324    uint8_t getAddressByteSize() const { return Header.getAddressByteSize(); }
   325    uint8_t getRefAddrByteSize() const { return Header.getRefAddrByteSize(); }
   326    uint8_t getDwarfOffsetByteSize() const {
(lldb) up
frame #2: 0x00005555581953a3 DebugInfoDWARFTests`llvm::DWARFFormValue::getAsSectionedAddress(Value=0x00007fffffffd2e8, Form=0, U=0x0000000000000000) at DWARFFormValue.cpp:642:51
   639
   640  std::optional<object::SectionedAddress> DWARFFormValue::getAsSectionedAddress(
   641      const ValueType &Value, const dwarf::Form Form, const DWARFUnit *U) {
-> 642    if (!doesFormBelongToClass(Form, FC_Address, U->getVersion()))
   643      return std::nullopt;
   644    bool AddrOffset = Form == dwarf::DW_FORM_LLVM_addrx_offset;
   645    if (Form == DW_FORM_GNU_addr_index || Form == DW_FORM_addrx ||
(lldb) p U
(const llvm::DWARFUnit *) nullptr

Please also note that https://github.com/llvm/llvm-project/commit/460a2244430fae192298a5fd9fa2a269e540e8c1 was landed with an incorrect Phabricator link.

This commit also needed numerous fixup commits, which is a good signal that this commit chain needs better testing before landing.

That is why it's part of the stack. Original diff was done by someone else. I commandeered it. All the fixes are in subsequent commit to separate work done. Thus I didn't squash into one commit.
I tried it locally, and for me it passes. Either on full stack or on this diff.

Although looking at stack trace I see where it goes wrong.
I can change
if (!doesFormBelongToClass(Form, FC_Address, U->getVersion()))
to
if (!U&&!isFormClass(FC_Address) || !doesFormBelongToClass(Form, FC_Address, U->getVersion()))
which should restore to old behavior in case U is nullptr.

OK, I was able to repro on mac.

ayermolo reopened this revision.Jul 7 2023, 4:56 PM
This revision is now accepted and ready to land.Jul 7 2023, 4:56 PM
ayermolo updated this revision to Diff 538302.Jul 7 2023, 5:24 PM

fixed lldb nullpr exception + rebase

This revision was landed with ongoing or failed builds.Jul 10 2023, 2:42 PM
This revision was automatically updated to reflect the committed changes.