Page MenuHomePhabricator

[lldb] DWZ 1/9: Pass main DWARFUnit * along DWARFDIEs
Changes PlannedPublic

Authored by jankratochvil on Feb 7 2021, 9:34 PM.

Details

Summary

DWZ needs to know origin CU from where are current DW_TAG_Partial_Unit DIEs for example for current language or to generate UID. One needs to access both symbol files as the one where the DIE really comes from still needs to be used for .debug_abbrev, .debug_str and .debug_line.

All DWZ patches are also applied at: git clone -b dwz git://git.jankratochvil.net/lldb
There is a copy: git clone -b dwz https://github.com/jankratochvil/llvm-project

Diff Detail

Event Timeline

jankratochvil created this revision.Feb 7 2021, 9:34 PM
jankratochvil requested review of this revision.Feb 7 2021, 9:34 PM
jankratochvil retitled this revision from DWZ 01/08: Pass main DWARFUnit * along DWARFDIEs to [lldb] [nfc] DWZ 01/08: Pass main DWARFUnit * along DWARFDIEs.Feb 7 2021, 9:49 PM
jankratochvil retitled this revision from [lldb] [nfc] DWZ 01/08: Pass main DWARFUnit * along DWARFDIEs to [lldb] DWZ 01/08: Pass main DWARFUnit * along DWARFDIEs.Feb 7 2021, 9:53 PM
Harbormaster completed remote builds in B88242: Diff 322025.
labath added a comment.Feb 8 2021, 5:26 AM

I stopped looking after the first file, as the introduction of DWARF-ness to the CompileUnit class is a show-stopper. These classes are supposed to be independent of the actual format used to represent the data, and having them know anything about DWARF breaks that.
What's the reason for that? We already have a way to obtain the DWARFCompileUnit given a (generic) CompileUnit from within DWARF code (SymbolFileDWARF::GetDWARFCompileUnit). Given that your changes are (should be?) inside DWARF code, it's not clear to me why is that not sufficient (?)

jankratochvil planned changes to this revision.Feb 8 2021, 5:59 AM

I will try to update it, I agree with your comment, thanks for the review.
I agree this first patch needs to be decided first before looking at the other parts 02..08.

It breaks testsuite if both dwz and libc++ are installed. That is a bug of dwz itself, not a problem of this dwz-support-for-LLDB: Multifile drops DW_TAG_namespace::DW_AT_export_symbols

jankratochvil retitled this revision from [lldb] DWZ 01/08: Pass main DWARFUnit * along DWARFDIEs to [lldb] DWZ 1/9: Pass main DWARFUnit * along DWARFDIEs.Mar 17 2021, 3:53 PM

OSX compatibility: DWARFCompileUnit::GetMainUnit

Ping, it would be nice to know if there is some plan/schedule for patch review of this series. I understand it is all voluntary, thanks.

I would love to do something about this, but currently I have no idea when I would be able to take on a review of this size. I know it's not what you wanted to hear. Sorry. :(

That's sure fine, thanks for the reply. Great there is at least the plan to review it in the future. I understand you are busy.

If this is going to be plumbed through everywhere (ie: a DWARFDIE is mostly useless without both its lexical DWARFUnit and its imported-into DWARFUnit) then maybe it makes more sense to add this DWARFUnit* to the DWARFBaseDie type alongside the lexical DWARFUnit?

Much of this patch requires all places that took a DWARFDie object now also must take an extra parameter. Is there really no way to have DWARFDie be able to discover the main unit via its contained DWARFUnit? DWARFDie objects are transient, so they can contain more information that just what they currently contain:

class DWARFDie {
  DWARFUnit *U = nullptr;
  const DWARFDebugInfoEntry *Die = nullptr;
};

Can we add a new "DWARFUnit *MU;" to the DWARFDie class? Is there really not way to ask the contained "U" DWARFUnit member for the main unit?

(similar idea)

Is there really not way to ask the contained "U" DWARFUnit member for the main unit?

There is no such way because one DWARFUnit can be used by multiple (in practice by two - DW_AT_language C plus C++) main units. You did ask it in D73206#1837138.

Can we add a new "DWARFUnit *MU;" to the DWARFDie class?

Been there, done that. It is the option 1 from text below. It cannot be applied (it was against the DWZ patch that time), just for an idea how it did work: https://people.redhat.com/jkratoch/dwz-DWARFUnitPair-2019-10-30.patch

 class DWARFBaseDIE {
-  DWARFUnit *m_cu;
+  DWARFUnitPair m_cu;
   DWARFDebugInfoEntry *m_die;
 };
+class DWARFUnitPair {
...
+  DWARFUnit *m_cu;
+  // For non-DWZ setups it is either equal to 'm_cu' or nullptr if 'm_cu' is a DWARFTypeUnit.
+  DWARFCompileUnit *m_main_cu;
+};

It was discussed in D73206 where was originally a different initial comment:


[...] proposing multiple possibilities how to satisfy DWZ's need for DWARFDIE::m_main_cu, could you choose one?

  1. DWARFDIE 16B->24B does not matter as DWARFDIE is not stored anywhere - maybe one could use this patch in such case.
  2. PointerUnion for DWARFDIE::m_cu so that DWARFDIE remains 16B without DWZ.
class DWARFBaseDIE {
  llvm::PointerUnion<DWARFUnit *, DWARFUnitPair *> m_cu;
  DWARFDebugInfoEntry *m_die;
};
class DWARFUnitPair {`
  DWARFUnit *m_cu;
  DWARFCompileUnit *m_main_cu;
};
  1. (removed as I do not find nowadays enough to store just DW_LANG_* instead of the main unit pointer.)

The option 2 is fine as the double dereference performance impact (if any) affects only LLDB run on DWZ systems. And still LLDB with this little performance hit is much faster than the only existing debugger GDB for DWZ systems.

@labath did disagree in D73206#1836018 regarding the merge with LLVM DWARF. I cannot much talk about the merge as I do not know much the LLVM counterpart and so I haven't yet made any plan in my mind how to merge it.

Thanks for the comments.

Just a rebase on top of trunk.

I would be fine with DWARFDie getting bigger to 24B. These objects are used temporarily and not used as part of the storage of all of a DWARFUnit's DIEs. DWARFUnit objects store an array of DWARFDebugInfoEntry objects and those are what we care about not getting bigger.

jankratochvil planned changes to this revision.Apr 23 2021, 12:50 AM

I would be fine with DWARFDie getting bigger to 24B.

Great, so I can return back to 2019.

These objects are used temporarily and not used as part of the storage of all of a DWARFUnit's DIEs.

This is what I also found in D73206. There was originally a different initial comment:


I have found DWARFDIE is mostly used only for parameters and autovariables. I found (by grep -r '<.*DWARF\(Base\|\)DIE' lldb) only two cases of DWARFDIE in permanent structures:

  • DWARFASTParserClang::m_decl_ctx_to_die refactored in this patch
  • ElaboratingDIEIterator::m_worklist but I think it will never grow to any more DWARFDIEs than a few so it is worse to refactor it to DIERef which is slower and due to few entries it does not save any memory space.

For the DWARFASTParserClang::m_decl_ctx_to_die refactoring I will make it a standalone patch in a new series.

DWARFUnit objects store an array of DWARFDebugInfoEntry objects and those are what we care about not getting bigger.

The problem is I still need to increase element size of various indexes, such as:

-  typedef llvm::DenseMap<const DWARFDebugInfoEntry *, clang::Decl *>
+  typedef llvm::DenseMap<std::pair<DWARFUnit *, const DWARFDebugInfoEntry *>, clang::Decl *>
       DIEToDeclMap;

I was thinking I could templatize it so that there are two variants of the code and it would switch to the larger + (a bit) slower variant only when it detects DWZ file. I understand it is bad to affect LLDB performance by the marginally used DWZ feature.