This is an archive of the discontinued LLVM Phabricator instance.

[dsymutil] Mark ODR canonical DIEs after processing the whole CU
Needs ReviewPublic

Authored by JDevlieghere on Dec 14 2022, 5:11 PM.

Details

Reviewers
avl
friss
aprantl
Summary

Currently, we've marking DIEs as ODR canonical while looking for DIEs to keep. The current approach works because we do the marking after the parent walk completes, which corresponds to when we're done processing a chain of DIEs. Given the complexity of the wordlist algorithm this is far from obvious when reading the code. I don't believe this actually has to be part of the algorithm that looks for DIEs to keep. In my opinion it would be less fragile and easier to understand if we did the marking of ODR candidates after processing the whole CU. That's what this patch implements.

Diff Detail

Event Timeline

JDevlieghere created this revision.Dec 14 2022, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2022, 5:11 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
JDevlieghere requested review of this revision.Dec 14 2022, 5:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2022, 5:11 PM
avl added a comment.Dec 15 2022, 7:23 AM

Does it cure the problem from D138176?

avl added a comment.Dec 15 2022, 8:58 AM

DIE could be marked as ODR canonical candidate(and also kept) from the another compile unit. If we do markDIEsAsODRCanonical right after the current unit is processed then we will miss DIE marked from another compile unit. And then this DIE would not be deduplcated.

CU1:
   0x10 DIE type "class1"   // will not be deduplicated and will be left inplace, after CU1 is processed this DIE is not marked as Kept and then is not marked as ODR canonical candidate
CU2:
   DIE ref to 0x10               // 0x10 DIE is marked as Kept but is not marked as ODR canonical candidate
CU3:
  0x40 DIE type "class1"   // will not be deduplicated and will be left inplace, after CU3 is processed this DIE is marked as Kept and is marked as ODR canonical candidate.
  DIE ref to 0x40

If markDIEsAsODRCanonical would be called after all compile uints are processed:

for (auto &CurrentUnit : OptContext.CompileUnits) {
  lookForDIEsToKeep(*OptContext.File.Addresses,
                    OptContext.File.Addresses->getValidAddressRanges(),
                    OptContext.CompileUnits,
                    CurrentUnit->getOrigUnit().getUnitDIE(),
                    OptContext.File, *CurrentUnit, 0);
}
for (auto &CurrentUnit : OptContext.CompileUnits) {
  markDIEsAsODRCanonical(CurrentUnit->getOrigUnit().getUnitDIE(),
                         *CurrentUnit);
}

Then deduplication would not be done for all these compile units, since it is based on ODRCanonical information which is not ready while lookForDIEsToKeep is working:

void DWARFLinker::lookForRefDIEsToKeep(
..
      // If the referenced DIE has a DeclContext that has already been
      // emitted, then do not keep the one in this CU. We'll link to
      // the canonical DIE in cloneDieReferenceAttribute.
      //
      // FIXME: compatibility with dsymutil-classic. UseODR shouldn't
      // be necessary and could be advantageously replaced by
      // ReferencedCU->hasODR() && CU.hasODR().
      //
      // FIXME: compatibility with dsymutil-classic. There is no
      // reason not to unique ref_addr references.
      if (AttrSpec.Form != dwarf::DW_FORM_ref_addr &&
          isODRAttribute(AttrSpec.Attr) && Info.Ctxt &&
          Info.Ctxt->hasCanonicalDIE())  // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        continue;    
...