This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Split TypeServerSource and extend type index map lifetime
ClosedPublic

Authored by rnk on Sep 15 2020, 7:03 PM.

Details

Summary

Extending the lifetime of these type index mappings does increase memory
usage (+2% in my case), but it decouples type merging from symbol
merging. This is a pre-requisite for two changes that I have in mind:

  • parallel type merging: speeds up slow type merging
  • defered symbol merging: avoid heap allocating (relocating) all symbols

This eliminates CVIndexMap and moves its data into TpiSource. The maps
are also split into a SmallVector and ArrayRef component, so that the
ipiMap can alias the tpiMap for /Z7 object files, and so that both maps
can simply alias the PDB type server maps for /Zi files.

Splitting TypeServerSource establishes that all input types to be merged
can be identified with two 32-bit indices:

  • The index of the TpiSource object
  • The type index of the record

This is useful, because this information can be stored in a single
64-bit atomic word to enable concurrent hashtable insertion.

Diff Detail

Unit TestsFailed

Event Timeline

rnk created this revision.Sep 15 2020, 7:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2020, 7:03 PM
Herald added subscribers: jfb, arphaman. · View Herald Transcript
rnk requested review of this revision.Sep 15 2020, 7:03 PM

I like a lot the change, it makes the API and the ownership more clear. But I am wondering: why not simply remove the CVIndexMap type? Do we need this extra indirection now? We could simply use a TpiSource API, instead of doing the if (indexMap->isTypeServerMap) in PDB.cpp? e.g. class TpiSource { ... virtual ArrayRef<TypeIndex> indexMap()/typeMap(); }.

rnk updated this revision to Diff 292351.Sep 16 2020, 2:31 PM
  • eliminate CVIndexMap entirely
rnk updated this revision to Diff 292362.Sep 16 2020, 3:14 PM
  • remove debugging noinline
rnk added a comment.Sep 16 2020, 3:16 PM

Thanks, looks like I can remove it. I made things more uniform by having ipiMap always point to something, so if we have an item reference, always use ipiMap.

rnk retitled this revision from [PDB] Split TypeServerSource and extend CVIndexMap lifetime to [PDB] Split TypeServerSource and extend type index map lifetime.Sep 16 2020, 3:16 PM
rnk edited the summary of this revision. (Show Details)

Just a few more things and we should be good to go:

lld/COFF/PDB.cpp
191–192

Can we only have TpiSource * here and infer file?

276

Don't we always expect a source at this point?

755

source isn't checked here. Perhaps assert(source) in the DebugSHandler constructor?

921–923

...otherwise if you don't want the if, it is worth adding a line to explain in which situation the source doesn't have a file.

923

No need to pass down both file and source.

931

I liked it better with this if as it is self-explantory. Is there another situation where we should return here? The TypeServerIpiSource is never returned, right?

rnk updated this revision to Diff 292564.Sep 17 2020, 10:39 AM
  • Ensure all objs with symbols have TpiSources
lld/COFF/PDB.cpp
191–192

Almost, but there are ObjFiles which lack a corresponding TpiSource, and we must add their symbols. So TpiSource is optional (may be null) and ObjFile is required (reference). I checked, and this case is tested in tree.

276

I wish. Perhaps we should fix this by making a source for all objs, even if they lack types to avoid the null check.

755

Huh, I thought I added a check above and had it warn and return. Must have gotten lost.

921–923

I can make the comment more explicit: PDB sources don't have symbols and don't have an object file, so only add symbols if the source is for an object file.

923

I think we can eliminate that if we make empty TpiSources for objects with symbols but no types, I'll try that.

931

Actually, we do come here for IPI sources. We iterate over all TpiSource objects and call addDebug on them, so if we keep this if, it will need two conditions. IMO the new if handles it more neatly: only object files have symbols, so if this is a source for an object file, add symbols.

aganea accepted this revision.Sep 17 2020, 11:30 AM

Looks great, thanks!

lld/COFF/InputFiles.cpp
793 ↗(On Diff #292564)

I am fine without this, I simply forgot about this case, where we had symbols but no types. Would that affect your tpi-index in D87805? This empty TpiSource will also receive an index and will be scheduled for doing, essentially, nothing? I am fine if you want to revert to an extra if below, whichever you think is the best.

lld/COFF/PDB.cpp
931

Agreed.

This revision is now accepted and ready to land.Sep 17 2020, 11:30 AM
rnk added a comment.Sep 17 2020, 11:36 AM

Thanks!

lld/COFF/InputFiles.cpp
793 ↗(On Diff #292564)

I think I like this version better: it's one less nullable pointer to check for later. There aren't many of these object files without types, they are mostly assembly files .debug$F FPO data, so the extra memory usage shouldn't be a concern.

This revision was landed with ongoing or failed builds.Sep 17 2020, 11:53 AM
This revision was automatically updated to reflect the committed changes.