This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstrRef][NFC] Use DenseMaps where we can, use more efficient ValueIDNum repr
ClosedPublic

Authored by jmorse on Oct 22 2021, 10:08 AM.

Details

Summary

This is a "misc" patch, making various tweaks to speed InstrRefBasedLDV up. I've been testing it on the compile-time-tracking project website, and it gets back something like 2% of instructions retired. There's still a non-trivial cost of instruction referencing, this patch eases the matter. Things done:

  • Adjusting the default sizes of some densemaps / smallvectors, and calling reserve() early when we know how large they should be,
  • Allowing LocIdx (the sequential location numbers) to be densemap keys,
  • Allowing ValueIDNums to be densemap keys,
  • Adjusting the representation of ValueIDNums to sharing a union with a uint64_t.

The latter is the most interesting: ValueIDNum is supposed to be a value type, but clang doesn't condense the comparison functions down to a single value comparison. Maybe I was doing something wrong, but explicitly having the bitfield as part of a union lets us do it manually.

Replacing other containers with dense ones is fairly straight forwards. I have to introduce a intermediate variable during an assignment, because DenseMap[A] = DenseMap[B] isn't always safe if one of those operator[]'s causes the container to reallocate.

Strangest of all: replacing the std::map in TransferTracker::loadInLocs with a DenseMap causes a performance loss, so I haven't done it. I still need to dig into why std::map is faster in this case.

Diff Detail

Event Timeline

jmorse created this revision.Oct 22 2021, 10:08 AM
jmorse requested review of this revision.Oct 22 2021, 10:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2021, 10:08 AM
Orlando accepted this revision.Oct 25 2021, 5:35 AM

LGTM - please clang-format.

Strangest of all: replacing the std::map in TransferTracker::loadInLocs with a DenseMap causes a performance loss

Some maybe-useless speculation: Perhaps the current ValueIDNum hash is poor / causing high number of collisions? std::map (as opposed to DenseMap and std::unorderd_map) is ordered. Maybe in this case, the ordered search is quicker than a hash table lookup or something like that?

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
119

backed -> packed?

1029

Curious if you have you tried experimenting with this hash function at all?

This revision is now accepted and ready to land.Oct 25 2021, 5:35 AM
jmorse added inline comments.Oct 25 2021, 5:53 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
1029

I tried the standard llvm hash_code, which caused CTMark to be slower. Looking at the assembly, hash_code adds three or four instructions, and that's all. This difference is for the transition of hash_code to nothing, and shows a speedup:

http://llvm-compile-time-tracker.com/compare.php?from=d6553eab88c71e1d797163295750f2d1bc3d86ba&to=cfe24102d48687ccb7fbcc8311488bb3a87010bb&stat=instructions

IMO, the distribution is (seemingly) good enough already.

This revision was landed with ongoing or failed builds.Oct 25 2021, 10:08 AM
This revision was automatically updated to reflect the committed changes.