This is an archive of the discontinued LLVM Phabricator instance.

[FoldingSet] Move compare to header (NFC).
Needs ReviewPublic

Authored by fhahn on Dec 7 2022, 1:00 PM.

Details

Summary

Moving the compare operator implementations to the header gives a slight
speedup. Not sure if the speedups are worthwhile, but I noticed this
when looking at folding set performance in general.

NewPM-O3: -0.04%
NewPM-ReleaseThinLTO: -0.03%
NewPM-ReleaseLTO-g: -0.01

https://llvm-compile-time-tracker.com/compare.php?from=8fa81cfad47c827ea1b75afe1f32861d4a7bad37&to=3c5e5ebd834106e62407e79d80ced0a49eafc4b7&stat=instructions:u

Diff Detail

Event Timeline

fhahn created this revision.Dec 7 2022, 1:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 1:00 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
fhahn requested review of this revision.Dec 7 2022, 1:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 1:00 PM

Out of curiosity: do you have any speedup when building with LTO? Looks like something LTO should handle gracefully...

fhahn added a comment.Dec 8 2022, 8:39 AM

Out of curiosity: do you have any speedup when building with LTO? Looks like something LTO should handle gracefully...

Sure, this is not needed when building with LTO, but many builds don't use LTO :)

yurai007 added a comment.EditedDec 9 2022, 3:00 AM

looking at folding set performance in general

Just as a side note regarding further improvements in future - when I was working on related patches some months ago I noted 2 extra ideas which could be worth exploring:

[1] faster hashing related to FoldingSet* classes family. Symbols like llvm::FoldingSetBase::FindNodeOrInsertPos and llvm::hashing::detail::hash_short are often seen as hot in profiler output (at least were - couple of months ago). There is long-standing task in my queue to try xxHash or even CRC just for FoldingSet and see if it helps.

[2] removing (somehow) indirection related to FoldingSet::NodeEquals in context of following callers: FunctionProtoTypes/PointerTypes/ElaboratedTypes/ParenTypes Profile functions. IIRC that indirection made inlining impossible which mattered for projects with compile time highly dominated by frontend. Moving compare operators in this change may help for NodeEquals but I believe there is still some room for improvement.

I'm not sure about [2], but I think that [1] should bring some value even for builds with LTO=ON.