This is an archive of the discontinued LLVM Phabricator instance.

[llvm-reduce] optimize extractFromModule functions
ClosedPublic

Authored by dwightguth on Oct 28 2021, 2:10 PM.

Details

Summary

The extractBasicBlocksFromModule, extractInstrFromModule, and other
similar functions previously performed very poorly when the number of
such elements in the program to reduce was very high. Previously, we
were creating the set which caches elements to keep by looping through
all elements in the module and adding them to the set. However, since
std::set is an ordered set, this introduces a massive amount of
rebalancing if the order of elements in the program and the order of
their pointers in memory are not the same.

The solution is straightforward: first put all the elements to be kept
in a vector, then use the constructor for std::set which takes a pair of
iterators over a collection. This constructor is optimized to avoid
doing unnecessary work when initializing large sets.

Also in this change, we pass BBsToKeep set to functions
replaceBranchTerminator and removeUninterestingBBsFromSwitch as a const
reference rather than passing it by value. This ought to prevent the
need to copy the collection each time these functions are called, which
is expensive if the collection is large.

Diff Detail

Event Timeline

dwightguth requested review of this revision.Oct 28 2021, 2:10 PM
dwightguth created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2021, 2:10 PM

removed stray change made by mistake

does std::unordered_set help?

I tried changing it to an unordered set without any of the other changes originally, but it still had the same problem of a lot of time spent inserting into the hash table. I eventually abandoned that approach because my intuition was that a hash table where the keys are addresses in memory would probably end up poorly balanced a lot of the time.

If you want I can try out the combination of pushing first into a vector and also creating a hash set from the vector, but I didn't go that route because I had already largely addressed the performance bottleneck in this function with the diff as you see it now, and it didn't seem worth the risk that it might be badly balanced in some cases when the majority of the execution time was now taking place elsewhere.

aeubanks accepted this revision.Oct 28 2021, 3:57 PM

adding some comments saying that this is for performance reasons would be good

This revision is now accepted and ready to land.Oct 28 2021, 3:57 PM

add comments

I addressed your suggestion that we add comments. I do not have commit access yet; can you please commit it if you're happy with the diff?

This revision was landed with ongoing or failed builds.Oct 29 2021, 10:08 AM
This revision was automatically updated to reflect the committed changes.