This is an archive of the discontinued LLVM Phabricator instance.

[GlobalDCE] Use DenseMap instead of unordered_multimap for GVDependencies.
ClosedPublic

Authored by mzolotukhin on Oct 13 2017, 5:54 PM.

Details

Summary

std::unordered_multimap happens to be very slow when the number of elements
grows large. On one of our internal applications we observed a 17x compile time
improvement from changing it to DenseMap.

Diff Detail

Repository
rL LLVM

Event Timeline

mzolotukhin created this revision.Oct 13 2017, 5:54 PM
davide edited edge metadata.Oct 13 2017, 6:00 PM

Generally switching to generic containers to LLVM ones lead to improvements, but 17x seems quite large to me (not complaining).
I wonder if you can add a test (maybe to CTMark?) to make sure this doesn't regress.

Generally switching to generic containers to LLVM ones lead to improvements, but 17x seems quite large to me (not complaining).

17x probably comes from the test being huge: when we're building with LTO, we're getting a ~50Mb bitcode file.

I wonder if you can add a test (maybe to CTMark?) to make sure this doesn't regress.

I thought about that too. But I thought maybe we should create a compile time regression test-suite instead, which will be separate from CTMark. That is, I definitely agree that it'd be great to be able to check if tests don't regress, but I would also like to keep CTMark runs fast. It's discussable though.
Another problem with this particular test is that I cannot disclose it as is, but I guess if we just generate a huge file with tons of globals and functions that would also do.

Michael

Hi,

Is the patch ok to commit?

Michael

davide accepted this revision.Oct 17 2017, 3:13 PM
davide added inline comments.
include/llvm/Transforms/IPO/GlobalDCE.h
38 ↗(On Diff #119002)

Why 4 ?

This revision is now accepted and ready to land.Oct 17 2017, 3:13 PM

Thanks!

include/llvm/Transforms/IPO/GlobalDCE.h
38 ↗(On Diff #119002)

No particular reason. Any advice on a more scientific option to choose this value? :)

When it comes to this I generally try many of them until I find something reasonable. I'm pretty sure 4 is a reasonable value, but I was wondering whether you actually ran some tests on some different values before picking "the best" (for some definition of). Unlikely it's going to matter in any way, FWIW.

This revision was automatically updated to reflect the committed changes.