This is an archive of the discontinued LLVM Phabricator instance.

[NewPM] Don't skip SCCs not in current RefSCC
ClosedPublic

Authored by aeubanks on Mar 17 2022, 2:18 PM.

Details

Summary

With D107249 I saw huge compile time regressions on a module (150s ->
5700s). This turned out to be due to a huge RefSCC in
the module. As we ran the function simplification pipeline on functions
in the SCCs in the RefSCC, some of those SCCs would be split out to
their RefSCC, a child of the current RefSCC. We'd skip the remaining
SCCs in the huge RefSCC because the current RefSCC is now the RefSCC
just split out, then revisit the original huge RefSCC from the
beginning. This happened many times because many functions in the
RefSCC were optimizable to the point of becoming their own RefSCC.

This patch makes it so we don't skip SCCs not in the current RefSCC so
that we split out all the child RefSCCs on the first iteration of
RefSCC. When we split out a RefSCC, we invalidate the original RefSCC
and add the remainder of the SCCs into a new RefSCC in
RCWorklist. This happens repeatedly until we finish visiting all
SCCs, at which point there is only one valid RefSCC in
RCWorklist from the original RefSCC containing all the SCCs that
were not split out, and we visit that.

For example, in the newly added test cgscc-refscc-mutation-order.ll,
we'd previously run instcombine in this order:
f1, f2, f1, f3, f1, f4, f1

Now it's:
f1, f2, f3, f4, f1

This can cause more passes to be run in some specific cases,
e.g. if f1<->f2 gets optimized to f1<-f2, we'd previously run f1, f2;
now we run f1, f2, f2.

This improves kimwitu++ compile times by a lot (12-15% for various -O3 configs):
https://llvm-compile-time-tracker.com/compare.php?from=2371c5a0e06d22b48da0427cebaf53a5e5c54635&to=00908f1d67400cab1ad7bcd7cacc7558d1672e97&stat=instructions

Diff Detail

Event Timeline

aeubanks created this revision.Mar 17 2022, 2:18 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: hiraditya. · View Herald Transcript
aeubanks requested review of this revision.Mar 17 2022, 2:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2022, 2:18 PM
aeubanks planned changes to this revision.Mar 17 2022, 2:23 PM
aeubanks edited the summary of this revision. (Show Details)Mar 17 2022, 2:47 PM
aeubanks edited the summary of this revision. (Show Details)
aeubanks added a reviewer: asbirlea.
nikic added a reviewer: nikic.Mar 17 2022, 3:05 PM
asbirlea accepted this revision.Mar 17 2022, 4:39 PM

As discussed offline, the potential issue with this change is whether we encounter a pathological case in the "opposite direction".

Seeing how so far we have a concrete case where the current traversal order leads to the compile time explosion, and the benchmark in the compiler tracker also benefits from the change, I'm in favor of moving forward with the change and revisit if we encounter a benchmark exhibiting the high compile times after this patch.

This revision is now accepted and ready to land.Mar 17 2022, 4:39 PM
This revision was landed with ongoing or failed builds.Mar 18 2022, 2:16 PM
This revision was automatically updated to reflect the committed changes.