This is an archive of the discontinued LLVM Phabricator instance.

[SimpleLoopUnswitch] Skip non-trivial unswitching of cold functions
ClosedPublic

Authored by drcut on Sep 4 2022, 10:01 AM.

Details

Summary

In the current main branch, all cold loops will not be applied non-trivial unswitch. As reported in D129599, skipping these cold loops will incur regression in SPEC benchmark.
Thus, instead of skipping cold loops, now only skipping loops in cold functions.

Diff Detail

Event Timeline

drcut created this revision.Sep 4 2022, 10:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2022, 10:01 AM
drcut requested review of this revision.Sep 4 2022, 10:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2022, 10:01 AM
This revision is now accepted and ready to land.Sep 6 2022, 6:05 AM
drcut added a comment.Sep 6 2022, 7:37 AM

@alexgatea Would you please also evaluate this patch on SPEC to see whether there are other regressions? Thanks

aeubanks accepted this revision.Sep 6 2022, 10:52 AM

I don't think we need to worry about SPEC too much, just resolve regressions

llvm/test/Transforms/SimpleLoopUnswitch/PGO-nontrivial-unswitch2.ll
72–73

looks like some of these aren't used? I think running this through opt without any passes should remove unused unnamed metadata

I don't think we need to worry about SPEC too much, just resolve regressions

I agree. And anyway, there was only one performance degradation observed due to the original commit, and I verified that this patch fixes it, so it should be all good.

drcut updated this revision to Diff 458306.Sep 6 2022, 4:09 PM

Use opt to remove useless profiling data

This revision was landed with ongoing or failed builds.Sep 6 2022, 4:13 PM
This revision was automatically updated to reflect the committed changes.
drcut marked an inline comment as done.

This patch is provoking some extreme compile time when we have a large cold function with many loop nests, since it repeatedly reanalyzes the coldness of the entire function, which involves walking all BBs (and all instructions in the case of SamplePGO). Based on the example given in D129599 for the SPEC degradation, we should only need to check all loops in the loop nest, not the entire function. I mailed D146383 with that fix.