This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Add a run of LoopUnroll
ClosedPublic

Authored by jmolloy on Jan 8 2016, 7:29 AM.

Details

Summary

Loop trip counts can often be resolved during LTO. We should obviously be unrolling small loops once those trip counts have been resolved, but we weren't.

This causes no change in a -flto run of the test-suite for me, but the nature of fully unrolling loops is that when they trigger, they cause massive changes in runtime. We observe this on third party test suites.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 44329.Jan 8 2016, 7:29 AM
jmolloy retitled this revision from to [LTO] Add a run of LoopUnroll.
jmolloy updated this object.
jmolloy added a reviewer: mehdi_amini.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: llvm-commits.
mehdi_amini added inline comments.Jan 8 2016, 8:33 AM
lib/Transforms/IPO/PassManagerBuilder.cpp
576

Note: it is inserted before LoopInterchangePass in the regular pipeline?

Also there is a comment before another insertion // BBVectorize may have significantly shortened a loop body; unroll again.. Could this be valid for the Loop and SLP vectorizers that are inserted below?

hfinkel added a subscriber: hfinkel.Jan 8 2016, 8:45 AM
hfinkel added inline comments.
lib/Transforms/IPO/PassManagerBuilder.cpp
576

Also there is a comment before another insertion // BBVectorize may have significantly shortened a loop body; unroll again.. Could this be valid for the Loop and SLP vectorizers that are inserted below?

Yes, although for the loop vectorizer there is a competing effect: The loop vectorizer does not vectorize loops with known constant trip counts below some threshold (TinyTripCountVectorThreshold - 16 by default).

Hi Mehdi,

Note: it is inserted before LoopInterchangePass in the regular pipeline?

In the regular pipeline there are many runs of loop unrolling. This extra unroll will only unroll loops completely - it'll never do partial unrolling. So the ideal place to do this is when loops are most countable and when trivial loops have been deleted. That's why I put this after LoopDeletion.

I agree that a cleanup Unroll after Vectorize seems a good idea. I'll add that.

James

jmolloy updated this revision to Diff 44626.Jan 12 2016, 5:44 AM
jmolloy accepted this revision.Jan 14 2016, 6:58 AM
jmolloy added a reviewer: jmolloy.

Mehdi accepted this on IRC with the condition that the first unroll invocation was swapped with LoopInterchange to be identical to the non-LTO pipeline.

This revision is now accepted and ready to land.Jan 14 2016, 6:58 AM
jmolloy closed this revision.Jan 14 2016, 7:04 AM

r257767