This is an archive of the discontinued LLVM Phabricator instance.

Do not disable completely loop unroll when optimizing for size.
ClosedPublic

Authored by mamai on May 2 2016, 1:15 PM.

Details

Summary

By disabling completely the loop unroll when optimizing for size (/Os), the #pragma unroll have no effect at that optimization level.

This contradicts the paragraph in an llvm blog post about the loop pragmas (http://blog.llvm.org/2014/11/loop-vectorization-diagnostics-and.html) saying the following:

For example, when compiling for size (-Os) it's a good idea to vectorize the hot loops of the application to improve performance. Vectorization, interleaving, and unrolling can be explicitly specified using the #pragma clang loop directive prior to any for, while, do-while, or c++11 range-based for loop.

Also, as explained in a previous commit, the loop unroll pass already have the logic to handle optimizing for size (http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20130805/085399.html).

Diff Detail

Repository
rL LLVM

Event Timeline

mamai updated this revision to Diff 55866.May 2 2016, 1:15 PM
mamai retitled this revision from to Do not disable completely loop unroll according to optimization level..
mamai updated this object.
mamai added a reviewer: chandlerc.
mamai set the repository for this revision to rL LLVM.
mamai added a subscriber: cfe-commits.

I believe the LLVM blog post is in error. Loop vectorization commonly generates two versions of the loop: vectorized and scalar. The scalar loop is necessary to handle the case where the trip count isn't evenly divisible by the vectorization factor. Therefore, it's reasonable to disable these type of optimizations (e.g., vectorization, unrolling) when we're optimizing for size. It's also reasonable to disable these optimizations at -O0.

However, I also understand the argument being made by Chandler. Can you please create an LLVM patch the shows the loop unroll pass respects the equivalent -Os/-Oz/-O0 in LLVM IR?

I believe the first two are handled by checking Function::optForSize() and for the latter you can check the function for the optnone attribute.

I think the blog comment is right. The pragma should make the loop unroll even in /Os. I think it is essential to allow the user to optimize some specific loops even if he generally wants to optimize for size the rest of the code. I will add tests that show the behavior of the loop unroll pass when optnone or optsize are specified.

I think the blog comment is right. The pragma should make the loop unroll even in /Os. I think it is essential to allow the user to optimize some specific loops even if he generally wants to optimize for size the rest of the code. I will add tests that show the behavior of the loop unroll pass when optnone or optsize are specified.

You're correct. No more reviewing patches after my bedtime. :)

mamai updated this revision to Diff 56133.May 4 2016, 6:09 AM
mamai retitled this revision from Do not disable completely loop unroll according to optimization level. to Do not disable completely loop unroll when optimizing for size..
mamai updated this object.

Modified the patch not to affect /O1 optimization level.

rnk accepted this revision.May 4 2016, 8:11 AM
rnk added a reviewer: rnk.
rnk added a subscriber: rnk.

lgtm

To be clear, loop unrolling just lowers its size threshold when -Os is on:

// Apply size attributes
if (L->getHeader()->getParent()->optForSize()) {
  UP.Threshold = UP.OptSizeThreshold;
  UP.PartialThreshold = UP.PartialOptSizeThreshold;
}

So this does more than enabling loop unrolling when pragmas are present. However, it that behavior is wrong then we should fix it in LLVM.

This revision is now accepted and ready to land.May 4 2016, 8:11 AM
This revision was automatically updated to reflect the committed changes.