This is an archive of the discontinued LLVM Phabricator instance.

LoopUnroll: respect pragma unroll when AllowRemainder is disabled
ClosedPublic

Authored by yaxunl on Feb 27 2018, 9:38 AM.

Details

Summary

Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.

This patch is related to https://reviews.llvm.org/D43594 but that one
deals with situations where remainder exists.

Diff Detail

Repository
rL LLVM

Event Timeline

yaxunl created this revision.Feb 27 2018, 9:38 AM
nhaehnle accepted this revision.Feb 27 2018, 7:39 PM

Thanks, this makes more sense than the other change. LGTM, although you may want to wait a bit for other people to chime in.

This revision is now accepted and ready to land.Feb 27 2018, 7:39 PM
efriedma added inline comments.Feb 28 2018, 12:21 PM
lib/Transforms/Scalar/LoopUnrollPass.cpp
733 ↗(On Diff #136090)

I don't think the TripCount check here is right; if the trip count isn't constant, TripCount will be zero.

The check is unnecessary, anyway: if the trip count is constant, TripCount is the same as TripMultiple.

yaxunl updated this revision to Diff 136532.Mar 1 2018, 8:30 AM

Revised by Eli's comments and added more tests for runtime trip count and indivisible trip count.

This revision was automatically updated to reflect the committed changes.