Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.
This patch is related to https://reviews.llvm.org/D43594 but that one
deals with situations where remainder exists.
I don't think the TripCount check here is right; if the trip count isn't constant, TripCount will be zero.
The check is unnecessary, anyway: if the trip count is constant, TripCount is the same as TripMultiple.