The patch fixes bug 26957, enabling unroll by non-power-of-2 counts if they are explicitly set in "pragma unroll".
Details
Diff Detail
Event Timeline
lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
736–739 | Why do we need to move this? | |
lib/Transforms/Utils/LoopUnrollRuntime.cpp | ||
354–355 | This change is unnecessary. | |
365–367 | What is the final expression for ModVal? From the code it looks like ModVal = ((BECount % Count) + 1) % Count Do we really need double urem here? If so, could you please add a comment explaining why we need exactly this? | |
368 | Nitpick: dot missing at the end. |
lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
736–739 | This is better place for the check. Previously "nounroll" attribute was even if we failed to unroll loop. There are plenty of checks inside UnrollLoop that can stop unrolling. | |
lib/Transforms/Utils/LoopUnrollRuntime.cpp | ||
354–355 | As now the comment is inside "if" it is shifted on 2 spaces. That causes longest string in it to become 81 symbols. So I just make it shorter than 81. | |
365–367 | Potentially BECount + 1 can unsigned overflow, while (BECount % Count) +1 can not. That is why double URem is used. I can try to simplify this to something like this ModVal = ((Count - ModValAdd) >> (ModVal->getSclarType()->getPrimitiveSizeInBits() - 1) & ModValAdd ``` or to select here or somewhere in further combiners. However non-power-of-2 is not frequent case as for runtime unrolling it comes only from user specified pragma. |
Minor drive-by comment.
lib/Transforms/Utils/LoopUnrollRuntime.cpp | ||
---|---|---|
365–367 | Alternatively, you could have ModVal as BECount == -1 ? <constant expr> : ((BECount +nuw 1) % Count. |
lib/Transforms/Utils/LoopUnrollRuntime.cpp | ||
---|---|---|
365–367 | Yes. That is what I mean by select. I would prefer combiner to decide the correct replacement (maybe based on some architecture properties). The simplification of TripCount % Count to TripCount & (Count - 1) when Count is power-of-2 is pretty obvious and architecture independent. This one is more complicated: ((BECount % Count) + 1) % Count. |
Hi,
This patch LGTM, thanks for working on this!
Michael
lib/Transforms/Scalar/LoopUnrollPass.cpp | ||
---|---|---|
736–739 | But maybe it's not a bad idea to have a 'nounroll' attribute on a loop that we cannot unroll:) Otherwise, next time we'll just probably spend some time trying again. | |
lib/Transforms/Utils/LoopUnrollRuntime.cpp | ||
354–355 | Oh, makes sense. Phabricator doesn't show whitespace changes, so I missed the change in indentation. | |
365–367 | Thanks for the explanation, it's clear now. I think both BECount == -1 ? <constant expr> : ((BECount +nuw 1) % Count and ((BECount % Count) + 1) % Count are good options here. |
Why do we need to move this?