This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Only peel if a predicate becomes known in the loop body.
ClosedPublic

Authored by fhahn on Mar 28 2018, 9:22 AM.

Diff Detail

Event Timeline

fhahn created this revision.Mar 28 2018, 9:22 AM
junbuml added inline comments.Mar 28 2018, 10:19 AM
lib/Transforms/Utils/LoopUnrollPeel.cpp
218

if Pred or !Pred

test/Transforms/LoopUnroll/peel-loop-conditions.ll
341–342

I think you should add another test to show the case where MaxPeelCount is respected.

450

Better to add comment here.

fhahn updated this revision to Diff 140194.Mar 29 2018, 2:07 AM
fhahn marked an inline comment as done.

Thanks for having a look! I've tweaked the comments

lib/Transforms/Utils/LoopUnrollPeel.cpp
218

In terms of the original Pred, Pred or !Pred is precise, but at this stage Pred is set so that it is known in the peeled part (it is set to the inverse predicate earlier, if the original Pred is not known). Therefore I think it is slightly clearer to refer to just !Pred here. What do you think?

test/Transforms/LoopUnroll/peel-loop-conditions.ll
341–342

I think this test case still checks that MaxPeelCount is respected. Without checking MaxPeelCount, we would peel off 9999 iterations. I have updated the comment and tried to highlight that fact.

mkazantsev added inline comments.Mar 29 2018, 2:44 AM
lib/Transforms/Utils/LoopUnrollPeel.cpp
220

Check that DesiredPeelCount != NewPeelCount before asking SCEV, this will save us some compile time in most cases.

fhahn updated this revision to Diff 140201.Mar 29 2018, 2:56 AM

Add extra check

lib/Transforms/Utils/LoopUnrollPeel.cpp
220

Done, thanks!

I am not sure about the heuristic in general. This patch says that we peel out first K iterations if starting from K + 1th iteration the predicate becomes known false. But isn't in even more profitable to peel out K+1th iteration because the predicate that was trivially true on first iterations (and thus we could base optimizations on it) will be trivially false, and it will STILL allow you to base some optimizations on it.

I think a better approach would be to peel out another iteration if we know that Pred is true on it OR if we know that Pred is false. Either will allow us to simplify the peeled code. Please correct me if I'm wrong with my reasoning.

I am not sure what problem you are fighting in this patch.

mkazantsev added inline comments.Mar 29 2018, 3:21 AM
lib/Transforms/Utils/LoopUnrollPeel.cpp
200

BTW, I just realized that this entire optimization will be harmful if the predicate is provable for LeftAR, RightSCEV. In this case, something is true on EVERY iteration, and you peel out SOME iterations thinking that it would be profitable.

Isn't it the real problem you are trying to win here?

fhahn added a comment.Mar 29 2018, 3:37 AM

I am not sure about the heuristic in general. This patch says that we peel out first K iterations if starting from K + 1th iteration the predicate becomes known false. But isn't in even more profitable to peel out K+1th iteration because the predicate that was trivially true on first iterations (and thus we could base optimizations on it) will be trivially false, and it will STILL allow you to base some optimizations on it.

I think a better approach would be to peel out another iteration if we know that Pred is true on it OR if we know that Pred is false. Either will allow us to simplify the peeled code. Please correct me if I'm wrong with my reasoning.

I think I see what you mean. You are saying we would not have to stop peeling even once Pred changes from true to false, as the peeled code could be simplified still? We could, but I think the major benefit of this optimization would be to simplify the loop body and peeling more than necessary may prevent inlining due to the code size increase.

I am not sure what problem you are fighting in this patch.

Basically, I want to prevent peeling, if after peeling, we cannot simplify the loop body. See the @test4. Here we would peel of a few iterations and the peeled code could be simplified, but we cannot simplify the loop body after peeling. So we potentially add lots of additional instructions through peeling, with little benefit, if the trip count is large. And that may prevent inlining.

lib/Transforms/Utils/LoopUnrollPeel.cpp
200

Are you referring where either Pred or !Pred is known for LeftAr, RightSCEV independently of the iteration? This case should be guarded against in line 176

fhahn added a comment.Mar 29 2018, 4:00 AM

Thank you very much for having a look by the way! :) Please let me know if I missed something.

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

lib/Transforms/Utils/LoopUnrollPeel.cpp
200

Right, I haven't noticed this. Thanks for pointing out!

fhahn added a comment.Mar 29 2018, 5:17 AM

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

Ah yes, I see the case you are referring to now. Maybe it would be worth to add a check for monotonic predicates?

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

Ah yes, I see the case you are referring to now. Maybe it would be worth to add a check for monotonic predicates?

As an option, yes. Alternatively, you know the number of iterations you are peeling out, meaning that you can use evaluateAtIteration to get what remains in loop after your manipulations. If the predicate or !predicate is known for this value, then it is what we want.

fhahn updated this revision to Diff 141316.Apr 6 2018, 4:49 AM

Update to peel only for monotonic predicates, that switch from true to false (or vice versa). In that case, we should be able to simplify the loop body.

For ICMP_EQ and ICMP_NE without wrapping, I think we could peel of an additional iteration and then the loop body could be simplified too. I've added that as a fixme and intend to add that as a follow up commit. I can also add it to this commit if you prefer.

mkazantsev added inline comments.Apr 9 2018, 2:26 AM
lib/Transforms/Utils/LoopUnrollPeel.cpp
227

How about checking it before you start calculating NewPeelCount?

fhahn updated this revision to Diff 141649.Apr 9 2018, 7:50 AM

Move monotonic check to the beginning, thanks Max

This revision is now accepted and ready to land.Apr 16 2018, 2:27 AM
This revision was automatically updated to reflect the committed changes.