This is an archive of the discontinued LLVM Phabricator instance.

[Hexagon] peel loops with runtime small trip counts
ClosedPublic

Authored by iajbar on Mar 25 2018, 11:58 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

iajbar created this revision.Mar 25 2018, 11:58 AM
mkazantsev added inline comments.Mar 25 2018, 10:07 PM
lib/Transforms/Scalar/LoopUnrollPass.cpp
803 ↗(On Diff #139743)

I don't think it is correct. computePeelCount will set PeelCount to 0 if the loop is unpeelable, and you are forcing it to some value without checking this fact.

mkazantsev added inline comments.Mar 25 2018, 10:08 PM
test/Transforms/LoopUnroll/Hexagon/peel-small-loop.ll
6 ↗(On Diff #139743)

Please add a test where you force peel count to 2 and the loop cannot be peeled, I believe it will expose a bug.

iajbar updated this revision to Diff 139883.Mar 26 2018, 8:59 PM
iajbar added inline comments.Mar 26 2018, 9:01 PM
lib/Transforms/Scalar/LoopUnrollPass.cpp
803 ↗(On Diff #139743)

There is another call to canPeel() in peelLoop(). In the case of a target sets the peel count to Non zero, then the loop won’t get peel if it’s invalid to be peeled.

mkazantsev added inline comments.Mar 26 2018, 10:24 PM
lib/Transforms/Scalar/LoopUnrollPass.cpp
803 ↗(On Diff #139743)
  1. This still need a test that ensures this behavior preserves.
  2. Actually, to me it looks quite strange, and I'd rather prefer having canPeel in peelLoop as assert in future. It makes no sense to make the same set of checks in two different places that go after one another.

I suggest you to move your logic inside computePeelCount and return the number of peeled iterations set by your platform-specific code *after* you check that the loop can be peeled. This will allow to replace the check in peelLoop with assertion later.

lib/Transforms/Utils/LoopUnrollPeel.cpp
480 ↗(On Diff #139883)
  1. Why it is not a part of canPeel?
  2. This is not connected to your logic about setting peel count in target. This should be a separate patch with justification and tests for this very change.
mkazantsev added inline comments.Mar 27 2018, 12:24 AM
lib/Transforms/Scalar/LoopUnrollPass.cpp
803 ↗(On Diff #139743)

I have submitted https://reviews.llvm.org/D44919 for review to get rid of this second check, I believe this is how it should work.

fhahn added a subscriber: fhahn.Mar 27 2018, 1:26 AM
fhahn added inline comments.
lib/Transforms/Scalar/LoopUnrollPass.cpp
803 ↗(On Diff #139743)

How about changing computePeelCount to take the maximum of the passed in peelCount and the compute peel count? That seems to be in line with how other unroll settings are handled.

For example, if computePeelCount finds a higher peel count, shouldn't that one be used?

Also note that even settings like UnrollForcePeelCount are only applied after the other calculations could not find a peel count.

iajbar updated this revision to Diff 139923.Mar 27 2018, 7:28 AM
iajbar edited the summary of this revision. (Show Details)

I addressed the comments from Max and Florian.

mkazantsev added inline comments.Mar 27 2018, 9:09 PM
lib/Transforms/Utils/LoopUnrollPeel.cpp
223 ↗(On Diff #139923)

You only need preset peel count in one place. How about remembering it into a variable in the beginning of the method, zeroing UP.PeelCount and using the variable where you need it? Then you will not need to zero UP.PeelCount before every return.

mkazantsev added inline comments.Mar 27 2018, 9:11 PM
test/Transforms/LoopUnroll/Hexagon/peel-small-loop.ll
6 ↗(On Diff #139743)

I still ask you to add such a test.

iajbar updated this revision to Diff 140096.Mar 28 2018, 10:06 AM

Thanks Max. Please note that this patch also fixes a bug as well. Without this patch, the value passed by the flag -unroll-peel-count was ignored.

test/Transforms/LoopUnroll/Hexagon/peel-small-loop.ll
6 ↗(On Diff #139743)

There is already a test that check this case: test/Transforms/LoopUnroll/peel-loop-irreducible.ll.

This revision is now accepted and ready to land.Mar 29 2018, 2:45 AM
This revision was automatically updated to reflect the committed changes.

I think this patch may cause some inefficiencies in terms of compile time and generated code in case if unroller decides to make full unrolling or partial unrolling with user-forced number of iterations, see discussion of D44919. However it is not a correctness issue, i.e. resulting code will still produce right results.

Currently it fails by assert after the logic has changed, to fix it I propose to add canPeel check before forcing peel count.

iajbar added a comment.Apr 2 2018, 8:07 PM

Thank you very much Max for your review. I tried to zero UP.PeelCount before every return from computeUnrollCount that stands before computePeelCount and it causes test/Transforms/LoopUnroll/pr33437.ll to fail. This test expects to peel a loop first and than unrolling it.

Ok, then it seems to be expected behavior in some cases (which is strange to me, but OK). Then we can just add canPeel check to this patch to avoid crash. As I said before, checking this before setting peel count is reasonable by itself.

iajbar added a comment.Apr 3 2018, 4:02 PM

I committed this patch with canPeel() check before setting PeelCount in Hexagon Target. Thank you very much Max for your Review.