This is an archive of the discontinued LLVM Phabricator instance.

[LV] Unroll factor is expected to be > 0
ClosedPublic

Authored by ebrevnov on Sep 15 2020, 3:16 AM.

Details

Summary

LV fails with assertion checking that UF > 0. We already set UF to 1 if it is 0 except the case when IC > MaxInterleaveCount. The fix is to set UF to 1 for that case as well.

Diff Detail

Event Timeline

ebrevnov created this revision.Sep 15 2020, 3:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2020, 3:16 AM
ebrevnov requested review of this revision.Sep 15 2020, 3:16 AM
fhahn requested changes to this revision.Sep 15 2020, 3:34 AM

Thanks for the patch! Some comments inline.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5629–5630

IIUC the problem here is in cases when MaxInterleaveCount == 0, right?

The code below expects MaxInterleaveCount > 0 and I think it would be slightly preferable to make sure MaxInterleaveCount is at least 1. With that, the code setting IC should do the right thing, and other (potential future) users of MaxInterleaveCount can expect a sane value.

llvm/test/Transforms/LoopVectorize/zero_unroll.ll
1 ↗(On Diff #291847)

I think the test needs moving to llvm/test/Transforms/LoopVectorize/SystemZ/, because it relies on SytemZ's cost model and might fail if LLVM is build without the SystemZ backend.

Could you also add a check line to make sure a vector body is generated?

5 ↗(On Diff #291847)

nit: dso_local local_unnamed_addr unneeded?

25 ↗(On Diff #291847)

nit: are those attributes needed?

This revision now requires changes to proceed.Sep 15 2020, 3:34 AM
dmgreen added inline comments.
llvm/test/Transforms/LoopVectorize/zero_unroll.ll
12 ↗(On Diff #291847)

This looks like an infinite outer loop? It maybe doesn't matter a large amount, but if it shows the same problem with a ret instead of looping back, that would make a conceptually simpler test.

ebrevnov marked 4 inline comments as done.Sep 15 2020, 5:58 AM
ebrevnov added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5629–5630

IIUC the problem here is in cases when MaxInterleaveCount == 0, right?

That's right.

The code below expects MaxInterleaveCount > 0 and I think it would be slightly preferable to make sure MaxInterleaveCount is at least 1.

I don't see any uses of MaxInterleaveCount bellow this point.

ebrevnov updated this revision to Diff 291881.Sep 15 2020, 6:00 AM

Addressed comments

ebrevnov updated this revision to Diff 291882.Sep 15 2020, 6:02 AM

Minor change to test case

fhahn accepted this revision.Sep 27 2020, 4:02 AM

LGTM, thanks. Suggestion inline, but quite subjective, so feel free to ignore.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5629–5630

The code below expects MaxInterleaveCount > 0 and I think it would be slightly preferable to make sure MaxInterleaveCount is at least 1.

I don't see any uses of MaxInterleaveCount bellow this point.

Right, personally I think it would still be slightly preferable to ensure the variables have sane values as early as possible, especially when a MaxInterleaveCount of 0 does not really make sense in this context and makes the existing code for IC handling work as expected. But it probably it doesn't make much of a difference in practice.

Also, using std::max would seem slightly more explicit to me, together with a comment that IC must be at least 1.

This revision is now accepted and ready to land.Sep 27 2020, 4:02 AM
Ayal added inline comments.Sep 27 2020, 6:00 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5605–5607

this is the culprit: *BestKnownTC < VF.getKnownMinValue() leads to zeroing MaxInterleaveCount, right?

Best check here if this is the case, and if so reset MaxInterleaveCount to 1 instead of 0. Also worth updating the above comment.

5629–5630

personally I think it would still be slightly preferable to ensure the variables have sane values as early as possible, especially when a MaxInterleaveCount of 0 does not really make sense in this context and makes the existing code for IC handling work as expected ...

Agreed. Adding an assert that MaxInterleaveCount is strictly positive is also recommended.

LGTM, thanks. Suggestion inline, but quite subjective, so feel free to ignore.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5605–5607

In this particular case yes, *BestKnownTC < VF.getKnownMinValue() is the problem. But MaxInterleaveCount can potentially be set to 0 in early assignments as well. Is there any guarantee that user won't set ForceTargetMaxScalarInterleaveFactor/ForceTargetMaxVectorInterleaveFactor to 0? Also I don't see anything preventing TTI::getMaxInterleaveFactor to return 0. In fact there is one target returning 0:

unsigned HexagonTTIImpl::getMaxInterleaveFactor(unsigned VF) {

return useHVX() ? 2 : 0;

}

5629–5630

The code below expects MaxInterleaveCount > 0 and I think it would be slightly preferable to make sure MaxInterleaveCount is at least 1.

I don't see any uses of MaxInterleaveCount bellow this point.

Right, personally I think it would still be slightly preferable to ensure the variables have sane values as early as possible, especially when a MaxInterleaveCount of 0 does not really make sense in this context and makes the existing code for IC handling work as expected. But it probably it doesn't make much of a difference in practice.

What I don't like in this approach is over complication. That doesn't help code readability. Instead of one line IC = std::max(1, IC) which
clearly dominates all future uses of IC we have to sanitize MaxIC first, then do conditional sanitizing of IC and finally assert IC value. If you worry about using note sanitized MaxIC value I would suggest putting it to a separate scope. Like this:

{

unsigned MaxIC=V1;
if (c1)
  MaxIC=V2;
...
if (IC > MaxIC)
  IC = MaxIC

}

IC = std::max(1, IC);

What do you think?

Also, using std::max would seem slightly more explicit to me, together with a comment that IC must be at least 1.

Agreed.

ebrevnov added inline comments.Sep 30 2020, 8:32 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5629–5630

@fhahn what do you think?

fhahn added inline comments.Oct 5 2020, 4:20 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5605–5607

In this particular case yes, *BestKnownTC < VF.getKnownMinValue() is the problem. But MaxInterleaveCount can potentially be set to 0 in early assignments as well. Is there any guarantee that user won't set ForceTargetMaxScalarInterleaveFactor/ForceTargetMaxVectorInterleaveFactor to 0? Also I don't see anything preventing TTI::getMaxInterleaveFactor to return 0. In fact there is one target returning 0:

I don't think there's anything currently preventing targets to return 0 for the max interleave factor, but I think it is reasonable to require them to return a MaxInterleaveFactor >= 1 and assert otherwise.

But I think it should be sufficient to make sure that max IC >= 1 here with a comment, as Ayal suggested

5629–5630

Given that @Ayal also seems to prefer sanitizing the value earlier I think it would be preferable to do so earlier, but don't have a strong opinion on that.

ebrevnov updated this revision to Diff 297202.Oct 9 2020, 5:20 AM

Sanitize MaxIC upfront.

Does this look fine?

fhahn accepted this revision.Oct 9 2020, 6:52 AM

LGTM, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5609–5622

nit: seems like something like 'count' is missing from the sentence below. Or say something like MaxInterleave count must be at least 1.

kparzysz added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5605–5607

I have changed the Hexagon code to return 1 (https://reviews.llvm.org/rG99cafe009477).

Ayal added inline comments.Oct 9 2020, 2:40 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5601–5602

My suggestion would be to (1) update the above comment: the interleave count is limited to be less than the trip count divided by VF, provided it is at-least 1.

(2) place the MaxInterleaveCount = std::max(1u, MaxInterleaveCount); fix immediately after setting

MaxInterleaveCount =
    std::min(*BestKnownTC / VF.getKnownMinValue(), MaxInterleaveCount);

and (3) introduce an assert(MaxInterleaveCount > 0 && "Max Interleave Count must be at-least 1"); later -- setting it to 0 by targets should be treated as an error (thanks @kparzysz for the fix!)

ebrevnov updated this revision to Diff 297508.Oct 12 2020, 1:04 AM
ebrevnov marked an inline comment as done.

Fixed according to Ayal's request.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5629–5630

Done

Ayal added inline comments.Oct 12 2020, 5:55 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5622

nit: suffice to only introduce the

assert(MaxInterleaveCount > 0 &&
       "Maximum interleave count must be greater than 0");

in the above change hunk, right?

Thanks!

ebrevnov added inline comments.Oct 12 2020, 9:48 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5622

That was done intentionally. By sanitizing IC value as early as possible we avoid potential issues with using incorrect value. This is similar in spirit to what is done for MaxInterliveCount.