Page MenuHomePhabricator

[LoopRotation] Allow loop header duplication if vectorization is forced
Needs ReviewPublic

Authored by anemet on Mar 26 2019, 11:47 AM.

Details

Summary

-Oz normally does not allow loop header duplication so this loop wouldn't be
vectorized. However the vectorization pragma should override this and allow
for loop rotation.

rdar://problem/49281061

Diff Detail

Event Timeline

anemet created this revision.Mar 26 2019, 11:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2019, 11:47 AM
ormris removed a subscriber: ormris.Mar 26 2019, 11:49 AM

This makes sense to me, given that loops with bottom checks are a precondition for LV. I think it would be good to also update the section about llvm.loop.vectorize.enable in LangRef to say it might also enable transformations like LoopRotate, in preparation for LV.

In case LV fails to vectorize the loop, it might be a bit surprising the loop has been rotated. But the way things are structured at the moment, there's nothing we can do about that (we cannot tell in LoopRotate if LV will be able to vectorize). By documenting that behavior, we can push the responsibility for that to the user. Maybe a remark in LoopRotate would be helpful to indicate that we only rotated because of the metadata.

Also adding Michael, as he worked on loop related metadata recently.

lib/Transforms/Scalar/LoopRotation.cpp
119

Might be worth a comment why isForcedVectorize overrides the behavior here.

test/Transforms/LoopVectorize/AArch64/Oz-and-forced-vectorize.ll
1

Could you limit the test case to running loop-rotate? The tests for LV should already be sufficient to check the transformed case. (same for second test)

59

All metadata except this one can be stripped I think. (same for second test)

anemet marked 2 inline comments as done.Mar 27 2019, 8:00 AM

This makes sense to me, given that loops with bottom checks are a precondition for LV. I think it would be good to also update the section about llvm.loop.vectorize.enable in LangRef to say it might also enable transformations like LoopRotate, in preparation for LV.

In case LV fails to vectorize the loop, it might be a bit surprising the loop has been rotated. But the way things are structured at the moment, there's nothing we can do about that (we cannot tell in LoopRotate if LV will be able to vectorize). By documenting that behavior, we can push the responsibility for that to the user. Maybe a remark in LoopRotate would be helpful to indicate that we only rotated because of the metadata.

I think the picture is even simpler in this case. The user has asked for an optimization. If we can't deliver we issue a warning (-Wpass-failed). So the user has sufficient information to see that something went wrong. But yes a loop rotation remark makes sense! especially mentioning that we did it in preparation for forced vectorization.

Also adding Michael, as he worked on loop related metadata recently.

test/Transforms/LoopVectorize/AArch64/Oz-and-forced-vectorize.ll
1

I don't think so. There is no other way to get sizelevel=2 activated for opt. You either run the entire -Oz pipeline or you run specific tests with sizelevel=0.

59

I kept it in to get the *warning* with proper line info for failed user-requested optimization. I can drop if you don't think that is useful. I liked it because it completely captured the user experience that this test came from.

I can check for the actual diagnostics to make this less subtle.

fhahn added inline comments.Mar 27 2019, 8:32 AM
test/Transforms/LoopVectorize/AArch64/Oz-and-forced-vectorize.ll
1

Argh, that's unfortunate :( . I thought it just respects the minsize attribute.......

This might be a more general problem, other passes might expect a normalized form as well, such as UnrollAndJam and LoopDistribute.

For LoopVectorizer, I am somewhat surprised. It calls simplifyLoop itself (instead of relying on LoopSimplifyPass). Could the same be done for LoopRotation?

lib/Transforms/Scalar/LoopRotation.cpp
60

Consider using hasVectorizeTransformation in LoopUtils.h. I added it to make determining whether a pragma is added consistent between passes.

test/Transforms/LoopVectorize/AArch64/Oz-and-unforced-vectorize.ll
3–4

There is no llvm.loop.vectorize.enable metadata here, so without -vectorize-loops option, it would not be vectorized anyway.

27

I don't think we need to regression-test that we do not vectorize. If LoopVectorize one day manages to vectorize unrotated loop, it's a good thing, not a regression.

Btw, since it's the same content, you could instead add another RUN: line to the other test.

fhahn added a comment.Mar 27 2019, 9:16 AM

This might be a more general problem, other passes might expect a normalized form as well, such as UnrollAndJam and LoopDistribute.

For LoopVectorizer, I am somewhat surprised. It calls simplifyLoop itself (instead of relying on LoopSimplifyPass). Could the same be done for LoopRotation?

I guess it could, but the underlying problem is still the same: as things are currently structured, we have to rotate before we know if vectorization is legal and cannot go back if it is not. Potentially, VPlan allows this sort of thing, by doing those transforms on a separate representation (VPlan IR)

anemet marked 2 inline comments as done.Mar 27 2019, 9:48 AM

This might be a more general problem, other passes might expect a normalized form as well, such as UnrollAndJam and LoopDistribute.

For LoopVectorizer, I am somewhat surprised. It calls simplifyLoop itself (instead of relying on LoopSimplifyPass). Could the same be done for LoopRotation?

It felt that the decision belonged to LoopRotate. We make a profitability decision there (for -Oz loop rotate is a bad trade-off) which is inaccurate.

If you feel strongly about this I can explore rotating the loop in the vectorizer when I have a bit more time.

lib/Transforms/Scalar/LoopRotation.cpp
60

Nice, I didn't know about this. BTW, LoopDistribute was not updated to use hasDistributeTransformation.

test/Transforms/LoopVectorize/AArch64/Oz-and-unforced-vectorize.ll
27

I don't think we need to regression-test that we do not vectorize. If LoopVectorize one day manages to vectorize unrotated loop, it's a good thing, not a regression.

I like to have the negative tests as well but if you want I'll remove it.

Btw, since it's the same content, you could instead add another RUN: line to the other test.

Well, it's not the same content since it's missing the the forced-vectorize metadata. But perhaps I am not understanding the point you're trying to make.

anemet marked an inline comment as done.Mar 27 2019, 10:05 AM
anemet added inline comments.
lib/Transforms/Scalar/LoopRotation.cpp
60

@Meinersbur I am actually confused whether I should be checking for Enable or ForcedByUser. We only set Enable and not Forced if the vectorization width is requested via a pragma but I think that is an even stronger enforcement of vectorization. What is the logic here?

If you feel strongly about this I can explore rotating the loop in the vectorizer when I have a bit more time.

No, this would go beyond fixing a bug.

lib/Transforms/Scalar/LoopRotation.cpp
60

I only changed UnrollAndJam to use it for now. How these decisions are made is surprisingly subtle and e.g. in case of LoopVectorizeHints, integrated within other decisions. I was not brave enough to put up reviews for this and risking changing behavior without a need. But I would be happy if new code uses it.

test/Transforms/LoopVectorize/AArch64/Oz-and-unforced-vectorize.ll
27

I like to have the negative tests as well but if you want I'll remove it.

I don't like them; they are fragile, require maintenance and check-llvm takes longer.

Well, it's not the same content since it's missing the the forced-vectorize metadata.

I was assuming the missing llvm.loop.vectorize.enable metadata was a mistake. Without it, LoopVectorize won't do anything, even if it could vectorize the loop (There is a difference in how clang and opt initialize the PassManagerBuilder).

An excellent example of how fragile this kind of test is.