-Oz normally does not allow loop header duplication so this loop wouldn't be
vectorized. However the vectorization pragma should override this and allow
for loop rotation.
rdar://problem/49281061
Differential D59832
[LoopRotation] Allow loop header duplication if vectorization is forced fhahn on Mar 26 2019, 11:47 AM. Authored by
Details -Oz normally does not allow loop header duplication so this loop wouldn't be rdar://problem/49281061
Diff Detail
Event TimelineComment Actions This makes sense to me, given that loops with bottom checks are a precondition for LV. I think it would be good to also update the section about llvm.loop.vectorize.enable in LangRef to say it might also enable transformations like LoopRotate, in preparation for LV. In case LV fails to vectorize the loop, it might be a bit surprising the loop has been rotated. But the way things are structured at the moment, there's nothing we can do about that (we cannot tell in LoopRotate if LV will be able to vectorize). By documenting that behavior, we can push the responsibility for that to the user. Maybe a remark in LoopRotate would be helpful to indicate that we only rotated because of the metadata. Also adding Michael, as he worked on loop related metadata recently.
Comment Actions I think the picture is even simpler in this case. The user has asked for an optimization. If we can't deliver we issue a warning (-Wpass-failed). So the user has sufficient information to see that something went wrong. But yes a loop rotation remark makes sense! especially mentioning that we did it in preparation for forced vectorization.
Comment Actions This might be a more general problem, other passes might expect a normalized form as well, such as UnrollAndJam and LoopDistribute. For LoopVectorizer, I am somewhat surprised. It calls simplifyLoop itself (instead of relying on LoopSimplifyPass). Could the same be done for LoopRotation?
Comment Actions I guess it could, but the underlying problem is still the same: as things are currently structured, we have to rotate before we know if vectorization is legal and cannot go back if it is not. Potentially, VPlan allows this sort of thing, by doing those transforms on a separate representation (VPlan IR) Comment Actions It felt that the decision belonged to LoopRotate. We make a profitability decision there (for -Oz loop rotate is a bad trade-off) which is inaccurate. If you feel strongly about this I can explore rotating the loop in the vectorizer when I have a bit more time.
Comment Actions No, this would go beyond fixing a bug.
Comment Actions Rebased, updated to use hasVectorizeTransformation as suggested, removed negative test (existing loop rotation tests should cover that already) and removed unnecessary dbg & tbaa metadata from test. Ping :) Comment Actions Looks reasonable to me, on nit. Someone else?
Comment Actions LGTM. There is an inline comment on the previous diff that I think was never submitted; It does not apply to the latest diff as it already uses TM_ForcedByUser. I don't delete is since is might still be interesting.
|
Consider using hasVectorizeTransformation in LoopUtils.h. I added it to make determining whether a pragma is added consistent between passes.