This is an archive of the discontinued LLVM Phabricator instance.

[LV] Avoid needless fold tail
ClosedPublic

Authored by gilr on Dec 20 2020, 11:29 PM.

Details

Summary

When the trip-count is provably divisible by the maximal/chosen VF, folding the loop's tail during vectorization is redundant.
This commit extends the existing test for constant trip-counts to any trip-count known to be divisible by maximal/selected VF by SCEV.

Diff Detail

Event Timeline

gilr created this revision.Dec 20 2020, 11:29 PM
gilr requested review of this revision.Dec 20 2020, 11:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2020, 11:29 PM
SjoerdMeijer added inline comments.Dec 21 2020, 12:43 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5505

Was surprised to see this change, because I thought we were handling it here already.
Is this check here still relevant? Or can we "merge" this with the one that you added below?

gilr added inline comments.Dec 21 2020, 5:33 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5505

Is this check here still relevant?

Yes, since IC may take non-power-of-2 values. Will add a test to cover that.

gilr updated this revision to Diff 313091.Dec 21 2020, 5:41 AM

Add a test for a constant TC with IC=3.

SjoerdMeijer added inline comments.Dec 21 2020, 5:42 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5505

Ah yeah, I do see that now.
Thanks

This revision is now accepted and ready to land.Dec 21 2020, 6:46 AM
fhahn accepted this revision.Dec 21 2020, 1:54 PM

LGTM, thanks

llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-const-TC.ll
10 ↗(On Diff #313091)

Is this enough? I think it might be better to check the whole vector body?

gilr added inline comments.Dec 22 2020, 12:02 AM
llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-const-TC.ll
10 ↗(On Diff #313091)

Tried to keep the checks to the minimum proving unmasked vectorization, but perhaps indeed better to check the whole context - will switch to update_test format for easy maintenance.
Thanks @SjoerdMeijer, @fhahn!

This revision was automatically updated to reflect the committed changes.
fhahn added inline comments.Dec 22 2020, 1:41 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5505

Thinking a bit more about this, I think we should be able to use ScalarEvolution::getURemExpr to check if the trip count is a multiple of any VF. That should work for both the constant and variable trip-count cases. As I missed commenting on that before the patch landed, I put up D93677