This is an archive of the discontinued LLVM Phabricator instance.

[LV] Always create VPWidenCanonicalIVRecipe, optimize away later.
ClosedPublic

Authored by fhahn on Jan 12 2022, 12:17 PM.

Details

Summary

This patch updates createBlockInMask to always generate
VPWidenCanonicalIVRecipe and adds a transform to optimize it away later,
if it is not needed.

This is a step towards breaking up VPWidenIntOrFpInductionRecipe and
explicitly distinguishing between vector phis and scalarizing.

Split off from D116123.

Diff Detail

Event Timeline

fhahn created this revision.Jan 12 2022, 12:17 PM
fhahn requested review of this revision.Jan 12 2022, 12:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 12:17 PM
Herald added a subscriber: vkmr. · View Herald Transcript
Ayal added inline comments.Jan 13 2022, 12:13 AM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
332

Would be better to find VPWidenCanonicalIVRecipe by searching the users of getCanonicalIV(), after connecting the former to the latter, instead of looking for it in HeaderVPBB?

Would be better to search for an IV recipe (among HeaderVPBB->phis()) with an isCanonicalID and same type as widen canonical IV, thereby clearly proving their equivalence, instead of relying on OriginalCanonicalIV?

351
if (WidenOriginalIV && WidenNewIV) {
fhahn updated this revision to Diff 399616.Jan 13 2022, 3:16 AM

Restructure code as suggested, added VPWidenIntOrFpInductionRecipe::isCanonical & use it instead of IR reference, check users of VPCanonicalIVPHIRecipe to find VPWidenCanonicalIVRecipe. This required making the def-use relation between those 2 explicit. Done in 3f2fb767e33a.

Ayal added inline comments.Jan 17 2022, 1:45 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9210

Suffice to call it removeRedundantCanonicalIVs ?

llvm/lib/Transforms/Vectorize/VPlan.cpp
1256

Original thought was to leverage
static bool isCanonicalID(const InductionDescriptor &ID)

Can potentially specialize upon construction and have a recipe which needs no Start nor Step operands, as these are inherently known constants for such IV's?

llvm/lib/Transforms/Vectorize/VPlan.h
1071

It's also the scalar/element type of the new IV; how about getScalarType?

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
342

Suffice to traverse HeaderVPBB->phis()?

347

Comment should be updated.

349

Would slightly be better to compare with WidenNewIV->getOriginalType()/getScalarType()?

llvm/lib/Transforms/Vectorize/VPlanTransforms.h
50

Comment should be updated.

fhahn updated this revision to Diff 400781.Jan 18 2022, 2:41 AM

Address latest comments, thanks!

fhahn marked 9 inline comments as done.Jan 18 2022, 2:45 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9210

Updated, thanks!

llvm/lib/Transforms/Vectorize/VPlan.cpp
1256

Original thought was to leverage
static bool isCanonicalID(const InductionDescriptor &ID)

I think it would make sense to have VPWidenIntOrFpInductionRecipe::isCanonical() replace the existing use of isCanonicalID. I put up D117551.

Can potentially specialize upon construction and have a recipe which needs no Start nor Step operands, as these are inherently known constants for such IV's?

That would be an option, but I am not sure if the benefits are worth introducing another IV recipe, as we already have a number of them and checking whether it is a canonical induction is cheap & not needed in many places.

llvm/lib/Transforms/Vectorize/VPlan.h
1071

Adjusted the name, thanks!

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
342

Yep, updated!

347

Updated, thanks!

349

Updated, thanks!

llvm/lib/Transforms/Vectorize/VPlanTransforms.h
50

Updated to reference canonical IV.

Ayal added a comment.Jan 18 2022, 5:08 AM

This is fine, thanks for accommodating, one last thought: does Canonical mean starting at zero, or possibly at EPResumeVal?

fhahn marked 7 inline comments as done.Jan 18 2022, 5:41 AM

This is fine, thanks for accommodating, one last thought: does Canonical mean starting at zero, or possibly at EPResumeVal?

I think checking for zero a the moment should suffice, as EPResumeVal will only be set as start value just before epilogue VPlan execution. This should also match the current behavior before the change (and D117551 which now contains the addition if isCanonical). Once a single plan contains both vector loops we might need some changes in that direction though.

Ayal added a comment.Jan 18 2022, 8:48 AM

This is fine, thanks for accommodating, one last thought: does Canonical mean starting at zero, or possibly at EPResumeVal?

I think checking for zero a the moment should suffice, as EPResumeVal will only be set as start value just before epilogue VPlan execution. This should also match the current behavior before the change (and D117551 which now contains the addition if isCanonical). Once a single plan contains both vector loops we might need some changes in that direction though.

Resetting start of canonicalIV to EPResumeVal just before epiloque VPlan execution might invalidate passes that relied on this value being zero, including removeRedundantCanonicalIVs()? A test would be good, either confirming or reassuring :-)

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
351

Perhaps suffice to

if (WidenOriginalIV && WidenOriginalIV->isCanonical() &&
    WidenOriginalIV->getScalarType() == WidenNewIV->getScalarType()) {
  WidenNewIV->replaceAllUsesWith(WidenOriginalIV);
  WidenNewIV->eraseFromParent();
  return;
}
llvm/lib/Transforms/Vectorize/VPlanTransforms.h
50

typo: "it exists."

fhahn updated this revision to Diff 400944.Jan 18 2022, 12:08 PM
fhahn marked an inline comment as done.

Address latest comments, thanks!

Resetting start of canonicalIV to EPResumeVal just before epiloque VPlan execution might invalidate passes that relied on this value being zero, including removeRedundantCanonicalIVs()? A test would be good, either confirming or reassuring :-)

I am not sure if it is possible to add a test that has epilogue vectorization and a VPWidenCanonicalIVRecipe. We only add VPWidenCanonicalIVRecipe when folding the tail. In that case, epilogue vectorization is disabled as there's no epilogue. But we should be able to add a relevant test for the stepvector/build-scalar-steps patches I think.

fhahn updated this revision to Diff 400950.Jan 18 2022, 12:28 PM

Fix formatting.

Ayal added a comment.Jan 19 2022, 1:03 AM

Address latest comments, thanks!

Resetting start of canonicalIV to EPResumeVal just before epiloque VPlan execution might invalidate passes that relied on this value being zero, including removeRedundantCanonicalIVs()? A test would be good, either confirming or reassuring :-)

I am not sure if it is possible to add a test that has epilogue vectorization and a VPWidenCanonicalIVRecipe. We only add VPWidenCanonicalIVRecipe when folding the tail. In that case, epilogue vectorization is disabled as there's no epilogue. But we should be able to add a relevant test for the stepvector/build-scalar-steps patches I think.

OK, that's reassuring. Perhaps worth leaving behind a comment somewhere, or better - asserting when resetting start from zero to EPResumeVal that canonicalIV is used only by its Increment but not (potentially misused) by VPWidenCanonicalIVRecipe?

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
344

Above early-exit is now redundant.

fhahn updated this revision to Diff 401142.Jan 19 2022, 2:11 AM

Remove unnecessary early exit.

This is fine, thanks for accommodating, one last thought: does Canonical mean starting at zero, or possibly at EPResumeVal?

I think checking for zero a the moment should suffice, as EPResumeVal will only be set as start value just before epilogue VPlan execution. This should also match the current behavior before the change (and D117551 which now contains the addition if isCanonical). Once a single plan contains both vector loops we might need some changes in that direction though.

Resetting start of canonicalIV to EPResumeVal just before epiloque VPlan execution might invalidate passes that relied on this value being zero, including removeRedundantCanonicalIVs()? A test would be good, either confirming or reassuring :-)

Added an assert in 165e36bf180e.

Ayal added inline comments.Jan 19 2022, 6:48 AM
llvm/lib/Transforms/Vectorize/VPlan.h
1067

Ahh, if this VPWidenIntOrFpInductionRecipe also folds a Trunc, should getScalarType() return Trunc's type instead of IV's type?

fhahn updated this revision to Diff 401244.Jan 19 2022, 8:07 AM

Update getScalarType to consider truncated IVs.

fhahn marked an inline comment as done.Jan 19 2022, 8:09 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.h
1067

Good spot! I updated the code. I don't think it is an issue at the moment, as widened canonical IV should always use the widest type, so there's no way to add a test IIUC.

Ayal accepted this revision.Jan 19 2022, 10:13 AM

Remove unnecessary early exit.

This is fine, thanks for accommodating, one last thought: does Canonical mean starting at zero, or possibly at EPResumeVal?

I think checking for zero a the moment should suffice, as EPResumeVal will only be set as start value just before epilogue VPlan execution. This should also match the current behavior before the change (and D117551 which now contains the addition if isCanonical). Once a single plan contains both vector loops we might need some changes in that direction though.

Resetting start of canonicalIV to EPResumeVal just before epiloque VPlan execution might invalidate passes that relied on this value being zero, including removeRedundantCanonicalIVs()? A test would be good, either confirming or reassuring :-)

Added an assert in 165e36bf180e.

Thanks!
For epilog VPlan execution, the start of all IV's should be bumped according to EPResumeVal; perhaps better to model CanonicalIVStartValue as an abstract VPValue of VPlan similar to TripCount and BackedgeTakenCount, w/o claiming it is zero prematurely.

This revision is now accepted and ready to land.Jan 19 2022, 10:13 AM
This revision was automatically updated to reflect the committed changes.
fhahn marked an inline comment as done.