This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Support sinking VPScalarIVStepsRecipe.
ClosedPublic

Authored by fhahn on Sep 13 2022, 1:23 AM.

Details

Summary

This patch extends VP-based sinking to also sink VPScalarStepsRecipe.
This takes us a step closer towards retiring the IR based sinking.

The main change is extending VPScalarIVStepsRecipe::execute to support
executing in a replicate-region.

Depends on D133758.

Diff Detail

Event Timeline

fhahn created this revision.Sep 13 2022, 1:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2022, 1:23 AM
fhahn requested review of this revision.Sep 13 2022, 1:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2022, 1:23 AM
Ayal added inline comments.Sep 20 2022, 12:18 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9583

nit: should probably have a utility for all interested users to set up {StartPart, EndPart} from a given State.

May be worth trying to unify this handling of VF=1 within buildScalarSteps (TODO?).

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
126

replicate [or scalar IV steps] recipe

139

Worth having a "canSinkScalar()" in RecipeBase, or VPValue?

145–146

Can we check directly if onlyFirstLaneUsed()?

163

Could ScalarIVSteps also feed a user using only first lane?
Conservatively check and set NeedsDuplicating only for RecplicateRecipes?

fhahn updated this revision to Diff 469307.Oct 20 2022, 11:46 AM
fhahn marked 5 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9583

May be worth trying to unify this handling of VF=1 within buildScalarSteps (TODO?).

That was straight-forward, as buildScalarSteps already handles VF = 1 the same way as the code below: e25ed058bc23

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
126

Adjusted, thanks!

139

I think for now it would be better to keep it here as its only user.

145–146

I've put up D133760 to address that.

163

Yes it could after changing to use onlyFirstLaneUsed, I added a continue for now.

fhahn updated this revision to Diff 469537.Oct 21 2022, 3:45 AM

Remove unused variable.

Ayal added inline comments.Oct 23 2022, 2:07 PM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
128–129

Worth clarifying/renaming that C is a SinkCandidateValue produced by SinkCandidate[Recipe]. Or use only the latter: only single-value producing recipes are currently being sunk, so WorkList could hold such recipes (subclasses of VPValues) rather than VPValues, filtering live-in operands when pushing into it.

133

Simplify a bit and check if C is not live-in?

VPRecipeBase *SinkCandidate = cast_or_null<VPRecipeBase>(C->getDef());
if (!SinkCandidate || SinkCandidate->getParent() == SinkTo ||
    SinkCandidate->mayHaveSideEffects() || SinkCandidate->mayReadOrWriteMemory())
  continue;
if (auto *RepR = dyn_cast<VPReplicateRecipe>(SinkCandidate)) {
  if (RepR->isUniform())
    continue;
} else if (!isa<VPScalarIVStepsRecipe>(SinkCandidate))
  continue;
145–146

(This is D133760 (as in Miracle Max calling the brute squad ;-) You mean D136368)

llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
2955 ↗(On Diff #469537)

Are these renamings important and related to this patch or can these changes be dropped?

llvm/test/Transforms/LoopVectorize/float-induction.ll
1411–1412

Sinking order appears to have changed, but is (still) deterministic, right?

llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll
577

Is this change of having the BLEND use MASK3, PRED, MASK2 instead of vp<%14,%12,%9> related to this patch focused on sinking STEPS, or can it be dropped?

fhahn updated this revision to Diff 479877.Dec 3 2022, 4:18 PM
fhahn marked 5 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
128–129

Renamed, thanks!

133

Simplified, thanks!

llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
2955 ↗(On Diff #469537)

Not sure how this sneaked into the diff, it is not needed for the patch and I removed it.

llvm/test/Transforms/LoopVectorize/float-induction.ll
1411–1412

Yes I think the difference here is because we now sink parts using VPlan sinking earlier.

llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll
577

Directly matching the numbered VPValues broke due to different numbering after moving the SCALAR-STEPS recipe. I'll split this change off separately.

Ayal added inline comments.Dec 4 2022, 1:33 AM
llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
52 ↗(On Diff #479877)

nit: indicating that VPScalarIVStepsSC may not write or read from memory is independent of this patch but is exercised by it.

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
142

nit: checking if SinkCandidate already resides in SinkTo is independent of its recipe type and can be combined with above if (!SinkCandidate || ...) checks, as suggested above?

166

Still use (uncaptured) SinkCandidate in addition to SinkCandidateValue?

fhahn updated this revision to Diff 479915.Dec 4 2022, 5:09 AM
fhahn marked 3 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
52 ↗(On Diff #479877)

Submitted separately in 3c5f07349fd8 with unit test coverage added in cf28e6b2f117

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
142

Updated, thanks!

166

I might be missing something here, but this is use is outside the lambda so doesn't need capturing?

We could only use SinkCandidate and add VPRecipeBase* to to worklist instead of VPValue*, but then we would need to filter out live ins in multiple places where we add to the worklist.

Ayal accepted this revision.Dec 4 2022, 9:50 AM

Looks good to me, thnks!
Adding a minor optional nit.

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
154

It does look better to return false and claim !CanSinkWithUser when duplication is needed for a non replicate recipe, leading to capturing both SinkCandidate and SinkCandidateValue...

166

Ahh, right, that use was indeed outside the lambda.

We could only use SinkCandidate and add VPRecipeBase* to to worklist instead of VPValue*, but then we would need to filter out live ins in multiple places where we add to the worklist.

Yeah, the filtering of getDefiningRecipe() would then be required twice - both when initializing the worklist and when augmenting it inside the loop, but clearly only recipes/non-live-in-VPValues are candidates for sinking, and doing so would clarify the SinkCandidateValue versus SinkCandidate ambiguity.

This revision is now accepted and ready to land.Dec 4 2022, 9:50 AM
This revision was landed with ongoing or failed builds.Dec 4 2022, 2:59 PM
This revision was automatically updated to reflect the committed changes.
fhahn added inline comments.Dec 4 2022, 3:00 PM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
166

Yeah, the filtering of getDefiningRecipe() would then be required twice - both when initializing the worklist and when augmenting it inside the loop, but clearly only recipes/non-live-in-VPValues are candidates for sinking, and doing so would clarify the SinkCandidateValue versus SinkCandidate ambiguity.

I updated the code to use VPRecipeBase for the worklist in the committed version, thanks!

Ayal added inline comments.Dec 5 2022, 12:41 AM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
166

I updated the code to use VPRecipeBase for the worklist in the committed version, thanks!

Very well!

There's still some Recipe/VPValue ambiguity left in multiple getVPSingleValue()'s (deserves an explicit comment or assert that only single valued recipes are handled?). This stems from handling not only Replicate Recipes but also sinking VPScalarIVStepsRecipes. Which may suggest introducing a VPSingleValueRecipe base class... although conceptually multiple value recipes could also potentially be sunk in the future.