This is an archive of the discontinued LLVM Phabricator instance.

[LV] Patch up induction phis after VPlan execution.
AbandonedPublic

Authored by fhahn on Nov 4 2021, 5:31 AM.

Details

Reviewers
Ayal
gilr
rengolin
Summary

This patch moves fixing up the incoming value of phis for
VPWidenIntOrFpInductionRecipes to after the VPlan is executed.

At the moment, the code in widenIntOrFpInduction relies on the vector
loop latch being created *before* the VPlan is executed. This prevents
us from modeling the vector latch, preheader and exit blocks in VPlan.

By patching up the incoming values for induction PHIs after VPlan
execution, we remove the need to create the latch up front. This enables
generating the vector latch directly from VPlan, which in turn is
required to enable modeling prehader and exit blocks.

The patch uses a workaround to access both the induction increment and
the vector phi for the recipe. It adds 2 new VPValues which are set to
them. While this is a temporary workaround, this will be removed once
the induction code generation is moved to work on VPlan more directly.
See D111303 for the next step, making the induction increment explicit
as VPValue operand.

Diff Detail

Event Timeline

fhahn created this revision.Nov 4 2021, 5:31 AM
fhahn requested review of this revision.Nov 4 2021, 5:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2021, 5:31 AM
Herald added a subscriber: vkmr. · View Herald Transcript
Ayal added a comment.Nov 22 2021, 10:04 AM

Would it be easier to first keep LastInduction in its original place in the header and then sink it to the latch as a VPlan-to-VPlan transformation after all blocks have been formed? Such a patch should refer to the original D22416.

fhahn updated this revision to Diff 389514.Nov 24 2021, 8:52 AM

Rebased, all tests should be passing now!

Would it be easier to first keep LastInduction in its original place in the header and then sink it to the latch as a VPlan-to-VPlan transformation after all blocks have been formed? Such a patch should refer to the original D22416.

In this patch, there is still no recipe to represent LastInduction, so unfortuantely I don't think it could be sunk just yet. But D113223 (next patch in stack after the current one) introduces new VPInstruction opcodes InductionIncrementNUW/InductionIncrement and uses them to explicitly model LastInduction.

he patch then can simplify add it to the last block of the plan. The exit condition & branch get a similar treatment in D113224.

Ayal added a comment.Nov 25 2021, 1:04 AM

Rebased, all tests should be passing now!

Would it be easier to first keep LastInduction in its original place in the header and then sink it to the latch as a VPlan-to-VPlan transformation after all blocks have been formed? Such a patch should refer to the original D22416.

In this patch, there is still no recipe to represent LastInduction, so unfortuantely I don't think it could be sunk just yet. But D113223 (next patch in stack after the current one) introduces new VPInstruction opcodes InductionIncrementNUW/InductionIncrement and uses them to explicitly model LastInduction.

he patch then can simplify add it to the last block of the plan. The exit condition & branch get a similar treatment in D113224.

Yes, in order to sink LastInduction it should be wrapped in a recipe by itself, separating it from VPWidenIntOrFpInductionRecipe. Would introducing such a LastInduction recipe here, potentially created in FixHeaderPhis out of its feeding/consuming VPWidenIntOrFpInductionRecipe, be simpler for this patch? It needs only to take of

LastInduction = cast<Instruction>(
    Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add"));
LastInduction->setDebugLoc(EntryVal->getDebugLoc());

keeping VPWidenIntOrFpInductionRecipe with a single value Def.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1167

Is this addition of excluding VPWidenIntOrFpInductionRecipe independent of this patch?

2413

PhiDef is redundant? VecInd is already being State.set above as part 0 of Def.

8715

This is when an induction feeds a reduction/FOR; the induction became a multi-valued def so we want to select its first, original value?

Perhaps Inductions should be treated as Reductions, included in PhisToFix, and their increment recipe introduced and rewired here?

IV isn't used, suffice to ask isa<>.

9075

(Related to introducing additional values to a Def; having no underlying values?)

llvm/lib/Transforms/Vectorize/VPlan.h
1058

who needs the II?

fhahn updated this revision to Diff 389735.Nov 25 2021, 4:15 AM
fhahn marked 2 inline comments as done.

Addressed comments and added a few additional code comments.

Yes, in order to sink LastInduction it should be wrapped in a recipe by itself, separating it from VPWidenIntOrFpInductionRecipe. Would introducing such a LastInduction recipe here, potentially created in FixHeaderPhis out of its feeding/consuming VPWidenIntOrFpInductionRecipe, be simpler for this patch? It needs only to take of

I had a look, but unfortunately I don't think this is feasible additional work. The step increment value depends on the induction descriptor and may need to be expanded from a SCEV expression in the pre-header. Once we can model the preheader in VPlan, we can add a recipe in the preheader to create the step value.

Alternatively we could land D111303 first, which expands the step up front and maps it to a VPValue that can be used by the induction recipes. Please let me know if you prefer that direction.

A further complication is that LastInduction is also set for CastDef.

LastInduction = cast<Instruction>(
    Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add"));
LastInduction->setDebugLoc(EntryVal->getDebugLoc());

keeping VPWidenIntOrFpInductionRecipe with a single value Def.

Unfortunately VPWidenIntOrFpInductionRecipe is already a multi-def in some cases (due to handling casts as well).

fhahn marked 2 inline comments as done.Nov 25 2021, 4:20 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1167

I think it makes sense to exclude it in any case, but it is becoming problematic with the patch (due to adding additional defs, causing a crash in `getUndderlyingValue). I *think* it could also cause issues if the induction recipe has a CastDef.

2413

Unfortunately Def may be reset to something other than the phi, e.g. in CreateSplatIV .

8715

I changed it to isa<>.

Perhaps Inductions should be treated as Reductions, included in PhisToFix, and their increment recipe introduced and rewired here?

Agreed, but could/should be a separate change?

9075

Yes.

llvm/lib/Transforms/Vectorize/VPlan.h
1058

No, it is a leftover from an earlier version. Removed all II-related changes.

Ayal added a comment.Nov 28 2021, 5:44 AM

Addressed comments and added a few additional code comments.

Yes, in order to sink LastInduction it should be wrapped in a recipe by itself, separating it from VPWidenIntOrFpInductionRecipe. Would introducing such a LastInduction recipe here, potentially created in FixHeaderPhis out of its feeding/consuming VPWidenIntOrFpInductionRecipe, be simpler for this patch? It needs only to take of

I had a look, but unfortunately I don't think this is feasible additional work. The step increment value depends on the induction descriptor and may need to be expanded from a SCEV expression in the pre-header. Once we can model the preheader in VPlan, we can add a recipe in the preheader to create the step value.

OK.
There are actually two issues here: (1) sinking LastInduction to the latch, and (2) feeding LastInduction to the header phi as its operand associated with the latch.
Another temporary workaround may be have ILV::widenIntOrFpInduction() return the LastInduction it generates (in the header, w/o feeding it to the phi), and have VPWidenIntOrFpInductionRecipe record this LastInduction directly. Note that ILV::widenIntOrFpInduction() might not generate a LastInduction - when (unrolling w/o vectorizing) it does not call createVectorIntOrFpInductionPHI(). In such case VPWidenIntOrFpInductionRecipe should not be formed, right?

Alternatively we could land D111303 first, which expands the step up front and maps it to a VPValue that can be used by the induction recipes. Please let me know if you prefer that direction.

A further complication is that LastInduction is also set for CastDef.

LastInduction = cast<Instruction>(
    Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add"));
LastInduction->setDebugLoc(EntryVal->getDebugLoc());

keeping VPWidenIntOrFpInductionRecipe with a single value Def.

Unfortunately VPWidenIntOrFpInductionRecipe is already a multi-def in some cases (due to handling casts as well).

wonder how R->addOperand(IncR->getVPSingleValue()); currently works in fixHeaderPhis - is a testcase of a casted induction feeding a reduction is needed?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1167

Any other (header phi?) recipes worth excluding? (Independent of this patch)

2413

Unfortunately Def may be reset to something other than the phi, e.g. in CreateSplatIV .

This truly deserves simplification. Or at-least some explanation what Def refers to in contrast to PhiDef.

fhahn marked 2 inline comments as done.Nov 29 2021, 1:57 PM

Unfortunately VPWidenIntOrFpInductionRecipe is already a multi-def in some cases (due to handling casts as well).

wonder how R->addOperand(IncR->getVPSingleValue()); currently works in fixHeaderPhis - is a testcase of a casted induction feeding a reduction is needed?

It doesn't. I put up D114720 and D114739 to fix 2 crashes in this area.

fhahn updated this revision to Diff 391923.Dec 5 2021, 10:05 AM

Simplify patch by only recording the 'LastInduction' instruction explicitly in the recipe, instead of adding additional defined values. Depends on D115112 which turns VPWidenIntOrFpInductionRecipe into a single-def recipe.

fhahn updated this revision to Diff 393464.Dec 10 2021, 6:12 AM

rebased on top of 505ad03c7d29 ; ping :)

Ayal added inline comments.Dec 14 2021, 12:36 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
510–511

Document return value?

1167

Is there still a problem (due to additional def/cast)?
Prune/stop this backward slice at any header phi recipe?

2521–2523

{} can be dropped

8715

Can IncR->getVPSingleValue() be retained?

llvm/lib/Transforms/Vectorize/VPlan.cpp
849

Can VecInd be found instead by looking up State.get(IV..), as done for reductions and FOR rewired below?

llvm/lib/Transforms/Vectorize/VPlan.h
1015

Better rename LastInduction to "LastIncrementOfVectorIV", throughout?

Scalar IV's are generated w/o a cross-iteration increment, based off of Induction, which is pre-generated as part of the skeleton, along with its last increment in the latch.

An alternative of forming a Region with an Exit to represent the latch, is mentioned in D114586.

fhahn updated this revision to Diff 394276.Dec 14 2021, 8:42 AM
fhahn marked 6 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
510–511

Updated.

1167

Is there still a problem (due to additional def/cast)?

No, with the recent improvements to cast handling there are no changes needed here!

8715

yes!

llvm/lib/Transforms/Vectorize/VPlan.cpp
849

Unfortunately that isn't possible, because State.get(IV,...) may not refer to the phi directly but may be reset to something other than the phi, e.g. in CreateSplatIV

llvm/lib/Transforms/Vectorize/VPlan.h
1015

Better rename LastInduction to "LastIncrementOfVectorIV", throughout?

Done, thanks!

An alternative of forming a Region with an Exit to represent the latch, is mentioned in D114586.

I'll follow up on the suggestion, but I don't see how this is related directly to this patch. Future patches will remove the need to keeping the last induction increment here by modeling the increment explicitly in VPlan. But to support the general induction case, we need to support expand the step for the induction from a SCEV expression in the pre-header. I'll put patches up to do this once we can model the pre-header.

Ayal added inline comments.Dec 15 2021, 2:38 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
2413

Add a comment that the other incoming will be added later?

2652

This State.set() seems to be the culprit.

llvm/lib/Transforms/Vectorize/VPlan.cpp
833

Augment above comment to also include induction phis?

849

State.get(IV->getVPSingleValue(), 0) does seem to provide the desired header phi, for non-scalable vectors. However, this State may indeed be reset for scalable vectors. Better to try and fix problems caused by resetting State, than to bypass them.

(CreateSplatIV() sets the State for IV rather than resets it, and only for scalar IV's. Here we're dealing with vector IV's).
This resetting stems from buildScalarSteps() in D98715, which resets the State for both scalar and vector IV's. The latter case resets per-part vectors only if they are scalable (and phi is not UAV). But if only a vector IV is needed, it is built w/o calling buildScalarSteps(), i.e., w/o resetting the State for IV. Why does build*Scalar*Steps() need to reset per-part vectors, only when scalar users are present?

If still needed, the actual PhiOfVectorIV could also be recorded alongside LastIncrementOfVectorIV, instead of tracking it down (or up). This could be done using a separate (derived) induction recipe for scalable vectors. In general, if multiple IR values associated with the same recipe need to be retrieved later to hook-up users and/or definitions, they could be modelled using a multi-VPValued VPDef.

854

LoopVectorLatch aka VectorLatchBB?
Rewiring of all header phi's should ideally look similar, as done below for reductions and FORs. Inductions should use their LastIncrement as Val, and are also SinglePartNeeded.

864

Keep this early-exit if !PhiR || !(FOR || Reduction || [Induction]) as the first thing in the loop?

llvm/lib/Transforms/Vectorize/VPlan.h
1015

An alternative of forming a Region with an Exit to represent the latch, is mentioned in D114586.

... how this is related directly to this patch

This alternative may have two (alternative?) parts:
Having the Exit VPBasicBlock available when creating header phi recipes will enable placing last increments there to begin with, instead of moving them (or their generated IR Instruction). This requires introducing recipes/VPInstructions to model these increments, as proposed in follow-up D113223, so indeed more directly related to that patch.
Having the IR Latch BasicBlock available when header phi recipes execute will enable them to generate their last increments and rewire them there and then, w/o needing to record and rewire later. This requires generating the IR basic block of the latch along with that of the header, presumably by VPRegionBase::execute(). But instead of having induction header-phi recipes be responsible for generating their entire chain of increments, it may be better if they behave like reduction and FOR header-phi recipes, whose 'previous' or last reduction operations have independent recipes; this supports a simple linear code-generation of VPlan, instead of moving a builder's insert position around.

fhahn added inline comments.Dec 16 2021, 3:13 AM
llvm/lib/Transforms/Vectorize/VPlan.h
1015

Patch to introduce header & latch up front is available now: D115793

I'll see how this may shake out in terms of simplifications here.

fhahn abandoned this revision.Jul 1 2022, 7:37 AM

AFAICT this was superseded by an alternative patch.

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 7:37 AM