This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Transforms/Vectorize/
-
lib/
-
Transforms/
-
Vectorize/
2/4
LoopVectorize.cpp
3/5
VPlan.cpp

Differential D121617

[LV] Move code to place induction increment to VPlan post-processing.
ClosedPublic

Authored by fhahn on Mar 14 2022, 9:54 AM.

Download Raw Diff

Details

Reviewers

Ayal
gilr
rengolin

Commits

rGe7bf2ea93472: [LV] Move code to place induction increment to VPlan post-processing.

Summary

This patch moves the code to set the correct incoming block for the
backedge value to VPlan::execute.

When generating the phi node, the backedge value is temporarily added
using the pre-header as incoming block. The invalid phi node will be
fixed up during VPlan::execute after main VPlan code generation.
At the same time, the backedge value is also moved to the latch.

This change removes the requirement to create the latch block up-front
for VPWidenIntOrFpInductionRecipe::execute, which in turn will enable
modeling the pre-header in VPlan.

As an alternative, the increment could be modeled as separate recipe,
but that would require more work and a bit of redundant code, as we need
to create the step-vector during VPWidenIntOrFpInductionRecipe::execute
anyways, to create the values for different parts.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Mar 14 2022, 9:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2022, 9:54 AM

Herald added subscribers: tschuett, psnobl, rogfer01 and 2 others. · View Herald Transcript

fhahn requested review of this revision.Mar 14 2022, 9:54 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2022, 9:54 AM

Herald added a subscriber: vkmr. · View Herald Transcript

Harbormaster completed remote builds in B154126: Diff 415129.Mar 14 2022, 9:54 AM

fhahn added a child revision: D121618: [LV] Move code to place pointer induction increment to VPlan post-processing..Mar 14 2022, 9:57 AM

ping :)

Ayal added inline comments.Mar 20 2022, 2:32 PM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9540	This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness...
llvm/lib/Transforms/Vectorize/VPlan.cpp
988	Could the incoming value to an induction be added similar to adding such values for reductions and FORs below, which have distinct values across the backedge? I.e., another case of SinglePartNeeded? Could PhiR->getBackedgeValue() work - if the induction PhiR is added as an (second) operand of itself? Moving the last induction step to the end of the latch could then be done separately.

Add note on why the latch cannot be used.

Harbormaster completed remote builds in B155401: Diff 416958.Mar 21 2022, 8:23 AM

fhahn marked 2 inline comments as done.Mar 21 2022, 8:36 AM

fhahn added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9540	This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) I think there are 2 distinct things going on. 1) moving the induction to the latch block and 2) adding the incoming value using the latch block. We could stop doing 1), without impacting correctness, but 2) is needed to generate valid IR in the presence of replicate regions. This should be handled by multiple existing tests. Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. Done! An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly Yes, but I think we would need to introduce a new VPValue for the increment, so we can map it properly. Adding the PHI as operand to itself wouldn't work I think, because in State we can only store values for each part, but we need the value of the last part + VF. Alternatively we could record the IR value in the recipe. But both of those seem like similar workarounds, but seems slightly more complicated. Another thing to note is that the CFG/IR LV generates during codegen is already temporarily invalid. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness... Agreed, but it would add extra clutter to the plan and also the code needed. Please let me know if any of the alternative is strictly preferable to the current approach.
llvm/lib/Transforms/Vectorize/VPlan.cpp
988	Could the incoming value to an induction be added similar to adding such values for reductions and FORs below, which have distinct values across the backedge? I.e., another case of SinglePartNeeded? Could PhiR->getBackedgeValue() work - if the induction PhiR is added as an (second) operand of itself? Unfortunately I don't think so, without further changes. The recipe for the phi is currently mapped to the vector IV values for each part. The increment is the last part + VF. So we either would have to turn VPWidenIntOrFpInductionRecipe into a multi-def or store the LLVM IR value in the recipe directly. Both seem like bigger workarounds in my opinion.

Ping :)

Ayal added inline comments.Mar 27 2022, 5:23 AM

llvm/lib/Transforms/Vectorize/VPlan.cpp
988	(That depends on one's definition of "workaround" ;-) It would be good if the modeling in VPlan of inductions and header phi's is more consistent. This could perhaps be done later, in subsequent patches. Trying to capture the issue and potential resolutions: VPHeaderPHIRecipe offers an interface for recording the backedge value as an operand, used by the Canonical IV and other header phi recipes which (must) have distinct backedge VPValues. Their modeling in VPlan thus matches the IR they generate. During VPlan execution, phi's are first generated with start values only, when traversing from header to latch, and are later revisited to add their backedge incoming values after the latch was generated. VPWidenIntOrFpInductionRecipe is currently not a VPHeaderPHIRecipe, it does not support an explicit backedge value. VPWidenIntOrFpInductionRecipe currently represents both the header phi - explicitly (with UF per-part bumps) and its bump - implicitly; it is conceptually a multi-def recipe, but uses its bump only to feed itself, w/o exposing it to any other user. This implicit bump must wait until the latch is generated in order to be fully rewired to the header phi, but can be partially/incorrectly rewired to the header phi temporarily. VPCanonicalIVPHIRecipe is currently a VPHeaderPHIRecipe; it represents only the header phi, having a separate CanonicalIVIncrement recipe designated to represent its bump, which feeds both VPCanonicalIVPHIRecipe and BranchOnCount; the canonical IV admittedly has an easier bump to generate than int/fp/pointer inductions. Both header phi and bump can alternatively be represented by a single multi-def recipe, as noted for VPWidenIntOrFpInductionRecipe. Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? Should VPWidenIntOrFpInductionRecipe become a VPHeaderPHIRecipe, by supporting a backedge VPValue? Should the rewiring of header phi's be delegated to VPHeaderPHIRecipe, say by adding a second 'execute' method to it, to be called after the latch is constructed, here or at the end of Region::execute?

fhahn added inline comments.Mar 27 2022, 11:51 AM

llvm/lib/Transforms/Vectorize/VPlan.cpp
988	It would be good if the modeling in VPlan of inductions and header phi's is more consistent. This could perhaps be done later, in subsequent patches. Trying to capture the issue and potential resolutions: Agreed, we should work towards making those recipes as consistent as we can! Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? IMO I think it would be better the other way around, fold the increment of `VPCanonicalIVPHIRecipe` back into the recipe. The PHI recipes could model the bump as additional VPValue (i.e. turn them into multi-defs). This would allow us to keep the `induction` concept encapsulated in single recipes and the plans themselves more concise. Should VPWidenIntOrFpInductionRecipe become a VPHeaderPHIRecipe, by supporting a backedge VPValue? Yes I think so. Should be easy now that` VPWidenIntOrFpInductionRecipe` always creates a phi. Should the rewiring of header phi's be delegated to VPHeaderPHIRecipe, say by adding a second 'execute' method to it, to be called after the latch is constructed, here or at the end of Region::execute? Yes that would be a good option to move that logic to something like `VPHeaderPHIRecipe::fixupPhi`. Updating this patch to turn `VPWidenIntOrFpInductionRecipe` into a multi-def should be relatively straight forward. Please let me know if you'd prefer to do this in the current patch or as follow up.

A good step forward in refactoring and simplifying induction variable recipes, raising thoughts for follow-up steps. Ship it!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9540	This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) I think there are 2 distinct things going on. 1) moving the induction to the latch block and 2) adding the incoming value using the latch block. We could stop doing 1), without impacting correctness, but 2) is needed to generate valid IR in the presence of replicate regions. This should be handled by multiple existing tests. Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. Done! An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly Yes, but I think we would need to introduce a new VPValue for the increment, so we can map it properly. Adding the PHI as operand to itself wouldn't work I think, because in State we can only store values for each part, but we need the value of the last part + VF. Alternatively we could record the IR value in the recipe. But both of those seem like similar workarounds, but seems slightly more complicated. Another thing to note is that the CFG/IR LV generates during codegen is already temporarily invalid. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness... Agreed, but it would add extra clutter to the plan and also the code needed. Please let me know if any of the alternative is strictly preferable to the current approach.
9542	VectorPreHeader is definitely an incorrect BB for the backedge incoming value, but it is the correct BB for the start value. Perhaps the current/header BB would be a safer temporary substitute (which must be non-null), to avoid confusion?
llvm/lib/Transforms/Vectorize/VPlan.cpp
988	Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? IMO I think it would be better the other way around, fold the increment of `VPCanonicalIVPHIRecipe` back into the recipe. The PHI recipes could model the bump as additional VPValue (i.e. turn them into multi-defs). This would allow us to keep the `induction` concept encapsulated in single recipes and the plans themselves more concise. Agreed. It's the same way - turning VPWidenIntOrFpInductionRecipe into a multi-def is one way of exposing its bump. Note that enabling other potential users of the exposed bump deserves attention when sinking it to the latch. Updating this patch to turn `VPWidenIntOrFpInductionRecipe` into a multi-def should be relatively straight forward. Please let me know if you'd prefer to do this in the current patch or as follow up. Follow up is fine, preferably with some TODO/FIXME left behind to indicate expected changes.

This revision is now accepted and ready to land.Mar 27 2022, 1:25 PM

Thanks Ayal! I rebased the patch and added the TODO. I'll land this one soon.

This revision was landed with ongoing or failed builds.Mar 28 2022, 8:22 AM

Closed by commit rGe7bf2ea93472: [LV] Move code to place induction increment to VPlan post-processing. (authored by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rGe7bf2ea93472: [LV] Move code to place induction increment to VPlan post-processing..

fhahn mentioned this in D121618: [LV] Move code to place pointer induction increment to VPlan post-processing..Mar 28 2022, 8:42 AM

Harbormaster completed remote builds in B156567: Diff 418593.Mar 29 2022, 12:23 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

14 lines

VPlan.cpp

16 lines

Diff 418594

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,528 Lines • ▼ Show 20 Lines	for (unsigned Part = 0; Part < State.UF; ++Part) {
if (isa<TruncInst>(EntryVal))		if (isa<TruncInst>(EntryVal))
State.ILV->addMetadata(LastInduction, EntryVal);		State.ILV->addMetadata(LastInduction, EntryVal);

LastInduction = cast<Instruction>(		LastInduction = cast<Instruction>(
Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add"));		Builder.CreateBinOp(AddOp, LastInduction, SplatVF, "step.add"));
LastInduction->setDebugLoc(EntryVal->getDebugLoc());		LastInduction->setDebugLoc(EntryVal->getDebugLoc());
}		}

// Move the last step to the end of the latch block. This ensures consistent
// placement of all induction updates.
auto *LoopVectorLatch =
State.LI->getLoopFor(State.CFG.PrevBB)->getLoopLatch();
auto *Br = cast<BranchInst>(LoopVectorLatch->getTerminator());
LastInduction->moveBefore(Br);
LastInduction->setName("vec.ind.next");		LastInduction->setName("vec.ind.next");

VecInd->addIncoming(SteppedStart, State.CFG.VectorPreHeader);		VecInd->addIncoming(SteppedStart, State.CFG.VectorPreHeader);
VecInd->addIncoming(LastInduction, LoopVectorLatch);		// Add induction update using an incorrect block temporarily. The phi node
		// will be fixed after VPlan execution. Note that at this point the latch
		AyalUnsubmitted Done Reply Inline Actions This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness... Ayal: This goes back to D22416. Simply adding this last step here with its current block, as done…
		fhahnAuthorUnsubmitted Done Reply Inline Actions This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) I think there are 2 distinct things going on. 1) moving the induction to the latch block and 2) adding the incoming value using the latch block. We could stop doing 1), without impacting correctness, but 2) is needed to generate valid IR in the presence of replicate regions. This should be handled by multiple existing tests. Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. Done! An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly Yes, but I think we would need to introduce a new VPValue for the increment, so we can map it properly. Adding the PHI as operand to itself wouldn't work I think, because in State we can only store values for each part, but we need the value of the last part + VF. Alternatively we could record the IR value in the recipe. But both of those seem like similar workarounds, but seems slightly more complicated. Another thing to note is that the CFG/IR LV generates during codegen is already temporarily invalid. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness... Agreed, but it would add extra clutter to the plan and also the code needed. Please let me know if any of the alternative is strictly preferable to the current approach. fhahn: > This goes back to D22416. > Simply adding this last step here with its current block, as done…
		AyalUnsubmitted Not Done Reply Inline Actions This goes back to D22416. Simply adding this last step here with its current block, as done prior to D22416, may be incorrect - when header and latch are distinct due to replicated regions? (May be good to (have a) test) I think there are 2 distinct things going on. 1) moving the induction to the latch block and 2) adding the incoming value using the latch block. We could stop doing 1), without impacting correctness, but 2) is needed to generate valid IR in the presence of replicate regions. This should be handled by multiple existing tests. Would be good to explain that the last induction step, which feeds PHI, cannot be added yet to PHI properly, because that requires the latch block - which may not have been formed yet. Done! An alternative to an incorrect addIncoming() may be to record LastInduction temporarily inside its VPWidenIntOrFpInductionRecipe (as it's operand(?) - see below), until it can be added to PHI properly Yes, but I think we would need to introduce a new VPValue for the increment, so we can map it properly. Adding the PHI as operand to itself wouldn't work I think, because in State we can only store values for each part, but we need the value of the last part + VF. Alternatively we could record the IR value in the recipe. But both of those seem like similar workarounds, but seems slightly more complicated. Another thing to note is that the CFG/IR LV generates during codegen is already temporarily invalid. Given that LastInduction is eventually placed in the latch, another alternative may be to designate a recipe for generating LastInduction, placing it in the latch, so that when the recipe executes it could add LastInduction properly to PHI. Listing these alternatives mainly for the sake of completeness... Agreed, but it would add extra clutter to the plan and also the code needed. Please let me know if any of the alternative is strictly preferable to the current approach. Ayal: > > This goes back to D22416. > > Simply adding this last step here with its current block, as…
		// block cannot be used, as it does not exist yet.
		// TODO: Model increment value in VPlan, by turning the recipe into a
		AyalUnsubmitted Not Done Reply Inline Actions VectorPreHeader is definitely an incorrect BB for the backedge incoming value, but it is the correct BB for the start value. Perhaps the current/header BB would be a safer temporary substitute (which must be non-null), to avoid confusion? Ayal: VectorPreHeader is definitely an incorrect BB for the backedge incoming value, but it is the…
		// multi-def and a subclass of VPHeaderPHIRecipe.
		VecInd->addIncoming(LastInduction, State.CFG.VectorPreHeader);
}		}

void VPWidenPointerInductionRecipe::execute(VPTransformState &State) {		void VPWidenPointerInductionRecipe::execute(VPTransformState &State) {
assert(IndDesc.getKind() == InductionDescriptor::IK_PtrInduction &&		assert(IndDesc.getKind() == InductionDescriptor::IK_PtrInduction &&
"Not a pointer induction according to InductionDescriptor!");		"Not a pointer induction according to InductionDescriptor!");
assert(cast<PHINode>(getUnderlyingInstr())->getType()->isPointerTy() &&		assert(cast<PHINode>(getUnderlyingInstr())->getType()->isPointerTy() &&
"Unexpected type.");		"Unexpected type.");

▲ Show 20 Lines • Show All 1,217 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/VPlan.cpp

Show First 20 Lines • Show All 976 Lines • ▼ Show 20 Lines	void VPlan::execute(VPTransformState *State) {
VPBasicBlock *Header = getVectorLoopRegion()->getEntryBasicBlock();		VPBasicBlock *Header = getVectorLoopRegion()->getEntryBasicBlock();
if (Header->empty()) {		if (Header->empty()) {
assert(EnableVPlanNativePath);		assert(EnableVPlanNativePath);
Header = cast<VPBasicBlock>(Header->getSingleSuccessor());		Header = cast<VPBasicBlock>(Header->getSingleSuccessor());
}		}
for (VPRecipeBase &R : Header->phis()) {		for (VPRecipeBase &R : Header->phis()) {
// Skip phi-like recipes that generate their backedege values themselves.		// Skip phi-like recipes that generate their backedege values themselves.
// TODO: Model their backedge values explicitly.		// TODO: Model their backedge values explicitly.
if (isa<VPWidenIntOrFpInductionRecipe>(&R) \|\| isa<VPWidenPHIRecipe>(&R) \|\|		if (isa<VPWidenPHIRecipe>(&R) \|\| isa<VPWidenPointerInductionRecipe>(&R))
isa<VPWidenPointerInductionRecipe>(&R))
continue;		continue;

		// Set the correct incoming block for backedge values and move induction to
		AyalUnsubmitted Done Reply Inline Actions Could the incoming value to an induction be added similar to adding such values for reductions and FORs below, which have distinct values across the backedge? I.e., another case of SinglePartNeeded? Could PhiR->getBackedgeValue() work - if the induction PhiR is added as an (second) operand of itself? Moving the last induction step to the end of the latch could then be done separately. Ayal: Could the incoming value to an induction be added similar to adding such values for reductions…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Could the incoming value to an induction be added similar to adding such values for reductions and FORs below, which have distinct values across the backedge? I.e., another case of SinglePartNeeded? Could PhiR->getBackedgeValue() work - if the induction PhiR is added as an (second) operand of itself? Unfortunately I don't think so, without further changes. The recipe for the phi is currently mapped to the vector IV values for each part. The increment is the last part + VF. So we either would have to turn VPWidenIntOrFpInductionRecipe into a multi-def or store the LLVM IR value in the recipe directly. Both seem like bigger workarounds in my opinion. fhahn: > Could the incoming value to an induction be added similar to adding such values for…
		AyalUnsubmitted Not Done Reply Inline Actions (That depends on one's definition of "workaround" ;-) It would be good if the modeling in VPlan of inductions and header phi's is more consistent. This could perhaps be done later, in subsequent patches. Trying to capture the issue and potential resolutions: VPHeaderPHIRecipe offers an interface for recording the backedge value as an operand, used by the Canonical IV and other header phi recipes which (must) have distinct backedge VPValues. Their modeling in VPlan thus matches the IR they generate. During VPlan execution, phi's are first generated with start values only, when traversing from header to latch, and are later revisited to add their backedge incoming values after the latch was generated. VPWidenIntOrFpInductionRecipe is currently not a VPHeaderPHIRecipe, it does not support an explicit backedge value. VPWidenIntOrFpInductionRecipe currently represents both the header phi - explicitly (with UF per-part bumps) and its bump - implicitly; it is conceptually a multi-def recipe, but uses its bump only to feed itself, w/o exposing it to any other user. This implicit bump must wait until the latch is generated in order to be fully rewired to the header phi, but can be partially/incorrectly rewired to the header phi temporarily. VPCanonicalIVPHIRecipe is currently a VPHeaderPHIRecipe; it represents only the header phi, having a separate CanonicalIVIncrement recipe designated to represent its bump, which feeds both VPCanonicalIVPHIRecipe and BranchOnCount; the canonical IV admittedly has an easier bump to generate than int/fp/pointer inductions. Both header phi and bump can alternatively be represented by a single multi-def recipe, as noted for VPWidenIntOrFpInductionRecipe. Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? Should VPWidenIntOrFpInductionRecipe become a VPHeaderPHIRecipe, by supporting a backedge VPValue? Should the rewiring of header phi's be delegated to VPHeaderPHIRecipe, say by adding a second 'execute' method to it, to be called after the latch is constructed, here or at the end of Region::execute? Ayal: (That depends on one's definition of "workaround" ;-) It would be good if the modeling in…
		fhahnAuthorUnsubmitted Done Reply Inline Actions It would be good if the modeling in VPlan of inductions and header phi's is more consistent. This could perhaps be done later, in subsequent patches. Trying to capture the issue and potential resolutions: Agreed, we should work towards making those recipes as consistent as we can! Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? IMO I think it would be better the other way around, fold the increment of `VPCanonicalIVPHIRecipe` back into the recipe. The PHI recipes could model the bump as additional VPValue (i.e. turn them into multi-defs). This would allow us to keep the `induction` concept encapsulated in single recipes and the plans themselves more concise. Should VPWidenIntOrFpInductionRecipe become a VPHeaderPHIRecipe, by supporting a backedge VPValue? Yes I think so. Should be easy now that` VPWidenIntOrFpInductionRecipe` always creates a phi. Should the rewiring of header phi's be delegated to VPHeaderPHIRecipe, say by adding a second 'execute' method to it, to be called after the latch is constructed, here or at the end of Region::execute? Yes that would be a good option to move that logic to something like `VPHeaderPHIRecipe::fixupPhi`. Updating this patch to turn `VPWidenIntOrFpInductionRecipe` into a multi-def should be relatively straight forward. Please let me know if you'd prefer to do this in the current patch or as follow up. fhahn: > It would be good if the modeling in VPlan of inductions and header phi's is more consistent.
		AyalUnsubmitted Not Done Reply Inline Actions Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for other potential users, and for transparency/consistency? IMO I think it would be better the other way around, fold the increment of `VPCanonicalIVPHIRecipe` back into the recipe. The PHI recipes could model the bump as additional VPValue (i.e. turn them into multi-defs). This would allow us to keep the `induction` concept encapsulated in single recipes and the plans themselves more concise. Agreed. It's the same way - turning VPWidenIntOrFpInductionRecipe into a multi-def is one way of exposing its bump. Note that enabling other potential users of the exposed bump deserves attention when sinking it to the latch. Updating this patch to turn `VPWidenIntOrFpInductionRecipe` into a multi-def should be relatively straight forward. Please let me know if you'd prefer to do this in the current patch or as follow up. Follow up is fine, preferably with some TODO/FIXME left behind to indicate expected changes. Ayal: > > Should VPWidenIntOrFpInductionRecipe expose the bump that feeds it across the backedge, for…
		// latch.
		if (auto *IndR = dyn_cast<VPWidenIntOrFpInductionRecipe>(&R)) {
		auto *Phi = cast<PHINode>(State->get(IndR, 0));
		Phi->setIncomingBlock(1, VectorLatchBB);

		// Move the last step to the end of the latch block. This ensures
		// consistent placement of all induction updates.
		Instruction *Inc = cast<Instruction>(Phi->getIncomingValue(1));
		Inc->moveBefore(VectorLatchBB->getTerminator()->getPrevNode());
		continue;
		}

auto *PhiR = cast<VPHeaderPHIRecipe>(&R);		auto *PhiR = cast<VPHeaderPHIRecipe>(&R);
// For canonical IV, first-order recurrences and in-order reduction phis,		// For canonical IV, first-order recurrences and in-order reduction phis,
// only a single part is generated, which provides the last part from the		// only a single part is generated, which provides the last part from the
// previous iteration. For non-ordered reductions all UF parts are		// previous iteration. For non-ordered reductions all UF parts are
// generated.		// generated.
bool SinglePartNeeded = isa<VPCanonicalIVPHIRecipe>(PhiR) \|\|		bool SinglePartNeeded = isa<VPCanonicalIVPHIRecipe>(PhiR) \|\|
isa<VPFirstOrderRecurrencePHIRecipe>(PhiR) \|\|		isa<VPFirstOrderRecurrencePHIRecipe>(PhiR) \|\|
cast<VPReductionPHIRecipe>(PhiR)->isOrdered();		cast<VPReductionPHIRecipe>(PhiR)->isOrdered();
▲ Show 20 Lines • Show All 729 Lines • Show Last 20 Lines