All information to fix-up the reduction phi nodes in the vectorized loop
is available in VPlan now. This patch moves the code to do so, to make
this clearer. Fixing up the loop exit value still relies on other
information and remains outside of VPlan for now.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
9688 | better turn on an IsOrdered bit inside VPReductionRecipe during its creation or planning, than during its execute? Admittedly goes back to D98435. There are 3 types of reductions: (1) parallel lanes: each lane computes an independent partial-sum; (2) parallel parts: each vector computes an independent partial sum, in-loop within vectors but not across parts, and (3) serial: in-loop across vectors and parts, the header phi operates on a single scalar. Would be good to explicitly indicate which type of reduction is chosen, in both ReductionRecipe(s) and the header phi; it effects their code-gen. | |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
63 | Defining a method of LVP inside VPlan.cpp seems odd, at best. The use of a gating EnableStrictReductions knob (in contrast to a CM cost-based decision), belongs to Legal rather than CM or Planner? I.e., Legal should bail out from vectorizing a loop having an ordered reduction, when !EnableStrictReductions? Admittedly goes back to D98435. | |
892 | Moving this to VPlan::execute is great! |
Rebased and updated the code to check whether to use ordered reductions during codegen to get the VPReductionRecipe and check isOrdered on the reduction descriptr. This builds on the code to determine whether a VPWidenPHI is part of an in-loop reduction.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
9688 |
I think we already have all the required information available with the existing pieces. I updated the patch to check isOrdered on the recurrence descriptor of the VPReductionRecipe and a helper to get the VPReductionRecipe for a VPWidenPHIRecipe, if there's any. | |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
63 | Agreed! I updated the code to check isOrdered on the reduction recurrence descriptor directly during code generation. To do so, I evolved the isInLoopReduction helper to an getInLoopReduction member that returns the VPReductionRecipe, if there's one. | |
892 | I rebased the code, and it now uses getVPSingleValue/getBackedgeValue. I can add the additional comments separately or in this change. |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
---|---|---|
880 | Place this next to hooking up incoming of FOR header phi's above, or even fuse both into a common loop over header phi's? The isOrdered case does behave quite similar to FOR phi's. |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
---|---|---|
880 | done, thanks! |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
---|---|---|
823 |
How about having a common treatment, e.g., for (VPRecipeBase &R : Header->phis()) { auto *PhiR = dyn_cast<VPWidenPHIRecipe>(&R); if (!PhiR) continue; // For first-order recurrences and in-order reduction phis, only a single part // is generated, which provides the last part from the previous iteration. // Otherwise all UF parts are generated. bool IsOrdered = (isa<VPFirstOrderRecurrencePHIRecipe>(&R) || cast<VPReductionPHIRecipe>(&R)->isOrdered()); unsigned LastPartForNewPhi = IsOrdered ? 1 : State->UF; for (unsigned Part = 0; Part < LastPartForNewPhi; ++Part) { Value *VecPhi = State->get(PhiR->getVPSingleValue(), Part); // Or suffice State->get(PhiR, Part)? Value *Val = State->get(PhiR->getBackedgeValue(), IsOrdered ? State->UF - 1 : Part); cast<PHINode>(VecPhi)->addIncoming(Val, VectorLatchBB); } ? |
llvm/lib/Transforms/Vectorize/VPlan.cpp | ||
---|---|---|
823 | Looks good, updated with a small re-naming. of IsOrdered -> SinglePartNeeded and a check to limit to FOR and reduction PHIs only. |
Thanks! This looks good to me ;-)
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
4309 | nit: place above comment about start value where it belongs? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
4309 | I'll move it to VPReductionPHIRecipe::execute, which creates & sets the start value. |
nit: place above comment about start value where it belongs?