This patch adds the predicate as additional operand to VPReplicateRecipe
during initial construction. The predicated recipes are later moved into
replicate regions. This simplifies constructions and some VPlan
transformations, like fixed-order recurrence handling.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Nice clean-up!
(Inspired by "All You Need Is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP", Yishen Chen
, Charith Mendis and Saman Amarasinghe, PLDI'22.)
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
2920 | Could the last redundant operand of RepRecipe be removed once it is no longer needed (after wrapping it inside a replicate region) instead of silently ignoring it here? | |
8672 | Worth clarifying that these instructions are first represented as a replicate recipe with a mask (facilitating their motion etc.) and are wrapping within the if-then masked region later? | |
9134 | comment? Is this the best position - must be placed before sinkScalarOperands()/mergeReplicateRegionsIntoSuccessors(), possibly right before them? | |
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp | ||
621 | Is this comment and code dealing with PredInstPHI still needed here? | |
688 | Future thought: would it be better to handle intervals of RepR's that share the same mask together here, placing them in a common region, rather than fusing regions later? | |
llvm/lib/Transforms/Vectorize/VPlanTransforms.h | ||
83 | Documentation? |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
2920 | Updated the code to remove the mask, by creating a new VPReplicateRecipe without the mask. | |
8672 | Adjusted, thanks! | |
9134 | Added a comment and moved just before sinkScalarOperands/region merging. | |
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp | ||
621 | Not any longer, removed! | |
688 | that would be a good follow-up change; general merging & sinking will probably still be useful, as sinking can enable new merging opportunities. |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
8675 | nit: worth recording as Mask to be returned by getMask(), initially set to null, rather than having the latter return the last operand blindly below? | |
8689 | nit: worth adding a comment that the masked replicate recipe is replaced by a non-masked one? | |
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll | ||
94–95 | nit: name of label fixed? Worth fixing them PRED_LOAD* names as well for consistency? | |
llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll | ||
369 | Second operand vp<[[STEPS]]> dropped intentionally? | |
371 | ditto | |
llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll | ||
411 | Two independent replicating regions with a reduce.add between them that were kept separate are now being merged? This patch seems conceptually an NFC, according to its current commit message. |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
8675 | I updated the code to add the mask in the constructer, iff IsPredicated is set. | |
8689 | Added a comment, thanks! | |
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll | ||
94–95 | Fixed, thanks! | |
llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll | ||
411 | In this case, all recipes are in the same VPBasicBlock when adjustRecipesForReductions is called, which will add the reduction recipes at the end of the block containing the original widen recipes, hence the change. |
Address latest comments, thanks! This now depends on D145322. Also add getMask helper.
llvm/lib/Transforms/Vectorize/VPlan.h | ||
---|---|---|
1513 | Passing IsPredicated is now redundant - can be inferred by checking if given Mask is null? |
Looks good to me, thanks!
Adding a couple of final nits.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
8701 | nit: now that this is a VPlan2VPlan transformation, check if PredRecipe has users instead of Instr's type? While we're here, could perhaps be simplified into (independent of this patch): VPRecipeBase *PHIRecipe = nullptr; if (PredRecipe has users outside its block) { PHIRecipe = new VPPredInstPHIRecipe(RecipeWithoutMask); RecipeWithoutMask->replaceAllUsesWith(PHIRecipe); PHIRecipe->setOperand(0, RecipeWithoutMask); } | |
9186 | nit: the above creation and iterative optimization of replicate regions could be folded into some VPlanTransforms::createAndOptimizeReplicateRegions(*Plan, RecipeBuilder); | |
llvm/lib/Transforms/Vectorize/VPlanTransforms.h | ||
83 | nit: perhaps better place addReplicateRegions() above right before sinkScalarOperand() - merge - merge? | |
llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll | ||
411 | Wonderful! |
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
8701 | Done in 2db71c9851e5, thanks! | |
9186 | Done in eca14a810e59 |
Could the last redundant operand of RepRecipe be removed once it is no longer needed (after wrapping it inside a replicate region) instead of silently ignoring it here?