If have just one non-undef scalar in the buildvector/gather node, we try
to put it to be the very first element, which is profitable in most
cases. Do the preliminary estimation, if this more profitable during
graph rotation and do same for all elements, including extractelements.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1449 | This needs a bit of explanation (a comment). | |
4188 | Could you please clarify the difference between returning empty container vs std::nullopt? | |
4192–4206 | Just for the sake of better readability can you rearrange the code to add few variables and break down into pieces that jumbo if condition, please? Like for example here: ` unsigned Idx = std::distance(TE.Scalars.begin(), It); Order[Idx] = 0; ` TopToBottom ? 0 : TTI->getShuffleCost(TTI::SK_PermuteSingleSrc, Ty, Mask); InstructionCost InsertFirstCost = TTI->getVectorInstrCost( Instruction::InsertElement, Ty, TTI::TCK_RecipThroughput, 0, PoisonValue::get(Ty), *It); InstructionCost InsertIdxCost = TTI->getVectorInstrCost( Instruction::InsertElement, Ty, TTI::TCK_RecipThroughput, Idx, PoisonValue::get(Ty), *It); if (InsertFirstCost + PermuteCost < InsertIdxCost) return Order; ` |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1449 | We can easily combine poison and extractelement <non-poison> or undef and extractelement <poison>. But combining undef + extractelement <non-poison-but-my-produce-poison> requires some extra operations and it is not very effective to combine such elements (to preserve the difference between undefs and poison), rather than extractelement from the same EV1, even in reversed order. | |
4188 | std::nullopt means that the ordering is not important for the node, empty - prefer identity order. I'll add it to the description of the function. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1449 | Ok, will do |
This needs a bit of explanation (a comment).