This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Improve handling gathers/buildvectors with undefs.
ClosedPublic

Authored by ABataev on Feb 23 2023, 5:45 PM.

Details

Summary

If have just one non-undef scalar in the buildvector/gather node, we try
to put it to be the very first element, which is profitable in most
cases. Do the preliminary estimation, if this more profitable during
graph rotation and do same for all elements, including extractelements.

Diff Detail

Event Timeline

ABataev created this revision.Feb 23 2023, 5:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2023, 5:45 PM
ABataev requested review of this revision.Feb 23 2023, 5:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2023, 5:45 PM
vdmitrie added inline comments.Feb 23 2023, 6:48 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
1456

This needs a bit of explanation (a comment).

4195

Could you please clarify the difference between returning empty container vs std::nullopt?
The description comment for getReorderingData method does not mention this distinction.

4199–4213

Just for the sake of better readability can you rearrange the code to add few variables and break down into pieces that jumbo if condition, please?

Like for example here:

` unsigned Idx = std::distance(TE.Scalars.begin(), It);

Order[Idx] = 0;

`
...
` InstructionCost PermuteCost =

    TopToBottom
        ? 0
        : TTI->getShuffleCost(TTI::SK_PermuteSingleSrc, Ty, Mask);
InstructionCost InsertFirstCost = TTI->getVectorInstrCost(
    Instruction::InsertElement, Ty, TTI::TCK_RecipThroughput, 0,
    PoisonValue::get(Ty), *It);
InstructionCost InsertIdxCost = TTI->getVectorInstrCost(
    Instruction::InsertElement, Ty, TTI::TCK_RecipThroughput, Idx,
    PoisonValue::get(Ty), *It);
if (InsertFirstCost + PermuteCost < InsertIdxCost)
  return Order;

`

ABataev added inline comments.Feb 24 2023, 8:27 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
1456

We can easily combine poison and extractelement <non-poison> or undef and extractelement <poison>. But combining undef + extractelement <non-poison-but-my-produce-poison> requires some extra operations and it is not very effective to combine such elements (to preserve the difference between undefs and poison), rather than extractelement from the same EV1, even in reversed order.

4195

std::nullopt means that the ordering is not important for the node, empty - prefer identity order. I'll add it to the description of the function.

ABataev updated this revision to Diff 500219.Feb 24 2023, 8:28 AM

Address comments

vdmitrie accepted this revision.Feb 24 2023, 10:27 AM
vdmitrie added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
1456

Thanks for explanation. I just meant to put it into the code as a comment.

4195

Thanks. That is what I thought but wasn't absolutely sure.

This revision is now accepted and ready to land.Feb 24 2023, 10:27 AM
ABataev added inline comments.Feb 24 2023, 10:28 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
1456

Ok, will do

This revision was landed with ongoing or failed builds.Feb 24 2023, 1:21 PM
This revision was automatically updated to reflect the committed changes.