This is an archive of the discontinued LLVM Phabricator instance.

DAGCombine: Combine shuffles of splat-shuffles
ClosedPublic

Authored by zvi on Apr 11 2017, 2:40 PM.

Details

Summary

Reapply r299047, but this time handle correctly splat-masks with undef elements.

Diff Detail

Repository
rL LLVM

Event Timeline

zvi created this revision.Apr 11 2017, 2:40 PM
zvi added inline comments.Apr 11 2017, 2:43 PM
test/CodeGen/X86/shuffle-of-splat-multiuses.ll
77 ↗(On Diff #94895)

These two moves seem like a result of suboptimal scheduling + register allocation choices.

spatel added inline comments.Apr 13 2017, 7:17 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14622–14624 ↗(On Diff #94895)

Do you think we'll need to distinguish between the "strict" splat and splat-with-undef cases anywhere else? I'm wondering if it would be better to add a bool "HasAnyUndefs" param to ShuffleVectorSDNode::isSplat(). Even if we don't have another use case at the moment, that would at least make the API more like BuildVectorSDNode::isConstantSplat().

zvi added inline comments.Apr 20 2017, 6:27 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14622–14624 ↗(On Diff #94895)

Not sure about that, i can try to survey the code for such case.
I could find only one real use of "HasAnyUndefs" for the build_vector API in PPCISelLowering.cpp.

test/CodeGen/X86/shuffle-of-splat-multiuses.ll
77 ↗(On Diff #94895)

Opened Bug 32719 to discuss the extra copy issue

zvi updated this revision to Diff 95946.Apr 20 2017, 6:36 AM
spatel added inline comments.May 1 2017, 2:20 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14635–14642 ↗(On Diff #95946)

I'm scared that I'm not seeing all of the undef possibilities here (since I definitely missed them the first time!).

How about writing this to make the options more visible (not sure if it's more or less efficient), but I'd find this easier to read (warning: untested code):

// If the splat mask has no undefs, then return it as-is. 
// It doesn't matter if the user mask has undefs. We choose the splat mask element because it's efficient to return an existing splat node.
ArrayRef<int> SplatMask = Splat->getMask();
if (find(SplatMask, -1) == SplatMask.end())
  return SDValue(Splat, 0);

// The splat mask has undefs. If the user shuffle does not choose undef elements where the splat does not have undef elements, return the existing splat node.
if (all_of(UserMask, [](int Idx) { return Idx == -1 || Splat->getMaskElt(Idx) == Splat->getSplatIndex(); }))
  return SDValue(Splat, 0); 
  
// The splat mask has undefs, and the user shuffle chooses undef elements where the splat has defined elements. Create a more liberal splat mask (one with more undefs).
 for (unsigned i = 0; i < NumElts; ++i)
  // No need for "IsSameMask" here because we already handled any case where we wanted to return the existing node.
zvi added inline comments.May 8 2017, 5:13 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14635–14642 ↗(On Diff #95946)
if (all_of(UserMask, [](int Idx) { return Idx == -1 || Splat->getMaskElt(Idx) == Splat->getSplatIndex(); }))
  return SDValue(Splat, 0);

This is not correct. For example UserMask = [0,2,u,u], SplatMask=[2,u,2,u]. It would be incorrect to return 'Splat' because it would result in an undef at element index=1 which the composition of masks does not yield.
But i will try to rewrite this code to follow the spirit of your suggestion: first try to identify cases where we can return the splat-shuffle itself, if not, build a new shuffle. Thanks!

zvi updated this revision to Diff 98157.May 8 2017, 5:42 AM

Restructure in spirit of Sanjay's proposal: first try to identify cases where we can return the splat-shuffle itself, if not, build a new shuffle mask and create a new shuffle node.

spatel added inline comments.May 9 2017, 8:18 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14880 ↗(On Diff #98157)

typo: mismatxh

14886–14892 ↗(On Diff #98157)

I'm a coding caveman, so I had not seen this syntax before. I know it takes a line or 2 more, but I think it's easier to read when you give the lambda a name, so something like:

auto canSimplifyToExistingSplat = [UserMask, SplatMask]() { ... }
if (canSimplifyToExistingSplat())
  return SDValue(Splat, 0);
14887–14890 ↗(On Diff #98157)

Given that I got this wrong twice, I'd like to see some mask examples for various undef cases in the code comments. I think the tests cover these, so it should just be a matter of listing those cases here similar to the explanatory note in this review:

UserMask = [0,2,u,u], SplatMask=[2,u,2,u] --> ?

Bonus points for adding similar comments in the test file to describe the undef handling there too.

zvi added inline comments.May 9 2017, 8:48 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14880 ↗(On Diff #98157)

Thanks

14886–14892 ↗(On Diff #98157)

Ok

14887–14890 ↗(On Diff #98157)

I agree this will improve the readability of the code and point-out this tricky subtlety.

zvi updated this revision to Diff 98307.May 9 2017, 9:39 AM

Update according to Sanjay's comments. I updated the comments with some representative examples, but gave up on annotating the tests since after Isel it's hard to see from the text which elements were undef as we choose arbitrary values for undef elements.

zvi marked 3 inline comments as done.May 9 2017, 9:40 AM
zvi marked an inline comment as done.
zvi marked 2 inline comments as done.May 9 2017, 9:43 AM
spatel accepted this revision.May 9 2017, 10:25 AM

LGTM. Thanks for adding the examples in the code comments.

This revision is now accepted and ready to land.May 9 2017, 10:25 AM
This revision was automatically updated to reflect the committed changes.