This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] A shuffle of a splat is always the splat itself
ClosedPublic

Authored by zvi on Mar 28 2017, 10:19 AM.

Details

Summary

Add a simplification:
shuffle (splat-shuffle), undef, M --> splat-shuffle

Fixes pr32449

Patch by Sanjay Patel

Diff Detail

Repository
rL LLVM

Event Timeline

zvi created this revision.Mar 28 2017, 10:19 AM
spatel edited edge metadata.Mar 28 2017, 4:14 PM

This seems unnecessarily complicated. Why can't we just match any shuffle of a splat before we get here and simplify it?

Also, would you be interested in fixing this in IR? :)

$ ./opt -instsimplify shufsplat.ll -S

define <4 x i32> @foo(<4 x i32> %x) {
  %splat = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> zeroinitializer
  %shuf = shufflevector <4 x i32> %splat, <4 x i32> undef, <4 x i32> <i32 0, i32 3, i32 2, i32 1>
  ret <4 x i32> %shuf
}

This example is handled by instcombine...but in a very inefficient way.

zvi added a comment.Mar 28 2017, 9:29 PM

I initially tried handling all cases of shuffle of shuffle-splat but the improvements from D27793 regressed, so this is what i came up with.
Should i change this to an X86-specific combine to follow the spirit of D27793?

About doing this in instcombine, yes, i plan to look into that too, but the pattern optimized by this patch is born out of thin air in another patch I'm working on that improves the combine of BUILD_VECTOR into composed shuffles.

In D31426#712864, @zvi wrote:

I initially tried handling all cases of shuffle of shuffle-splat but the improvements from D27793 regressed, so this is what i came up with.
Should i change this to an X86-specific combine to follow the spirit of D27793?

I think D27793 overstepped what it was trying to protect against. In the case where N1 is undef, we probably still want to do more combining? But that doesn't solve the tests here. I haven't looked at what is happening below that check.

But the transform you want is a "simplify" not a "combine" (we're returning an existing node), so we shouldn't need to check uses, legality, or anything else. This diff solves the tests in this patch with no regressions on other tests:

Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/DAGCombiner.cpp	(revision 298954)
+++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp	(working copy)
@@ -14646,6 +14646,12 @@
       return DAG.getVectorShuffle(VT, SDLoc(N), N0, N1, NewMask);
   }
 
+  // A shuffle of a splat is always the splat itself:
+  // shuffle (splat-shuffle), undef, M --> splat-shuffle
+  if (auto *N0Shuf = dyn_cast<ShuffleVectorSDNode>(N0))
+    if (N1.isUndef() && N0Shuf->isSplat())
+      return N0;
+
   // If it is a splat, check if the argument vector is another splat or a
   // build_vector.
   if (SVN->isSplat() && SVN->getSplatIndex() < (int)NumElts) {
zvi added a comment.Mar 29 2017, 11:30 AM

Thanks, Sanjay, for suggesting this superior patch. I will use it for the next revision.

zvi updated this revision to Diff 93393.Mar 29 2017, 11:35 AM
zvi retitled this revision from [DAGCombine] Combine shuffle of splat with multiple uses to [DAGCombine] A shuffle of a splat is always the splat itself.
zvi edited the summary of this revision. (Show Details)
spatel accepted this revision.Mar 29 2017, 1:13 PM

LGTM of course. :)
If anyone who knows the shuffle combiner better wants to confirm that I haven't misdiagnosed the problem, that would be good too.

This revision is now accepted and ready to land.Mar 29 2017, 1:13 PM
This revision was automatically updated to reflect the committed changes.