This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] shrink truncated splat shuffle
ClosedPublic

Authored by spatel on Feb 17 2017, 3:50 PM.

Details

Summary

This could be one part of solving a recent bug report:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

This keeps with our general approach: changing arbitrary shuffles is off-limts, but changing splat is ok. The transform is very similar to the existing shrinkBitwiseLogic() canonicalization.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Feb 17 2017, 3:50 PM
efriedma edited edge metadata.Feb 17 2017, 4:22 PM

"the mask doesn't change"

That's technically true... but doesn't really match the spirit of the rule that we don't change shuffle masks. Many targets have shuffles that only work at specific widths.

"the mask doesn't change"

That's technically true... but doesn't really match the spirit of the rule that we don't change shuffle masks. Many targets have shuffles that only work at specific widths.

I knew I might be pushing the limits here by generalizing too much. :)
In the motivating example, we have a splat mask, so we could still do a limited form of this transform and catch that case. Can we assume that any splat shuffle is safe to transform in target-independent IR?

instcombine already assumes that targets can handle a splat of any type; that's fine.

spatel updated this revision to Diff 89025.Feb 18 2017, 8:06 AM
spatel retitled this revision from [InstCombine] shrink truncated shuffle with constant operand to [InstCombine] shrink truncated splat shuffle.
spatel edited the summary of this revision. (Show Details)

Patch upated:

  1. Limit the transform to splat shuffles. In this case, we canonicalize the vector to operand 0 (and operand 1 should be undef), so that eliminates the commuted pattern match.
  2. Added test for splat with strange types for more coverage of the getSplatValue() implementation.
  3. Added test with undef element in mask. Unlike the DAG's ShuffleVectorSDNode::isSplatMask(), Constant::getSplatValue() does not check for undef elements, so we miss this case currently. I'm not sure if this matters in practice currently because we're conservative about changing shuffle masks.
zvi added a subscriber: zvi.Feb 21 2017, 6:45 AM
majnemer edited edge metadata.

@efriedma, @mkuper I think you two are well positioned to review this change. Please take a look when you get a chance. Thanks!

efriedma accepted this revision.Mar 6 2017, 2:30 PM

LGTM, but you might want to look into the following...

define <8 x i16> @wide_splat1(<8 x i32> %x) {
  %shuf = trunc <8 x i32> %x to <8 x i16>
  %trunc = shufflevector <8 x i16> %shuf, <8 x i16> undef, <8 x i32> <i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
  ret <8 x i16> %trunc
}

define <8 x i16> @wide_splat2(<8 x i32> %x) {
  %trunc = shufflevector <8 x i32> %x, <8 x i32> undef, <8 x i32> <i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
  %shuf = trunc <8 x i32> %trunc to <8 x i16>
  ret <8 x i16> %shuf
}

llc output (x86-64, no SSSE3):

        .text
        .file   "<stdin>"
        .globl  wide_splat1
        .p2align        4, 0x90
        .type   wide_splat1,@function
wide_splat1:                            # @wide_splat1
        .cfi_startproc
# BB#0:
        pslld   $16, %xmm1
        psrad   $16, %xmm1
        pslld   $16, %xmm0
        psrad   $16, %xmm0
        packssdw        %xmm1, %xmm0
        pshuflw $170, %xmm0, %xmm0      # xmm0 = xmm0[2,2,2,2,4,5,6,7]
        pshufd  $80, %xmm0, %xmm0       # xmm0 = xmm0[0,0,1,1]
        retq
.Lfunc_end0:
        .size   wide_splat1, .Lfunc_end0-wide_splat1
        .cfi_endproc

        .globl  wide_splat2
        .p2align        4, 0x90
        .type   wide_splat2,@function
wide_splat2:                            # @wide_splat2
        .cfi_startproc
# BB#0:
        pshufd  $170, %xmm0, %xmm0      # xmm0 = xmm0[2,2,2,2]
        pslld   $16, %xmm0
        psrad   $16, %xmm0
        packssdw        %xmm0, %xmm0
        retq
.Lfunc_end1:
        .size   wide_splat2, .Lfunc_end1-wide_splat2
        .cfi_endproc
This revision is now accepted and ready to land.Mar 6 2017, 2:30 PM
spatel added a comment.Mar 6 2017, 4:22 PM

LGTM, but you might want to look into the following...

define <8 x i16> @wide_splat1(<8 x i32> %x) {
  %shuf = trunc <8 x i32> %x to <8 x i16>
  %trunc = shufflevector <8 x i16> %shuf, <8 x i16> undef, <8 x i32> <i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
  ret <8 x i16> %trunc
}

Thanks for pointing it out. Filed as:
https://bugs.llvm.org/show_bug.cgi?id=32160

This revision was automatically updated to reflect the committed changes.