[DAGCombiner] Extending pattern detection for vector shuffle.
ClosedPublic

Authored by jbhateja on Jul 24 2017, 1:44 AM.

Details

Summary

If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.

This will also fix PR 33784

jbhateja created this revision.Jul 24 2017, 1:44 AM
jbhateja added a subscriber: llvm-commits.
jbhateja retitled this revision from [DAGCombiner] If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. to [DAGCombiner] Extending pattern detection for vector shuffle..Jul 24 2017, 1:50 AM
jbhateja edited the summary of this revision. (Show Details)
zvi added inline comments.Jul 24 2017, 3:54 AM
test/CodeGen/X86/pr33784-vector-shuffle.ll
1 ↗(On Diff #107862)

Can you please commit the test and re-apply the patch so that we can see the change in the generated code?

RKSimon added inline comments.Jul 24 2017, 4:21 AM
test/CodeGen/X86/pr33784-vector-shuffle.ll
1 ↗(On Diff #107862)

Please can you add this test to shuffle-vs-trunc-512.ll instead of creating a new test file?

zvi added a comment.Mon, Jul 24, 1:11 PM

See D35700 where pr33784 is handled a bit differently.

test/CodeGen/X86/pr33784-vector-shuffle.ll
1 ↗(On Diff #107862)

This case already exists in test/CodeGen/X86/shuffle-strided-with-offset-512.ll?
Can you please verify you are rebased on top-of-trunk?

Hi Zvi , I'm in process of uploading patch post rebase which should show the changes in asm with the diff. Thanks

jbhateja updated this revision to Diff 108125.Tue, Jul 25, 11:28 AM
jbhateja edited the summary of this revision. (Show Details)

Patch updation post rebase.

jbhateja marked 3 inline comments as done.Tue, Jul 25, 11:31 AM

Hi Simon, Zvi since D35700 also taking care of shuffle_vectors extraction, and this patch is also extending the pattern should it be ok to merge this patch with it.

zvi added a comment.EditedWed, Jul 26, 3:49 AM

IIUC, this patch affect reduceBuildVecToShuffle(). D35700 affect the reduceBuildVecToTruncate() which has precedence to fire before reduceBuildVecToShuffle(), so would it make sense that you rebase this patch on top of D35700, assuming D35700 will be approved.

In D35788#821310, @zvi wrote:

IIUC, this patch affect reduceBuildVecToShuffle(). D35700 affect the reduceBuildVecToTruncate() which has precedence to fire before reduceBuildVecToShuffle(), so would it make sense that you rebase this patch on top of D35700, assuming D35700 will be approved.

+1

jbhateja updated this revision to Diff 108409.Wed, Jul 26, 8:06 PM

Updating post rebase.

Splitting of a vector argument of shuffle vector based on extracted shuffle_mask from the build_vector
is pattern independent w.r.t shuffle mask and looks at the maximal index of vector access.

zvi added inline comments.Thu, Jul 27, 3:19 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14377

Please fix indentation to units of two spaces. The clang-format tool can help.

RKSimon added inline comments.Thu, Jul 27, 3:58 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14322

Very minor - but please keep these newlines - it makes everything easier to understand!

jbhateja updated this revision to Diff 108580.Thu, Jul 27, 9:52 PM

Resolving formatting comments.

A few minors but it looks like its almost there

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14197

SDValue has a bool operator for this - you should be able to just test with:

if (!(VecIn1 && VecIn2 &&
14376

Add a TODO to support UNDEF and zero entries?

14391

Don't calculate NearestPow2 inside the if, pull it out and just test for non-zero NearestPow2 in the if:

NearestPow2 = PowerOf2Ceil(MaxIndex);
if (NearestPow2 && ((NumElems * 2) < NearestPow2)) {
14393

Isn't this just SplitSize > 1 ?

jbhateja updated this revision to Diff 108795.Sat, Jul 29, 9:41 AM

Review comment resolution.

@Reviewers : All the comments have been resolved. Thanks

Ping @ reviewers.

RKSimon added inline comments.Tue, Aug 1, 11:46 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14197

Is there any case where VecIn1 can be null? InVT1 above assumes not

14390

You can use std::max here

jbhateja updated this revision to Diff 109266.Tue, Aug 1, 8:27 PM

Review comments resolution.

RKSimon accepted this revision.Thu, Aug 3, 5:31 AM

A few final minor comments, but otherwise LGTM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
14199

Add an explanation comment. It might be more obvious if you 'de-morgan'-ising the if() as well (but that's up to you):

if (!VecIn2 || VecIn1.getOpcode() != ISD::EXTRACT_SUBVECTOR ||
    VecIn2.getOpcode() !== ISD::EXTRACT_SUBVECTOR ||
    VecIn1.getOperand(0) != VecIn2.getOperand(0))
14396

You can probably do the test as NearestPow2 > 2, and merge it with the outer if and pull SplitSize inside:

if (NearestPow2 > 2 && ((NumElems * 2) < NearestPow2)) {
  unsigned SplitSize = NearestPow2 / 2;
  EVT SplitVT = ......
This revision is now accepted and ready to land.Thu, Aug 3, 5:31 AM
jbhateja updated this revision to Diff 109680.Thu, Aug 3, 8:41 PM
  • Updations to tests post commit rL309963.
  • Synthetic changes to cover review comments.
  • Merge branch 'master' of git://github.com/llvm-mirror/llvm
RKSimon accepted this revision.Fri, Aug 4, 3:30 AM

LGTM (again)

Please land this in trunk if its ok. I do not have commit rights yet.
Thanks

This revision was automatically updated to reflect the committed changes.
RKSimon reopened this revision.Sat, Aug 5, 7:57 AM

Reopening as it was reverted at rL310118 due to PR34073

This revision is now accepted and ready to land.Sat, Aug 5, 7:57 AM
RKSimon requested changes to this revision.Sat, Aug 5, 7:58 AM

PR34073

This revision now requires changes to proceed.Sat, Aug 5, 7:58 AM
jbhateja updated this revision to Diff 110164.Tue, Aug 8, 3:21 AM
  • Merge branch 'master' of git://github.com/llvm-mirror/llvm
  • D35788 : Limiting this patch to simple value types.

Is there a suitable repro test from PR34073 that you can add?

Is there a suitable repro test from PR34073 that you can add?

PR34073 is a huge .cpp test which is running fine now.
There already exist a test case "test/CodeGen/X86/oddshuffles.ll" where shuffle
vector with non-simple vector type was getting split earlier is now not splitting the vector.

Hi Simon, Can you please update this patch in trunk. Thanks

@ Simon, patch has been updated and good to land, please let me know if anything more is needed here.
Thanks

Hi Simon, patch has been updated and good to land, please let me know if anything more is needed here.
Thanks

This comment was removed by jbhateja.
RKSimon accepted this revision.Sat, Aug 12, 6:08 AM

LGTM

@jbhateja Sorry for the delay - I highly recommend that you request commit rights yourself

This revision is now accepted and ready to land.Sat, Aug 12, 6:08 AM

Thanks Simon

This revision was automatically updated to reflect the committed changes.
eladcohen reopened this revision.Mon, Aug 14, 2:14 AM
eladcohen added a subscriber: eladcohen.

Reopening as it was reverted at rL310822 due to PR34175.

This revision is now accepted and ready to land.Mon, Aug 14, 2:14 AM
jbhateja updated this revision to Diff 111009.Mon, Aug 14, 9:35 AM
  • Handling for undef in vector shuffle mask.
  • Fix for PR34175 caused due to l310782

Folks, please consider using utils/shuffle_fuzz.py or some other fuzz testing device. The number of times this lands and gets reverted is becoming fairly disruptive.

jbhateja updated this revision to Diff 111709.Fri, Aug 18, 11:13 AM
  • Rebasing patch to latest trunk.
  • Merge branch 'master' of git://github.com/llvm-mirror/llvm
jbhateja updated this revision to Diff 111830.Sat, Aug 19, 8:55 AM
  • Updating test case ref post rebase
jbhateja closed this revision.Sat, Aug 19, 11:27 AM

Closed by commit rL311255: [DAGCombiner] Extending pattern detection for vector shuffle.