This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fix for PR29124: reduce insertelements to shufflevector
ClosedPublic

Authored by ABataev on Sep 2 2016, 3:08 AM.

Details

Summary

If inserting more than one constant into a vector:

define <4 x float> @foo(<4 x float> %x) {
  %ins1 = insertelement <4 x float> %x, float 1.0, i32 1
  %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2
  ret <4 x float> %ins2
}

InstCombine could reduce that to a shufflevector:

define <4 x float> @goo(<4 x float> %x) {
 %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3>
 ret <4 x float> %shuf
}

Diff Detail

Event Timeline

ABataev updated this revision to Diff 70142.Sep 2 2016, 3:08 AM
ABataev retitled this revision from to [InstCombine] Fix for PR29124: reduce insertelements to shufflevector.
ABataev updated this object.
ABataev added reviewers: spatel, simon.f.whittaker.
ABataev added a subscriber: avt77.

Adding reviewers/subscribers from D23886.

If D23886 goes in, I think this patch can be simplified to just work on a pair of insertelements with an undef vector operand for the first insert. Ie, this transform provides the 'seed' shufflevector, and the transform in D23886 pulls subsequent inserts into that shuffle.

test/Transforms/InstCombine/insert-const-shuf.ll
17–22 ↗(On Diff #70142)

This will be affected by D23886. Ideally, this should be only one shuffle, so this means we are missing a fold that combines shufflevectors.

spatel added a comment.Sep 2 2016, 3:00 PM

If D23886 goes in, I think this patch can be simplified to just work on a pair of insertelements with an undef vector operand for the first insert. Ie, this transform provides the 'seed' shufflevector, and the transform in D23886 pulls subsequent inserts into that shuffle.

Hi Alexey -

Does your implementation catch cases that the above idea would not? Note that D23866 was committed at rL280504, and I updated the checks in vec_demanded_elts.ll at rL280531 because it was hard to tell what was changing. So you will need to merge this patch after those changes.

Also, I filed https://llvm.org/bugs/show_bug.cgi?id=30264 for the missing shuffle/shuffle fold (although it may not be important if we have this combine + D23866).

Finally, note Hal's comment ( https://llvm.org/bugs/show_bug.cgi?id=30264#c1 ) about not creating risky shuffles; this is something that Eli mentioned in D23866 as well. I think the case of combining two insertelt constants would be safe for any target (AFAICT, the legalization fallback is to just create inserts again), but someone can tell me if I'm wrong about that. :)

RKSimon added inline comments.Sep 4 2016, 7:11 AM
lib/Transforms/InstCombine/InstCombineVectorOps.cpp
670

Is this actually firing? Might it be better to just bail out if this happens?

test/Transforms/InstCombine/insert-const-shuf.ll
13 ↗(On Diff #70142)

Shouldn't D23886 have dealt with this?

test/Transforms/InstCombine/vec_demanded_elts.ll
1

Possibly regenerate this test first so that the diff is more obvious?

ABataev updated this revision to Diff 70310.Sep 5 2016, 2:40 AM
ABataev edited edge metadata.

Updated to the latest version + small fixes

ABataev updated this revision to Diff 70359.Sep 6 2016, 12:54 AM

Full diff

ABataev updated this revision to Diff 70670.Sep 8 2016, 2:59 AM

Fixed regression in the optimization

ABataev updated this revision to Diff 71902.Sep 20 2016, 12:37 AM

Updated to latest version, renamed UndefElts3/4 to (L|R)HSUndefElts, joined with D24381 to avoid regressions

ABataev updated this revision to Diff 71904.Sep 20 2016, 12:47 AM

Added a test

ABataev updated this object.Sep 22 2016, 6:25 AM
ABataev updated this object.
ABataev updated this revision to Diff 72163.Sep 22 2016, 6:51 AM

Added tests with constants + small improvements

RKSimon accepted this revision.Sep 22 2016, 7:27 AM
RKSimon edited edge metadata.

LGTM with 2 minor corrections.

lib/Transforms/InstCombine/InstCombineVectorOps.cpp
658

Please add a comment describing what this loop is doing.

667

Please add a comment describing what this loop is doing.

This revision is now accepted and ready to land.Sep 22 2016, 7:27 AM
This revision was automatically updated to reflect the committed changes.