This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
ClosedPublic

Authored by RKSimon on Jan 16 2017, 10:00 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 84576.Jan 16 2017, 10:00 AM
RKSimon retitled this revision from to [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions.
RKSimon updated this object.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
mkuper added inline comments.Jan 17 2017, 7:14 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
1831 ↗(On Diff #84576)

This isn't a question about this patch (and maybe I'm not the person who should be reviewing it :-) ), but I'm now confused about how this works at all.

What's the benefit of calling SimplifyDemandedVectorElts() on an instruction when you're demanding all elements, the instruction you're calling this on will also demand all the elements of its operands, and throwing UndefElts away? Is this just so that things have a chance to happen in the middle of the tree?

lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1475 ↗(On Diff #84576)

Add a TODO for the AVX512 versions?

1495 ↗(On Diff #84576)

Can you make this more verbose? It's a bit opaque unless you already know what the layout looks like.

1497 ↗(On Diff #84576)

i -> Lane, j -> Elt (or something similar)?

1527 ↗(On Diff #84576)

Maybe make UndefElts an array of 2 APInts? This will make this code slightly prettier.

craig.topper added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
1831 ↗(On Diff #84576)

Yeah this probably has no value for packs. Its useful for things like scalar intrinsics because they demand less elements from their inputs no matter what future users demand from the intrinsic.

I think this means PSHUFB doesn't need code in InstCombineCalls either.

I've removed the demanded elts handling of pshufb/vpermil from InstCombiner::visitCallInst

lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1527 ↗(On Diff #84576)

UndefElts is a pass by reference argument that is used by the caller so I can't do much to change it. I agree that the code is messy but it is probably as good as it can be.

RKSimon updated this revision to Diff 84834.Jan 18 2017, 7:59 AM

Removed unnecessary handling in InstCombiner::visitCallInst, cleaned up code as requested.

mkuper added inline comments.Jan 19 2017, 11:27 AM
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1527 ↗(On Diff #84576)

Sorry, I meant "make UndefElts2 and UndefElts3 a 2-element array".
I think having this code with a nested loop would be nicer than duplicating each line twice.

RKSimon updated this revision to Diff 85040.Jan 19 2017, 2:38 PM

Removed duplication of each operand by putting the demanded elts code in a loop.

mkuper accepted this revision.Jan 19 2017, 2:56 PM

One more minor naming nit, other than that, LGTM.

lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1496 ↗(On Diff #85040)

i -> OpNum?

This revision is now accepted and ready to land.Jan 19 2017, 2:56 PM
This revision was automatically updated to reflect the committed changes.