This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with zero'd lanes
ClosedPublic

Authored by RKSimon on Jun 21 2015, 7:31 AM.

Details

Summary

The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments.

As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 28089.Jun 21 2015, 7:31 AM
RKSimon retitled this revision from to [X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with zero'd lanes.
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon added reviewers: chandlerc, spatel.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).
spatel added inline comments.Jul 6 2015, 8:46 AM
lib/Target/X86/X86ISelLowering.cpp
4455 ↗(On Diff #28089)

-ve ?

4456 ↗(On Diff #28089)

Should this check be:

M == SM_SentinelZero

Thanks Sanjay.

lib/Target/X86/X86ISelLowering.cpp
4455 ↗(On Diff #28089)

I'll replace it with 'negative'

4456 ↗(On Diff #28089)

At the moment none of the functions that call getTargetShuffleMask support SM_SentinelZero, they treat any negative shuffle index value as UNDEF (hence the FIXME comment I added above) so we can't support this aspect of VPERM2X128 yet.

spatel added inline comments.Jul 6 2015, 9:44 AM
lib/Target/X86/X86ISelLowering.cpp
4456 ↗(On Diff #28089)

But in this function, we know that the mask values can only take on valid or SM_SentinelZero as values, right? So we can make this more specific:

if (std::any_of(Mask.begin(), Mask.end(), [](int M){ return M == SM_SentinelZero; }))
  return false;

It won't change the result for the callers (they still just see the 'false' return value), but it makes the code explicitly match the preceding comment.

RKSimon updated this revision to Diff 29128.Jul 6 2015, 2:50 PM

Updated

spatel accepted this revision.Jul 6 2015, 2:57 PM
spatel edited edge metadata.

LGTM...but one more nit: the std::any_of line has to be split for 80-col? :(

This revision is now accepted and ready to land.Jul 6 2015, 2:57 PM
This revision was automatically updated to reflect the committed changes.

Thanks Sanjay - I've submitted after running clang-format over the patch