This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Lower shuffles as repeated lane shuffles then lane-crossing shuffles
ClosedPublic

Authored by RKSimon on Jan 25 2016, 8:41 AM.

Details

Summary

This patch attempts to represent a shuffle as a repeating shuffle (recognisable by is128BitLaneRepeatedShuffleMask) with the source input(s) in their original lanes, followed by a single permutation of the 128-bit lanes to their final destinations.

On AVX2 we can additionally attempt to match using 64-bit sub-lane permutation. AVX2 can also now match a similar 'broadcasted' repeating shuffle.

This patch has several benefits:

  • Avoids prematurely matching with lowerVectorShuffleByMerging128BitLanes which can require both inputs to have their input lanes permuted before shuffling.
  • Can replace PERMPS/PERMD instructions - although these are useful for cross-lane unary shuffling, they require their shuffle mask to be pre-loaded (and increase register pressure).
  • Matching the repeating shuffle makes use of a lot of existing shuffle lowering.

There is an outstanding minor AVX1 regression (combine_unneeded_subvector1 in vector-shuffle-combining.ll) of a previously 128-bit shuffle + subvector splat being converted to a subvector splat + (2 instruction) 256-bit shuffle, I intend to fix this in a followup patch for review.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 45870.Jan 25 2016, 8:41 AM
RKSimon retitled this revision from to [X86][AVX] Lower shuffles as repeated lane shuffles then lane-crossing shuffles.
RKSimon updated this object.
RKSimon added reviewers: qcolombet, delena, andreadb.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
spatel added a subscriber: spatel.Jan 25 2016, 8:58 AM
ab added a subscriber: ab.Feb 11 2016, 12:35 PM
ab added inline comments.
lib/Target/X86/X86ISelLowering.cpp
10645 ↗(On Diff #45870)

Lower -> lower?

10732 ↗(On Diff #45870)

I found this non-obvious; it might help to split it in two?

IIUC the first part picks the lane and the second part picks the sublane, right?

RKSimon updated this revision to Diff 47734.Feb 11 2016, 3:05 PM

Updated based on Ahmed's comments

ab accepted this revision.Feb 12 2016, 10:25 AM
ab added a reviewer: ab.

LGTM

This revision is now accepted and ready to land.Feb 12 2016, 10:25 AM
This revision was automatically updated to reflect the committed changes.