This is an archive of the discontinued LLVM Phabricator instance.

[X86] Determine if target shuffle contains zero elements
ClosedPublic

Authored by RKSimon on Dec 9 2015, 6:43 AM.

Details

Summary

getTargetShuffleMask may return shuffle masks with SM_SentinelZero (-2) values (currently just for PSHUFB but VPERM2X128 as well with this patch). Although some calling functions can make use of this (mainly for shuffle combining), others can not and their inclusion makes shuffle mask comparisons more difficult.

This patch adds a flag to getTargetShuffleMask to indicate if the calling function can't handle SM_SentinelZero; getTargetShuffleMask will then return false if it occurs to make handling much easier.

I've tidied up some uses of getTargetShuffleMask to better indicate what is going on - more could be done but at present I don't have test cases to demonstrate it.

Some upcoming patches will make use of this to both support more uses where SM_SentinelZero is not permitted (e.g. combineShuffleToAddSub), and also will allow us to add INSERTPS support to getTargetShuffleMask as part of better zero handling discussed in D14261.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 42294.Dec 9 2015, 6:43 AM
RKSimon retitled this revision from to [X86] Determine if target shuffle contains zero elements.
RKSimon updated this object.
RKSimon added reviewers: spatel, andreadb, chandlerc.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
spatel edited edge metadata.Jan 5 2016, 11:59 AM

Is there any way to expose this change in a regression test?

lib/Target/X86/X86ISelLowering.cpp
4932–4934 ↗(On Diff #42294)

Sinking the empty mask checks is independent / NFC ?
If yes, please check that in first.

RKSimon updated this revision to Diff 44106.Jan 6 2016, 4:12 AM
RKSimon edited edge metadata.

Updated now that the mask.empty() diffs have bveen committed in rL256921.

For tests, the PerformShuffleCombine tests continue to work as-is (they already supported SentinelZero as they deal with PSHUFB combines) - the other candidate is XFormVExtractWithShuffleIntoLoad and that will be able to be tested once I've added INSERTPS as a target shuffle in a future commit and enabled zero support.

Overall, this patch is more about preparing the ground instead of changing much functionality.

RKSimon marked an inline comment as done.Jan 6 2016, 4:13 AM
spatel accepted this revision.Jan 6 2016, 8:40 AM
spatel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jan 6 2016, 8:40 AM
This revision was automatically updated to reflect the committed changes.