Page MenuHomePhabricator

[DAG] Canonicalize non-inlane shuffle -> AND if all non-inlane referenced elements are known zero
ClosedPublic

Authored by RKSimon on Jul 5 2022, 9:50 AM.

Details

Summary

As mentioned on D127115, this patch that attempts to recognise shuffle masks that could be simplified to a AND mask - we already have a similar transform that will fold AND -> 'clear mask' shuffle, but this patch handles cases where the referenced elements are not from the same lane indices but are known to be zero.

Diff Detail

Event Timeline

RKSimon created this revision.Jul 5 2022, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 9:50 AM
RKSimon requested review of this revision.Jul 5 2022, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 9:50 AM
RKSimon updated this revision to Diff 442492.Jul 6 2022, 3:13 AM

Fold directly to a clear mask shuffle if isVectorClearMaskLegal is true

RKSimon updated this revision to Diff 442628.Jul 6 2022, 10:20 AM
RKSimon edited the summary of this revision. (Show Details)

rebase and add type legality checks

deadalnix added inline comments.Jul 8 2022, 5:43 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22618–22619

Not very important, but this doesn't match the LLVM style.

deadalnix added inline comments.Jul 8 2022, 7:38 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22625

signed/unsigned comparison.

22649

dito

RKSimon planned changes to this revision.Jul 8 2022, 7:46 AM

waiting for dependencies before I revisit this

RKSimon updated this revision to Diff 443259.Jul 8 2022, 8:32 AM

rebase (still WIP)

RKSimon planned changes to this revision.Jul 8 2022, 8:32 AM
deadalnix added inline comments.Jul 11 2022, 5:30 PM
llvm/test/CodeGen/X86/vselect-constants.ll
304 ↗(On Diff #443649)

Do you know what's up with this guy? This seems objectively worse.

RKSimon added inline comments.Jul 12 2022, 2:45 AM
llvm/test/CodeGen/X86/vselect-constants.ll
304 ↗(On Diff #443649)

We currently reuse the <1,0,0,0> vector constant, purely by chance not design. Anything that attempts to exploit the zero bits tends to break that lucky pattern.

Whats the real annoyance is that the <1,0,0,0> was originally <i1 true, i1 false>, but we zero-extended it to <i64 1, i64 0> during promotion instead of sign-extending it which made it a lot harder to fold with the 'all sign bits' elements from the compare - with a little luck this would have folded away entirely as part of shuffle combining :-(

RKSimon retitled this revision from [DAG] Canonicalize non-inlane shuffle -> AND if all non-inlane referenced elements are known zero (WIP) to [DAG] Canonicalize non-inlane shuffle -> AND if all non-inlane referenced elements are known zero.Jul 13 2022, 9:51 AM
RKSimon edited the summary of this revision. (Show Details)
dmgreen added inline comments.Jul 13 2022, 12:23 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22679–22680

Why is a zero mask better than an undef mask for undef shuffle mask elements?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11814

Is this saying that MOV #0 + LegalShuffle is always better than create mask + and? I think that sounds OK, so long as it doesn't destroy any BIC patterns.

RKSimon added inline comments.Jul 14 2022, 3:22 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22679–22680

This was from before I added the isVectorClearMaskLegal handling, I'll see if I can relax it again, but XformToShuffleWithZero always forces undef mask elements to zero as "X & undef --> 0 (not undef)" and IIRC I was trying to keep the behaviours as similar as possible.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11814

There's nothing enforcing it, but M should always be a 'select/blend' style mask (+ undefs) - afaict it will only ever match in isShuffleMaskLegal against 2-element zip style patterns? I think those were the regressions I saw.

RKSimon updated this revision to Diff 444582.Jul 14 2022, 3:46 AM

use undef in and-mask elements if the shuffle mask was undef

ping - any more comments?

dmgreen accepted this revision.Jul 16 2022, 3:17 AM

LGTM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11814

Yeah I think this sounds OK to enable, from what I can see. The testcase it changes is a little odd as it has quite a few undefs, but in general I've not seen any issues from it from experimenting.

This revision is now accepted and ready to land.Jul 16 2022, 3:17 AM
This revision was landed with ongoing or failed builds.Jul 16 2022, 3:38 AM
This revision was automatically updated to reflect the committed changes.