As mentioned on D127115, this patch that attempts to recognise shuffle masks that could be simplified to a AND mask - we already have a similar transform that will fold AND -> 'clear mask' shuffle, but this patch handles cases where the referenced elements are not from the same lane indices but are known to be zero.
|304 ↗||(On Diff #443649)|
We currently reuse the <1,0,0,0> vector constant, purely by chance not design. Anything that attempts to exploit the zero bits tends to break that lucky pattern.
Whats the real annoyance is that the <1,0,0,0> was originally <i1 true, i1 false>, but we zero-extended it to <i64 1, i64 0> during promotion instead of sign-extending it which made it a lot harder to fold with the 'all sign bits' elements from the compare - with a little luck this would have folded away entirely as part of shuffle combining :-(
Why is a zero mask better than an undef mask for undef shuffle mask elements?
Is this saying that MOV #0 + LegalShuffle is always better than create mask + and? I think that sounds OK, so long as it doesn't destroy any BIC patterns.
This was from before I added the isVectorClearMaskLegal handling, I'll see if I can relax it again, but XformToShuffleWithZero always forces undef mask elements to zero as "X & undef --> 0 (not undef)" and IIRC I was trying to keep the behaviours as similar as possible.
There's nothing enforcing it, but M should always be a 'select/blend' style mask (+ undefs) - afaict it will only ever match in isShuffleMaskLegal against 2-element zip style patterns? I think those were the regressions I saw.