This folds a select_cc and (select(set_cc) of a max or min vector reduction with a scalar value into a VMAXV or VMINV.
This could do with a few extra tests to make sure it's a min/max. LHS==TrueVal and RHS==FalseVal for example.
Plus it's worth checking if commuted min/max work too. min(vecreduce.min, x) or min(x, vecreduce.min)
It's might be worth pulling this out into a separate function, similar to isLowerSaturatingConditional or something like PerformSplittingToNarrowingStores where we return the new SDValue.
This could use std::swap?
I think we might need to check these things, otherwise we might be matching things incorrectly.
I believe the exact semantics of a vminv.s8 are that it reads the bottom 8 bits of Rn to do the final scalar min. So a 32bit min could actually produce a different value. We should make sure we are getting this correct too.
Fold using the LHS and check for LHS and RHS opcodes explicitly.
That's much cleaner! Done.
Ah I didn't know that existed. Done.
Thanks. I've added explicit checks for the LHS opcode and have made the lowering use the LHS instead of the TrueVal as the scalar so that we know the top 24 bits aren't set.
else return false;
This looks like it need to be a bit stricter still. The AND needs to go with a umin/umax, and the sign extend with smin/smax (they shouldn't be the other way around). The AND needs a mask of the right size (it'll be 255 for i8 for example, which is the same as a "zero extend inreg"). The sign extend will have a type as the second argument, which should be the same as the vecreduce's scalar type.
Some of this might be difficult to get to come up in practice, but we should make sure it won't be subtly wrong and cause bugs.
Please clang-format, and possible just make it a static function?
Hmm. This is getting complicated.
I would suggest changing this to something more like:
if (Type == i32) return Selected == Compared; else // Check the opcodes are and/signexted/assertzext/assertsext.
But I don't think that will work very well. For i8/i16 I think we might actually need to be matching:
t7: i32 = AssertZext t5, ValueType:ch:i8 t17: i32 = vecreduce_umin t3 t25: i32 = and t17, Constant:i32<255> t27: i32 = select_cc t25, t7, t17, t7, setult:ch t21: i32 = and t27, Constant:i32<255>
Including that root "and" (which is the same as the two TrueVal/FalseVal sides of the select_cc being extended). But one loop I tried had no visible way I could see to prove the value needed to be a i16. It might be easier to do this from a combine, not during lowering. So before we have legalized the types.
What do you think about getting i32 working first with this patch, and doing i8/i16 as a followup? That way we can get all the boilerplate out of the way, and it's just this code that produces the ARMISD::VMINVs that we need to work further on.
Yeah, nice. This is a lot easier to follow.
I think it needs to check that operand is a Setcc too, to be sure it's not something strange.
The top bits of this are not read by the instruction. Can we change it to an ANY_EXTEND?
I think the last operand isn't needed here.
Move the hasMVEIntegerOps check into the function? So it can look like the rest.
One last thing I thought of. Because this is pre-lowering we have to be careful with illegal types.
Can you check the types are the ones we expect them to be (v16i8/v8i16/v4i32). And add some tests for v2i64 (which should be scalarized).
This is no longer used?