This commit implements the following patterns:
fmul (ptrue sv_all) (dup 1.0) V => V fmul (ptrue sv_all) V (dup 1.0) => V mul (ptrue sv_all) (dup 1) V => V mul (ptrue sv_all) V (dup 1) => V
That is: using the SVE mul/fmul intrinsic with an all-true predicate to
multiply a vector X by a vector of all ones is redundant.
The result of this commit is that code such as:
1 #include <arm_sve.h> 2 3 svfloat64_t foo(svfloat64_t a) { 4 svbool_t t = svptrue_b64(); 5 svfloat64_t b = svdup_f64(1.0); 6 return svmul_m(t, a, b); 7 }
will compile to a nop.
This commit does not capture all possibilities; only the simple case as
described above. There is still room for further optimisation.
Is it worth naming this something like combineSVEIntrinsicBinOp and similarly for FP you could have combineSVEIntrinsicFPBinOp? A bit like SelectionDAG::simplifyFPBinop. The reason I mention this is that I can imagine you wanting similar things for divides, adds at some point too, i.e. fdiv X, 1.0 -> X or fadd X, 0.0 -> X