An initial backend patch towards fixing the various poor HADD combines (PR34724, PR41813, PR45747 etc.).
This extends isHorizontalBinOp to check if we have per-element horizontal ops (odd+even element pairs), but not in the expected serial order - in which case we build a "post shuffle mask" that we can apply to the HOP result, assuming we have fast-hops/optsize etc.
The next step will be to extend the SHUFFLE(HOP(X,Y)) combines as suggested on PR41813 - accepting more post-shuffle masks even on slow-hop targets if we can fold it into another shuffle.
Reduce negative logic?
// Avoid 128-bit lane crossing if this is pre-AVX2 and FP (integer will be split).
if (!Subtarget.hasAVX2 && VT.isFloatingPoint() && ...)