This changes the lowering of saddsat and ssubsat so that instead of using
r,o = saddo x, y c = setcc r < 0 s = c ? INTMAX : INTMIN ret o ? s : r
into using asr and xor to materialize the INTMAX/INTMIN constants:
r,o = saddo x, y s = ashr r, BW-1 x = xor s, INTMIN ret o ? x : r
https://alive2.llvm.org/ce/z/TYufgD
This seems to reduce the instruction count in most testcases across most architectures.
Please can you move the SSE41 v2i64 lines here.