This also makes sense if A is a vector type with i64 elements but
the target doesn't have avx512 but has avx2/sse4.1 (for ymm/xmm respectively).
In that case ABS will expand with 3 instructions `blendv(A, sub(set0,
A))` so its better to just to transform the version with fewer/faster
instructions.
SSE41 implies SSE2