PR34474 describes problems for code generation of multiplications with a factor of 2^C +/-1. The patch transforms multiplications of this type to a left shift and an additional addition or subtraction.
The test cases which produced bad MC:
define <2 x i64> @mul7(<2 x i64>) {
%2 = mul <2 x i64> %0, <i64 7, i64 7>
ret <2 x i64> %2
}
define <2 x i64> @mul17(<2 x i64>) {
%2 = mul <2 x i64> %0, <i64 17, i64 17>
ret <2 x i64> %2
}
define <16 x i8> @mul31(<16 x i8>) {
%2 = mul <16 x i8> %0, <i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31, i8 31>
ret <16 x i8> %2
}and are now lowered as follows:
mul7:
# BB#0:
vpsllq $3, %xmm0, %xmm1
vpsubq %xmm0, %xmm1, %xmm0
retq
mul17:
# BB#0:
vpsllq $4, %xmm0, %xmm1
vpaddq %xmm0, %xmm1, %xmm0
retq
mul31:
# BB#0: # %entry
vpsllw $5, %xmm0, %xmm1
vpand .LCPI4_0(%rip), %xmm1, %xmm1
vpsubb %xmm0, %xmm1, %xmm0
retq
This should be an early-out