Lots of diff. The entire check-llvm-codegen passes,
so only X86 had conflicting transform. (D62327)
We want this transform because currently every single DAGCombine add %x, C
vector pattern needs to be written twice - for add and for sub.
- AArch64 changes look neutral-positive. I'm not good with that asm, but i think movi v1.2d encodes the entire all-ones as an imm0_255:$imm8, so there should not be codesize penalty?
- AMDGPU seems to miss a fold: "if this is an addition by an immediate, and immediate needs a load, and negated immediate won't need load if used in add, then transform add to sub" Looks neutral otherwise
- MIPS - bad, many regressions, same fold as AMDGPU seems missing.
- PowerPC - not great, some regressions, same fold as AMDGPU seems missing.
- X86 - in average looks like an improvement :) There are more deletions than additions. We delete 137 unfolded constant-pool loads, but add 56; delete 233 folded constant-pool loads, but add 350. Can't tell yet if there is some missing combines..