[ARM] This patch addresses following issue.
long long foo(int a, int b, int c, int d) { long long acc = (long long)a * (long long)b; acc += (long long)c * (long long)d; return acc; }
Should compile to use SMLAL (Signed Multiply Accumulate Long) which multiplies
two signed 32-bit values to produce a 64-bit value, and accumulates this with
a 64-bit value.
We currently get this for v7:
_foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr
The above is reduced to following with this patch:
_foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr
I think we actually want this check to be
The existing weird opcode logic looks like it was a buggy attempt to work around the problem that ADDE and ADDC use different halves of *MUL_LOHI. I suspect you could get wrong codegen if you could arrange that the low half of SMUL_LOHI gets fed into the ADDE and the high half into the ADDC.
If that's right, we also want to check that the AddeOp is using the MULOp->getValue(1) part; and while we're at it, this check just below has become redundant: