(This is my first patch to LLVM. Let me know if there's anything I could improve.)
For the following function:
double fn1(double d0, double d1, double d2) { double a = -d0 - d1 * d2; return a; }
on ARM, LLVM generates code along the lines of
vneg.f64 d0, d0 vmls.f64 d0, d1, d2
i.e., a negate and a multiply-subtract. The attached patch adds instruction selection patterns to allow it to generate the single instruction
vnmla.f64 d0, d1, d2
(multiply-add with negation) instead, like GCC does.