If the DAG looks like this: a*b+c*d, it could be folded into fma(a, b, c*d) or fma(c, d, a*b). https://reviews.llvm.org/D11855 was posted to improve it that respects the uses of a*b or c*d to do the best choice.

But for a*b-c*d, it could be also folded into fma(a, b, -c*d) or fma(-c, d, a*b). This patch is trying to respect the uses of a*b and c*d to make the best choice.

And this is the motivated case:

define double @fsub1(double %a, double %b, double %c, double %d) { entry: %mul = fmul fast double %b, %a %mul1 = fmul fast double %d, %c %sub = fsub fast double %mul, %mul1 %mul3 = fmul fast double %mul, %sub ret double %mul3 }

define double @fsub1(double %a, double %b, double %c, double %d) { ; CHECK-LABEL: fsub1: ; CHECK: # %bb.0: # %entry -; CHECK-NEXT: xsmuldp 3, 4, 3 ; CHECK-NEXT: xsmuldp 0, 2, 1 -; CHECK-NEXT: xsmsubadp 3, 2, 1 -; CHECK-NEXT: xsmuldp 1, 0, 3 +; CHECK-NEXT: fmr 1, 0 +; CHECK-NEXT: xsnmsubadp 1, 4, 3 +; CHECK-NEXT: xsmuldp 1, 0, 1