This patch adds tranformation of fmul+fadd/fsub chains to fused multiply
instructions:
- fmul+fadd->fmadd
- fmul+fsub->fmsub/fnmsub
We also will try to combine these instructions if the fmul has more than one use
and cannot be deleted. However, removing the dependence between fmul and fadd can
still be profitable, and we rely on machine combiner approximations of scheduling.
Cache the the result is FADD when you called it the first time?