This extends the patterns added in D130564 for fma to also handle negative 0.0. -0.0 is the identity element for fadd so comes up in vectorized loops.
The same basic idea applies to D130564, but nsz should no longer be needed for the fadd case, and is for fsub (which is really only added for completeness).