[x86] inline calls to fmaxf / llvm.maxnum.f32 using maxss (PR24475)
This patch improves on the suggested codegen from PR24475:
but only for the fmaxf() case to start, so we can sort out any bugs before
extending to fmin, f64, and vectors.
The fmax / maxnum definitions provide us flexibility for signed zeros, so the
only thing we have to worry about in this replacement sequence is NaN handling.
Note 1: It may be better to implement this as lowerFMAXNUM(), but that exposes
a problem: SelectionDAGBuilder::visitSelect() transforms compare/select
instructions into FMAXNUM nodes if we declare FMAXNUM legal or custom. Perhaps
that should be checking for NaN inputs or global unsafe-math before transforming?
As it stands, that bypasses a big set of optimizations that the x86 backend
already has in PerformSELECTCombine().
Note 2: The v2f32 test reveals another bug; the vector is extended to v4f32, so
we have completely unnecessary operations happening on undef elements of the
Differential Revision: http://reviews.llvm.org/D15294