This patch originated from D46562 and is a proper subset, with some issues addressed.
See comments in D47911 - we're adding FMF constraints where there were none before?
I don't know enough about AMDGPU to understand what this diff means.
I'd prefer that we add a test for some target where this kind of test currently produces fdiv asm, and this patch will allow it to become an fmul.
Be advised that I am splitting the remaining fmul, fdiv and fadd reviews out for negation context, it will have its own review shortly. The old reviews will remain but only under visit wrt context.
So I walked through the APFloat interface, it does return the correct 16bit value, 0x2E66, which is then converted to F32, and is 0x3dccc000. The old value bypassed conversion in visitFDIV and we given the decimal value of 1/10 which is in 32bit is 0x3dcccccd. The new behavior is more correct than what used to happen since we are being asked to rcp divide approximate a 16bit value.
Tighter binding of the folded constant to the multiply produced in the check patterns. Please note the long constant 1036828672 is the hex value 0x3DCCC000 and is represented as float 0.0999755859, this shows the same propagation as the AMDGPU test, first folding the constant 1/10 to F16 and then converting it to F32.
This is an interesting difference. Before this patch, we wouldn't recognize that we could use an estimate from the fdiv node, but we did recognize that we could use an estimate on the square root itself. But at that point, we don't realize that we're actually calculating a reciprocal square root (as opposed to a plain square root estimate)...so we generate the safety check for a 0.0 input.