Hi,
The attached patch changes Clang so that for types bigger than float, instead of converting to fp16 via the sequence "InTy -> float -> fp16", we perform conversions in just one step. This avoids the double rounding which potentially changes results from a natural IEEE-754 operation.
There are potential problems, but I believe the benefits outweigh them:
- It's a change in semantics. I believe it's compatible with the major standards though (OpenCL requires accesses go via a builtin; n1833 for C would *demand* this change, from my reading of it).
- It means double -> __fp16 conversion will fail on x86 and v7 ARM CPUs for now. Specifically, we will generate a libcall which isn't actually widespread (or probably implemented anywhere). I think this is preferable to the status-quo of producing a possibly incorrect result though.
Longer term, I'd like to improve the codegen here to use real fpext/fptrunc operations and remove many of the special cases for half. Unfortunately the LLVM CodeGen isn't up to this change yet, so I've just extended the use of the @llvm.convert.to.fp16 intrinsic.
So, is it OK to change this?
Cheers.
Tim.