Hi,
I'm in the process of reworking how we handle the __fp16 type slightly. I have larger goals, but the most important immediate one is to perform extensions and truncations in one step so that this C code has IEEE-sensible semantics:
void my_round(double in, __fp16 *out) { *out = in; }
Now, I *think* this is fairly academic as far as OpenCL is concerned (you have to use the vload_half/vstore_half functions to access __fp16 at all times), but I'd like to minimise breakage as far as possible anyway.
As part of this I've made the @llvm.convert.from.fp16 and @llvm.convert.to.fp16 intrinsics polymorphic, and would like to add support for f64 variants in as many places as possible.
NVPTX already seemed to have the instructions there, waiting to be used so I added a couple of patterns and a test.
Are you happy for me to commit the change?
Cheers.
Tim.