Hi,
I'm in the process of reworking how we handle the __fp16 type slightly. I have larger goals, but the most important immediate one is to perform extensions and truncations in one step so that this C code has IEEE-sensible semantics:
void my_round(double in, __fp16 *out) { *out = in; }
Now, I *think* this is fairly academic as far as OpenCL is concerned (you have to use the vload_half/vstore_half functions to access __fp16 at all times), but I'd like to minimise breakage as far as possible anyway.
As part of this I've made the @llvm.convert.from.fp16 and @llvm.convert.to.fp16 intrinsics polymorphic, and would like to add support for f64 variants in as many places as possible.
For R600, it looks like there is no single-step truncation, or support for intrinsics. This means the truncation will always fail to compile, but the attached patch implements extension correctly by splitting it into two operations: f16 -> f32 -> f64.
Are you happy for me to commit the change?
Cheers.
Tim.