This enables load/stores of half type, without half being a legal type.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
include/clang/Basic/Builtins.def | ||
---|---|---|
1427 | I think this should be a language builtin (see above) but perhaps we might need to extend the language version here. Because I believe we only have OpenCL v2.0 currently. Also this should only be available if cl_khr_fp16 is supported and enabled? I think we are doing similar with some subgroups functions (e.g. get_kernel_sub_group_count_for_ndrange) that are only supported by cl_khr_subgroup but those have custom diagnostic though. May be we could leave this check out since half is not available if cl_khr_fp16 is not enabled anyways. | |
test/CodeGenOpenCL/no-half.cl | ||
4 | It seems strange that cl_khr_fp16 is not enabled too. |
include/clang/Basic/Builtins.def | ||
---|---|---|
1427 | This is specifically meant to be used when cl_khr_fp16 is not available. These builtins are not necessary if cl_khr_fp16 is available (we can use regular loads/stores). I'll take stab at making these CLC only, but similarly to device specific builtins it looked useful beyond that, since these builtins provide access to half type storage. |
include/clang/Basic/Builtins.def | ||
---|---|---|
1427 | Strange. This is not how I would interpret from the extension spec though: https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/cl_khr_fp16.html But I think for this change is probably fine indeed because this doesn't affect half type itself. |
include/clang/Basic/Builtins.def | ||
---|---|---|
1427 | I'm not sure I see the conflict here. cl_khr_fp16 adds support for half scalar and halfn vector types.
vload_half and vstore_half used to access those buffers without needing half type (or the cl_khr_fp16 extension).
exactly. this is needed outside of cl_khr_fp16, or the half type | |
test/CodeGenOpenCL/no-half.cl | ||
20 | there is no load. fptrunc double %foo to half uses the function parameter directly | |
28 | just my laziness, I've added full check. |
test/CodeGenOpenCL/no-half.cl | ||
---|---|---|
28 | Could we do the same for the above examples too? |
test/CodeGenOpenCL/no-half.cl | ||
---|---|---|
28 | I don't understand. entry: %0 = fptrunc double %foo to half store half %0, half addrspace(1)* %bar, align 2 ret void |
I think this should be a language builtin (see above) but perhaps we might need to extend the language version here. Because I believe we only have OpenCL v2.0 currently.
Also this should only be available if cl_khr_fp16 is supported and enabled? I think we are doing similar with some subgroups functions (e.g. get_kernel_sub_group_count_for_ndrange) that are only supported by cl_khr_subgroup but those have custom diagnostic though. May be we could leave this check out since half is not available if cl_khr_fp16 is not enabled anyways.