This is an archive of the discontinued LLVM Phabricator instance.

Add half load and store builtins
ClosedPublic

Authored by jvesely on Aug 28 2017, 2:11 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

jvesely created this revision.Aug 28 2017, 2:11 PM
Anastasia added inline comments.
include/clang/Basic/Builtins.def
1427 ↗(On Diff #112965)

I think this should be a language builtin (see above) but perhaps we might need to extend the language version here. Because I believe we only have OpenCL v2.0 currently.

Also this should only be available if cl_khr_fp16 is supported and enabled? I think we are doing similar with some subgroups functions (e.g. get_kernel_sub_group_count_for_ndrange) that are only supported by cl_khr_subgroup but those have custom diagnostic though. May be we could leave this check out since half is not available if cl_khr_fp16 is not enabled anyways.

test/CodeGenOpenCL/no-half.cl
3 ↗(On Diff #112965)

It seems strange that cl_khr_fp16 is not enabled too.

jvesely added inline comments.Aug 29 2017, 6:16 AM
include/clang/Basic/Builtins.def
1427 ↗(On Diff #112965)

This is specifically meant to be used when cl_khr_fp16 is not available.
CLC allows using half as storage format and half pointers without the extension,
vstore_half/vload_half are used to load/store half values. (CL1.2 CH 6.1.1.1)

These builtins are not necessary if cl_khr_fp16 is available (we can use regular loads/stores).

I'll take stab at making these CLC only, but similarly to device specific builtins it looked useful beyond that, since these builtins provide access to half type storage.

jvesely updated this revision to Diff 113190.Aug 29 2017, 7:36 PM

restrict builtins to OCLC langauges

Anastasia accepted this revision.Sep 1 2017, 9:34 AM
Anastasia added inline comments.
include/clang/Basic/Builtins.def
1427 ↗(On Diff #112965)

Strange. This is not how I would interpret from the extension spec though: https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/cl_khr_fp16.html

But I think for this change is probably fine indeed because this doesn't affect half type itself.

This revision is now accepted and ready to land.Sep 1 2017, 9:34 AM

LGTM! Thanks!

Anastasia added inline comments.Sep 1 2017, 9:41 AM
test/CodeGenOpenCL/no-half.cl
19 ↗(On Diff #113190)

Would it make sense to add a check for load similarly to store in the test_load_float/test_load_double tests?

27 ↗(On Diff #113190)

Minor thing: any reason you are not checking the load fully?

jvesely updated this revision to Diff 113588.Sep 1 2017, 1:11 PM
jvesely marked 6 inline comments as done.
jvesely edited the summary of this revision. (Show Details)

fully check loads in tests

jvesely added inline comments.Sep 1 2017, 1:13 PM
include/clang/Basic/Builtins.def
1427 ↗(On Diff #112965)

I'm not sure I see the conflict here. cl_khr_fp16 adds support for half scalar and halfn vector types.
without the extension the specs say (OCL 1.2 Ch. 6.1.1.1):

The half data type can only be used to declare a pointer to a buffer that contains half values.

vload_half and vstore_half used to access those buffers without needing half type (or the cl_khr_fp16 extension).

But I think for this change is probably fine indeed because this doesn't affect half type itself.

exactly. this is needed outside of cl_khr_fp16, or the half type

test/CodeGenOpenCL/no-half.cl
19 ↗(On Diff #113190)

there is no load. fptrunc double %foo to half uses the function parameter directly

27 ↗(On Diff #113190)

just my laziness, I've added full check.

jvesely updated this revision to Diff 113624.Sep 1 2017, 4:30 PM

mark load pointers const

jvesely requested review of this revision.Sep 4 2017, 12:30 AM
jvesely edited edge metadata.

please let me know if your accept still stands for the modified version.

Anastasia added inline comments.Sep 6 2017, 10:43 AM
test/CodeGenOpenCL/no-half.cl
27 ↗(On Diff #113190)

Could we do the same for the above examples too?

jvesely marked 2 inline comments as done.Sep 6 2017, 11:39 AM
jvesely added inline comments.
test/CodeGenOpenCL/no-half.cl
27 ↗(On Diff #113190)

I don't understand.
if you mean test_store_*, those functions do not generate any load instructions. the full generated code is:
test_store_double:

entry:
  %0 = fptrunc double %foo to half
  store half %0, half addrspace(1)* %bar, align 2
  ret void
Anastasia accepted this revision.Sep 7 2017, 7:54 AM

Yes, sorry overlooked that. :) LGTM! Thanks!

This revision is now accepted and ready to land.Sep 7 2017, 7:54 AM
This revision was automatically updated to reflect the committed changes.