This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Support cl_ext_float_atomics
ClosedPublic

Authored by haonanya on Jul 19 2021, 11:01 PM.

Details

Summary

This aims to support cl_ext_float_atomics on CFE, see https://github.com/KhronosGroup/OpenCL-Docs/pull/552 for details.

Diff Detail

Event Timeline

haonanya created this revision.Jul 19 2021, 11:01 PM
haonanya requested review of this revision.Jul 19 2021, 11:01 PM

The extension spec seems to also mention atomic_half. Are planning to add it too?

clang/include/clang/Basic/OpenCLExtensions.def
85 ↗(On Diff #360018)

If the only purpose of adding this extension here is to define the macros you should just use the internal header for it. See guidelines: https://clang.llvm.org/docs/OpenCLSupport.html#implementation-guidelines

The same applies to all the feature macros below.

You could check how new subgroup extensions are implemented: https://clang.llvm.org/doxygen/opencl-c-base_8h_source.html

// For SPIR all extensions are supported.
#if defined(__SPIR__)
#define cl_khr_subgroup_extended_types 1
#define cl_khr_subgroup_non_uniform_vote 1
#define cl_khr_subgroup_ballot 1
#define cl_khr_subgroup_non_uniform_arithmetic 1
#define cl_khr_subgroup_shuffle 1
#define cl_khr_subgroup_shuffle_relative 1
#define cl_khr_subgroup_clustered_reduce 1
#endif // defined(__SPIR__)
clang/lib/Headers/opencl-c-base.h
40

From the spec it feels like those macros should be defined conditionally?

clang/lib/Headers/opencl-c.h
13419

should this not be conditioned on __opencl_c_ext_fp32_global_atomic? Otherwise, I am missing what is the intent of those macros...

FYI this revision has not been added to cfe-commits, is this intensional?

haonanya updated this revision to Diff 360712.Jul 22 2021, 12:13 AM

Add extension and all the feature macros to internal header.

FYI this revision has not been added to cfe-commits, is this intensional?

Hi, Anastasia. I am not very familiar with the process, could you please help to add to cfe-commits if possible? Thanks very much.

Anastasia added a subscriber: cfe-commits.

Just to make sure you are aware Clang doesn't use this header by default, so the upstream users won't be able to call those functions unless you add them into OpenCLBuiltins.td:
https://clang.llvm.org/docs/OpenCLSupport.html#opencl-builtins

This header is only accessible via the frontend options: https://clang.llvm.org/docs/OpenCLSupport.html#cmdoption-finclude-default-header

clang/lib/Headers/opencl-c.h
13657

Can you annotate the #endifs with a comment describing what they correspond to. i.e. something like:

#endif //defined(__opencl_c_ext_fp32_global_atomic_min_max)
svenvh added a subscriber: svenvh.Aug 3 2021, 7:33 AM
svenvh added inline comments.
clang/lib/Headers/opencl-c-base.h
24

Should this be defined as 1?

Should this define be tested in clang/test/Headers/opencl-c-header.cl too?

Anastasia added inline comments.Aug 5 2021, 4:34 AM
clang/lib/Headers/opencl-c-base.h
24

Actually, now that I think more about this, it seems incorrect to add this here without adding the functions to OpenCLBuiltins.td because then the feature macro will be present without the feature when the default header is used?

So I guess we either need to extend OpenCLBuiltins.td with new functions or move the new macros into opencl-c.h?

svenvh requested changes to this revision.Aug 5 2021, 7:01 AM
svenvh added inline comments.
clang/lib/Headers/opencl-c-base.h
24

Good catch! Indeed, adding the define in the shared opencl-c-base.h without also providing the builtins through OpenCLBuiltins.td is incorrect.

The preferred solution would be to add the new builtins to clang/lib/Sema/OpenCLBuiltins.td too, to avoid diverging the header and tablegen-driven code paths.

This revision now requires changes to proceed.Aug 5 2021, 7:01 AM
haonanya updated this revision to Diff 366829.Aug 17 2021, 1:36 AM

Add the new builtins to clang/lib/Sema/OpenCLBuiltins.td.

Hi, Anastasia and svenvh.
Sorry for late reply. I have updated patch per your comments.
Many thanks for your comments.

Thanks for the update! I have a few points to improve the patch.

clang/lib/Sema/OpenCLBuiltins.td
1117

Do we really need to guard these additions behind OpenCL 3.0? The spec mentions

The functionality added by this extension uses the OpenCL C 2.0 atomic syntax and hence requires OpenCL 2.0 or newer.

(same applies to the opencl-h.c changes of course)

1122

The feature macros seem to be missing. See FuncExtOpenCLCPipes for an example how to do that.

1142–1145

This can be merged into the preceeding foreach parts I think?

clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
137

As mentioned in the comment on lines 13-17, this test is not meant to be exhaustive. So you don't have to test every overload, checking one or two builtins should suffice.

haonanya updated this revision to Diff 367178.Aug 18 2021, 4:36 AM

Remove OpenCL3.0 macro guards on opencl-c.h and OpenCLBuiltins.td.
Add missing feature macro.
simplify test.

haonanya marked 8 inline comments as done.Aug 18 2021, 4:46 AM
haonanya marked an inline comment as done.
haonanya marked an inline comment as done.Aug 18 2021, 4:48 AM
svenvh added inline comments.Aug 18 2021, 6:22 AM
clang/lib/Sema/OpenCLBuiltins.td
1118

So now all of those builtins are guarded by cl_ext_float_atomics, which is good, but not by any of the __opencl_c_ext_... macros yet.

To guard by multiple macros, we'd need to do something like:

def FuncExtFloatAtomicsFp32GlobalMinMax  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_min_max">;
def FuncExtFloatAtomicsFp32LocalMinMax   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max">;

And then use let Extension = FuncExtFloatAtomics... around the corresponding builtins. You shouldn't have to change the loop structure much for this, as you can hopefully use # concatenation to construct the appropriate FuncExt name (and then !cast it to a record).

However, I do see some problematic cases: the generic address space builtins are enabled by one of multiple feature macros, which is something that is currently not supported by the OpenCLBuiltins.td handling. If it's not too late, could we ask the extension spec editors to provide a dedicated feature macro for generic perhaps?

haonanya updated this revision to Diff 367980.Aug 22 2021, 3:01 AM

Let builtins are guarded by related macro

svenvh added inline comments.Aug 23 2021, 1:53 AM
clang/lib/Sema/OpenCLBuiltins.td
1120

Please try to follow the formatting used in the rest of this file:

def : Builtin<...

So a space after def, then no newline after the :.

This applies to all the new defs below too.

1121

The paste operator # is a binary operator, so it makes more sense to put a space on both sides.

1197

Wrong extension guard.

1293

Wrong extension guard.

haonanya updated this revision to Diff 368119.Aug 23 2021, 8:43 AM

Unify formatting and fix some errors on OpenCLBuiltins.td

haonanya marked 5 inline comments as done.Aug 23 2021, 8:51 AM

Thanks for the update! I have a comment about indentation, other than that this is looking good to me.

clang/lib/Sema/OpenCLBuiltins.td
1128

Please indent the content inside all let blocks.

haonanya updated this revision to Diff 368256.Aug 23 2021, 7:10 PM

Fix formatting issues

svenvh accepted this revision.Aug 24 2021, 1:01 AM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 24 2021, 1:01 AM

The extension spec seems to also mention atomic_half. Are planning to add it too?

Hi, Anastasia. I also have some work to translate these atomic builtins on SPIRV currently, so I'd like to have another patch to add atomic_half later.
Do you have any comments?
Thanks very much.

haonanya marked an inline comment as done.Aug 25 2021, 4:48 AM

Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.

#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

Hi, svenvh and Anastasia. If you approve the patch, could you please submit it?
I don't have permission to do it.

Hi, svenvh and Anastasia. If you approve the patch, could you please submit it?
I don't have permission to do it.

Sure, I can commit it on your behalf, atomic_half can be added separately. Thanks!

Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.

#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

Hi, svenvh and Anastasia. Do you have any comments for adding cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard atomic_double type? I'd appreciate it if you have time to answer it.
And if there is no any comment, please commit the patch.
Thanks very much.

Kindly ping

This revision was landed with ongoing or failed builds.Sep 13 2021, 4:13 AM
This revision was automatically updated to reflect the committed changes.

Apologies for the delayed response.

Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.

#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

This is perhaps something to raise at the specification level?

We can adjust the guards after any followup discussion if needed. To progress the support of this extension, I've just committed your patch (with some minor whitespace fixes).

Hi, svenvh.
I am ok with the patch. Thanks very much.
I'd appreciate it if you help review https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/1116 as well.