This is an archive of the discontinued LLVM Phabricator instance.

clang/include/clang/Basic/OpenCLExtensions.def
85 ↗	(On Diff #360018)	If the only purpose of adding this extension here is to define the macros you should just use the internal header for it. See guidelines: https://clang.llvm.org/docs/OpenCLSupport.html#implementation-guidelines The same applies to all the feature macros below. You could check how new subgroup extensions are implemented: https://clang.llvm.org/doxygen/opencl-c-base_8h_source.html // For SPIR all extensions are supported. #if defined(__SPIR__) #define cl_khr_subgroup_extended_types 1 #define cl_khr_subgroup_non_uniform_vote 1 #define cl_khr_subgroup_ballot 1 #define cl_khr_subgroup_non_uniform_arithmetic 1 #define cl_khr_subgroup_shuffle 1 #define cl_khr_subgroup_shuffle_relative 1 #define cl_khr_subgroup_clustered_reduce 1 #endif // defined(__SPIR__)
clang/lib/Headers/opencl-c-base.h
40	From the spec it feels like those macros should be defined conditionally?
clang/lib/Headers/opencl-c.h
13419	should this not be conditioned on `__opencl_c_ext_fp32_global_atomic`? Otherwise, I am missing what is the intent of those macros...

Please consider uploading full diff: https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

FYI this revision has not been added to cfe-commits, is this intensional?

Add extension and all the feature macros to internal header.

Harbormaster completed remote builds in B115498: Diff 360712.Jul 22 2021, 12:14 AM

In D106343#2892960, @Anastasia wrote:

FYI this revision has not been added to cfe-commits, is this intensional?

Hi, Anastasia. I am not very familiar with the process, could you please help to add to cfe-commits if possible? Thanks very much.

Anastasia added a reviewer: Anastasia.Jul 23 2021, 1:46 AM

Anastasia added a subscriber: cfe-commits.

Just to make sure you are aware Clang doesn't use this header by default, so the upstream users won't be able to call those functions unless you add them into OpenCLBuiltins.td:
https://clang.llvm.org/docs/OpenCLSupport.html#opencl-builtins

This header is only accessible via the frontend options: https://clang.llvm.org/docs/OpenCLSupport.html#cmdoption-finclude-default-header

clang/lib/Headers/opencl-c.h
13657	Can you annotate the `#endif`s with a comment describing what they correspond to. i.e. something like: #endif //defined(__opencl_c_ext_fp32_global_atomic_min_max)

svenvh added a subscriber: svenvh.Aug 3 2021, 7:33 AM

svenvh added inline comments.

clang/lib/Headers/opencl-c-base.h
24	Should this be defined as `1`? Should this define be tested in `clang/test/Headers/opencl-c-header.cl` too?

Anastasia added inline comments.Aug 5 2021, 4:34 AM

clang/lib/Headers/opencl-c-base.h
24	Actually, now that I think more about this, it seems incorrect to add this here without adding the functions to `OpenCLBuiltins.td` because then the feature macro will be present without the feature when the default header is used? So I guess we either need to extend `OpenCLBuiltins.td` with new functions or move the new macros into `opencl-c.h`?

svenvh requested changes to this revision.Aug 5 2021, 7:01 AM

svenvh added inline comments.

clang/lib/Headers/opencl-c-base.h
24	Good catch! Indeed, adding the define in the shared `opencl-c-base.h` without also providing the builtins through `OpenCLBuiltins.td` is incorrect. The preferred solution would be to add the new builtins to `clang/lib/Sema/OpenCLBuiltins.td` too, to avoid diverging the header and tablegen-driven code paths.

This revision now requires changes to proceed.Aug 5 2021, 7:01 AM

Add the new builtins to clang/lib/Sema/OpenCLBuiltins.td.

Hi, Anastasia and svenvh.
Sorry for late reply. I have updated patch per your comments.
Many thanks for your comments.

Harbormaster completed remote builds in B119852: Diff 366829.Aug 17 2021, 2:05 AM

Thanks for the update! I have a few points to improve the patch.

clang/lib/Sema/OpenCLBuiltins.td
1117	Do we really need to guard these additions behind OpenCL 3.0? The spec mentions The functionality added by this extension uses the OpenCL C 2.0 atomic syntax and hence requires OpenCL 2.0 or newer. (same applies to the opencl-h.c changes of course)
1122	The feature macros seem to be missing. See `FuncExtOpenCLCPipes` for an example how to do that.
1142–1145	This can be merged into the preceeding `foreach` parts I think?
clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl
137	As mentioned in the comment on lines 13-17, this test is not meant to be exhaustive. So you don't have to test every overload, checking one or two builtins should suffice.

Remove OpenCL3.0 macro guards on opencl-c.h and OpenCLBuiltins.td.
Add missing feature macro.
simplify test.

haonanya marked 8 inline comments as done.Aug 18 2021, 4:46 AM

haonanya marked an inline comment as done.

haonanya marked an inline comment as done.Aug 18 2021, 4:48 AM

Harbormaster completed remote builds in B120102: Diff 367178.Aug 18 2021, 5:26 AM

svenvh added inline comments.Aug 18 2021, 6:22 AM

clang/lib/Sema/OpenCLBuiltins.td
1118	So now all of those builtins are guarded by `cl_ext_float_atomics`, which is good, but not by any of the `__opencl_c_ext_...` macros yet. To guard by multiple macros, we'd need to do something like: def FuncExtFloatAtomicsFp32GlobalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_min_max">; def FuncExtFloatAtomicsFp32LocalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max">; And then use `let Extension = FuncExtFloatAtomics...` around the corresponding builtins. You shouldn't have to change the loop structure much for this, as you can hopefully use `#` concatenation to construct the appropriate FuncExt name (and then `!cast` it to a record). However, I do see some problematic cases: the generic address space builtins are enabled by one of multiple feature macros, which is something that is currently not supported by the OpenCLBuiltins.td handling. If it's not too late, could we ask the extension spec editors to provide a dedicated feature macro for generic perhaps?

Let builtins are guarded by related macro

Harbormaster completed remote builds in B120692: Diff 367980.Aug 22 2021, 3:50 AM

svenvh added inline comments.Aug 23 2021, 1:53 AM

clang/lib/Sema/OpenCLBuiltins.td
1120	Please try to follow the formatting used in the rest of this file: def : Builtin<... So a space after `def`, then no newline after the `:`. This applies to all the new `def`s below too.
1121	The paste operator `#` is a binary operator, so it makes more sense to put a space on both sides.
1197	Wrong extension guard.
1293	Wrong extension guard.

Unify formatting and fix some errors on OpenCLBuiltins.td

haonanya marked 5 inline comments as done.Aug 23 2021, 8:51 AM

Harbormaster completed remote builds in B120802: Diff 368119.Aug 23 2021, 9:12 AM

Thanks for the update! I have a comment about indentation, other than that this is looking good to me.

clang/lib/Sema/OpenCLBuiltins.td
1128	Please indent the content inside all `let` blocks.

Fix formatting issues

Harbormaster completed remote builds in B120899: Diff 368256.Aug 23 2021, 7:43 PM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 24 2021, 1:01 AM

In D106343#2892946, @Anastasia wrote:

The extension spec seems to also mention atomic_half. Are planning to add it too?

Hi, Anastasia. I also have some work to translate these atomic builtins on SPIRV currently, so I'd like to have another patch to add atomic_half later.
Do you have any comments?
Thanks very much.

haonanya marked an inline comment as done.Aug 25 2021, 4:48 AM

Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.

#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

Hi, svenvh and Anastasia. If you approve the patch, could you please submit it?
I don't have permission to do it.

In D106343#2974055, @haonanya wrote:

Hi, svenvh and Anastasia. If you approve the patch, could you please submit it?
I don't have permission to do it.

Sure, I can commit it on your behalf, atomic_half can be added separately. Thanks!

In D106343#2967089, @haonanya wrote:
Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.
#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

Hi, svenvh and Anastasia. Do you have any comments for adding cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard atomic_double type? I'd appreciate it if you have time to answer it.
And if there is no any comment, please commit the patch.
Thanks very much.

Kindly ping

This revision was landed with ongoing or failed builds.Sep 13 2021, 4:13 AM

Closed by commit rGd353d1c50112: [OpenCL] Support cl_ext_float_atomics (authored by svenvh). · Explain Why

This revision was automatically updated to reflect the committed changes.

svenvh added a commit: rGd353d1c50112: [OpenCL] Support cl_ext_float_atomics.

Herald added a subscriber: ldrumm. · View Herald TranscriptSep 13 2021, 4:13 AM

Apologies for the delayed response.

In D106343#2967089, @haonanya wrote:
Hi, svenvh.
Should we use cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics to guard the functions using atomic_double type?
Thanks very much.
#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
double __ovld atomic_fetch_min(volatile __local atomic_double *object,
                               double operand);
#endif

This is perhaps something to raise at the specification level?

We can adjust the guards after any followup discussion if needed. To progress the support of this extension, I've just committed your patch (with some minor whitespace fixes).

Hi, svenvh.
I am ok with the patch. Thanks very much.
I'd appreciate it if you help review https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/1116 as well.

Revision Contents

Path

Size

clang/

lib/

Headers/

opencl-c-base.h

19 lines

opencl-c.h

209 lines

Sema/

OpenCLBuiltins.td

116 lines

test/

Headers/

opencl-c-header.cl

90 lines

SemaOpenCL/

fdeclare-opencl-builtins.cl

21 lines

Diff 372212

clang/lib/Headers/opencl-c-base.h

	Show All 15 Lines
	#if defined(__SPIR__)			#if defined(__SPIR__)
	#define cl_khr_subgroup_extended_types 1			#define cl_khr_subgroup_extended_types 1
	#define cl_khr_subgroup_non_uniform_vote 1			#define cl_khr_subgroup_non_uniform_vote 1
	#define cl_khr_subgroup_ballot 1			#define cl_khr_subgroup_ballot 1
	#define cl_khr_subgroup_non_uniform_arithmetic 1			#define cl_khr_subgroup_non_uniform_arithmetic 1
	#define cl_khr_subgroup_shuffle 1			#define cl_khr_subgroup_shuffle 1
	#define cl_khr_subgroup_shuffle_relative 1			#define cl_khr_subgroup_shuffle_relative 1
	#define cl_khr_subgroup_clustered_reduce 1			#define cl_khr_subgroup_clustered_reduce 1
	#define cl_khr_extended_bit_ops 1			#define cl_khr_extended_bit_ops 1
				svenvhUnsubmitted Done Reply Inline Actions Should this be defined as `1`? Should this define be tested in `clang/test/Headers/opencl-c-header.cl` too? svenvh: Should this be defined as `1`? Should this define be tested in `clang/test/Headers/opencl-c…
				AnastasiaUnsubmitted Done Reply Inline Actions Actually, now that I think more about this, it seems incorrect to add this here without adding the functions to `OpenCLBuiltins.td` because then the feature macro will be present without the feature when the default header is used? So I guess we either need to extend `OpenCLBuiltins.td` with new functions or move the new macros into `opencl-c.h`? Anastasia: Actually, now that I think more about this, it seems incorrect to add this here without adding…
				svenvhUnsubmitted Done Reply Inline Actions Good catch! Indeed, adding the define in the shared `opencl-c-base.h` without also providing the builtins through `OpenCLBuiltins.td` is incorrect. The preferred solution would be to add the new builtins to `clang/lib/Sema/OpenCLBuiltins.td` too, to avoid diverging the header and tablegen-driven code paths. svenvh: Good catch! Indeed, adding the define in the shared `opencl-c-base.h` without also providing…
	#define cl_khr_integer_dot_product 1			#define cl_khr_integer_dot_product 1
	#define __opencl_c_integer_dot_product_input_4x8bit 1			#define __opencl_c_integer_dot_product_input_4x8bit 1
	#define __opencl_c_integer_dot_product_input_4x8bit_packed 1			#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
				#define cl_ext_float_atomics 1
				#ifdef cl_khr_fp16
				#define __opencl_c_ext_fp16_global_atomic_load_store 1
				#define __opencl_c_ext_fp16_local_atomic_load_store 1
				#define __opencl_c_ext_fp16_global_atomic_add 1
				#define __opencl_c_ext_fp16_local_atomic_add 1
				#define __opencl_c_ext_fp16_global_atomic_min_max 1
				#define __opencl_c_ext_fp16_local_atomic_min_max 1
				#endif
				#ifdef cl_khr_fp64
				#define __opencl_c_ext_fp64_global_atomic_add 1
				#define __opencl_c_ext_fp64_local_atomic_add 1
				#define __opencl_c_ext_fp64_global_atomic_min_max 1
				AnastasiaUnsubmitted Done Reply Inline Actions From the spec it feels like those macros should be defined conditionally? Anastasia: From the spec it feels like those macros should be defined conditionally?
				#define __opencl_c_ext_fp64_local_atomic_min_max 1
				#endif
				#define __opencl_c_ext_fp32_global_atomic_add 1
				#define __opencl_c_ext_fp32_local_atomic_add 1
				#define __opencl_c_ext_fp32_global_atomic_min_max 1
				#define __opencl_c_ext_fp32_local_atomic_min_max 1

	#endif // defined(__SPIR__)			#endif // defined(__SPIR__)
	#endif // (defined(__OPENCL_CPP_VERSION__) \|\| __OPENCL_C_VERSION__ >= 200)			#endif // (defined(__OPENCL_CPP_VERSION__) \|\| __OPENCL_C_VERSION__ >= 200)

	// Define feature macros for OpenCL C 2.0			// Define feature macros for OpenCL C 2.0
	#if (__OPENCL_CPP_VERSION__ == 100 \|\| __OPENCL_C_VERSION__ == 200)			#if (__OPENCL_CPP_VERSION__ == 100 \|\| __OPENCL_C_VERSION__ == 200)
	#define __opencl_c_pipes 1			#define __opencl_c_pipes 1
	#define __opencl_c_generic_address_space 1			#define __opencl_c_generic_address_space 1
	▲ Show 20 Lines • Show All 721 Lines • Show Last 20 Lines

clang/lib/Headers/opencl-c.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 13,410 Lines • ▼ Show 20 Lines
	uint __ovld atomic_fetch_or_explicit(volatile atomic_uint *object, uint operand, memory_order order);			uint __ovld atomic_fetch_or_explicit(volatile atomic_uint *object, uint operand, memory_order order);
	int __ovld atomic_fetch_xor_explicit(volatile atomic_int *object, int operand, memory_order order);			int __ovld atomic_fetch_xor_explicit(volatile atomic_int *object, int operand, memory_order order);
	uint __ovld atomic_fetch_xor_explicit(volatile atomic_uint *object, uint operand, memory_order order);			uint __ovld atomic_fetch_xor_explicit(volatile atomic_uint *object, uint operand, memory_order order);
	int __ovld atomic_fetch_and_explicit(volatile atomic_int *object, int operand, memory_order order);			int __ovld atomic_fetch_and_explicit(volatile atomic_int *object, int operand, memory_order order);
	uint __ovld atomic_fetch_and_explicit(volatile atomic_uint *object, uint operand, memory_order order);			uint __ovld atomic_fetch_and_explicit(volatile atomic_uint *object, uint operand, memory_order order);
	int __ovld atomic_fetch_min_explicit(volatile atomic_int *object, int operand, memory_order order);			int __ovld atomic_fetch_min_explicit(volatile atomic_int *object, int operand, memory_order order);
	uint __ovld atomic_fetch_min_explicit(volatile atomic_uint *object, uint operand, memory_order order);			uint __ovld atomic_fetch_min_explicit(volatile atomic_uint *object, uint operand, memory_order order);
	int __ovld atomic_fetch_max_explicit(volatile atomic_int *object, int operand, memory_order order);			int __ovld atomic_fetch_max_explicit(volatile atomic_int *object, int operand, memory_order order);
	uint __ovld atomic_fetch_max_explicit(volatile atomic_uint *object, uint operand, memory_order order);			uint __ovld atomic_fetch_max_explicit(volatile atomic_uint *object, uint operand, memory_order order);
				AnastasiaUnsubmitted Done Reply Inline Actions should this not be conditioned on `__opencl_c_ext_fp32_global_atomic`? Otherwise, I am missing what is the intent of those macros... Anastasia: should this not be conditioned on `__opencl_c_ext_fp32_global_atomic`? Otherwise, I am missing…
	#if defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)			#if defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)
	long __ovld atomic_fetch_add_explicit(volatile atomic_long *object, long operand, memory_order order);			long __ovld atomic_fetch_add_explicit(volatile atomic_long *object, long operand, memory_order order);
	ulong __ovld atomic_fetch_add_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);			ulong __ovld atomic_fetch_add_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);
	long __ovld atomic_fetch_sub_explicit(volatile atomic_long *object, long operand, memory_order order);			long __ovld atomic_fetch_sub_explicit(volatile atomic_long *object, long operand, memory_order order);
	ulong __ovld atomic_fetch_sub_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);			ulong __ovld atomic_fetch_sub_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);
	long __ovld atomic_fetch_or_explicit(volatile atomic_long *object, long operand, memory_order order);			long __ovld atomic_fetch_or_explicit(volatile atomic_long *object, long operand, memory_order order);
	ulong __ovld atomic_fetch_or_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);			ulong __ovld atomic_fetch_or_explicit(volatile atomic_ulong *object, ulong operand, memory_order order);
	long __ovld atomic_fetch_xor_explicit(volatile atomic_long *object, long operand, memory_order order);			long __ovld atomic_fetch_xor_explicit(volatile atomic_long *object, long operand, memory_order order);
	▲ Show 20 Lines • Show All 204 Lines • ▼ Show 20 Lines
	long __ovld atomic_fetch_max_explicit(volatile __local atomic_long *object, long operand, memory_order order, memory_scope scope);			long __ovld atomic_fetch_max_explicit(volatile __local atomic_long *object, long operand, memory_order order, memory_scope scope);
	ulong __ovld atomic_fetch_max_explicit(volatile __global atomic_ulong *object, ulong operand, memory_order order, memory_scope scope);			ulong __ovld atomic_fetch_max_explicit(volatile __global atomic_ulong *object, ulong operand, memory_order order, memory_scope scope);
	ulong __ovld atomic_fetch_max_explicit(volatile __local atomic_ulong *object, ulong operand, memory_order order, memory_scope scope);			ulong __ovld atomic_fetch_max_explicit(volatile __local atomic_ulong *object, ulong operand, memory_order order, memory_scope scope);
	uintptr_t __ovld atomic_fetch_add_explicit(volatile __global atomic_uintptr_t *object, ptrdiff_t operand, memory_order order, memory_scope scope);			uintptr_t __ovld atomic_fetch_add_explicit(volatile __global atomic_uintptr_t *object, ptrdiff_t operand, memory_order order, memory_scope scope);
	uintptr_t __ovld atomic_fetch_sub_explicit(volatile __local atomic_uintptr_t *object, ptrdiff_t operand, memory_order order, memory_scope scope);			uintptr_t __ovld atomic_fetch_sub_explicit(volatile __local atomic_uintptr_t *object, ptrdiff_t operand, memory_order order, memory_scope scope);
	#endif //defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)			#endif //defined(cl_khr_int64_base_atomics) && defined(cl_khr_int64_extended_atomics)
	#endif //__OPENCL_C_VERSION__ >= CL_VERSION_3_0			#endif //__OPENCL_C_VERSION__ >= CL_VERSION_3_0

				// The functionality added by cl_ext_float_atomics extension
				#if defined(cl_ext_float_atomics)

				#if defined(__opencl_c_ext_fp32_global_atomic_min_max)
				float __ovld atomic_fetch_min(volatile __global atomic_float *object,
				float operand);
				float __ovld atomic_fetch_max(volatile __global atomic_float *object,
				float operand);
				float __ovld atomic_fetch_min_explicit(volatile __global atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_max_explicit(volatile __global atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_min_explicit(volatile __global atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_max_explicit(volatile __global atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				AnastasiaUnsubmitted Done Reply Inline Actions Can you annotate the `#endif`s with a comment describing what they correspond to. i.e. something like: #endif //defined(__opencl_c_ext_fp32_global_atomic_min_max) Anastasia: Can you annotate the `#endif`s with a comment describing what they correspond to. i.e.
				#endif // defined(__opencl_c_ext_fp32_global_atomic_min_max)

				#if defined(__opencl_c_ext_fp32_local_atomic_min_max)
				float __ovld atomic_fetch_min(volatile __local atomic_float *object,
				float operand);
				float __ovld atomic_fetch_max(volatile __local atomic_float *object,
				float operand);
				float __ovld atomic_fetch_min_explicit(volatile __local atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_max_explicit(volatile __local atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_min_explicit(volatile __local atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_max_explicit(volatile __local atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp32_local_atomic_min_max)

				#if defined(__opencl_c_ext_fp32_global_atomic_min_max) && \
				defined(__opencl_c_ext_fp32_local_atomic_min_max)
				float __ovld atomic_fetch_min(volatile atomic_float *object, float operand);
				float __ovld atomic_fetch_max(volatile atomic_float *object, float operand);
				float __ovld atomic_fetch_min_explicit(volatile atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_max_explicit(volatile atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_min_explicit(volatile atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_max_explicit(volatile atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp32_global_atomic_min_max) && \
				defined(__opencl_c_ext_fp32_local_atomic_min_max)

				#if defined(__opencl_c_ext_fp64_global_atomic_min_max)
				double __ovld atomic_fetch_min(volatile __global atomic_double *object,
				double operand);
				double __ovld atomic_fetch_max(volatile __global atomic_double *object,
				double operand);
				double __ovld atomic_fetch_min_explicit(volatile __global atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_max_explicit(volatile __global atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_min_explicit(volatile __global atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_max_explicit(volatile __global atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_global_atomic_min_max)

				#if defined(__opencl_c_ext_fp64_local_atomic_min_max)
				double __ovld atomic_fetch_min(volatile __local atomic_double *object,
				double operand);
				double __ovld atomic_fetch_max(volatile __local atomic_double *object,
				double operand);
				double __ovld atomic_fetch_min_explicit(volatile __local atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_max_explicit(volatile __local atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_min_explicit(volatile __local atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_max_explicit(volatile __local atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_local_atomic_min_max)

				#if defined(__opencl_c_ext_fp64_global_atomic_min_max) && \
				defined(__opencl_c_ext_fp64_local_atomic_min_max)
				double __ovld atomic_fetch_min(volatile atomic_double *object, double operand);
				double __ovld atomic_fetch_max(volatile atomic_double *object, double operand);
				double __ovld atomic_fetch_min_explicit(volatile atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_max_explicit(volatile atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_min_explicit(volatile atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_max_explicit(volatile atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_global_atomic_min_max) && \
				defined(__opencl_c_ext_fp64_local_atomic_min_max)

				#if defined(__opencl_c_ext_fp32_global_atomic_add)
				float __ovld atomic_fetch_add(volatile __global atomic_float *object,
				float operand);
				float __ovld atomic_fetch_sub(volatile __global atomic_float *object,
				float operand);
				float __ovld atomic_fetch_add_explicit(volatile __global atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_sub_explicit(volatile __global atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_add_explicit(volatile __global atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_sub_explicit(volatile __global atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp32_global_atomic_add)

				#if defined(__opencl_c_ext_fp32_local_atomic_add)
				float __ovld atomic_fetch_add(volatile __local atomic_float *object,
				float operand);
				float __ovld atomic_fetch_sub(volatile __local atomic_float *object,
				float operand);
				float __ovld atomic_fetch_add_explicit(volatile __local atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_sub_explicit(volatile __local atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_add_explicit(volatile __local atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_sub_explicit(volatile __local atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp32_local_atomic_add)

				#if defined(__opencl_c_ext_fp32_global_atomic_add) && \
				defined(__opencl_c_ext_fp32_local_atomic_add)
				float __ovld atomic_fetch_add(volatile atomic_float *object, float operand);
				float __ovld atomic_fetch_sub(volatile atomic_float *object, float operand);
				float __ovld atomic_fetch_add_explicit(volatile atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_sub_explicit(volatile atomic_float *object,
				float operand, memory_order order);
				float __ovld atomic_fetch_add_explicit(volatile atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				float __ovld atomic_fetch_sub_explicit(volatile atomic_float *object,
				float operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp32_global_atomic_add) && \
				defined(__opencl_c_ext_fp32_local_atomic_add)

				#if defined(__opencl_c_ext_fp64_global_atomic_add)
				double __ovld atomic_fetch_add(volatile __global atomic_double *object,
				double operand);
				double __ovld atomic_fetch_sub(volatile __global atomic_double *object,
				double operand);
				double __ovld atomic_fetch_add_explicit(volatile __global atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_sub_explicit(volatile __global atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_add_explicit(volatile __global atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_sub_explicit(volatile __global atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_global_atomic_add)

				#if defined(__opencl_c_ext_fp64_local_atomic_add)
				double __ovld atomic_fetch_add(volatile __local atomic_double *object,
				double operand);
				double __ovld atomic_fetch_sub(volatile __local atomic_double *object,
				double operand);
				double __ovld atomic_fetch_add_explicit(volatile __local atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_sub_explicit(volatile __local atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_add_explicit(volatile __local atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_sub_explicit(volatile __local atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_local_atomic_add)

				#if defined(__opencl_c_ext_fp64_global_atomic_add) && \
				defined(__opencl_c_ext_fp64_local_atomic_add)
				double __ovld atomic_fetch_add(volatile atomic_double *object, double operand);
				double __ovld atomic_fetch_sub(volatile atomic_double *object, double operand);
				double __ovld atomic_fetch_add_explicit(volatile atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_sub_explicit(volatile atomic_double *object,
				double operand, memory_order order);
				double __ovld atomic_fetch_add_explicit(volatile atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				double __ovld atomic_fetch_sub_explicit(volatile atomic_double *object,
				double operand, memory_order order,
				memory_scope scope);
				#endif // defined(__opencl_c_ext_fp64_global_atomic_add) && \
				defined(__opencl_c_ext_fp64_local_atomic_add)

				#endif // cl_ext_float_atomics

	// atomic_store()			// atomic_store()

	#if defined(__opencl_c_atomic_order_seq_cst) && defined(__opencl_c_atomic_scope_device)			#if defined(__opencl_c_atomic_order_seq_cst) && defined(__opencl_c_atomic_scope_device)
	#if defined(__opencl_c_generic_address_space)			#if defined(__opencl_c_generic_address_space)
	void __ovld atomic_store(volatile atomic_int *object, int desired);			void __ovld atomic_store(volatile atomic_int *object, int desired);
	void __ovld atomic_store(volatile atomic_uint *object, uint desired);			void __ovld atomic_store(volatile atomic_uint *object, uint desired);
	void __ovld atomic_store(volatile atomic_float *object, float desired);			void __ovld atomic_store(volatile atomic_float *object, float desired);

	▲ Show 20 Lines • Show All 4,476 Lines • Show Last 20 Lines

clang/lib/Sema/OpenCLBuiltins.td

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines

def FuncExtKhrInt64BaseAtomics : FunctionExtension<"cl_khr_int64_base_atomics">; def FuncExtKhrInt64BaseAtomics : FunctionExtension<"cl_khr_int64_base_atomics">;

def FuncExtKhrInt64ExtendedAtomics : FunctionExtension<"cl_khr_int64_extended_atomics">; def FuncExtKhrInt64ExtendedAtomics : FunctionExtension<"cl_khr_int64_extended_atomics">;

def FuncExtKhrMipmapImage : FunctionExtension<"cl_khr_mipmap_image">; def FuncExtKhrMipmapImage : FunctionExtension<"cl_khr_mipmap_image">;

def FuncExtKhrMipmapImageWrites : FunctionExtension<"cl_khr_mipmap_image_writes">; def FuncExtKhrMipmapImageWrites : FunctionExtension<"cl_khr_mipmap_image_writes">;

def FuncExtKhrGlMsaaSharing : FunctionExtension<"cl_khr_gl_msaa_sharing">; def FuncExtKhrGlMsaaSharing : FunctionExtension<"cl_khr_gl_msaa_sharing">;

def FuncExtOpenCLCPipes : FunctionExtension<"__opencl_c_pipes">; def FuncExtOpenCLCPipes : FunctionExtension<"__opencl_c_pipes">;

def FuncExtOpenCLCWGCollectiveFunctions : FunctionExtension<"__opencl_c_work_group_collective_functions">; def FuncExtOpenCLCWGCollectiveFunctions : FunctionExtension<"__opencl_c_work_group_collective_functions">;

def FuncExtFloatAtomicsFp32GlobalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_add">;

def FuncExtFloatAtomicsFp64GlobalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_global_atomic_add">;

def FuncExtFloatAtomicsFp32LocalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_add">;

def FuncExtFloatAtomicsFp64LocalAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_add">;

def FuncExtFloatAtomicsFp32GenericAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_add __opencl_c_ext_fp32_global_atomic_add">;

def FuncExtFloatAtomicsFp64GenericAdd : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_add __opencl_c_ext_fp64_global_atomic_add">;

def FuncExtFloatAtomicsFp32GlobalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_min_max">;

def FuncExtFloatAtomicsFp64GlobalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_global_atomic_min_max">;

def FuncExtFloatAtomicsFp32LocalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max">;

def FuncExtFloatAtomicsFp64LocalMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_min_max">;

def FuncExtFloatAtomicsFp32GenericMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max __opencl_c_ext_fp32_global_atomic_min_max">;

def FuncExtFloatAtomicsFp64GenericMinMax : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp64_local_atomic_min_max __opencl_c_ext_fp64_global_atomic_min_max">;

// Not a real extension, but a workaround to add C++ for OpenCL specific builtins. // Not a real extension, but a workaround to add C++ for OpenCL specific builtins.

def FuncExtOpenCLCxx : FunctionExtension<"__cplusplus">; def FuncExtOpenCLCxx : FunctionExtension<"__cplusplus">;

// Multiple extensions // Multiple extensions

def FuncExtKhrMipmapWritesAndWrite3d : FunctionExtension<"cl_khr_mipmap_image_writes cl_khr_3d_image_writes">; def FuncExtKhrMipmapWritesAndWrite3d : FunctionExtension<"cl_khr_mipmap_image_writes cl_khr_3d_image_writes">;

// Arm extensions. // Arm extensions.

▲ Show 20 Lines • Show All 1,000 Lines • ▼ Show 20 Lines let MinVersion = CL20 in {

def : Builtin<"atomic_flag_test_and_set", def : Builtin<"atomic_flag_test_and_set",

[Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>]>; [Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>]>;

def : Builtin<"atomic_flag_test_and_set_explicit", def : Builtin<"atomic_flag_test_and_set_explicit",

[Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>, MemoryOrder]>; [Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>, MemoryOrder]>;

def : Builtin<"atomic_flag_test_and_set_explicit", def : Builtin<"atomic_flag_test_and_set_explicit",

[Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>, MemoryOrder, MemoryScope]>; [Bool, PointerType<VolatileType<AtomicFlag>, GenericAS>, MemoryOrder, MemoryScope]>;

} }

// The functionality added by cl_ext_float_atomics extension

let MinVersion = CL20 in {

svenvhUnsubmitted

Done

Do we really need to guard these additions behind OpenCL 3.0? The spec mentions

The functionality added by this extension uses the OpenCL C 2.0 atomic syntax and hence requires OpenCL 2.0 or newer.

(same applies to the opencl-h.c changes of course)

svenvh: Do we really need to guard these additions behind OpenCL 3.0? The spec mentions > The…

foreach ModOp = ["add", "sub"] in {

svenvhUnsubmitted

Done

So now all of those builtins are guarded by cl_ext_float_atomics, which is good, but not by any of the __opencl_c_ext_... macros yet.

To guard by multiple macros, we'd need to do something like:

def FuncExtFloatAtomicsFp32GlobalMinMax  : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_global_atomic_min_max">;
def FuncExtFloatAtomicsFp32LocalMinMax   : FunctionExtension<"cl_ext_float_atomics __opencl_c_ext_fp32_local_atomic_min_max">;

And then use let Extension = FuncExtFloatAtomics... around the corresponding builtins. You shouldn't have to change the loop structure much for this, as you can hopefully use # concatenation to construct the appropriate FuncExt name (and then !cast it to a record).

However, I do see some problematic cases: the generic address space builtins are enabled by one of multiple feature macros, which is something that is currently not supported by the OpenCLBuiltins.td handling. If it's not too late, could we ask the extension spec editors to provide a dedicated feature macro for generic perhaps?

svenvh: So now all of those builtins are guarded by `cl_ext_float_atomics`, which is good, but not by…

let Extension = FuncExtFloatAtomicsFp32GlobalAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

svenvhUnsubmitted

Done

Please try to follow the formatting used in the rest of this file:

def : Builtin<...

So a space after def, then no newline after the :.

This applies to all the new defs below too.

svenvh: Please try to follow the formatting used in the rest of this file: ``` def : Builtin<... ``` So…

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float]>;

svenvhUnsubmitted

Done

def:

- Builtin<"atomic_fetch_" #ModOp,

+ Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float]>;

The paste operator # is a binary operator, so it makes more sense to put a space on both sides.

svenvh: The paste operator `#` is a binary operator, so it makes more sense to put a space on both…

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

svenvhUnsubmitted

Done

The feature macros seem to be missing. See FuncExtOpenCLCPipes for an example how to do that.

svenvh: The feature macros seem to be missing. See `FuncExtOpenCLCPipes` for an example how to do that.

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64GlobalAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

svenvhUnsubmitted

Done

Please indent the content inside all let blocks.

svenvh: Please indent the content inside all `let` blocks.

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp32LocalAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64LocalAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double]>;

svenvhUnsubmitted

Done

This can be merged into the preceeding foreach parts I think?

svenvh: This can be merged into the preceeding `foreach` parts I think?

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp32GenericAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64GenericAdd in {

def : Builtin<"atomic_fetch_" # ModOp,

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double, MemoryOrder, MemoryScope]>;

}

foreach ModOp = ["min", "max"] in {

let Extension = FuncExtFloatAtomicsFp32GlobalMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GlobalAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64GlobalMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GlobalAS>, Double, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp32LocalMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, LocalAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64LocalMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double, MemoryOrder]>;

svenvhUnsubmitted

Done

MemoryOrder, MemoryScope

]>;

}

- let Extension = FuncExtFloatAtomicsFp64GlobalAdd in {

+ let Extension = FuncExtFloatAtomicsFp64GenericAdd in {

def:

Wrong extension guard.

svenvh: Wrong extension guard.

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, LocalAS>, Double, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp32GenericMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Float, PointerType<VolatileType<AtomicFloat>, GenericAS>, Float, MemoryOrder, MemoryScope]>;

}

let Extension = FuncExtFloatAtomicsFp64GenericMinMax in {

def : Builtin<"atomic_fetch_" # ModOp,

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double, MemoryOrder]>;

def : Builtin<"atomic_fetch_" # ModOp # "_explicit",

[Double, PointerType<VolatileType<AtomicDouble>, GenericAS>, Double, MemoryOrder, MemoryScope]>;

}

//-------------------------------------------------------------------- //--------------------------------------------------------------------

// OpenCL v1.1 s6.11.12, v1.2 s6.12.12, v2.0 s6.13.12 - Miscellaneous Vector Functions // OpenCL v1.1 s6.11.12, v1.2 s6.12.12, v2.0 s6.13.12 - Miscellaneous Vector Functions

// --- Table 19 --- // --- Table 19 ---

foreach VSize1 = [2, 4, 8, 16] in { foreach VSize1 = [2, 4, 8, 16] in {

foreach VSize2 = [2, 4, 8, 16] in { foreach VSize2 = [2, 4, 8, 16] in {

foreach VecAndMaskType = [[Char, UChar], [UChar, UChar], foreach VecAndMaskType = [[Char, UChar], [UChar, UChar],

[Short, UShort], [UShort, UShort], [Short, UShort], [UShort, UShort],

[Int, UInt], [UInt, UInt], [Int, UInt], [UInt, UInt],

▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines let MinVersion = CL12 in {

foreach aQual = ["RO", "RW"] in { foreach aQual = ["RO", "RW"] in {

foreach imgTy = [Image2d, Image1dArray] in { foreach imgTy = [Image2d, Image1dArray] in {

def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>; def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>;

def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>; def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>;

def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>; def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, VectorType<Int, 2>], Attr.Pure>;

} }

foreach imgTy = [Image3d, Image2dArray] in { foreach imgTy = [Image3d, Image2dArray] in {

def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>; def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>;

def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>; def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>;

svenvhUnsubmitted

Done

MemoryOrder, MemoryScope

]>;

}

- let Extension = FuncExtFloatAtomicsFp64GlobalMinMax in {

+ let Extension = FuncExtFloatAtomicsFp64GenericMinMax in {

def:

Wrong extension guard.

svenvh: Wrong extension guard.

def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>; def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, VectorType<Int, 4>], Attr.Pure>;

} }

foreach imgTy = [Image1d, Image1dBuffer] in { foreach imgTy = [Image1d, Image1dBuffer] in {

def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>; def : Builtin<"read_imagef", [VectorType<Float, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>;

def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>; def : Builtin<"read_imagei", [VectorType<Int, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>;

def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>; def : Builtin<"read_imageui", [VectorType<UInt, 4>, ImageType<imgTy, aQual>, Int], Attr.Pure>;

} }

def : Builtin<"read_imagef", [Float, ImageType<Image2dDepth, aQual>, VectorType<Int, 2>], Attr.Pure>; def : Builtin<"read_imagef", [Float, ImageType<Image2dDepth, aQual>, VectorType<Int, 2>], Attr.Pure>;

▲ Show 20 Lines • Show All 543 Lines • Show Last 20 Lines

clang/test/Headers/opencl-c-header.cl

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	#error "Incorrectly defined cl_khr_integer_dot_product"			#error "Incorrectly defined cl_khr_integer_dot_product"
	#endif			#endif
	#if __opencl_c_integer_dot_product_input_4x8bit != 1			#if __opencl_c_integer_dot_product_input_4x8bit != 1
	#error "Incorrectly defined __opencl_c_integer_dot_product_input_4x8bit"			#error "Incorrectly defined __opencl_c_integer_dot_product_input_4x8bit"
	#endif			#endif
	#if __opencl_c_integer_dot_product_input_4x8bit_packed != 1			#if __opencl_c_integer_dot_product_input_4x8bit_packed != 1
	#error "Incorrectly defined __opencl_c_integer_dot_product_input_4x8bit_packed"			#error "Incorrectly defined __opencl_c_integer_dot_product_input_4x8bit_packed"
	#endif			#endif
				#if cl_ext_float_atomics != 1
				#error "Incorrectly defined cl_ext_float_atomics"
				#endif
				#if __opencl_c_ext_fp16_global_atomic_load_store != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_global_atomic_load_store"
				#endif
				#if __opencl_c_ext_fp16_local_atomic_load_store != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_local_atomic_load_store"
				#endif
				#if __opencl_c_ext_fp16_global_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_global_atomic_add"
				#endif
				#if __opencl_c_ext_fp32_global_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp32_global_atomic_add"
				#endif
				#if __opencl_c_ext_fp64_global_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp64_global_atomic_add"
				#endif
				#if __opencl_c_ext_fp16_local_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_local_atomic_add"
				#endif
				#if __opencl_c_ext_fp32_local_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp32_local_atomic_add"
				#endif
				#if __opencl_c_ext_fp64_local_atomic_add != 1
				#error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_add"
				#endif
				#if __opencl_c_ext_fp16_global_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_global_atomic_min_max"
				#endif
				#if __opencl_c_ext_fp32_global_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp32_global_atomic_min_max"
				#endif
				#if __opencl_c_ext_fp64_global_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp64_global_atomic_min_max"
				#endif
				#if __opencl_c_ext_fp16_local_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp16_local_atomic_min_max"
				#endif
				#if __opencl_c_ext_fp32_local_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp32_local_atomic_min_max"
				#endif
				#if __opencl_c_ext_fp64_local_atomic_min_max != 1
				#error "Incorrectly defined __opencl_c_ext_fp64_local_atomic_min_max"
				#endif

	#else			#else

	#ifdef cl_khr_subgroup_extended_types			#ifdef cl_khr_subgroup_extended_types
	#error "Incorrect cl_khr_subgroup_extended_types define"			#error "Incorrect cl_khr_subgroup_extended_types define"
	#endif			#endif
	#ifdef cl_khr_subgroup_non_uniform_vote			#ifdef cl_khr_subgroup_non_uniform_vote
	#error "Incorrect cl_khr_subgroup_non_uniform_vote define"			#error "Incorrect cl_khr_subgroup_non_uniform_vote define"
	Show All 20 Lines
	#error "Incorrect cl_khr_integer_dot_product define"			#error "Incorrect cl_khr_integer_dot_product define"
	#endif			#endif
	#ifdef __opencl_c_integer_dot_product_input_4x8bit			#ifdef __opencl_c_integer_dot_product_input_4x8bit
	#error "Incorrect __opencl_c_integer_dot_product_input_4x8bit define"			#error "Incorrect __opencl_c_integer_dot_product_input_4x8bit define"
	#endif			#endif
	#ifdef __opencl_c_integer_dot_product_input_4x8bit_packed			#ifdef __opencl_c_integer_dot_product_input_4x8bit_packed
	#error "Incorrect __opencl_c_integer_dot_product_input_4x8bit_packed define"			#error "Incorrect __opencl_c_integer_dot_product_input_4x8bit_packed define"
	#endif			#endif
				#ifdef cl_ext_float_atomics
				#error "Incorrect cl_ext_float_atomics define"
				#endif
				#ifdef __opencl_c_ext_fp16_global_atomic_load_store
				#error "Incorrectly __opencl_c_ext_fp16_global_atomic_load_store defined"
				#endif
				#ifdef __opencl_c_ext_fp16_local_atomic_load_store
				#error "Incorrectly __opencl_c_ext_fp16_local_atomic_load_store defined"
				#endif
				#ifdef __opencl_c_ext_fp16_global_atomic_add
				#error "Incorrectly __opencl_c_ext_fp16_global_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp32_global_atomic_add
				#error "Incorrectly __opencl_c_ext_fp32_global_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp64_global_atomic_add
				#error "Incorrectly __opencl_c_ext_fp64_global_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp16_local_atomic_add
				#error "Incorrectly __opencl_c_ext_fp16_local_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp32_local_atomic_add
				#error "Incorrectly __opencl_c_ext_fp32_local_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp64_local_atomic_add
				#error "Incorrectly __opencl_c_ext_fp64_local_atomic_add defined"
				#endif
				#ifdef __opencl_c_ext_fp16_global_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp16_global_atomic_min_max defined"
				#endif
				#ifdef __opencl_c_ext_fp32_global_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp32_global_atomic_min_max defined"
				#endif
				#ifdef __opencl_c_ext_fp64_global_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp64_global_atomic_min_max defined"
				#endif
				#ifdef __opencl_c_ext_fp16_local_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp16_local_atomic_min_max defined"
				#endif
				#ifdef __opencl_c_ext_fp32_local_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp32_local_atomic_min_max defined"
				#endif
				#ifdef __opencl_c_ext_fp64_local_atomic_min_max
				#error "Incorrectly __opencl_c_ext_fp64_local_atomic_min_max defined"
				#endif

	#endif //(defined(__OPENCL_CPP_VERSION__) \|\| __OPENCL_C_VERSION__ >= 200)			#endif //(defined(__OPENCL_CPP_VERSION__) \|\| __OPENCL_C_VERSION__ >= 200)

	// OpenCL C features.			// OpenCL C features.
	#if (__OPENCL_CPP_VERSION__ == 202100 \|\| __OPENCL_C_VERSION__ == 300)			#if (__OPENCL_CPP_VERSION__ == 202100 \|\| __OPENCL_C_VERSION__ == 300)

	#if __opencl_c_atomic_scope_all_devices != 1			#if __opencl_c_atomic_scope_all_devices != 1
	#error "Incorrectly defined feature macro __opencl_c_atomic_scope_all_devices"			#error "Incorrectly defined feature macro __opencl_c_atomic_scope_all_devices"
	▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	void test_atomic_fetch(volatile __generic atomic_int *a_int,
ip = atomic_fetch_add(a_intptr, ptrdiff);		ip = atomic_fetch_add(a_intptr, ptrdiff);
uip = atomic_fetch_add(a_uintptr, ptrdiff);		uip = atomic_fetch_add(a_uintptr, ptrdiff);

ip = atomic_fetch_or(a_intptr, ip);		ip = atomic_fetch_or(a_intptr, ip);
uip = atomic_fetch_or(a_uintptr, uip);		uip = atomic_fetch_or(a_uintptr, uip);
}		}
#endif		#endif

		#if !defined(NO_HEADER) && !defined(NO_FP64) && __OPENCL_C_VERSION__ >= 200
		// Check added atomic_fetch_ functions by cl_ext_float_atomics
		// extension can be called
		void test_atomic_fetch_with_address_space(volatile __generic atomic_float *a_float,
		volatile __generic atomic_double *a_double,
		volatile __local atomic_float *a_float_local,
		volatile __local atomic_double *a_double_local,
		volatile __global atomic_float *a_float_global,
		volatile __global atomic_double *a_double_global) {
		float f1, resf1;
		double d1, resd1;
		resf1 = atomic_fetch_min(a_float, f1);
		resf1 = atomic_fetch_max_explicit(a_float_local, f1, memory_order_seq_cst);
		svenvhUnsubmitted Done Reply Inline Actions As mentioned in the comment on lines 13-17, this test is not meant to be exhaustive. So you don't have to test every overload, checking one or two builtins should suffice. svenvh: As mentioned in the comment on lines 13-17, this test is not meant to be exhaustive. So you…
		resf1 = atomic_fetch_add_explicit(a_float_global, f1, memory_order_seq_cst, memory_scope_work_group);

		resd1 = atomic_fetch_min(a_double, d1);
		resd1 = atomic_fetch_max_explicit(a_double_local, d1, memory_order_seq_cst);
		resd1 = atomic_fetch_add_explicit(a_double_global, d1, memory_order_seq_cst, memory_scope_work_group);
		}
		#endif // !defined(NO_HEADER) && __OPENCL_C_VERSION__ >= 200

// Test old atomic overloaded with generic address space in C++ for OpenCL.		// Test old atomic overloaded with generic address space in C++ for OpenCL.
#if __OPENCL_C_VERSION__ >= 200		#if __OPENCL_C_VERSION__ >= 200
void test_legacy_atomics_cpp(__generic volatile unsigned int *a) {		void test_legacy_atomics_cpp(__generic volatile unsigned int *a) {
atomic_add(a, 1);		atomic_add(a, 1);
#if !defined(__cplusplus)		#if !defined(__cplusplus)
// expected-error@-2{{no matching function for call to 'atomic_add'}}		// expected-error@-2{{no matching function for call to 'atomic_add'}}
// expected-note@-3 4 {{candidate function not viable}}		// expected-note@-3 4 {{candidate function not viable}}
#endif		#endif
▲ Show 20 Lines • Show All 167 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Support cl_ext_float_atomicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 372212

clang/lib/Headers/opencl-c-base.h

clang/lib/Headers/opencl-c.h

clang/lib/Sema/OpenCLBuiltins.td

clang/test/Headers/opencl-c-header.cl

clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl

[OpenCL] Support cl_ext_float_atomics
ClosedPublic