Page MenuHomePhabricator

[OpenCL] Upstreaming khronos OpenCL header file.
ClosedPublic

Authored by yaxunl on Mar 22 2016, 12:13 PM.

Details

Summary

OpenCL has large number of "builtin" functions ("builtin" in the sense of OpenCL spec) which are defined in header files. To compile OpenCL kernels using these builtin functions, the header files are needed. There are different implementations of OpenCL header files. The Khronos implementation can be found at this link:

https://github.com/KhronosGroup/SPIR/blob/spirv-1.0/lib/Headers/opencl.h

This patch is an effort to elicit further discussions about upstreaming OpenCL header file to clang.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
yaxunl added inline comments.Apr 25 2016, 12:03 PM
lib/Headers/opencl-12.h
585 ↗(On Diff #51317)

Fixed.

lib/Headers/opencl-20.h
4150 ↗(On Diff #51317)

as_type is defined as operators in v1.2 s6.2.4.2 instead of as builtin functions in 6.12.

9866 ↗(On Diff #51317)

Fixed.

11213 ↗(On Diff #51317)

s6.13.16 requires macro CLK_NULL_RESERVE_ID to be defined.

11222 ↗(On Diff #51317)

We need another patch to address this. For now I suggest keep the clang definition of ndrange_t and remove this from header.

11251 ↗(On Diff #51317)

need another patch for this work.

11263 ↗(On Diff #51317)

need another patch

11572 ↗(On Diff #51317)

We need a uniform representation for this macro in SPIR. As far as I know this works fine on different platforms. If a target needs special handling of this macro condition macro maybe added.

11605 ↗(On Diff #51317)

Fixed.

The newly added builtin functions of opencl 2.0 are using macros.

11886 ↗(On Diff #51317)

We need another patch to handle this. Since the argument type allows user defined type, it cannot be declared in header file.

yaxunl updated this revision to Diff 54883.Apr 25 2016, 12:06 PM
yaxunl marked 11 inline comments as done.

update the file.

yaxunl updated this revision to Diff 54910.Apr 25 2016, 2:30 PM

Remove some invalid atomic_fetch_ functions.

Anastasia added inline comments.Apr 29 2016, 12:53 PM
lib/Headers/opencl-c.h
14 ↗(On Diff #54910)

Is there any reason to define this extra macro?

Could we not just check OPENCL_C_VERSION >= CL_VERSION_2_0

137 ↗(On Diff #54910)

cl_khr_fp64 is not enabled!

144 ↗(On Diff #54910)

I think it should be (void*)0.

Btw, I think we should make it available in all OpenCL versions (at least as a part of C99 itself).

4872 ↗(On Diff #54910)

Interesting, macro has the same name as an extension?

6581 ↗(On Diff #54910)

I would like to avoid mapping to Clang builtin in macro because it breaks error reporting (i.e. the error is reported about the macro/Clang builtin and not a proper OpenCL function)

Ex:

error: too many arguments provided to function-like macro invocation
as_char(1,1);
                 ^
note: macro 'as_char' defined here

This also reduces generality in case some implementation need to do something different here.

Such mapping can be simply done in library implementation and should be easily inlined during optimizations.

13318 ↗(On Diff #54910)

Could we remove this functions from here please as they are not properly declared anyways?

I think they should be added as Clang builtins directly.

14056 ↗(On Diff #54910)

I am not sure this implementation of macro is generic enough!

15312 ↗(On Diff #54910)

Should we just check the CL version instead? The same for other occurrences!

yaxunl updated this revision to Diff 55879.May 2 2016, 1:09 PM
yaxunl marked 7 inline comments as done.

Revised as Anastasia suggested.

Also absorbed attribute((overloadable)) into macro __const_func, which saves 300MB space.

lib/Headers/opencl-c.h
4872 ↗(On Diff #54910)

The spec requires a macro to be defined with the same name as the supported extension, so it is natural to use it here.

14056 ↗(On Diff #54910)

This macro is for initializing global variables with atomic type. In SPIR it is represented as usual initializer the same way as a non-atomic type. OpenCL runtime initializes it the same way as a non-atomic type before kernel execution. I don't think there is need to define this macro another way.

yaxunl added a comment.May 2 2016, 1:14 PM

typo. saved 300KB space.

If we want to save some space, could we use some macro to expand the gentype or some script to expand the gentype into each types when the clang is build?

yaxunl added a comment.May 5 2016, 7:47 AM

If we want to save some space, could we use some macro to expand the gentype or some script to expand the gentype into each types when the clang is build?

We need to balance between space and readability. When I absorbed attribute((overloadable)) into __const_func to save space, it did not sacrifice readability. However using macro with gentype will cause the header file more difficult to read.

yaxunl added inline comments.May 6 2016, 2:43 PM
lib/Headers/opencl-c.h
14057 ↗(On Diff #55879)

If this representation is not generic enough. Any suggestion for an alternative? Thanks.

If we want to save some space, could we use some macro to expand the gentype or some script to expand the gentype into each types when the clang is build?

We need to balance between space and readability. When I absorbed attribute((overloadable)) into __const_func to save space, it did not sacrifice readability. However using macro with gentype will cause the header file more difficult to read.

But I don't think this kind of so long header is easy to read, it contains full pages of the same function with different types and it is hard to see if anyone is missing or find where these functions end. In spec builtin functions can be represented by

gentype ctz (gentype x)
gentype max (gentype x, gentype y)
gentype max (gentype x, sgentype y)

If we could use some macro or script to generate the actual builtin functions, it seems more likely spec and clear to read, also will avoid missing or typo.

yaxunl added a comment.May 9 2016, 7:08 AM

If we want to save some space, could we use some macro to expand the gentype or some script to expand the gentype into each types when the clang is build?

We need to balance between space and readability. When I absorbed attribute((overloadable)) into __const_func to save space, it did not sacrifice readability. However using macro with gentype will cause the header file more difficult to read.

But I don't think this kind of so long header is easy to read, it contains full pages of the same function with different types and it is hard to see if anyone is missing or find where these functions end. In spec builtin functions can be represented by

gentype ctz (gentype x)
gentype max (gentype x, gentype y)
gentype max (gentype x, sgentype y)

If we could use some macro or script to generate the actual builtin functions, it seems more likely spec and clear to read, also will avoid missing or typo.

For simple cases, we could define something like this:

#define MATH_FUN(F) \
float F(float x); \
double F(double x); \
half F(half x);

MATH_FUN(sin)
MATH_FUN(cos)

It could be concise without sacrificing much clarity. However, for more complicated patterns, like

uchar8 __const_func convert_uchar8_rtn(int8);

It seems hard to balance between readability and brevity. Mostly it will ends up with multiple layers of macros calling each other, and the original function declarations obscured. This also opens Pandora's box of subjectivity of readability.

I think using macro and not using macro both have advantages and disadvantages. That's why I did not take efforts to change one representation to another.

It will take considerable efforts to define all builtin functions as macros, whereas this review already dragged too long.

Anastasia, what's your suggestion? Thanks.

If we want to save some space, could we use some macro to expand the gentype or some script to expand the gentype into each types when the clang is build?

We need to balance between space and readability. When I absorbed attribute((overloadable)) into __const_func to save space, it did not sacrifice readability. However using macro with gentype will cause the header file more difficult to read.

But I don't think this kind of so long header is easy to read, it contains full pages of the same function with different types and it is hard to see if anyone is missing or find where these functions end. In spec builtin functions can be represented by

gentype ctz (gentype x)
gentype max (gentype x, gentype y)
gentype max (gentype x, sgentype y)

If we could use some macro or script to generate the actual builtin functions, it seems more likely spec and clear to read, also will avoid missing or typo.

For simple cases, we could define something like this:

#define MATH_FUN(F) \
float F(float x); \
double F(double x); \
half F(half x);

MATH_FUN(sin)
MATH_FUN(cos)

It could be concise without sacrificing much clarity. However, for more complicated patterns, like

uchar8 __const_func convert_uchar8_rtn(int8);

It seems hard to balance between readability and brevity. Mostly it will ends up with multiple layers of macros calling each other, and the original function declarations obscured. This also opens Pandora's box of subjectivity of readability.

I think using macro and not using macro both have advantages and disadvantages. That's why I did not take efforts to change one representation to another.

It will take considerable efforts to define all builtin functions as macros, whereas this review already dragged too long.

Anastasia, what's your suggestion? Thanks.

Let's don't exaggerate with macros for this. Using them has negative effect of diagnostics being reported about a macro and not the original function, which won't conform the Spec any longer.

See my earlier comment to line 6582:

error: too many arguments provided to function-like macro invocation
lib/Headers/opencl-c.h
14057 ↗(On Diff #55879)

I don't think Spec imposes any specific implementation of this macro.

I am thinking we might better leave it out to allow adding in a way suitable for other implementations.

yaxunl marked 3 inline comments as done.May 13 2016, 7:45 AM
yaxunl added inline comments.
lib/Headers/opencl-c.h
14057 ↗(On Diff #55879)

How about this?

#ifndef ATOMIC_VAR_INIT
#define ATOMIC_VAR_INIT(x) (x)
#endif

This way we have a default declaration and also allows user to override it.

Another way is to remove it from header and define it in Clang on target by target basis.

Anastasia added inline comments.May 13 2016, 10:59 AM
lib/Headers/opencl-c.h
7452 ↗(On Diff #55879)

Could you put OpenCL v1.1 section too?

7767 ↗(On Diff #55879)

What does +ve mean?

7795 ↗(On Diff #55879)

Could y be merged with the previous line?

7952 ↗(On Diff #55879)

Can't read this comment.

Perhaps: "Compute base e exponent"?

8306 ↗(On Diff #55879)

Does this mean that all non-generic versions are not available in CL2.0 forcing the dynamic AS conversion for all BIFs with a pointer type?

Wondering if adding those unconditionally would be better thing to do...

8457 ↗(On Diff #55879)

Could you add space please: x^2 + y^2

9110 ↗(On Diff #55879)

Overloadable here and in other places too?

12772 ↗(On Diff #55879)

Should it be vload_half instead?

14484 ↗(On Diff #55879)

btw, perhaps it's a good idea to have a macro for attribute((overloadable)). This should be relatively simple to replace everywhere.

14766 ↗(On Diff #55879)

Let's not have macros for function declarations. As commented earlier they can break proper diagnostics reporting (ie. diagnostics will display macro instead of actual function).

15674 ↗(On Diff #55879)

I feel that some bits of this description are being repeated multiple times. Could we have a common description at the beginning instead that would cover all the different cases.

16220 ↗(On Diff #55879)

The same here. Could we unify the description for all write_image functions somehow?

16960 ↗(On Diff #55879)

Let's remove macros from here too.

17051 ↗(On Diff #55879)

Are those arbitrary taken values I am guessing?

17099 ↗(On Diff #55879)

Could we remove the enqueue_kernel as it's being added as Clang builtin? It's prototype is incorrect here anyways.

Btw, I have started the review for it: http://reviews.llvm.org/D20249

17252 ↗(On Diff #55879)

I think this should be better grouped with other image functions above.

Anastasia added inline comments.May 13 2016, 11:00 AM
lib/Headers/opencl-c.h
14057 ↗(On Diff #55879)

Not sure. I guess this way would work too.

lib/Headers/opencl-c.h
9110 ↗(On Diff #55879)

It seems macro __const_func has overloadable attribute.

#define const_func attribute__((const, overloadable))

13534 ↗(On Diff #55879)

I have seen a lot of comments with such few words in one line. It seems very strange here.

14484 ↗(On Diff #55879)

Agree, it seems a lot of functions need overloable attribute.

14766 ↗(On Diff #55879)

So maybe we could use see script to generate the header when build? It will save space and avoid macro break readability and diagnostics reporting. The clang builtin functions are defined in some .def files, or maybe we could try something like that?
This can be done after we have a stable version of what is need for header, we may just decide what should be in this header and don't focus on the performance.

yaxunl added inline comments.May 16 2016, 9:28 AM
lib/Headers/opencl-c.h
17051 ↗(On Diff #55879)

How about this?

// The value are defined according to their order in the spec.
#define CLK_SUCCESS                                 0
#define CLK_ENQUEUE_FAILURE                         -1
#define CLK_INVALID_QUEUE                           -2
#define CLK_INVALID_NDRANGE                         -3
#define CLK_INVALID_EVENT_WAIT_LIST                 -4
#define CLK_DEVICE_QUEUE_FULL                       -5
#define CLK_INVALID_ARG_SIZE                        -6
#define CLK_EVENT_ALLOCATION_FAILURE                -8
#define CLK_OUT_OF_RESOURCES                        -9
jprice added inline comments.May 16 2016, 9:43 AM
lib/Headers/opencl-c.h
17051 ↗(On Diff #55879)

These values were taken from the table in section 2.10.6.1 in the SPIR 2.0 provisional spec. Some of the choices seem arbitrary, while others were chosen to match the value of the equivalent runtime API error.

I imagine changing these values now might cause incompatibilities with existing SPIR consumers?

yaxunl added inline comments.May 16 2016, 10:14 AM
lib/Headers/opencl-c.h
17051 ↗(On Diff #55879)

Then I guess we'd better keep it as is.

yaxunl updated this revision to Diff 57382.May 16 2016, 12:21 PM

Revised according to Anastasia and Xiuli's comments.

yaxunl updated this revision to Diff 57383.May 16 2016, 12:29 PM

Revert enqueue_kernel error code value changed by accident.

Looking good already! I just have one general concern about committing 16K lines of code with no tests. But perhaps testing all the declarations isn't an option actually.

Could we at least have a test that includes the header and makes sure it's compiled successfully. Any thought's on this?

Also I can see that other headers are added to lib/Headers/CMakeLists.txt.

yaxunl updated this revision to Diff 57488.May 17 2016, 9:22 AM

Add opencl-c.h to lib/Headers/CMakeLists.txt. Add a simple test for the header file.

pxli168 accepted this revision.May 19 2016, 9:48 PM
pxli168 edited edge metadata.

LGTM!
I think we may add optimization on this base later.

Sam, there are a few small comments, otherwise it seems to be in a good shape.

@rsmith, do you think you could take a look here? The OpenCL side is fine, but I was just wondering if you see any issue with us adding a header of ~17K lines. It is all part of OpenCL standard libraries though and would be very helpful to have it upstream for all of us. Any suggestion on testing would be useful too as it's a bit simplified for now due to its size.

lib/Headers/opencl-c.h
19 ↗(On Diff #57488)

Could we change to ovld and cnfn with double underscores please?

143 ↗(On Diff #57488)

indentation seems wrong!

175 ↗(On Diff #57488)

Is this a part of spec? I don't see it in s6.13.2.1.

7953 ↗(On Diff #57488)

Ping!

8307 ↗(On Diff #57488)

Any problem with doing this unconditionally?

8458 ↗(On Diff #57488)

Ping!

12646 ↗(On Diff #57488)

This description is missing explaining what gentype is! The same happens in other places too.

12773 ↗(On Diff #57488)

Ping!

14004 ↗(On Diff #57488)

Formatting is wrong!

yaxunl updated this revision to Diff 58340.May 24 2016, 3:39 PM
yaxunl edited edge metadata.
yaxunl marked 35 inline comments as done.

Revised by Anastasia's comments.

yaxunl added inline comments.May 24 2016, 4:26 PM
lib/Headers/opencl-c.h
143 ↗(On Diff #57488)

fixed.

175 ↗(On Diff #57488)

see v1.1 s6.11.2.

7953 ↗(On Diff #57488)

I think "base e exponential" is the correct name. It could be renamed as "base e exponential function" to be more clear.

"exponent" is the 'superscript' part of an exponential. e.g. in e^n, n is the exponent.

8307 ↗(On Diff #57488)

This requires vendors to implement functions not required by OpenCL 2.0 spec. For example, in our 2.0 builtin library we don't have such functions.

8458 ↗(On Diff #57488)

Fixed.

12773 ↗(On Diff #57488)

no. they are different. see OpenCL extensions v1.1 s9.6.6, v1.2 s9.5.6, v2.0 s9.4.6

Anastasia accepted this revision.May 25 2016, 8:58 AM
Anastasia edited edge metadata.

LGTM! Thanks!

Since it has been in review for quite a while, should we try to commit it ASAP? I think Richard can give us his feedback later as well.

This revision was automatically updated to reflect the committed changes.