This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Remove unused extensions
ClosedPublic

Authored by mantognini on Oct 14 2020, 1:41 AM.

Details

Summary

Many non-language extensions are defined but also unused. This patch
removes them with their tests as they do not require compiler support.

The cl_khr_select_fprounding_mode extension is also removed because it
has been deprecated since OpenCL 1.1 and Clang doesn't have any specific
support for it.

The cl_khr_context_abort extension is only referred to in "The OpenCL
Specification", version 1.2 and 2.0, in Table 4.3, but no specification
is provided in "The OpenCL Extension Specification" for these versions.
Because it is both unused in Clang and lacks specification, this
extension is removed.

The following extensions are platform extensions that bring new OpenCL
APIs but do not impact the kernel language nor require compiler support.
They are therefore removed.

  • cl_khr_gl_sharing, introduced in OpenCL 1.0
  • cl_khr_icd, introduced in OpenCL 1.2
  • cl_khr_gl_event, introduced in OpenCL 1.1

Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquireGLObjects.
Hence, they cannot be used on the device side and the extension does
not impact the kernel language.

  • cl_khr_d3d10_sharing, introduced in OpenCL 1.1
  • cl_khr_d3d11_sharing, introduced in OpenCL 1.2
  • cl_khr_dx9_media_sharing, introduced in OpenCL 1.2
  • cl_khr_image2d_from_buffer, introduced in OpenCL 1.2
  • cl_khr_initialize_memory, introduced in OpenCL 1.2
  • cl_khr_gl_depth_images, introduced in OpenCL 1.2

Note: this extension is related to cl_khr_depth_images but only the
latter adds new features to the kernel language.

  • cl_khr_spir, introduced in OpenCL 1.2
  • cl_khr_egl_event, introduced in OpenCL 1.2

Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquire* API
functions. Hence, they cannot be used on the device side and the
extension does not impact the kernel language.

  • cl_khr_egl_image, introduced in OpenCL 1.2
  • cl_khr_terminate_context, introduced in OpenCL 1.2

The minimum required OpenCL version used in OpenCLExtensions.def for
these extensions is not always correct. Removing these address that
issue.

Diff Detail

Event Timeline

mantognini created this revision.Oct 14 2020, 1:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2020, 1:41 AM

Addressed buildbot issues.

mantognini published this revision for review.Oct 14 2020, 4:41 AM

what if users rely on the predefined macros associated with the extension e.g. cl_khr_srgb_image_writes to enable/disable certain code?

What's the issue with these extensions not removed?

what if users rely on the predefined macros associated with the extension e.g. cl_khr_srgb_image_writes to enable/disable certain code?

What's the issue with these extensions not removed?

I meant to add a link to the RFC that highlighted the issue: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066911.html

In a nutshell, those extensions I'm removing are not language extensions but api extensions. I can't think of the usefulness of these macros -- if the host doesn't support the extensions, the kernels relying on those cannot be executed (i.e. it makes no sense). @Anastasia also highlights in her comment that having these increases the complexity and maintenance burden in the ecosystem.

Yes, this is a group of extensions that doesn't seem to change anything in the kernel language. So if the macro is available I don't understand how it can be used to do anything different in the kernel code because it just doesn't modify anything in the kernel code.

FYI I have attempted to clarify the extensions with Khronos https://github.com/KhronosGroup/OpenCL-Docs/issues/82 but it didn't go very far unfortunately. I presume this is not important enough and some extensions are not even described at all or deprecated. If this is the case they should not hold us back.

azabaznov requested changes to this revision.Oct 14 2020, 7:48 AM
azabaznov added inline comments.
clang/include/clang/Basic/OpenCLExtensions.def
74

cl_khr_srgb_image_writes - Extension allowing writes to sRGB images from a kernel. This extension enables write_imagef built-in function as it described here. So I think we shouldn't remove it. Do I miss something?

This revision now requires changes to proceed.Oct 14 2020, 7:48 AM

With this change, clang basically will have no knowledge about the removed extensions, i.e., it will not know which extension is supported in which version of OpenCL and have no way to enable/disable those extensions. There will be no way to define corresponding macros in clang.

Basically the responsibility of defining those macros will be shifted to OpenCL runtime for JIT and shifted to users for offline compilation. They need to have knowledge about which extensions are supported in which version of OpenCL and which cpu/platform supports them. I am not sure whether this is the direction we want to move to.

Anastasia added a comment.EditedOct 14 2020, 9:26 AM

With this change, clang basically will have no knowledge about the removed extensions, i.e., it will not know which extension is supported in which version of OpenCL and have no way to enable/disable those extensions. There will be no way to define corresponding macros in clang.

Basically the responsibility of defining those macros will be shifted to OpenCL runtime for JIT and shifted to users for offline compilation. They need to have knowledge about which extensions are supported in which version of OpenCL and which cpu/platform supports them. I am not sure whether this is the direction we want to move to.

But why do you think anyone would need to use those macros in OpenCL C?

Let's take cl_khr_icd as an exmaple: https://www.khronos.org/registry/OpenCL//sdk/2.2/docs/man/html/cl_khr_icd.html

cl_khr_icd - Extension through which the Khronos OpenCL installable client driver loader (ICD Loader) may expose multiple separate vendor installable client drivers (Vendor ICDs) for OpenCL.

Why would anyone need any macro while compiling OpenCL C code for this functionality? It is dialing with the driver loader that runs on the host before compiling anything.

I believe that the addition of such extensions into the kernel language is accidental and we should try to improve this. There is no need to have something that isn't needed. We have enough code and complexity to maintain that is useful. Let's try to simplify by at least removing what is not needed.

On a separate note, the extensions that need macro definition and don't require the functionality in the clang parsing also doesn't have to be in the clang source code. I have mentioned it in my RFC as well: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066911.html

mantognini added inline comments.Oct 14 2020, 9:48 AM
clang/include/clang/Basic/OpenCLExtensions.def
74

On second reading, this one is slightly ambiguous. On the language side, the extension doesn't add an overload; it only specifies that existing overload can be used with a different kind of image. On the API side, it does extend the set of features. But at the same time if the extended API is not supported, it's not possible to create an image in the first place and therefore impossible to call write_imagef. So I question the usefulness of such macro on the device side. Does this argument convince you it should be removed?

azabaznov added inline comments.Oct 14 2020, 11:18 AM
clang/include/clang/Basic/OpenCLExtensions.def
74

it's not possible to create an image in the first place and therefore impossible

Not quite that right. Referring to this:

read_imagef built-in functions for OpenCL C 2.0 devices do implicit conversions to RGB for sRGB images. However, implicit conversion from sRGB to RGB is done on write_imagef only if corresponding extension is supported. Otherwise, explicit conversions inside a kernel may be required.

I'm agree that this one is kind of confusing and requires some extra investigation. But I think now we should remove only those extensions which are only API related for sure.

With this change, clang basically will have no knowledge about the removed extensions, i.e., it will not know which extension is supported in which version of OpenCL and have no way to enable/disable those extensions. There will be no way to define corresponding macros in clang.

Basically the responsibility of defining those macros will be shifted to OpenCL runtime for JIT and shifted to users for offline compilation. They need to have knowledge about which extensions are supported in which version of OpenCL and which cpu/platform supports them. I am not sure whether this is the direction we want to move to.

But why do you think anyone would need to use those macros in OpenCL C?

Let's take cl_khr_icd as an exmaple: https://www.khronos.org/registry/OpenCL//sdk/2.2/docs/man/html/cl_khr_icd.html

cl_khr_icd - Extension through which the Khronos OpenCL installable client driver loader (ICD Loader) may expose multiple separate vendor installable client drivers (Vendor ICDs) for OpenCL.

Why would anyone need any macro while compiling OpenCL C code for this functionality? It is dialing with the driver loader that runs on the host before compiling anything.

I believe that the addition of such extensions into the kernel language is accidental and we should try to improve this. There is no need to have something that isn't needed. We have enough code and complexity to maintain that is useful. Let's try to simplify by at least removing what is not needed.

On a separate note, the extensions that need macro definition and don't require the functionality in the clang parsing also doesn't have to be in the clang source code. I have mentioned it in my RFC as well: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066911.html

Does the spec requires cl_* macro to be defined if an extension is enabled? If it is not useful, shouldn't the spec be fixed first? Otherwise, how do we make sure users are not using those macros?

Anastasia added inline comments.Oct 14 2020, 12:22 PM
clang/include/clang/Basic/OpenCLExtensions.def
74

Ok, thanks for providing extra information. I agree this deserves extra clarifications.

I am not sure whether the frontend will be able to perform such conversions since it doesn't have information regarding the channels. The image types are completely opaque. That means that potentially the BIFs can handle the conversion internally. However, I am unclear whether the macro can be useful i.e. whether there is anything that can be done differently at the source level based on its availability in OpenCL kernel code.

It should be ok to leave this out for now from the clean up unless someone else can help quickly with some more insights.

With this change, clang basically will have no knowledge about the removed extensions, i.e., it will not know which extension is supported in which version of OpenCL and have no way to enable/disable those extensions. There will be no way to define corresponding macros in clang.

Basically the responsibility of defining those macros will be shifted to OpenCL runtime for JIT and shifted to users for offline compilation. They need to have knowledge about which extensions are supported in which version of OpenCL and which cpu/platform supports them. I am not sure whether this is the direction we want to move to.

But why do you think anyone would need to use those macros in OpenCL C?

Let's take cl_khr_icd as an exmaple: https://www.khronos.org/registry/OpenCL//sdk/2.2/docs/man/html/cl_khr_icd.html

cl_khr_icd - Extension through which the Khronos OpenCL installable client driver loader (ICD Loader) may expose multiple separate vendor installable client drivers (Vendor ICDs) for OpenCL.

Why would anyone need any macro while compiling OpenCL C code for this functionality? It is dialing with the driver loader that runs on the host before compiling anything.

I believe that the addition of such extensions into the kernel language is accidental and we should try to improve this. There is no need to have something that isn't needed. We have enough code and complexity to maintain that is useful. Let's try to simplify by at least removing what is not needed.

On a separate note, the extensions that need macro definition and don't require the functionality in the clang parsing also doesn't have to be in the clang source code. I have mentioned it in my RFC as well: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066911.html

Does the spec requires cl_* macro to be defined if an extension is enabled?

The extension spec currently has:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation.

Those extensions don't affect the language or add any BIFs so my reading from this the macro shouldn't be available.

If it is not useful, shouldn't the spec be fixed first? Otherwise, how do we make sure users are not using those macros?

I have first attemted to resolve this with Khonos but there was slow progress so it got stalled several times https://github.com/KhronosGroup/OpenCL-Docs/issues/82 (see also PR). I gather we are the main affected stakeholders so perhaps it is up to us to improve this.

As for the spec I would like to point out that clang doesn't implement spec presicely from very early standards. The issue is that it has never been documented but we plan to improve at least from the documentation side in the course of OpenCL 3.0 development.

http://lists.llvm.org/pipermail/cfe-dev/2020-September/066912.html

Does the spec requires cl_* macro to be defined if an extension is enabled?

The extension spec currently has:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation.

Those extensions don't affect the language or add any BIFs so my reading from this the macro shouldn't be available.

Thanks. That sounds reasonable.

yaxunl added inline comments.Oct 14 2020, 1:10 PM
clang/include/clang/Basic/OpenCLExtensions.def
16

Can you add a comment here referring the spec about "Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string." and therefore only extensions "which affects the OpenCL language semantics, syntax or adds built-in functions to the language" are defined in this file. Thanks.

Anastasia added inline comments.Oct 14 2020, 2:45 PM
clang/include/clang/Basic/OpenCLExtensions.def
16

Good idea! I also suggest we add clarifications regarding pragmas, that those are to be added only when the compiler needs to change the compilation flow i.e. if there is a difference in the language semantic from what is defined in a standard.

Addressed comments.

clang/include/clang/Basic/OpenCLExtensions.def
16

I've added a comment. Let me know if it suits you.

74

Thanks for the clarification; this indeed requires further investigations so I've removed this part from my patch for now.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.
Even if all clang targets support this the macro should still be defined for backward compatibility.

it would be useful if the commit message or the description of this revision included a justification for each removed extension why it doesn't impact language semantics with spec references.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.

Ok, does it mean that we are missing to diagnose this in the frontend? Btw I do acknowledge that what you say makes sense but I don't think the documentation support that:
https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/cl_khr_byte_addressable_store.html

Am I just looking in the wrong place?

Even if all clang targets support this the macro should still be defined for backward compatibility.

Ok, are you saying that all targets currently support this and we sould always define it? In this case I would be more happy if we move them into the internal header that we add implicitly anyway...

it would be useful if the commit message or the description of this revision included a justification for each removed extension why it doesn't impact language semantics with spec references.

Yes, this is a good suggestion in principle and we should try our best. However I am not sure it is feasible for all of those, in particular this documentation doesn't contain anything:
https://man.opencl.org/cl_khr_context_abort.html

Are you suggesting to leave this out? However I don't see any evidence that this require either macro or pragma. I feel this is in rather incomplete state. So I don't feel we can accomodate for all of these.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.

Ok, does it mean that we are missing to diagnose this in the frontend? Btw I do acknowledge that what you say makes sense but I don't think the documentation support that:
https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/cl_khr_byte_addressable_store.html

Am I just looking in the wrong place?

Since it's only an extension in ocl 1.0 that spec is a better place to look:

9.9 Byte Addressable Stores 
Section 6.8.m describes restrictions on built-in types char, uchar, char2, uchar2, short,
and half. The OpenCL extension cl_khr_byte_addressable_store removes these
restrictions.  An application that wants to be able to write to elements of a pointer (or struct) that
are of type char, uchar, char2, uchar2, short, ushort and half will need to include
the #pragma OPENCL EXTENSION cl_khr_byte_addressable_store :
enable directive before any code that performs writes that may not be supported as per section
6.8.m.

6.8 Restrictions
...
m. Built-in types that are less than 32-bits in size i.e. char, uchar, char2, uchar2,
short, ushort, and half have the following restriction:
Writes to a pointer (or arrays) of type char, uchar, char2, uchar2, short,
ushort, and half or to elements of a struct that are of type char, uchar,
char2, uchar2, short and ushort are not supported. Refer to section 9.9
for additional information.

Even if all clang targets support this the macro should still be defined for backward compatibility.

Ok, are you saying that all targets currently support this and we sould always define it? In this case I would be more happy if we move them into the internal header that we add implicitly anyway...

r600/r700/EG/NI targets have a bit wonky support byte-addressable stores (basically using 32bit atomic MSKOR), but I don't expect those targets to survive for long, and for now, they advertise support.
some out of tree backends might also benefit from the extension (like legup -- verilog backend), but I'm not sure if out of tree targets are worth considering, or if they indeed make use of this extension.
on a high-level, I could see a restriction to 32bit data paths be useful for FPGA targets.

moving the define it to header breaks OCL1.0 programs. Based on the specs above. those programs are required to include the following:

#ifdef cl_khr_byte_addressable_store
#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable
#endif

Before they can dereference pointers to char/char2/short/... types (including array indexing and struct members).
so the pragma part needs to work, and the language level checks need to work if the extension is not enabled.

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

clang can always decide that OCL1.0 programs (even when compiled in cl-1.x mode) and embedded profile is unsupported.
I have no clue if there are many OCL programs that rely on those modes out there.
However, removing support for them is a significantly bigger change than cleaning up a few language-irrelevant extensions (actually, I'm not sure if clang ever supported OCL embedded mode).

it would be useful if the commit message or the description of this revision included a justification for each removed extension why it doesn't impact language semantics with spec references.

Yes, this is a good suggestion in principle and we should try our best. However I am not sure it is feasible for all of those, in particular this documentation doesn't contain anything:
https://man.opencl.org/cl_khr_context_abort.html

Are you suggesting to leave this out? However I don't see any evidence that this require either macro or pragma. I feel this is in rather incomplete state. So I don't feel we can accomodate for all of these.

"incomplete specification" is imo a good reason to drop support for an extension. My argument is that explanation of legacy extensions should rely on the spec that introduced them, rather than the current 2.x/3.x which does an arguably poor job on backward compatibility.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.

Ok, does it mean that we are missing to diagnose this in the frontend? Btw I do acknowledge that what you say makes sense but I don't think the documentation support that:
https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/cl_khr_byte_addressable_store.html

Am I just looking in the wrong place?

Since it's only an extension in ocl 1.0 that spec is a better place to look:

9.9 Byte Addressable Stores 
Section 6.8.m describes restrictions on built-in types char, uchar, char2, uchar2, short,
and half. The OpenCL extension cl_khr_byte_addressable_store removes these
restrictions.  An application that wants to be able to write to elements of a pointer (or struct) that
are of type char, uchar, char2, uchar2, short, ushort and half will need to include
the #pragma OPENCL EXTENSION cl_khr_byte_addressable_store :
enable directive before any code that performs writes that may not be supported as per section
6.8.m.

6.8 Restrictions
...
m. Built-in types that are less than 32-bits in size i.e. char, uchar, char2, uchar2,
short, ushort, and half have the following restriction:
Writes to a pointer (or arrays) of type char, uchar, char2, uchar2, short,
ushort, and half or to elements of a struct that are of type char, uchar,
char2, uchar2, short and ushort are not supported. Refer to section 9.9
for additional information.

Ok, thanks! I guess the documentation should be fixed in this URL to point to the place where this behavior is described, like for example in cl_khr_subgroups:
https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/cl_khr_subgroups.html

Even if all clang targets support this the macro should still be defined for backward compatibility.

Ok, are you saying that all targets currently support this and we sould always define it? In this case I would be more happy if we move them into the internal header that we add implicitly anyway...

r600/r700/EG/NI targets have a bit wonky support byte-addressable stores (basically using 32bit atomic MSKOR), but I don't expect those targets to survive for long, and for now, they advertise support.
some out of tree backends might also benefit from the extension (like legup -- verilog backend), but I'm not sure if out of tree targets are worth considering, or if they indeed make use of this extension.
on a high-level, I could see a restriction to 32bit data paths be useful for FPGA targets.

moving the define it to header breaks OCL1.0 programs. Based on the specs above. those programs are required to include the following:

#ifdef cl_khr_byte_addressable_store
#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable
#endif

Before they can dereference pointers to char/char2/short/... types (including array indexing and struct members).
so the pragma part needs to work, and the language level checks need to work if the extension is not enabled.

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

I don't think clang currently support embedded profile. Adding extension into the OpenCLOptions is pointless if it's not used. Do you plan to add any functionality for it? Defining macros can be done easily outside of clang source code or if it is preferable to do inside there is MacroBuilder::defineMacro available in the target hooks. Currently for every extension added into OpenCLExtensions.def there is a macro, a pragma and a field in OpenCLOptions added. This is often more than what is necessary. Plus Khronos has very many extensions if we assume that all of them are added in clang it will become scalability and maintanance nightmare. Most of the extensions only add functions so if we provide a way to add macros for those outside of clang code base it will keep the common code clean and vendors can be more flexible in adding the extensions without the need to modify upstream code if they need to. I see it as an opportunity to improve common code and out of tree implementations too. It just need a little bit of restructuring.

clang can always decide that OCL1.0 programs (even when compiled in cl-1.x mode) and embedded profile is unsupported.
I have no clue if there are many OCL programs that rely on those modes out there.
However, removing support for them is a significantly bigger change than cleaning up a few language-irrelevant extensions (actually, I'm not sure if clang ever supported OCL embedded mode).

I don't think so. There is only one extensions added but that hasn't been implemented fully. As a matter of fact I have created this review earlier to make it clear in the User Manual: https://reviews.llvm.org/D89143. Feel free to provide your feedback.

it would be useful if the commit message or the description of this revision included a justification for each removed extension why it doesn't impact language semantics with spec references.

Yes, this is a good suggestion in principle and we should try our best. However I am not sure it is feasible for all of those, in particular this documentation doesn't contain anything:
https://man.opencl.org/cl_khr_context_abort.html

Are you suggesting to leave this out? However I don't see any evidence that this require either macro or pragma. I feel this is in rather incomplete state. So I don't feel we can accomodate for all of these.

"incomplete specification" is imo a good reason to drop support for an extension. My argument is that explanation of legacy extensions should rely on the spec that introduced them, rather than the current 2.x/3.x which does an arguably poor job on backward compatibility.

Ok, the idea is not to break backwards compatibility of course. For the extensions that intended to modify language (but never did) if we look from upstream clang user's perspective those extensions couldn't possibly be used to do anything useful. It is of course possible that in some forks the functionality has been completed but then they can easily update the implementation to include the extension definition back. This is very small change compared to the actual extension functionality.

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

I though the current behaviour of 'you can use #pragma, but the dereferences are always legal' was intentional for backward compatibility.
I don't think there are programs that would set it to 'disabled' and expect compilation failure. My concern is that legacy apps would set '#pragma enabled' before using char/short. such behavior would produce warning/error if with this change applied.

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

I don't think clang currently support embedded profile. Adding extension into the OpenCLOptions is pointless if it's not used. Do you plan to add any functionality for it? Defining macros can be done easily outside of clang source code or if it is preferable to do inside there is MacroBuilder::defineMacro available in the target hooks. Currently for every extension added into OpenCLExtensions.def there is a macro, a pragma and a field in OpenCLOptions added. This is often more than what is necessary. Plus Khronos has very many extensions if we assume that all of them are added in clang it will become scalability and maintanance nightmare. Most of the extensions only add functions so if we provide a way to add macros for those outside of clang code base it will keep the common code clean and vendors can be more flexible in adding the extensions without the need to modify upstream code if they need to. I see it as an opportunity to improve common code and out of tree implementations too. It just need a little bit of restructuring.

My understanding is that both the macro and working #pragma directive is necessary. The configuration bit is only needed if clang changes behaviour, which is a separate question.
I'd also argue that new third party extensions need an API call to register new extensions in order to get a working pragma mechanism.

Even if an extension only adds access new functions, pragma should control if user functions with conflicting names are legal or not.

for example, a program can implement function new_func, which gets later added to an extension. since the program doesn't know about the new extension it doesn't use #pragma new_extension:enable and continues to work correctly.
If the new extension exposes new_func unconditionally, the program breaks, because it doesn't check for a macro that didn't exist at the time it was written.
more recent programs can use ifdef to either use new_func provided by the extension, or implement a custom version.

I didn't find much about embedded program behavior in full profile implementation in the specs.
It only says that "embedded profile is a subset" which imo implies that legal embedded profile programs should work correctly in a full profile implementation.
This implies that cles_* pragmas should continue to work even if the behavior is always supported.

Are you suggesting to leave this out? However I don't see any evidence that this require either macro or pragma. I feel this is in rather incomplete state. So I don't feel we can accomodate for all of these.

"incomplete specification" is imo a good reason to drop support for an extension. My argument is that explanation of legacy extensions should rely on the spec that introduced them, rather than the current 2.x/3.x which does an arguably poor job on backward compatibility.

Ok, the idea is not to break backwards compatibility of course. For the extensions that intended to modify language (but never did) if we look from upstream clang user's perspective those extensions couldn't possibly be used to do anything useful. It is of course possible that in some forks the functionality has been completed but then they can easily update the implementation to include the extension definition back. This is very small change compared to the actual extension functionality.

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

I'd phrase it differently. extensions that modify the language should be included even if clang decides not to implement given language modification.

I think it's perfectly fine to always allow dereferencing of char/short types, as long as correctly written CL1.0 programs that use #pragma OPENCL cl_khr_byte_addressable_store: enable don't produce extra errors/warnings.
Full profile is a superset of embedded profile so the cles_* behaviours should be available and #pragma should work as well.

This would suggest that there should be a list of no-op extensions that are only present for backward/embedded compatibility and don't change behaviour since the extended version of language behaviour is always supported.

note that my concerns are purely wrt. spec wording and intentions. Given that ocl didn't see wider use until later versions, such legacy applications might not exist.
feel free to reject them as purely academic, but the point is that changes in this revision break backward compatibility (which is always a maintenance burden).

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

I though the current behaviour of 'you can use #pragma, but the dereferences are always legal' was intentional for backward compatibility.
I don't think there are programs that would set it to 'disabled' and expect compilation failure.

The initial state of the pragma is disabled, so if disabling the extension isn't supposed to do anything then I don't know why would anyone ever enable it?

My concern is that legacy apps would set '#pragma enabled' before using char/short. such behavior would produce warning/error if with this change applied.

Correct, it will compile with a warning but not fail to compile. I am willing to introduce a -W option (if not present already ) to suppress that warning if it helps with the clean up and backward compatibility. It might also open up opportunities for a wider clean up - removing pragma in extensions that currently require pragma for no good reason. I have written more details on this in https://github.com/KhronosGroup/OpenCL-Docs/pull/355#issuecomment-662679499

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

I don't think clang currently support embedded profile. Adding extension into the OpenCLOptions is pointless if it's not used. Do you plan to add any functionality for it? Defining macros can be done easily outside of clang source code or if it is preferable to do inside there is MacroBuilder::defineMacro available in the target hooks. Currently for every extension added into OpenCLExtensions.def there is a macro, a pragma and a field in OpenCLOptions added. This is often more than what is necessary. Plus Khronos has very many extensions if we assume that all of them are added in clang it will become scalability and maintanance nightmare. Most of the extensions only add functions so if we provide a way to add macros for those outside of clang code base it will keep the common code clean and vendors can be more flexible in adding the extensions without the need to modify upstream code if they need to. I see it as an opportunity to improve common code and out of tree implementations too. It just need a little bit of restructuring.

My understanding is that both the macro and working #pragma directive is necessary.

I don't believe this is the correct interpretation. If you look at the extension spec s9.1 it says:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation.

It does not say that the pragma is needed, it only says that macro is needed. That perfectly makes sense because the macro allows to check that the extension is present to implement certain functionality conditionally.

OpenCL spec however never clarified the use of pragma that technically makes sense because the pragmas in C languages are used for altering the standard behavior that can't be otherwise achieved using standard parsing i.e. C99 section 6.10.1 says about non-STDC pragmas:

The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. Any such pragma that is not recognized by the implementation is ignored.

So C99 only regulates behavior of STDC pragmas and for those it explicitly says how the behavior of the parsed program is altered.

Technically OpenCL pragmas are not covered neither in OpenCL C and not C99 and therefore it is unclear what the implementation should do. However, with time due to the absence of the clarification mutiple interpretations appeared. Sadly some of them ended up in a very suboptimal state both for the tooling and the application developers because the pragma started to be added for no reason or for controlling redundant diagnostics of use of types or functions that extensions were introducing. If you are interested in more details I suggest to check this issue: https://github.com/KhronosGroup/OpenCL-Docs/issues/82

The configuration bit is only needed if clang changes behaviour, which is a separate question.
I'd also argue that new third party extensions need an API call to register new extensions in order to get a working pragma mechanism.

Can you explain this please?

Even if an extension only adds access new functions, pragma should control if user functions with conflicting names are legal or not.

for example, a program can implement function new_func, which gets later added to an extension. since the program doesn't know about the new extension it doesn't use #pragma new_extension:enable and continues to work correctly.

This is absolutely not how it works or what the pragma currently does or even can do. If you want such a thing to work you need to introduce a concept of a namespace into OpenCL C, so every extension is implemented as a separate namespace to avoid name clushing. I do acknowledge however that it would be extremly useful and convenient but it is not how C parsers work. And I imagine it is not trivial to extend this but probably not impossible.

https://godbolt.org/z/d8c5fb

If you look at this example the function from the extension have already been added into the list the symbal tables. The pragma doesn't undo or remove them or make them invisible. There is no machnism to do this easily and the proper solution would be to use namespaces but it is only available in C++.

If the new extension exposes new_func unconditionally, the program breaks, because it doesn't check for a macro that didn't exist at the time it was written.
more recent programs can use ifdef to either use new_func provided by the extension, or implement a custom version.

Yes, this is the intent of defining the extension macros.

I didn't find much about embedded program behavior in full profile implementation in the specs.
It only says that "embedded profile is a subset" which imo implies that legal embedded profile programs should work correctly in a full profile implementation.
This implies that cles_* pragmas should continue to work even if the behavior is always supported.

If there is no proper specification for this I think adding it to clang contradict the development policy http://clang.llvm.org/get_involved.html

A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.

Are you suggesting to leave this out? However I don't see any evidence that this require either macro or pragma. I feel this is in rather incomplete state. So I don't feel we can accomodate for all of these.

"incomplete specification" is imo a good reason to drop support for an extension. My argument is that explanation of legacy extensions should rely on the spec that introduced them, rather than the current 2.x/3.x which does an arguably poor job on backward compatibility.

Ok, the idea is not to break backwards compatibility of course. For the extensions that intended to modify language (but never did) if we look from upstream clang user's perspective those extensions couldn't possibly be used to do anything useful. It is of course possible that in some forks the functionality has been completed but then they can easily update the implementation to include the extension definition back. This is very small change compared to the actual extension functionality.

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

I'd phrase it differently. extensions that modify the language should be included even if clang decides not to implement given language modification.

Why do we need to add what we don't implement? I don't think this complies to the clang development policy either as we don't support incomplete implementations.

I think it's perfectly fine to always allow dereferencing of char/short types, as long as correctly written CL1.0 programs that use #pragma OPENCL cl_khr_byte_addressable_store: enable don't produce extra errors/warnings.
Full profile is a superset of embedded profile so the cles_* behaviours should be available and #pragma should work as well.

This would suggest that there should be a list of no-op extensions that are only present for backward/embedded compatibility and don't change behaviour since the extended version of language behaviour is always supported.

What do the extensions do? If they do nothing why do they need to be in clang? Once again if anyone needs to define a macro, this can be done outside of clang code base. OpenCLExtensions.def is not meant to be used for adding the macros as it does a lot more than this. Using OpenCLExtensions.def for only for adding a macro is a hack to me.

note that my concerns are purely wrt. spec wording and intentions. Given that ocl didn't see wider use until later versions, such legacy applications might not exist.

I disagree on that there are more OpenCL 1.x conformant devices than OpenCL 2.x. You can see by the adoption list.

feel free to reject them as purely academic, but the point is that changes in this revision break backward compatibility (which is always a maintenance burden).

I think we already agreed earlier to leave certain extensions in for now but add a comment to address this later. See:

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

Keep cl_khr_byte_addressable_store and cles_khr_int64 out of this PR. Add more details on the extensions being removed in the commit message.

mantognini edited the summary of this revision. (Show Details)Oct 16 2020, 10:22 AM

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

I though the current behaviour of 'you can use #pragma, but the dereferences are always legal' was intentional for backward compatibility.
I don't think there are programs that would set it to 'disabled' and expect compilation failure.

The initial state of the pragma is disabled, so if disabling the extension isn't supposed to do anything then I don't know why would anyone ever enable it?

My concern is that legacy apps would set '#pragma enabled' before using char/short. such behavior would produce warning/error if with this change applied.

Correct, it will compile with a warning but not fail to compile. I am willing to introduce a -W option (if not present already ) to suppress that warning if it helps with the clean up and backward compatibility. It might also open up opportunities for a wider clean up - removing pragma in extensions that currently require pragma for no good reason. I have written more details on this in https://github.com/KhronosGroup/OpenCL-Docs/pull/355#issuecomment-662679499

Adding a new option cannot be a solution as legacy applications were written before its introduction.
CL1.x applications need to continue to work without changes, anything else breaks backward compatibility.

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

I don't think clang currently support embedded profile. Adding extension into the OpenCLOptions is pointless if it's not used. Do you plan to add any functionality for it? Defining macros can be done easily outside of clang source code or if it is preferable to do inside there is MacroBuilder::defineMacro available in the target hooks. Currently for every extension added into OpenCLExtensions.def there is a macro, a pragma and a field in OpenCLOptions added. This is often more than what is necessary. Plus Khronos has very many extensions if we assume that all of them are added in clang it will become scalability and maintanance nightmare. Most of the extensions only add functions so if we provide a way to add macros for those outside of clang code base it will keep the common code clean and vendors can be more flexible in adding the extensions without the need to modify upstream code if they need to. I see it as an opportunity to improve common code and out of tree implementations too. It just need a little bit of restructuring.

My understanding is that both the macro and working #pragma directive is necessary.

I don't believe this is the correct interpretation. If you look at the extension spec s9.1 it says:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation.

This is only one part indicating the availability of extension. It's not describing what's necessary to observe extended behaviour. The specs state that the initial state of the compiler is as if #pragma OPENCL EXTENSION all : disable was issued.
This means that functions introduced by extensions cannot be present, and the symbol names are available for application use unless such symbols were previously reserved.

It does not say that the pragma is needed, it only says that macro is needed. That perfectly makes sense because the macro allows to check that the extension is present to implement certain functionality conditionally.

This cited line only talks about extensions availability, not extension use. One example of newly introduced functions is in cl_khr_in64_base_atomics. The specs say:

The optional extensions cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics
implement atomic operations on 64-bit signed and unsigned integers to locations in __global
and __local memory. .
An application that wants to use any of these extensions will **need** to include the #pragma
OPENCL EXTENSION cl_khr_int64_base_atomics : enable or #pragma
OPENCL EXTENSION cl_khr_int64_extended_atomics : enable directive in
the OpenCL program source

(emphasis mine) The use of pragma is not voluntary, but necessary for the functions to become available.

OpenCL spec however never clarified the use of pragma that technically makes sense because the pragmas in C languages are used for altering the standard behavior that can't be otherwise achieved using standard parsing i.e. C99 section 6.10.1 says about non-STDC pragmas:

The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. Any such pragma that is not recognized by the implementation is ignored.

So C99 only regulates behavior of STDC pragmas and for those it explicitly says how the behavior of the parsed program is altered.

Technically OpenCL pragmas are not covered neither in OpenCL C and not C99 and therefore it is unclear what the implementation should do. However, with time due to the absence of the clarification mutiple interpretations appeared. Sadly some of them ended up in a very suboptimal state both for the tooling and the application developers because the pragma started to be added for no reason or for controlling redundant diagnostics of use of types or functions that extensions were introducing. If you are interested in more details I suggest to check this issue: https://github.com/KhronosGroup/OpenCL-Docs/issues/82

The configuration bit is only needed if clang changes behaviour, which is a separate question.
I'd also argue that new third party extensions need an API call to register new extensions in order to get a working pragma mechanism.

Can you explain this please?

explained above, extensions shouldn't introduce new symbols without being enabled. applications are free to use any symbol names that are not reserved by the specs.
If my application decides to implement swap_atomically it's perfectly free to do so and should continue to work even if someone later decides to add swap_atomically as extension function.
Requiring such an application to be updated to check for new_extension macro breaks backward compatibility.

Even if an extension only adds access new functions, pragma should control if user functions with conflicting names are legal or not.

for example, a program can implement function new_func, which gets later added to an extension. since the program doesn't know about the new extension it doesn't use #pragma new_extension:enable and continues to work correctly.

This is absolutely not how it works or what the pragma currently does or even can do. If you want such a thing to work you need to introduce a concept of a namespace into OpenCL C, so every extension is implemented as a separate namespace to avoid name clushing. I do acknowledge however that it would be extremly useful and convenient but it is not how C parsers work. And I imagine it is not trivial to extend this but probably not impossible.

https://godbolt.org/z/d8c5fb

If you look at this example the function from the extension have already been added into the list the symbal tables. The pragma doesn't undo or remove them or make them invisible. There is no machnism to do this easily and the proper solution would be to use namespaces but it is only available in C++.

Namespaces are not necessary. the behaviour is easily implementable using defines:
#pragma abcdef: enable introduces #define exported_function __reserved_name_function
#pragma abcdef: disable removes such definitions.

applications cannot legally use names that begin with two underscores, so there's no risk of name clash

all identifiers that being with an underscore and a capital letter or another underscore are always reserved for any use (C99 7.1.3)

The following names are reserved for use as keywords in OpenCL C and shall not be used
otherwise.
 * Names reserved as keywords by C99.

Opencl-1.1 6.1.9

If the new extension exposes new_func unconditionally, the program breaks, because it doesn't check for a macro that didn't exist at the time it was written.
more recent programs can use ifdef to either use new_func provided by the extension, or implement a custom version.

Yes, this is the intent of defining the extension macros.

And it misses the point. legacy application cannot check for macros that didn't exist at the time of writing.
requiring applications to be continually adjusted as new extensions become available breaks backward compatibility.

I didn't find much about embedded program behavior in full profile implementation in the specs.
It only says that "embedded profile is a subset" which imo implies that legal embedded profile programs should work correctly in a full profile implementation.
This implies that cles_* pragmas should continue to work even if the behavior is always supported.

If there is no proper specification for this I think adding it to clang contradict the development policy http://clang.llvm.org/get_involved.html

A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.

Both cles_khr_int64 and cles_khr_2d_image_array_writes are described in opencl 1.1 and 1.2 specs.
Applications that use opencl 1.2 follow those specs, and clang's std=cl1.2 should follow cl 1.2 specs.
Requiring applications to be continually rewritten as new cl versions become available breaks backward compatibility.

I'd phrase it differently. extensions that modify the language should be included even if clang decides not to implement given language modification.

Why do we need to add what we don't implement? I don't think this complies to the clang development policy either as we don't support incomplete implementations.

The extension is already implemented (char/short dereferences work). it just needs to be exposed to applications performing correct steps to enable extended behaviour.
The same reason why cl_khr_fp64 is still listed in the extensions list even though it stopped being an extension in cl1.2.

I think it's perfectly fine to always allow dereferencing of char/short types, as long as correctly written CL1.0 programs that use #pragma OPENCL cl_khr_byte_addressable_store: enable don't produce extra errors/warnings.
Full profile is a superset of embedded profile so the cles_* behaviours should be available and #pragma should work as well.

This would suggest that there should be a list of no-op extensions that are only present for backward/embedded compatibility and don't change behaviour since the extended version of language behaviour is always supported.

What do the extensions do? If they do nothing why do they need to be in clang? Once again if anyone needs to define a macro, this can be done outside of clang code base. OpenCLExtensions.def is not meant to be used for adding the macros as it does a lot more than this. Using OpenCLExtensions.def for only for adding a macro is a hack to me.

The presence of these extensions enables correct functionality of procedures that enable extended behaviour (use of pragmas). Irrespective of whether the same procedure is required in later versions or not.
Correctly written OCL 1.0 applications are required to use legacy pragmas, removing them breaks backward compatibility.

Whether it's implemented as a no-op extension, or just a list of pragma names that shouldn't issue warning or error is an implementation detail.

note that my concerns are purely wrt. spec wording and intentions. Given that ocl didn't see wider use until later versions, such legacy applications might not exist.

I disagree on that there are more OpenCL 1.x conformant devices than OpenCL 2.x. You can see by the adoption list.

sure. which makes it even more puzzling that you don't want to consider cl1.x specs.

feel free to reject them as purely academic, but the point is that changes in this revision break backward compatibility (which is always a maintenance burden).

I think we already agreed earlier to leave certain extensions in for now but add a comment to address this later. See:

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

This is IMO completely misguided. The policy wrt to applications that follow older spec should be clearly stated and adhered to, giving enough time for any affected users to adapt their workloads in case there are changes.
An ill-defined 'benefit of the community' just means that old applications will get broken unless they actively watch for clang development. at which point it's easier to update legacy apps.

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

I though the current behaviour of 'you can use #pragma, but the dereferences are always legal' was intentional for backward compatibility.
I don't think there are programs that would set it to 'disabled' and expect compilation failure.

The initial state of the pragma is disabled, so if disabling the extension isn't supposed to do anything then I don't know why would anyone ever enable it?

My concern is that legacy apps would set '#pragma enabled' before using char/short. such behavior would produce warning/error if with this change applied.

Correct, it will compile with a warning but not fail to compile. I am willing to introduce a -W option (if not present already ) to suppress that warning if it helps with the clean up and backward compatibility. It might also open up opportunities for a wider clean up - removing pragma in extensions that currently require pragma for no good reason. I have written more details on this in https://github.com/KhronosGroup/OpenCL-Docs/pull/355#issuecomment-662679499

Adding a new option cannot be a solution as legacy applications were written before its introduction.
CL1.x applications need to continue to work without changes, anything else breaks backward compatibility.

Yes, we don't break backward compatibility of the specified behavior. This is never the intent.

The same arguments also apply to cles_khr_in64. It's illegal to use int64 types in embedded profile unless you first enable the extensions. Rather than removing it, cles_khr_2d_image_array_writes should be added to the extension list.

I don't think clang currently support embedded profile. Adding extension into the OpenCLOptions is pointless if it's not used. Do you plan to add any functionality for it? Defining macros can be done easily outside of clang source code or if it is preferable to do inside there is MacroBuilder::defineMacro available in the target hooks. Currently for every extension added into OpenCLExtensions.def there is a macro, a pragma and a field in OpenCLOptions added. This is often more than what is necessary. Plus Khronos has very many extensions if we assume that all of them are added in clang it will become scalability and maintanance nightmare. Most of the extensions only add functions so if we provide a way to add macros for those outside of clang code base it will keep the common code clean and vendors can be more flexible in adding the extensions without the need to modify upstream code if they need to. I see it as an opportunity to improve common code and out of tree implementations too. It just need a little bit of restructuring.

My understanding is that both the macro and working #pragma directive is necessary.

I don't believe this is the correct interpretation. If you look at the extension spec s9.1 it says:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions to the language must create a preprocessor #define that matches the extension name string. This #define would be available in the language if and only if the extension is supported on a given implementation.

This is only one part indicating the availability of extension. It's not describing what's necessary to observe extended behaviour. The specs state that the initial state of the compiler is as if #pragma OPENCL EXTENSION all : disable was issued.
This means that functions introduced by extensions cannot be present, and the symbol names are available for application use unless such symbols were previously reserved.

This is not written in the spec. The statement you quote only says that there is implicit #pragma OPENCL EXTENSION all : disable and this is all. It says nothing about functions!

It does not say that the pragma is needed, it only says that macro is needed. That perfectly makes sense because the macro allows to check that the extension is present to implement certain functionality conditionally.

This cited line only talks about extensions availability, not extension use. One example of newly introduced functions is in cl_khr_in64_base_atomics. The specs say:

The optional extensions cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics
implement atomic operations on 64-bit signed and unsigned integers to locations in __global
and __local memory. .
An application that wants to use any of these extensions will **need** to include the #pragma
OPENCL EXTENSION cl_khr_int64_base_atomics : enable or #pragma
OPENCL EXTENSION cl_khr_int64_extended_atomics : enable directive in
the OpenCL program source

(emphasis mine) The use of pragma is not voluntary, but necessary for the functions to become available.

No, once again is doesn't say this. It doesn't say anything of what happens. And btw we are diverging from the topic I feel. In this patch we are not covering those extensions. This will be more useful as a genral RFC or an issue in github.

OpenCL spec however never clarified the use of pragma that technically makes sense because the pragmas in C languages are used for altering the standard behavior that can't be otherwise achieved using standard parsing i.e. C99 section 6.10.1 says about non-STDC pragmas:

The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. Any such pragma that is not recognized by the implementation is ignored.

So C99 only regulates behavior of STDC pragmas and for those it explicitly says how the behavior of the parsed program is altered.

Technically OpenCL pragmas are not covered neither in OpenCL C and not C99 and therefore it is unclear what the implementation should do. However, with time due to the absence of the clarification mutiple interpretations appeared. Sadly some of them ended up in a very suboptimal state both for the tooling and the application developers because the pragma started to be added for no reason or for controlling redundant diagnostics of use of types or functions that extensions were introducing. If you are interested in more details I suggest to check this issue: https://github.com/KhronosGroup/OpenCL-Docs/issues/82

The configuration bit is only needed if clang changes behaviour, which is a separate question.
I'd also argue that new third party extensions need an API call to register new extensions in order to get a working pragma mechanism.

Can you explain this please?

explained above, extensions shouldn't introduce new symbols without being enabled. applications are free to use any symbol names that are not reserved by the specs.

There is nothing of this in the spec.

If my application decides to implement swap_atomically it's perfectly free to do so and should continue to work even if someone later decides to add swap_atomically as extension function.
Requiring such an application to be updated to check for new_extension macro breaks backward compatibility.

This is how it is and has always been. For that Khronos recommends not to use cl_-prefixed identifiers because they are likely to be used by extensions or future standards.

Even if an extension only adds access new functions, pragma should control if user functions with conflicting names are legal or not.

for example, a program can implement function new_func, which gets later added to an extension. since the program doesn't know about the new extension it doesn't use #pragma new_extension:enable and continues to work correctly.

This is absolutely not how it works or what the pragma currently does or even can do. If you want such a thing to work you need to introduce a concept of a namespace into OpenCL C, so every extension is implemented as a separate namespace to avoid name clushing. I do acknowledge however that it would be extremly useful and convenient but it is not how C parsers work. And I imagine it is not trivial to extend this but probably not impossible.

https://godbolt.org/z/d8c5fb

If you look at this example the function from the extension have already been added into the list the symbal tables. The pragma doesn't undo or remove them or make them invisible. There is no machnism to do this easily and the proper solution would be to use namespaces but it is only available in C++.

Namespaces are not necessary. the behaviour is easily implementable using defines:
#pragma abcdef: enable introduces #define exported_function __reserved_name_function
#pragma abcdef: disable removes such definitions.

applications cannot legally use names that begin with two underscores, so there's no risk of name clash

I don't understand your proposal but more importantly this is not the right place to discuss it. You are more than welcome to send it to Clang community and to Khronos for the future standards. Right now we talk about what exists in the specification.

all identifiers that being with an underscore and a capital letter or another underscore are always reserved for any use (C99 7.1.3)

The following names are reserved for use as keywords in OpenCL C and shall not be used
otherwise.
 * Names reserved as keywords by C99.

Opencl-1.1 6.1.9

If the new extension exposes new_func unconditionally, the program breaks, because it doesn't check for a macro that didn't exist at the time it was written.
more recent programs can use ifdef to either use new_func provided by the extension, or implement a custom version.

Yes, this is the intent of defining the extension macros.

And it misses the point. legacy application cannot check for macros that didn't exist at the time of writing.
requiring applications to be continually adjusted as new extensions become available breaks backward compatibility.

I didn't find much about embedded program behavior in full profile implementation in the specs.
It only says that "embedded profile is a subset" which imo implies that legal embedded profile programs should work correctly in a full profile implementation.
This implies that cles_* pragmas should continue to work even if the behavior is always supported.

If there is no proper specification for this I think adding it to clang contradict the development policy http://clang.llvm.org/get_involved.html

A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.

Both cles_khr_int64 and cles_khr_2d_image_array_writes are described in opencl 1.1 and 1.2 specs.
Applications that use opencl 1.2 follow those specs, and clang's std=cl1.2 should follow cl 1.2 specs.
Requiring applications to be continually rewritten as new cl versions become available breaks backward compatibility.

I'd phrase it differently. extensions that modify the language should be included even if clang decides not to implement given language modification.

Why do we need to add what we don't implement? I don't think this complies to the clang development policy either as we don't support incomplete implementations.

The extension is already implemented (char/short dereferences work). it just needs to be exposed to applications performing correct steps to enable extended behaviour.
The same reason why cl_khr_fp64 is still listed in the extensions list even though it stopped being an extension in cl1.2.

cl_khr_fp64 is properly and fully implemented in Clang. It is not in the category being discussed here. You are conflating very different situations.

I think it's perfectly fine to always allow dereferencing of char/short types, as long as correctly written CL1.0 programs that use #pragma OPENCL cl_khr_byte_addressable_store: enable don't produce extra errors/warnings.
Full profile is a superset of embedded profile so the cles_* behaviours should be available and #pragma should work as well.

This would suggest that there should be a list of no-op extensions that are only present for backward/embedded compatibility and don't change behaviour since the extended version of language behaviour is always supported.

What do the extensions do? If they do nothing why do they need to be in clang? Once again if anyone needs to define a macro, this can be done outside of clang code base. OpenCLExtensions.def is not meant to be used for adding the macros as it does a lot more than this. Using OpenCLExtensions.def for only for adding a macro is a hack to me.

The presence of these extensions enables correct functionality of procedures that enable extended behaviour (use of pragmas). Irrespective of whether the same procedure is required in later versions or not.
Correctly written OCL 1.0 applications are required to use legacy pragmas, removing them breaks backward compatibility.

Whether it's implemented as a no-op extension, or just a list of pragma names that shouldn't issue warning or error is an implementation detail.

note that my concerns are purely wrt. spec wording and intentions. Given that ocl didn't see wider use until later versions, such legacy applications might not exist.

I disagree on that there are more OpenCL 1.x conformant devices than OpenCL 2.x. You can see by the adoption list.

sure. which makes it even more puzzling that you don't want to consider cl1.x specs.

Have I ever written that? Right now we are fixing incorrect behavior in the compiler that is not supported by anything in the spec.

feel free to reject them as purely academic, but the point is that changes in this revision break backward compatibility (which is always a maintenance burden).

I think we already agreed earlier to leave certain extensions in for now but add a comment to address this later. See:

I am ok to leave the extensions that could hypotetically modify the language out of this patch for now. Perhaps we could add a comment explaining that they are unused and only left for backwards compatibility. In a long term we need to find some ways to remove the legacy that doesn't bring any benefit to the community. Maybe I will write another RFC and ask vendors to reply in the mainling list if they use those and plan to either complete the functionality upstream or remove them.

This is IMO completely misguided. The policy wrt to applications that follow older spec should be clearly stated and adhered to, giving enough time for any affected users to adapt their workloads in case there are changes.
An ill-defined 'benefit of the community' just means that old applications will get broken unless they actively watch for clang development. at which point it's easier to update legacy apps.

We do and will always provide backwards compatibiliy in accordance with the standard. If developers however implemented the applications based on the behavior that the spec doesn't document there is no guarantee provided for such cases. If developers need such a guarantee they have to submit bugs to implementation or/and ask for the spec clarifications to make sure they only rely on the behavior that is written in the spec explicitly

I don't want to stop the wider discussion, that being said I think I've addressed the comment regarding the content of this PR. Let me know if the latest version is fine or needs further addressing. Thanks.

I don't want to stop the wider discussion, that being said I think I've addressed the comment regarding the content of this PR. Let me know if the latest version is fine or needs further addressing. Thanks.

Thank you for providing comprehensive information regarding the extensions being removed. Since there are no kernel language changes in those I still see no reason to keep them in clang!

However, regarding the discussion about the general extension mechanisms for the language changes, @jvesely would you be willing to describe your vision regarding extensions pragma on this issue https://github.com/KhronosGroup/OpenCL-Docs/issues/82? Or otherwise, would you be ok if I summarise it there? I think it would be useful to aggregate all the information before discussing future directions with Khronos.

Anastasia accepted this revision.Oct 20 2020, 9:21 AM

LGTM! Thanks!

This revision was not accepted when it landed; it landed in state Needs Review.Oct 22 2020, 9:01 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.