Page MenuHomePhabricator

OpenCL: Include builtin header by default
AbandonedPublic

Authored by arsenm on Apr 27 2020, 6:25 PM.

Details

Summary

Nonsensically, the default header is not included by default. Users
are using the cc1 flag to compile OpenCL programs, which should not be
required. Switch the default, and allow -nostdinc to disable the
include.

I'm somewhat confused by the way this is supposed to work, or why a
separate option exists for OpenCL. The other language flags seem to be
implemented in the driver, not cc1 like OpenCL. I'm not sure this is
the right set of flag naming decisions, or if we really need the
fno-include-default-header cc1 flag.

Diff Detail

Event Timeline

arsenm created this revision.Apr 27 2020, 6:25 PM

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

-nogpulib is fine since opencl compiler is in parallel with the device compiler of CUDA/HIP. The library it uses is the device library.

clang/lib/Frontend/CompilerInvocation.cpp
2597

should use hasFlag in case there are multiple occurrences of both options.

I'm somewhat confused by the way this is supposed to work, or why a
separate option exists for OpenCL. The other language flags seem to be
implemented in the driver, not cc1 like OpenCL. I'm not sure this is
the right set of flag naming decisions, or if we really need the
fno-include-default-header cc1 flag.

Originally we didn't want to include the header by default for 2 reasons:

  1. Many vendors used their own header.
  2. It takes long time to parse the header.

However, we might be in a different position now because the header exists for long time and I think we have fixed all the issues so it should be recommended to use it instead of vendor's header. It is generally good to include it by default for the upstream user. However, I am still concerned with the parsing time. To mitigate this we have developed an alternative way to include the functions utilizing TableGen:
https://llvm.org/devmtg/2019-10/talk-abstracts.html#lit5
Unfortunately, we are not that far with testing yet to be able to use it by default. But hopefully we should be able to switch to is as a default option at some point soon.

To summarize I am generally not against the inclusion of the header by default and I think it's best if we just reuse the standard non-OpenCL specific flags. We would just need to understand how it will work with the fast TableGen-based solution that is enabled using -fdeclare-opencl-builtins. I am looping in @svenvh to discuss this further. Sven, do you think we should rename this flag due to current proposed change? Should it be moved out of -cc1 too?

Originally we didn't want to include the header by default for 2 reasons:

  1. Many vendors used their own header.
  2. It takes long time to parse the header.

However, we might be in a different position now because the header exists for long time and I think we have fixed all the issues so it should be recommended to use it instead of vendor's header. It is generally good to include it by default for the upstream user. However, I am still concerned with the parsing time. To mitigate this we have developed an alternative way to include the functions utilizing TableGen:
https://llvm.org/devmtg/2019-10/talk-abstracts.html#lit5
Unfortunately, we are not that far with testing yet to be able to use it by default. But hopefully we should be able to switch to is as a default option at some point soon.

The main thing that's missing from the TableGen approach is support for builtins taking enum arguments, and then there is completeness testing as you mentioned. So I think it is not ready yet to become the default way of handling OpenCL builtins.

To summarize I am generally not against the inclusion of the header by default and I think it's best if we just reuse the standard non-OpenCL specific flags.

I agree.

We would just need to understand how it will work with the fast TableGen-based solution that is enabled using -fdeclare-opencl-builtins.

The combination of -fdeclare-opencl-builtins and finclude-default-header currently causes a smaller include file (that is fast to parse) to be included. So we should include the small header by default for -fdeclare-opencl-builtins.

I am looping in @svenvh to discuss this further. Sven, do you think we should rename this flag due to current proposed change? Should it be moved out of -cc1 too?

I would leave -fdeclare-opencl-builtins as it is, until it is complete and taken out of its "experimental" status. Then we can discuss if we want to use TableGen instead of the header as a default.

Anastasia added a comment.EditedApr 28 2020, 6:07 AM

I am looping in @svenvh to discuss this further. Sven, do you think we should rename this flag due to current proposed change? Should it be moved out of -cc1 too?

I would leave -fdeclare-opencl-builtins as it is, until it is complete and taken out of its "experimental" status. Then we can discuss if we want to use TableGen instead of the header as a default.

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

I am looping in @svenvh to discuss this further. Sven, do you think we should rename this flag due to current proposed change? Should it be moved out of -cc1 too?

I would leave -fdeclare-opencl-builtins as it is, until it is complete and taken out of its "experimental" status. Then we can discuss if we want to use TableGen instead of the header as a default.

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

Is my understanding correct? We should reflect this is the User Manual.

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

Is my understanding correct? We should reflect this is the User Manual.

I think -finclude-default-header -fno-include-default-header better becomes both driver and -cc1 option since users are likely to use it with driver. It may also be used with other languages e.g. HIP.

I'm somewhat confused by the way this is supposed to work, or why a
separate option exists for OpenCL. The other language flags seem to be
implemented in the driver, not cc1 like OpenCL. I'm not sure this is
the right set of flag naming decisions, or if we really need the
fno-include-default-header cc1 flag.

Originally we didn't want to include the header by default for 2 reasons:

  1. Many vendors used their own header.
  2. It takes long time to parse the header.

I think this is backwards reasoning. Upstream clang should compile OpenCL as expected by default. If vendors want to use some other header, it should be a downstream responsibility to turn off the default header and include something else

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

The name of -fdeclare-opencl-builtins becomes a bit non-intuitive, a more intuitive name would be something like -fopencl-tablegen-builtins perhaps or -fopencl-fast-builtins if we don't want to mention TableGen. But since it is still an experimental option, I have no objections against postponing the renaming until the option has reached maturity.

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

The name of -fdeclare-opencl-builtins becomes a bit non-intuitive, a more intuitive name would be something like -fopencl-tablegen-builtins perhaps or -fopencl-fast-builtins if we don't want to mention TableGen. But since it is still an experimental option, I have no objections against postponing the renaming until the option has reached maturity.

I vote for -fopencl-fast-builtins.

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

Is my understanding correct? We should reflect this is the User Manual.

I think -finclude-default-header -fno-include-default-header better becomes both driver and -cc1 option since users are likely to use it with driver. It may also be used with other languages e.g. HIP.

If we change -finclude-default-header to driver option, what would it do since the header is added by default?

Unless as @arsenm pointed out we want to get rid of custom OpenCL options and just use standard ones for C/C++...

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

-nogpulib is fine since opencl compiler is in parallel with the device compiler of CUDA/HIP. The library it uses is the device library.

OpenCL can target other devices than GPUs, including CPUs and FPGAs, referring to gpulibs wrt opencl is a misnomer.

It would be nice to have some clarity as to how OpenCL is handled wrt clang frontend vs. clang driver.
OpenCL options are currently split between the two (e.g. cl-denorms-are-zero is only available in the driver and not the frontend)
There are 3 implementations of CL headers, two in clang which might or might not be included by default, and the 3rd one in libclc.

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

-nogpulib is fine since opencl compiler is in parallel with the device compiler of CUDA/HIP. The library it uses is the device library.

OpenCL can target other devices than GPUs, including CPUs and FPGAs, referring to gpulibs wrt opencl is a misnomer.

It would be nice to have some clarity as to how OpenCL is handled wrt clang frontend vs. clang driver.
OpenCL options are currently split between the two (e.g. cl-denorms-are-zero is only available in the driver and not the frontend)
There are 3 implementations of CL headers, two in clang which might or might not be included by default, and the 3rd one in libclc.

Thanks for the feedback. That's very useful. Some of those I believe are just bugs and we should probably report them through bugzilla and fix them. But others I believe are due to the lack of documentation. indeed. That we should definitely keep improving.

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

-nogpulib is fine since opencl compiler is in parallel with the device compiler of CUDA/HIP. The library it uses is the device library.

OpenCL can target other devices than GPUs, including CPUs and FPGAs, referring to gpulibs wrt opencl is a misnomer.

It makes sense to have a separate flag from -nostdiinc for CUDA/HIP/OpenMP since you have to consider both the host and offload target code.

It would be nice to have some clarity as to how OpenCL is handled wrt clang frontend vs. clang driver.
OpenCL options are currently split between the two (e.g. cl-denorms-are-zero is only available in the driver and not the frontend)

Users shouldn't ever need to concern themselves with cc1 flags. That's part of the issue with not including the header by default today. Only the user facing driver should be guaranteed to have all the spec flags

There are 3 implementations of CL headers, two in clang which might or might not be included by default, and the 3rd one in libclc.

I think the libclc header should just be removed

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

The name of -fdeclare-opencl-builtins becomes a bit non-intuitive, a more intuitive name would be something like -fopencl-tablegen-builtins perhaps or -fopencl-fast-builtins if we don't want to mention TableGen. But since it is still an experimental option, I have no objections against postponing the renaming until the option has reached maturity.

I vote for -fopencl-fast-builtins.

The main problem I'm trying to solve is that users shouldn't need to concern themselves with including any header, much less a choice between two internal implementation details. Switching the two internal header implementations seems OK to leave as a cc1 flag. The main question I think is what the spelling of how to disable whatever default header was chosen, and what the internal flags to switch the implementation look like.

So what's the agreement on spelling? Use -nostdinc/-nostdlib? Have 2 different cc1 flags for the header implementation switch? Surface/invert -fno-include-default-header?

Would it not become confusing since the builtins are going to be included by default? Should we rename the flag at least? Also ideally it should be documented in https://clang.llvm.org/docs/UsersManual.html#opencl-header

ah I guess if we leave it under -cc1 we will have the command line interface as follows:

  • Driver (without -cc1) adds OpenCL header by default that can be overridden by the flag added in this patch.
  • Frontend (with -cc1) doesn't add the header by default but there are two flags -fdeclare-opencl-builtins or -finclude-default-header that allow to include the header.

The name of -fdeclare-opencl-builtins becomes a bit non-intuitive, a more intuitive name would be something like -fopencl-tablegen-builtins perhaps or -fopencl-fast-builtins if we don't want to mention TableGen. But since it is still an experimental option, I have no objections against postponing the renaming until the option has reached maturity.

I vote for -fopencl-fast-builtins.

The main problem I'm trying to solve is that users shouldn't need to concern themselves with including any header, much less a choice between two internal implementation details. Switching the two internal header implementations seems OK to leave as a cc1 flag. The main question I think is what the spelling of how to disable whatever default header was chosen, and what the internal flags to switch the implementation look like.

So what's the agreement on spelling? Use -nostdinc/-nostdlib? Have 2 different cc1 flags for the header implementation switch? Surface/invert -fno-include-default-header?

I like the idea of reusing -nostdinc/-nostdlib, although we don't link to libs by default so I guess -nostdlib is useless? Then do we want to keep -finclude-default-header to be used for the frontend? As for the TableGen header I guess we can leave it under frontend only flag for now but I would suggest to rename to something more distinct i.e. -fopencl-fast-builtins?

@arsenm I would like to see if we can finalize this patch soon. Do you think you will have a capacity for this in the next weeks? Otherwise I would be happy to help too.

It looks in a good shape, we just need to decide if we go ahead with the new flag you are adding or reuse -nostdinc. Do you have any preference yourself?

arsenm added a comment.Feb 2 2021, 7:42 AM

@arsenm I would like to see if we can finalize this patch soon. Do you think you will have a capacity for this in the next weeks? Otherwise I would be happy to help too.

I don't think I'll have time soon, you can take over

It looks in a good shape, we just need to decide if we go ahead with the new flag you are adding or reuse -nostdinc. Do you have any preference yourself?

Not really sure. No real preference

@arsenm I would like to see if we can finalize this patch soon. Do you think you will have a capacity for this in the next weeks? Otherwise I would be happy to help too.

I don't think I'll have time soon, you can take over

It looks in a good shape, we just need to decide if we go ahead with the new flag you are adding or reuse -nostdinc. Do you have any preference yourself?

Not really sure. No real preference

Thanks, after brainstorming more with other developers I have decided to change the direction slightly that I have explained in the RFC:
https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html

I believe this achieves similar goal but it also addresses the long parsing speed that has always been a concern for the default header within the OpenCL community.

Let me know if you have any feedback and either way we decide to go your patch will be a good base point.

jansvoboda11 resigned from this revision.Mar 5 2021, 1:29 AM
arsenm abandoned this revision.May 11 2021, 5:19 PM