This is an archive of the discontinued LLVM Phabricator instance.

[SYCL] Implement SYCL address space attributes handling
ClosedPublic

Authored by bader on Oct 21 2020, 1:34 PM.

Details

Summary

Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCL named address spaces sycl_global, sycl_local and
sycl_private defined as sub-sets of the default address space.

Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
bader added inline comments.Feb 1 2021, 7:50 AM
clang/lib/CodeGen/TargetInfo.cpp
9996 ↗(On Diff #319566)

@aaron.ballman, suggests we call out auto here.

I took this code from AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace.

I also found a few other places where getConstantAddressSpace is called and authors use auto: CodeGenModule::getStringLiteralAddressSpace castStringLiteralToDefaultAddressSpace and createReferenceTemporary.

@rjmccall, should I update all these instances? I can create an NFC patch for changes outside of this patch and push it before this change to the style consistent across the code base.

If you prefer to continue this route then you should just document your dialect sematic somewhere publicly accessible and avoid aliasing to OpenCL, or target address spaces. So it would become a proper language dialect implementation.

I've added a section to our internal design document about address space handling in SYCL mode: pull request (more readable version). Please, take a look and let me know if it can be improved.

The documentation looks like a good starting point but it would be better to add it to clang docs, then we can also refine the details during the review. For now I can tell that the following points are too high-level and unclear:

To utilize existing clang's functionality, we re-use following OpenCL address space attributes in SYCL mode with modified semantic rules and unmodified parser and IR generation components.

You should provide more details on the difference to OpenCL address spaces - in particular looking at your patch I don't think this statement is correct "unmodified <...> IR generation components."

SYCL mode allows conversion from annotated pointer to unannotated pointer to enable integration of accelerated code with standard C++ code.

I don't understand what integration you mean. Also C++ has many different kinds of conversions, which conversions do you mean exactly?

Otherwise, the overall design of address spaces seems a lot more clear now.

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

  1. How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.
  2. Why is changing of `the address space map necessary for SPIR but not the other targets?

I also think it would be good if @rjmccall could look at the Codegen changes.

clang/include/clang/AST/Type.h
498

This should be done in the method above.

clang/include/clang/Sema/ParsedAttr.h
624

Do you still plan to resolve this?

clang/lib/Basic/Targets/SPIR.h
73–77

I am talking about the llvm-project btw. You can't just customize it to any arbitrary GitHub project as there has to be some core design flow to be respected.

bader added a subscriber: svenvh.Feb 1 2021, 1:37 PM

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

We use alternative map to not impact OpenCL language modes.

I provided clang lit tests results for changing the default map below. Most of the failures are caused by valid changes in LLVM IR and can be fixed by updating FileCheck comments, but there are a few crashes in CodeGen for "C++ for OpenCL" mode (with the root cause similar t o https://bugs.llvm.org/show_bug.cgi?id=45472). AFAIK, it's the only blocker for using unified map for any language mode. @svenvh, do you think we can address https://bugs.llvm.org/show_bug.cgi?id=45472 and update SPIRAddrSpaceMap (i.e. SPIRAddrSpaceMap[0] <- 4)?

Failed Tests (22):

Clang :: CodeGen/ext-int-cc.c
Clang :: CodeGen/spir-half-type.cpp
Clang :: CodeGenCXX/builtin-calling-conv.cpp
Clang :: CodeGenOpenCL/addr-space-struct-arg.cl
Clang :: CodeGenOpenCL/atomic-ops-libcall.cl
Clang :: CodeGenOpenCL/blocks.cl
Clang :: CodeGenOpenCL/intel-subgroups-avc-ext-types.cl
Clang :: CodeGenOpenCL/opencl_types.cl
Clang :: CodeGenOpenCL/partial_initializer.cl
Clang :: CodeGenOpenCL/private-array-initialization.cl
Clang :: CodeGenOpenCL/sampler.cl
Clang :: CodeGenOpenCL/vectorLoadStore.cl
Clang :: CodeGenOpenCLCXX/address-space-castoperators.cpp
Clang :: CodeGenOpenCLCXX/address-space-deduction.cl
Clang :: CodeGenOpenCLCXX/addrspace-derived-base.cl
Clang :: CodeGenOpenCLCXX/addrspace-of-this.cl
Clang :: CodeGenOpenCLCXX/addrspace-operators.cl
Clang :: CodeGenOpenCLCXX/addrspace-references.cl
Clang :: CodeGenOpenCLCXX/addrspace-with-class.cl
Clang :: CodeGenOpenCLCXX/atexit.cl
Clang :: CodeGenOpenCLCXX/template-address-spaces.cl
Clang :: Index/pipe-size.cl
clang/include/clang/AST/Type.h
498

Will do.

clang/include/clang/Sema/ParsedAttr.h
624

Do you still plan to resolve this?

If re-using OpenCL ParsedAttr is okay, I can just remove this comment. Do you think we should add new ParsedAttr for SYCL?

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"? If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that? Certainly this is not how LLVM was designed. LLVM expects the same address spaces to be used with one target for all languages.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references. Default address space is where objects are put by default - for example in C++ everything is mapped to this address space. I don't see why you just replace Default by OpenCL generic. It has been added for a very different purpose.

bader updated this revision to Diff 320741.Feb 2 2021, 4:05 AM

Applied code review suggestions from Anastasia.

bader marked an inline comment as done.Feb 2 2021, 5:16 AM

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

The mapping has been discussed in this comment: https://github.com/intel/llvm/pull/1039/#discussion_r369667791.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"?

LLVM has it's own sematic which toolchains rely on. Toolchains consuming LLVM IR doesn't know if this LLVM IR was emitted by C++ compiler or written manually - they can rely only on LLVM IR semantic defined by https://llvm.org/docs/LangRef.html.

If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that?

Technically clang changes SPIR target mapping for Default address space only for sycldevice environment component of the target triple, so it doesn't depend on the input language i.e. Default is mapped to the same LLVM address space for C++ and SYCL.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

This is unfortunate. Do you know if there are any plans to provide up-to-date documentation? It's critical for non-clang front-ends targeting SPIR-V format.

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references.

I agree with that, but I don't get how it's related to the mapping language AS to LLVM AS.

Default address space is where objects are put by default - for example in C++ everything is mapped to this address space. I don't see why you just replace Default by OpenCL generic. It has been added for a very different purpose.

SPIR-V Generic is the only address space allowing us to preserve correct semantic of the program (possibly global can be used, but we didn't investigate this option due to obvious performance implications). The objects from Default AS might be allocated in any physical AS, so LLVM IR AS 4 allows LLVM compiler to infer it using InferAddressSpace pass.

bader updated this revision to Diff 320761.Feb 2 2021, 5:46 AM

Fixed a couple of typos in the comments; NFC.

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

The mapping has been discussed in this comment: https://github.com/intel/llvm/pull/1039/#discussion_r369667791.

The discussion you cite is in the separate Github project, but you should have a discussion using LLVM channels about the design you propose for llvm-project repo. Also I don't think citing resources with many comments and no conclusions is making the review process very productive.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"?

LLVM has it's own sematic which toolchains rely on. Toolchains consuming LLVM IR doesn't know if this LLVM IR was emitted by C++ compiler or written manually - they can rely only on LLVM IR semantic defined by https://llvm.org/docs/LangRef.html.

If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that?

Technically clang changes SPIR target mapping for Default address space only for sycldevice environment component of the target triple, so it doesn't depend on the input language i.e. Default is mapped to the same LLVM address space for C++ and SYCL.

If you are suggesting to add SYCL as a component for other languages supported by clang this deserves a separate review probably with different reviewers and a wider discussion with the community.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

This is unfortunate. Do you know if there are any plans to provide up-to-date documentation? It's critical for non-clang front-ends targeting SPIR-V format.

I am not aware of any plan. FYI SPIR has not been added or designed for SPIR-V. It is only used for SPIR-V as a workaround by some external projects but it is not the only use case even if perhaps an important one for the community.

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references.

I agree with that, but I don't get how it's related to the mapping language AS to LLVM AS.

Every language address space has its own semantic that should be mapped appropriately to LLVM IR to make sure the semantic is preserved.

OpenCL generic is a concept governed by OpenCL C spec cited above

It can be used with pointer types and it represents a placeholder for any of the named address spaces - global, local or private. It signals that a pointer points to an object in one of these concrete named address spaces. The exact address space resolution can occur dynamically during the kernel execution.

and it is completely different to the concept of Default that is a flat address space of C and C++ as per Embedded C s5.1.1

For the purpose of this Technical Report, the C language is extended to support additional address spaces.  When not specified otherwise, objects are allocated by default in a generic address space, which corresponds to the single address space of ISO/IEC 9899:1999.  In addition to the generic address space, an implementation may support other, named address spaces.

I hope you see the difference between the two.

As SPIR is a virtual target, Default and OpenCL generic have distinct values because you need to preserve their semantic until the mapping to the physical device. I don't see why these two concepts should have the same logical mappping? For the generality tools consuming SPIR LLVM IR should have information for both address spaces and then decide how to represent or map those to a physical device. Then on a physical device target it makes sense that some or even all address spaces are mapped to the same value representing physical segments.

Default address space is where objects are put by default - for example in C++ everything is mapped to this address space. I don't see why you just replace Default by OpenCL generic. It has been added for a very different purpose.

SPIR-V Generic is the only address space allowing us to preserve correct semantic of the program (possibly global can be used, but we didn't investigate this option due to obvious performance implications).

FYI LLVM doesn't support SPIR-V yet, but nevertheless SPIR-V generic has the same semantic as OpenCL generic but it is not the same as Default in clang.

The objects from Default AS might be allocated in any physical AS, so LLVM IR AS 4 allows LLVM compiler to infer it using InferAddressSpace pass.

As far as I am aware you can also use any value with InferAddressSpace pass not only 4.

bader added a comment.Feb 3 2021, 8:08 AM

Anastasia added a comment.
In D89909#2536331 https://reviews.llvm.org/D89909#2536331, @bader wrote:

In D89909#2536157 https://reviews.llvm.org/D89909#2536157, @Anastasia wrote:

In D89909#2534790 https://reviews.llvm.org/D89909#2534790, @bader wrote:

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

The mapping has been discussed in this comment: https://github.com/intel/llvm/pull/1039/#discussion_r369667791.

The discussion you cite is in the separate Github project, but you should have a discussion using LLVM channels about the design you propose for llvm-project repo. Also I don't think citing resources with many comments and no conclusions is making the review process very productive.

Sorry, I forgot to mentioned that this change was also discussed in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html.
There are no objections from the community.

What is not covered by this thread and probably worth to discuss separately is changing OpenCL private mapping to non-zero LLVM AS to avoid the mangler component change. I expect significant impact on SPIR LLVM IR consumers and long transition period if the change is accepted.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"?

LLVM has it's own sematic which toolchains rely on. Toolchains consuming LLVM IR doesn't know if this LLVM IR was emitted by C++ compiler or written manually - they can rely only on LLVM IR semantic defined by https://llvm.org/docs/LangRef.html.

If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that?

Technically clang changes SPIR target mapping for Default address space only for sycldevice environment component of the target triple, so it doesn't depend on the input language i.e. Default is mapped to the same LLVM address space for C++ and SYCL.

If you are suggesting to add SYCL as a component for other languages supported by clang this deserves a separate review probably with different reviewers and a wider discussion with the community.

No, but clang doesn't couple input language with environment component of the target triple. As I mentioned above, community doesn't have any objections to this approach.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

This is unfortunate. Do you know if there are any plans to provide up-to-date documentation? It's critical for non-clang front-ends targeting SPIR-V format.

I am not aware of any plan. FYI SPIR has not been added or designed for SPIR-V. It is only used for SPIR-V as a workaround by some external projects but it is not the only use case even if perhaps an important one for the community.

I'm aware of SPIR target history and design - I was on the team contributed SPIR-1.2 support to clang, although I was involved into compiler back-end development.
We designed and contributed SPIR-1.2 support to clang aiming to provide a tool producing stable portable binary format for OpenCL programs.
SPIR format evolved from SPIR-1.2 to SPIR-V, which is not based on LLVM-based, but AFAIK the goal of SPIR target in clang tool has not changed and it is still to produce binary format for Khronos APIs (even though part of the toolchain is hosted outside of LLVM project).
It's great if there are other uses of SPIR target in LLVM project. Do you know where I can learn more about this?

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references.

I agree with that, but I don't get how it's related to the mapping language AS to LLVM AS.

Every language address space has its own semantic that should be mapped appropriately to LLVM IR to make sure the semantic is preserved.

OpenCL generic is a concept governed by OpenCL C spec cited above

It can be used with pointer types and it represents a placeholder for any of the named address spaces - global, local or private. It signals that a pointer points to an object in one of these concrete named address spaces. The exact address space resolution can occur dynamically during the kernel execution.

and it is completely different to the concept of Default that is a flat address space of C and C++ as per Embedded C s5.1.1

For the purpose of this Technical Report, the C language is extended to support additional address spaces.  When not specified otherwise, objects are allocated by default in a generic address space, which corresponds to the single address space of ISO/IEC 9899:1999.  In addition to the generic address space, an implementation may support other, named address spaces.

I hope you see the difference between the two.

As SPIR is a virtual target, Default and OpenCL generic have distinct values because you need to preserve their semantic until the mapping to the physical device. I don't see why these two concepts should have the same logical mappping? For the generality tools consuming SPIR LLVM IR should have information for both address spaces and then decide how to represent or map those to a physical device. Then on a physical device target it makes sense that some or even all address spaces are mapped to the same value representing physical segments.

SPIR-V Generic is the only address space allowing us to preserve correct semantic of the program (possibly global can be used, but we didn't investigate this option due to obvious performance implications).

FYI LLVM doesn't support SPIR-V yet, but nevertheless SPIR-V generic has the same semantic as OpenCL generic but it is not the same as Default in clang.

I puzzled by existing mapping: Aren't Default and SPIR-V Function (or OpenCL private) semantically different address spaces?
I don't see what difference between Default and SPIR-V generic (or OpenCL generic) plays significant role at LLVM IR level.
Note that LLVM IR objects are not allocated neither in Default nor in Generic address spaces as described by John in this message.
This seems to be only property important for separate mapping of these two address spaces to LLVM IR.
Could you provide an example where mapping Default to SPIR-V generic might cause the problem, please? I have a problem with imagining such case, but I can be biased.

FWIW, real GPU targets map Default to Generic and as far as I concerned it doesn't cause problems.

The objects from Default AS might be allocated in any physical AS, so LLVM IR AS 4 allows LLVM compiler to infer it using InferAddressSpace pass.

As far as I am aware you can also use any value with InferAddressSpace pass not only 4.

Right, but we use SPIR target and it uses 4.

Sorry, I forgot to mentioned that this change was also discussed in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html.
There are no objections from the community.

I remember asking similar questions regarding the SYCL address space map on that thread, and I don't think I could get more than what we are discussing here. I generally don't think that RFC ended with any conclusions as a result this patch contained different approach to what I have suggested at the end.

SPIR format evolved from SPIR-1.2 to SPIR-V, which is not based on LLVM-based, but AFAIK the goal of SPIR target in clang tool has not changed and it is still to produce binary format for Khronos APIs (even though part of the toolchain is hosted outside of LLVM project).

I would like to highlight that SPIR was completely replaced by SPIR-V. I don't really see it as evolution. There were several attempts to add SPIR-V into LLVM and Clang - both of us were involved in some of those.

As of now the comments in the source code still says:

llvm/include/llvm/ADT/Triple.h:    spir,           // SPIR: standard portable IR for OpenCL 32-bit version
llvm/include/llvm/ADT/Triple.h:    spir64,         // SPIR: standard portable IR for OpenCL 64-bit version

It's great if there are other uses of SPIR target in LLVM project. Do you know where I can learn more about this?

I presume it is used for what SPIR was originally intended - a de-facto portable format? As a matter of fact not all OpenCL vendors support SPIR-V and we as a community won't and can't force this. If you want to learn more about use cases I would ask the community, cfe-dev could be a good start.

I puzzled by existing mapping: Aren't Default and SPIR-V Function (or OpenCL private) semantically different address spaces?

Yes they are. OpenCL private is just the most common case this is why mapping them to the same value often works.

Note that LLVM IR objects are not allocated neither in Default nor in Generic address spaces as described by John in this message.

Default can be used to allocate objects even if address spaces for alloca can be customized. There are other objects that are used with Default - arguments or return types. It is just so inherent in the parser. Most of objects use Default at the AST level even if not all of this ends up in the IR.

Could you provide an example where mapping Default to SPIR-V generic might cause the problem, please? I have a problem with imagining such case, but I can be biased.

Sorry I don't think I have enough time. But in general I think you should not base the design decision solely on the examples as both C and C++ are complex enough to overlook various corner cases. Your design should make sense conceptually and then the tooling will be correct by construction and easy to reason about.

FWIW, real GPU targets map Default to Generic and as far as I concerned it doesn't cause problems.

How do you prove this though?

Right, but we use SPIR target and it uses 4.

I guess I still don't get your architecture, but you could also look at adding a pass that remaps the address spaces or extend the pass to work with different default address spaces?

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

clang/include/clang/AST/Type.h
491
clang/lib/AST/ItaniumMangle.cpp
2350
clang/lib/Basic/Targets/SPIR.h
73–78
clang/lib/CodeGen/TargetInfo.cpp
9997 ↗(On Diff #320761)

I called it out because it wasn't clear to me what the type actually was, but I see now that it's an Optional<LangAS>. I think it's useful to know that we're dealing with an optional, but it's not crucial.

bader updated this revision to Diff 328100.Mar 4 2021, 3:30 AM
bader marked 4 inline comments as done.

Apply suggestions from Aaron.

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

bader added a comment.Mar 10 2021, 9:42 AM

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

Just to be clear I am not suggesting a nice-to-have documentation. I would like to see the documentation that explains what behavior is expected from the compiler. I think this is normally a good starting point and it has been a common practice for many contributions I believe too. I guess we could first commit the implementation and then work out what we want it to be doing but that implies a bigger risks to the community. Also I am also not entirely happy with some directions taken deviating the original design in this patch. But it is hard to assess whether there are better practical alternatives and suggest anything without understanding what is being implemented.

Address spaces are used by many stakeholders in clang so having a clear understanding of various use cases publicly accessible would benefit the community and reduce the risks of disruptions and negative impact of each other's work.

Regarding the timing I believe it could take less than several months considering that there are some parts already available but it depends on various factors including everyone's availability. However, the fact that some areas would require collaborative exploration of design alternatives from stakeholders was to my memory highlighted on the original proposal to contribute SYCL support to clang quite a while ago. So the requirement here should not appear as unexpected.

This has been my suggestion to address the ping-ponging discussion issues and latencies/time incurred by them as I think it would be good for us to get the terminology sorted. But I am happy if there are other suggestions addressing the concerns.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

Just to be clear I am not suggesting a nice-to-have documentation. I would like to see the documentation that explains what behavior is expected from the compiler. I think this is normally a good starting point and it has been a common practice for many contributions I believe too.

If what you're suggesting is that Clang have a SYCL-specific page that serves a similar purpose to https://clang.llvm.org/docs/OpenCLSupport.html, then I agree we should have that (and it should be linked to from https://clang.llvm.org/docs/index.html the same as OpenCL, OpenMP, etc). However, I think that's orthogonal to this patch and should be done separately. (If the documentation already existed, then I'd request that this patch update it, but because those docs don't exist yet, I think it's unrelated enough to warrant being done separately.)

If you're requesting something different, I'd appreciate a bit more specific details.

I guess we could first commit the implementation and then work out what we want it to be doing but that implies a bigger risks to the community.

Given that we lack the dedicated documentation page for SYCL within Clang, I guess I view that as the status quo (we've already started some of the upstreaming work and none of it is documented in a convenient place for Clang users). Based on that, as someone who doesn't really do much with SYCL to date, I think it's reasonable to accept this patch because it extends existing functionality in a straight-forward manner with a reasonable explanation as to why.

Also I am also not entirely happy with some directions taken deviating the original design in this patch.

If there are technical concerns to be addressed, I'm not certain it's clear (to me, at least) what they are.

But it is hard to assess whether there are better practical alternatives and suggest anything without understanding what is being implemented.

That's reasonable, but I think we also have to trust the domain experts who are submitting patches to have done their design homework and to be willing to address concerns as they come up when there's concretely a problem. This is not an extremely experimental invention from an unknown lone contributor; this is a long-time maintainer upstreaming work from downstream repo with users who use this functionality.

Address spaces are used by many stakeholders in clang so having a clear understanding of various use cases publicly accessible would benefit the community and reduce the risks of disruptions and negative impact of each other's work.

Strongly agreed! But my point is: it seems like this is (at least to some extent) already publicly accessible information and I don't think we've identified any disruptions or negative impacts (yet).

Regarding the timing I believe it could take less than several months considering that there are some parts already available but it depends on various factors including everyone's availability. However, the fact that some areas would require collaborative exploration of design alternatives from stakeholders was to my memory highlighted on the original proposal to contribute SYCL support to clang quite a while ago. So the requirement here should not appear as unexpected.

This has been my suggestion to address the ping-ponging discussion issues and latencies/time incurred by them as I think it would be good for us to get the terminology sorted. But I am happy if there are other suggestions addressing the concerns.

I read this as "Maybe it won't take several more months depending on factors outside of the patch author's control" which doesn't really address my concerns that there's not a clear direction on how to unblock this patch in a timely fashion. That's not to imply that I think your concerns are invalid or should be hand-waved away! I agree that we should make sure there's comfort with the design and try to reduce risk as much as we can. It's more that I think I'm recognizing that trying to gain this clarity over Phabricator with long latency on the discussion is somewhat ineffective.

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

If what you're suggesting is that Clang have a SYCL-specific page that serves a similar purpose to https://clang.llvm.org/docs/OpenCLSupport.html, then I agree we should have that (and it should be linked to from https://clang.llvm.org/docs/index.html the same as OpenCL, OpenMP, etc). However, I think that's orthogonal to this patch and should be done separately. (If the documentation already existed, then I'd request that this patch update it, but because those docs don't exist yet, I think it's unrelated enough to warrant being done separately.)

If you're requesting something different, I'd appreciate a bit more specific details.

I don't really mind the format. It doesn't have to be in Clang documentation although perhaps it makes the most sense logically. What I would like to see is a transparancy within the commnity that is I believe a common practice as per clang contribution policy:

A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.

I don't find it great to continue reviewing the work where I don't have a clear picture. I also would like to avoid the situation the community has to work with code that we don't understand. We should not be in need to reverse engineer what is being implemented. It is not a comfortable position to be nor it is productive. Once we know the high-level language semantic it facilitates us to reason about implementation should we need to redesign anything or fix a bug. If we only have fragements of code it doesn't tell us a lot and there is very little we can do with it. Perhaps this is why good practices have been added by the community? There are also other areas whether it demonstrated to work really well - for very good reasons we no longer add undocumented attributes to clang.

Given that we lack the dedicated documentation page for SYCL within Clang, I guess I view that as the status quo (we've already started some of the upstreaming work and none of it is documented in a convenient place for Clang users). Based on that, as someone who doesn't really do much with SYCL to date, I think it's reasonable to accept this patch because it extends existing functionality in a straight-forward manner with a reasonable explanation as to why.

I am not aware of all work on SYCL, but I believe some documentation was provided, for example, for the new kernel attribute?

Personally I don't find the patch straight-forward mainly because it deviates from the original design wrt target address space map. But there are other concerns - for example the need for InferAddressSpace that is not required for embedded C and C++ functionality that to my understanding SYCL aims to implement. However, I don't know whether my understanding of the aim is correct as I don't have any reference to verify my understanding.

If there are technical concerns to be addressed, I'm not certain it's clear (to me, at least) what they are.

Well the issue I got stuck at was highlighted earlier - the need to add new address spaces map for SYCL. So far the design we were striking at is that every language would have a new entry in the map but adding multiple maps per language seems to defeat the purpose. Considering that OpenCL and SYCL would likely be supported by similar targets we should try to avoid this or at least try to understand why we ended up with such situation and what we might be able to do to improve. I would accept if we can't do anything with this immediately in case it requires big restructuring but perhaps at least this deserves a wider community communication.

That's reasonable, but I think we also have to trust the domain experts who are submitting patches to have done their design homework and to be willing to address concerns as they come up when there's concretely a problem. This is not an extremely experimental invention from an unknown lone contributor; this is a long-time maintainer upstreaming work from downstream repo with users who use this functionality.

I don't think this can be called a trust issue. For me personally, it is about making sure to align with the requirements from various domains that none of us has complete knowledge of. There is certainly no invention but however a case of substantially new functionality that is using the feature scattered all over the frontend source code. I am sorry that I just don't find it trivial. While I have worked on this topic for many years I wouldn't trust myself to make a new design decision alone because it is important to get things right in general not just for my product. As a matter of fact me and some other developers from this area were working quite a lot in the last years on the unification of the functionality for different dialects which results in better understanding and less maintenance.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts.

This is somewhat unfortunate because the fact that collaboration would be needed in this area has been highlighted straight from the beginning. Technically we would only need #3 to unblock #0 as creating a new page is very trivial and it could be done as a part of documenting the SYCL address space language sematic. I believe that with the use of references to existing documentations - SYCL, OpenCL, embedded C specifications, etc, such a goal can be accomplished without significant time investment.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

I am not opposed to this, but it would be good to agree on the exact agenda before to ensure we can make some progress.

Tyker added a subscriber: Tyker.Mar 18 2021, 8:24 AM
bader updated this revision to Diff 333247.Mar 25 2021, 3:44 AM

[NFC] Align address space map names with AMDGPU target.

Rename SPIRAddrSpaceMap and SYCLAddrSpaceMap to SPIRDefIsPrivMap and SPIRDefIsGenMap.

bader added a comment.Mar 25 2021, 5:21 AM

Let's take a look at compilation output for the following program:

c++
void foo(int *p) {}

OpenCL compiler produces following LLVM IR for SPIR target:

LLVM
define dso_local spir_func void @_Z3fooPU3AS4i(i32 addrspace(4)* nocapture %0) local_unnamed_addr #0 {
  ret void
}

Here is what SYCL device compiler produces:

LLVM
define dso_local spir_func void @_Z3fooPi(i32* nocapture %0) local_unnamed_addr #0 {
  ret void
}

We would like to get equivalent code produced by both compiler for SPIR target.
Currently SYCL device compiler emits pointer to address space 0 instead of expected 4. This lowering is defined by the address space map.
Having multiple maps is not something new to the clang community. Similar approach AMDGPU target applies to customize address space mapping for OpenCL language (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/AMDGPU.cpp#L354).

I updated address space map names to avoid confusion that mapping change is specific to the language mode. Now they use the same naming scheme as AMDGPU maps.

Using InferAddressSpace pass is not required for the targets supporting SPIR-V with pointers to generic address space, but it can be beneficial to produce more efficient native code.

I also created a review request with SYCL design documentation - https://reviews.llvm.org/D99190

Having multiple maps is not something new to the clang community. Similar approach AMDGPU target applies to customize address space mapping for OpenCL language (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/AMDGPU.cpp#L354).

I updated address space map names to avoid confusion that mapping change is specific to the language mode. Now they use the same naming scheme as AMDGPU maps.

Well AMDGPU is a specific target in contrast to SPIR. But the question is whether the map that has been renamed is only intended for SYCL or can it be used for OpenCL or other languages too? My worry is that the original design was intended to separate language and target information but somehow it is now driven towards tangling them together more and more.

Using InferAddressSpace pass is not required for the targets supporting SPIR-V with pointers to generic address space, but it can be beneficial to produce more efficient native code.

Ok, that's good to know that it is only intended for optimisations!

I also created a review request with SYCL design documentation - https://reviews.llvm.org/D99190

Thanks. I will take a look at the address space part. My first impression is that it would be good to abstract away from the implementation in clang and start from the language semantic that you are trying to get. Perhaps it will be easier if I provide concrete questions on the review.

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

@aaron.ballman, I've uploaded SYCL architecture design document with address space section (https://reviews.llvm.org/D99190).
I checked commits with SYCL functionality and we documented all extensions. In particular sycl_kernel attribute mentioned above is documented here - https://clang.llvm.org/docs/AttributeReference.html#sycl-kernel.
Are there any other steps I should do or we can consider it to be done?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

@Anastasia, @aaron.ballman, is there anything I can do for this patch to be accepted?

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

@aaron.ballman, I've uploaded SYCL architecture design document with address space section (https://reviews.llvm.org/D99190).
I checked commits with SYCL functionality and we documented all extensions. In particular sycl_kernel attribute mentioned above is documented here - https://clang.llvm.org/docs/AttributeReference.html#sycl-kernel.
Are there any other steps I should do or we can consider it to be done?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

@Anastasia, @aaron.ballman, is there anything I can do for this patch to be accepted?

I have provided my feedback regarding the address space section on the documentation review. I don't think I will be able to look at the other parts of documentation though. So if you would like to finalize the address spaces faster it might make sense to split it up into a separate review. This might also help to reduce the noise.

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

I'd like to provide a bit more detailed summary. I think meeting was very useful with the following basic outcome:

  • Need to add references to SYCL spec where applicable
  • Need to update documentation where references are not enough, in particular:
    • C++ address space attributes used to implement SYCL address space behavior defined by the specification
    • main issue is missing documentation for Clang extensions being added for SYCL.
  • @Anastasia needs some time to read all references/docs provided, some of them has been sent during the meeting

Do you agree on the list above or I missed anything?

This is what has been done already after the meeting based on the comments made:

I believe that currently this PR is in much better shape than it was and I think I addressed all critical missing items that have been mentioned.
Can you please review and let me know if anything missing?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

I'd like to provide a bit more detailed summary. I think meeting was very useful with the following basic outcome:

  • Need to add references to SYCL spec where applicable
  • Need to update documentation where references are not enough, in particular:
    • C++ address space attributes used to implement SYCL address space behavior defined by the specification
    • main issue is missing documentation for Clang extensions being added for SYCL.
  • @Anastasia needs some time to read all references/docs provided, some of them has been sent during the meeting

Do you agree on the list above or I missed anything?

This is what has been done already after the meeting based on the comments made:

I believe that currently this PR is in much better shape than it was and I think I addressed all critical missing items that have been mentioned.
Can you please review and let me know if anything missing?

Thanks, your summary look fairly accurate to me.

Anastasia added inline comments.Apr 7 2021, 10:54 AM
clang/include/clang/AST/Type.h
492

Ok if you allow implicit conversions both ways then this condition should be extended to also contain all named address spaces in A and Default in B. But actually, could you simplify by checking that you have Default on either side, so something like

(A == LangAS::Default || B == LangAS::Default)

?

clang/lib/AST/ItaniumMangle.cpp
2350

Any reason not to use OpenCL mangling? If you do then you might be able to link against libraries compiled for OpenCL. Also you will get more stable naming i.e. it would not differ from target to target.

clang/lib/Basic/Targets/AMDGPU.cpp
75

Would this map ever be used for SYCL? If not it would be better to add a comment about it and/or perhaps even just use dummy values.

clang/lib/Basic/Targets/SPIR.h
36

The same here. This map will never work for SYCL so let's just use dummy values like for CUDA and add a comment explaining this.

73–77

Ok so what I understand is that the only reason you need a separate map is that the semantics of Default is different for SYCL than for C/C++.

i.e. in SYCL (i.e. inherited from CUDA) it is a virtual/placeholder address space that can point to any of the named address spaces while in embedded C it is a physical address space which is the same as automatic storage / flat address space.

Since originally CUDA was not intended to compile for the same targets as C or C++ it hasn't been an issue as there was some sort of separation of concerns. But with wider adoption, it seems that this separation no longer exists and any target supporting CUDA style address space semantic together with C style would need to duplicate the address space map. This is not ideal especially considering the number of entities it contains and their constant growth. In reality, at least there won't be many such targets but if SYCL intends to support targets that OpenCL does then it is not uncommon. For example CPU targets can compile both OpenCL and C/C++...

If we adhere to the current architecture we should have C's Default and CUDA's Default as separate entries in the map. However, this might be quite an involved refactoring. Have you looked in this before? A less involved refactoring could be to add a special hook that would allow to alter the address space mapping only for Default address space. I am not sure whether this is something we could add to Targets, although it seems to be driven by the language semantic. Anyway we can't decide this in this review as it requires other community member involvement.

Considering that the workaround is relatively simple I think it would be reasonable to accept it as an initial step. However, I would prefer to simplify this patch further by removing the environment component for the time being and just conditionalizing the map on the language mode instead. You will probably need to reset the map in TargetInfo:adjust because this is where we have access to LangOpts.

https://clang.llvm.org/doxygen/Basic_2TargetInfo_8cpp_source.html#l00347

We already override lots of settings there and hopefully, this will only be needed temporarily. I would add a FIXME comment there explaining the design issue.

Then you won’t need to extend the triple component at least for now, until we find the right architectural solution for this. As I am not convinced this is a good approach. I would suggest to follow up on cfe-dev regarding this issue and how we should address this in the long term. Ideally, we should have started from this but I understand that you have some timing constraints.

127

Should the following code be legal? Perhaps you can replace this with a spec reference instead?

132

This only makes sense for SYCL so let's add a guard.

clang/lib/CodeGen/TargetInfo.cpp
9954 ↗(On Diff #333247)

Is this change related to address spaces?

9984 ↗(On Diff #333247)

I think this comment should be improved because OpenCL and CUDA are not really address space agnostic. Should this check also contain C or C++?

9985 ↗(On Diff #333247)

Since you are using SYCL address space you should probably guard this line by SYCL mode... Btw the same seems to apply to the code below as it implements SYCL sematics?

Can you add spec references here too.

Also there seems to be nothing target specific in the code here as you are implementing what is specified by the language semantics. Should this not be moved to GetGlobalVarAddressSpace along with the other language handling?

I am not very familiar with this part of address space handling though. I would be more comfortable if @rjmccall could take a look too.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
2

Is this comment relevant?

40

This code doesn't seem to be relevant to the testing.

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

Are you testing address space segment binding here? Could you extend to other use cases too?

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
2

How about adding explicit conversion here too and then renaming into something like address-space-conversions.cpp since the testing is not specific to a parameter.

81

What functionality in the patch does this test?

100

The same functionality seems to be tested above.

125

Does this test anything from the patch?

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

What do you test in this file?

clang/test/CodeGenSYCL/address-spaces.cpp
2

Does this test also check conversions? Could this be merged with another conversions test if there is anything missing there?

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
2

I think this should also be renamed to address-space-conversions.cpp

You shouldn't need to add -x c++ as it is default for cpp files.

You should add testing of diagnostics for disallowed conversions too.

26

Btw you don't seem to be testing the reverse conversion i.e. from named to Default.

48

Why?

54

This doesn't belong to conversions technically.

bader updated this revision to Diff 336543.Apr 9 2021, 12:24 PM
bader marked 32 inline comments as done.

Applied code review suggestions.

Rebased on ToT and updated commit message.

clang/include/clang/AST/Type.h
492

Ok if you allow implicit conversions both ways then this condition should be extended to also contain all named address spaces in A and Default in B. But actually, could you simplify by checking that you have Default on either side, so something like

(A == LangAS::Default || B == LangAS::Default)

?

According to the comment above isAddressSpaceSupersetOf function definition.

/// Returns true if address space A is equal to or a superset of B.

(A == LangAS::Default || B == LangAS::Default) <- this change makes Default address space a superset of all address spaces including OpenCL, which we were trying to avoid with adding SYCL address spaces. Another problem with this code is that make Default a sub-set of named address spaces (like sycl_local), which is not right.
If I understand it correctly defining "isSupersSetOf" relation is enough for the rest of framework to enable conversions. Am I right?

clang/lib/AST/ItaniumMangle.cpp
2350

Any reason not to use OpenCL mangling? If you do then you might be able to link against libraries compiled for OpenCL. Also you will get more stable naming i.e. it would not differ from target to target.

I'm not sure I understand your suggestion. Could you elaborate on "OpenCL mangling", please?

Let me clarify the problem this change addresses. The test case covering it is located in clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp lines 86-91.

template <typename T>
void tmpl(T t) {}

int *NoAS;
__attribute__((opencl_private)) int *PRIV;

tmpl(PRIV);
// CHECK-DAG: [[PRIV_LOAD5:%[a-zA-Z0-9]+]] = load i32*, i32* addrspace(4)* [[PRIV]].ascast
// CHECK-DAG: call spir_func void [[PRIV_TMPL:@[a-zA-Z0-9_]+]](i32* [[PRIV_LOAD5]])
tmpl(NoAS);
// CHECK-DAG: [[NoAS_LOAD5:%[a-zA-Z0-9]+]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[NoAS]].ascast
// CHECK-DAG: call spir_func void [[GEN_TMPL:@[a-zA-Z0-9_]+]](i32 addrspace(4)* [[NoAS_LOAD5]])

Clang has separate code paths for mangling types w/ and w/o address space attributes (i.e. using Default address space).

Address space is not mangled if there is no AS attribute (Default) or if address space attribute is maps to 0 target address space. SPIR target maps *_private address space to 0, which causes name conflict for the example above.

This change for SYCL compiler enables mangling for non-default address space attributes regardless of their mapping to target address space.

clang/lib/Basic/Targets/AMDGPU.cpp
75

I can't find an example of how to do this.
CUDA address spaces use valid values and I wasn't able to find similar comments.

Where do you think we can put a comment?

clang/lib/Basic/Targets/SPIR.h
36

I've set 0 for SYCL values.

73–77

Good suggestion. I see that AMDGPU target does the same trick with overriding ajust method.
I also added back SPIR target assertions for unknow environment and OS components to clang/lib/Basic/Targets.cpp.

132

This only makes sense for SYCL so let's add a guard.

Unfortunately, I don't know what can be used as a guard here. TargetInfo doesn't provide access to language options.
Any suggestions?

clang/lib/CodeGen/TargetInfo.cpp
9954 ↗(On Diff #333247)

I'll move this change to a separate patch.

9984 ↗(On Diff #333247)

I think this comment should be improved because OpenCL and CUDA are not really address space agnostic. Should this check also contain C or C++?

I think the assert says that OpenCL and CUDA are not address space agnostic languages.
Tagging @yaxunl as an author of the check I just borrowed from AMDGPU target.

9985 ↗(On Diff #333247)

This code assigns target address space "global variables w/o address space attribute".
SYCL says it's "implementation defined" (from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace):

Namespace scope
If the type is const, the address space the declaration is assigned to is implementation-defined. If the target of the SYCL backend can represent the generic address space, then the assigned address space must be compatible with the generic address space.
Namespace scope non-const declarations cannot be used within a kernel, as restricted in Section 5.4. This means that non-const global variables cannot be accessed by any device kernel or code called by the device kernel.

I added clarification that SPIR target allocates global variables in global address space to https://reviews.llvm.org/D99488 (see line #248).

@rjmccall, mentioned in the mailing list discussion that this callbacks were developed for compiling C++ to AMDGPU target, so this not necessary designed only for SYCL, but it works for SYCL as well.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
2

Is this comment relevant?

I think so. It's placed by the script and useful for test maintenance.

40

Probably not for this patch, but the next patch (https://reviews.llvm.org/D71016) makes it necessary to emit LLVM IR for this test. Is it okay to leave it as is?

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

Are you testing address space segment binding here? Could you extend to other use cases too?

This test checks that returned address space is generic and address space is correctly casted for character, array, reference and aggregate types.

Could you clarify how exactly do you want this test to be extended, please? I think I'm not getting it.

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
81

What functionality in the patch does this test?

As I mentioned in the comment for clang/lib/AST/ItaniumMangle.cpp, there was a problem with instantiating templates for parameters which differs only by address space attribute. Lines 79-91 cover mangling changes.

100

Removed.

125

Does this test anything from the patch?

It's the same case as for address-space-cond-op.cpp test. This function defines an entry point for the device code. If it's not called, SYCL device compiler won't emit any LLVM IR output. I uploaded the final version of the test, which assumes that SYCL device compiler code generation part is in place. I would prefer to keep it instead of adding it back with my next patch. Is this okay?

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

What do you test in this file?

This test was added to cover a bug in CodeGen with incompatible address spaces between pointers to a struct and its members.
As we replaced CodeGen changes with SPIR target callbacks I think it's not needed anymore. I removed this test.

BTW, the same applies to clang/test/CodeGenSYCL/address-space-cond-op.cpp. Should we remove it as well?

clang/test/CodeGenSYCL/address-spaces.cpp
2

Does this test also check conversions? Could this be merged with another conversions test if there is anything missing there?

I renamed this to address-space-deduction.cpp and address-space-parameter-conversions.cpp to address-space-conversions.cpp.

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
26

Btw you don't seem to be testing the reverse conversion i.e. from named to Default.

Don't bar2(*GLOB);, bar2(*LOC);, bar2(*PRIV); cover this conversion?

Anastasia added inline comments.Apr 12 2021, 7:56 AM
clang/include/clang/AST/Type.h
492

(A == LangAS::Default || B == LangAS::Default) <- this change makes Default address space a superset of all address spaces including OpenCL.

I see, yes this will break pretty much everything unless we guard by SYCL mode. But I don't think it is good to go this route though.

Another problem with this code is that make Default a sub-set of named address spaces (like sycl_local), which is not right.

Well, if you need implicit conversions to work both ways as you have written in the documentation then you don't really have a true super-/subsets between the named address spaces and the default one. They appear to be equivalent.

SYCL mode enables both explicit and implicit conversion to/from the default address space from/to
the address space-attributed type.

So do you actually need something like this to work?

int * genptr = ...;
__private int * privptr = genptr:
Anastasia added inline comments.Apr 12 2021, 7:56 AM
clang/lib/AST/ItaniumMangle.cpp
2350

It's just that all language address spaces are mangled with the source spelling in Italium ABI right now, if you check the else statement. I don't think it is part of the official spec yet but it might be better to stick to the same pattern if possible.

clang/lib/Basic/Targets/AMDGPU.cpp
75

I think you could add a comment right here. I bet you can use any values i.e. 0? They are not expected to be meaningful. It would be good if we could add an asset somewhere but perhaps it might be hard to find a good place...

clang/lib/Basic/Targets/SPIR.h
36

Great! Let's add a comment here that we use dummy values because the map can't be used for this language mode.

46

Ok let's set the OpenCL address spaces to 0 and add a comment that they can't be used with this map.

131

Ok, do you still plan to add a FIXME as mentioned previously explaining why we need to reset the map here?

clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

After all what objects are allowed to bind to non-default address space here is defined in SYCL spec even if the exact address spaces are not defined so it is not completely a target-specific behavior.

My understanding of the API you are extending (judging from its use) is that it allows you to extend the language sematics with some target-specific setup. I.e. you could add extra address spaces to C++ or OpenCL or any other language. But here you are setting the language address spaces instead that are mapped to the target at some point implicitly.

It seems like this change better fits to CodeGenModule::GetGlobalVarAddressSpace that already contains very similar logic?

Otherwise, it makes more sense to use target address spaces directly instead of SYCL language address spaces. But either way, we should guard it by SYCL mode somehow as we have not established this as a universal logic for SPIR.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
9

I am trying to understand what specifically you are testing from the patch and whether you need this whole IR output to be checked?

It helps for reviewing and maintenance purposes to minimize the IR patterns as much as possible.

40

I would prefer if we minimize the amount of noise as much as possible. This patch has enough functionality so I don't think adding extra helps in any way.

The testing should generally encapsulate what is relevant. Otherwise, I don't know how to assess it without reviewing other patches too.

clang/test/CodeGenSYCL/address-space-conversions.cpp
12 ↗(On Diff #336543)

Should you check that the function name has expected mangling?

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

It seems to me that you are just testing address space inference here similar to address-space-deduction.cpp? I don't see anything specific to a return type in either the test or your patch?

clang/test/CodeGenSYCL/convergent.cpp
2 ↗(On Diff #336543)

Is this change related? I thought we are not adding the environment component after all...

Anastasia added inline comments.Apr 12 2021, 8:05 AM
clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
81

You are not testing the mangling though?

125

If you are not testing any logic from this patch let's leave it out because it is hard to navigate in the large review.

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

Yeah if it is not testing anything in your patch then it's better to remove it.

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
26

Sorry I mean from Default to named AS.

bader marked an inline comment as done.Apr 12 2021, 10:57 AM
bader added inline comments.
clang/include/clang/AST/Type.h
487

BTW, we need enable global_device and global_host attributes from https://reviews.llvm.org/D82174 for SYCL USM feature. I have following question regarding this: should I create a follow-up patch or we can enable all attributes for SYCL at once?

clang/test/CodeGenSYCL/convergent.cpp
2 ↗(On Diff #336543)

Is this change related? I thought we are not adding the environment component after all...

While I was removing -sycldevice environment component from the patch, I noticed that one of the committed tests already uses it.
https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenSYCL/convergent.cpp#L2

Do you want to me to create a separate review request for this change?

Anastasia added inline comments.Apr 13 2021, 6:57 AM
clang/include/clang/AST/Type.h
487

It seems like they would just be extending the existing functionality and not redesigning what we do in this patch?

If that's the case let's keep it in a separate patch, but feel free to upload it even now.

clang/lib/Basic/Targets/SPIR.h
131

Btw I was just thinking of another alternative here.

What do you think about just setting a value in Default AS entry then we wouldn't need any extra map at all in this case? So something like:

AddrSpaceMap[LangAS::Default] = AddrSpaceMap[LangAS::opencl_generic];

with a good explanation in FIXME? :)

clang/test/CodeGenSYCL/convergent.cpp
2 ↗(On Diff #336543)

I see, this looks trivial enough test clean up. You could just commit it without any review perhaps?

bader updated this revision to Diff 337180.Apr 13 2021, 9:02 AM
bader marked 16 inline comments as done.

Applied more code review suggestions.

Rebased on ToT.

bader added inline comments.Apr 13 2021, 9:03 AM
clang/include/clang/AST/Type.h
492

I looked though the code base and I see that explicit cast is used when raw pointer is casted to address space annotated type. I think we can always use explicit cast from Default to named address space instead of implicit cast. It might be even useful to avoid unintended implicit casts causing UB.
@keryell, @Naghasan, what do you think if we update https://reviews.llvm.org/D99488 to disallow implicit casts from Default to named address space? I think it should be okay considering that current implementation doesn't use this type of casts (and I can't come up with a use case for it).

Meanwhile I've added checks for that to clang/test/SemaSYCL/address-space-conversions.cpp.

clang/lib/AST/ItaniumMangle.cpp
2350

It's just that all language address spaces are mangled with the source spelling in Italium ABI right now, if you check the else statement. I don't think it is part of the official spec yet but it might be better to stick to the same pattern if possible.

I would really love to avoid changes to the mangler (e.g. to be able to link binaries produced by different front-end like SYCL/OpenCL/CUDA), but I don't know the better way to address the issue
Sorry, I don't get what do you suggest here. Could you clarify what exactly I should change, please?

clang/lib/Basic/Targets/AMDGPU.cpp
75

I think you could add a comment right here. I bet you can use any values i.e. 0? They are not expected to be meaningful. It would be good if we could add an asset somewhere but perhaps it might be hard to find a good place...

I've set Generic value (i.e. 0) and added a comment.

clang/lib/Basic/Targets/SPIR.h
131

Btw I was just thinking of another alternative here.

What do you think about just setting a value in Default AS entry then we wouldn't need any extra map at all in this case? So something like:

AddrSpaceMap[LangAS::Default] = AddrSpaceMap[LangAS::opencl_generic];

with a good explanation in FIXME? :)

I think this won't work because AddrSpaceMap is a constant.
Regarding FIXME, could you point where it was previously mentioned, please? I think it might miss it. I can add a link to the SYCL spec - https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:genericAddressSpace. Will it be sufficient?

clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

It seems like this change better fits to CodeGenModule::GetGlobalVarAddressSpace that already contains very similar logic?

This was the original implementation (see https://reviews.llvm.org/D89909?id=299795), but @rjmccall suggested to use this callback instead.
Both ways work for me, but the implementation proposed by John is easier to maintain.

Otherwise, it makes more sense to use target address spaces directly instead of SYCL language address spaces. But either way, we should guard it by SYCL mode somehow as we have not established this as a universal logic for SPIR.

I've updated the code to use target address space. I also added an assertion for SYCL language mode, although I think SPIR doesn't support global variables in address spaces other than global or constant regardless of the language mode, so I think the logic is universal.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
9

I am trying to understand what specifically you are testing from the patch and whether you need this whole IR output to be checked?

It helps for reviewing and maintenance purposes to minimize the IR patterns as much as possible.

Removed this test.

40

Removed from all new tests.

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

It seems to me that you are just testing address space inference here similar to address-space-deduction.cpp? I don't see anything specific to a return type in either the test or your patch?

Right. Removed.

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
81

You are not testing the mangling though?

Right, what we are checking here is that each instantiation emits a different function in LLVM IR. Exact mangling is not relevant to the check.

clang/test/CodeGenSYCL/convergent.cpp
2 ↗(On Diff #336543)

I see, this looks trivial enough test clean up. You could just commit it without any review perhaps?

Done in https://github.com/llvm/llvm-project/commit/95c614afcd4de71d00a240d6a4a02c036c972ed0

bader edited the summary of this revision. (Show Details)Apr 13 2021, 9:04 AM
bader edited the summary of this revision. (Show Details)
bader marked 2 inline comments as done.
bader added inline comments.
clang/include/clang/AST/Type.h
487

It seems like they would just be extending the existing functionality and not redesigning what we do in this patch?

If that's the case let's keep it in a separate patch, but feel free to upload it even now.

Added in https://reviews.llvm.org/D100396.

Anastasia added inline comments.Apr 14 2021, 4:35 AM
clang/lib/AST/ItaniumMangle.cpp
2350

For now I am just trying to understand why you are not adopting similar mangling scheme as for other language address spaces since it gives more stable mangling irrespective from the target compiled for.

If you plan to link libraries from other frontends i.e. OpenCL or CUDA the mangling you use is different from what they produce. Just have a look at the line 2470 that explains OpenCL mangling or line 2494 explaining CUDA mangling. FYI similar scheme applies to other language address spaces, so the AS<num> was only really used for the address spaces that have no source spelling i.e. no language semantics.

clang/lib/Basic/Targets/SPIR.h
131

I mean the FIXME is about the issue with Clang design:
https://reviews.llvm.org/D89909#inline-941733

Basically, just explain why we need to reset the map and that it is only needed for Default address space.

clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

This was the original implementation (see https://reviews.llvm.org/D89909?id=299795), but @rjmccall suggested to use this callback instead.

Did you mean to link some particular conversation? Currently, it resets to the top of the review.

Both ways work for me, but the implementation proposed by John is easier to maintain.

I can't see why the same code would be harder to maintain in the caller. If anything it should reduce the maintenance because the same logic won't need to be implemented by every target.

I also added an assertion for SYCL language mode, although I think SPIR doesn't support global variables in address spaces other than global or constant regardless of the language mode, so I think the logic is universal.

Asserts don't guard this logic to be applied universally. And since the IR was generated like this for about 10 years I don't feel comfortable about just changing it silently.

To my memory SPIR spec never put restrictions to the address spaces. It only described the generation for OpenCL C. So if you compile from C you would have everything in the default address space. And even OpenCL rules doesn't seem to be quite accurate in your patch as in OpenCL C globals can be in __global, __constant or __local. However, the SPIR spec was discontinued quite a while ago and the implementation of SPIR has evolved so I am not sure how relevant the spec is now.

Personally, I feel the behavior you are implementing is governed by the language soI think it is more logical to encapsulate it to avoid interfering with other language modes.

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
81

Ok, this is the kind of change that we normally expect tested explicitly. I don't believe you would be ok that we modify that freely as we like after you commit the patch.

bader updated this revision to Diff 337441.Apr 14 2021, 6:59 AM
bader marked 5 inline comments as done.

Applied more comments.

clang/lib/AST/ItaniumMangle.cpp
2350

For now I am just trying to understand why you are not adopting similar mangling scheme as for other language address spaces since it gives more stable mangling irrespective from the target compiled for.

According to my understanding this code is used for other language spaces. For instance, per comments at lines 2455-2457 it's used for OpenCL and CUDA address spaces.
Do you mean some other mangling scheme?

If you plan to link libraries from other frontends i.e. OpenCL or CUDA the mangling you use is different from what they produce.

SYCL standard doesn't have such functionality. OpenCL C functions are not mangled (only built-ins), so there should be no problem to link with OpenCL C libraries.
I know that mangling difference causes some problems for SYCL built-ins implementations on CUDA, but I don't know all the details. @Naghasan knows about these. @Naghasan, do you have suggestions to fix the problems caused by mangling?

Just have a look at the line 2470 that explains OpenCL mangling or line 2494 explaining CUDA mangling. FYI similar scheme applies to other language address spaces, so the AS<num> was only really used for the address spaces that have no source spelling i.e. no language semantics.

Is this statement correct? https://godbolt.org/z/n7The38dz - AS<num> is used for OpenCL address spaces mangling when it's compiled for spir target.

clang/lib/Basic/Targets/SPIR.h
131

Added FIXME comment.

clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

This was the original implementation (see https://reviews.llvm.org/D89909?id=299795), but @rjmccall suggested to use this callback instead.

Did you mean to link some particular conversation? Currently, it resets to the top of the review.

I pointed to the patch version implementing address space deduction in CodeGenModule::GetGlobalVarAddressSpace.
Conversion pointer is RFC in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-tp4068754p4069039.html

Both ways work for me, but the implementation proposed by John is easier to maintain.

I can't see why the same code would be harder to maintain in the caller. If anything it should reduce the maintenance because the same logic won't need to be implemented by every target.

I also added an assertion for SYCL language mode, although I think SPIR doesn't support global variables in address spaces other than global or constant regardless of the language mode, so I think the logic is universal.

Asserts don't guard this logic to be applied universally. And since the IR was generated like this for about 10 years I don't feel comfortable about just changing it silently.

To my memory SPIR spec never put restrictions to the address spaces. It only described the generation for OpenCL C. So if you compile from C you would have everything in the default address space. And even OpenCL rules doesn't seem to be quite accurate in your patch as in OpenCL C globals can be in __global, __constant or __local. However, the SPIR spec was discontinued quite a while ago and the implementation of SPIR has evolved so I am not sure how relevant the spec is now.

Personally, I feel the behavior you are implementing is governed by the language soI think it is more logical to encapsulate it to avoid interfering with other language modes.

Added early exist for non-SYCL modes.

Anastasia added inline comments.Apr 16 2021, 4:58 AM
clang/lib/AST/ItaniumMangle.cpp
2350

Is this statement correct? https://godbolt.org/z/n7The38dz - AS<num> is used for OpenCL address spaces mangling when it's compiled for spir target.

I am very confident this is correct https://godbolt.org/z/8o6vdYEPc though perhaps comments could be made more clear and feel free to improve them if you find them misleading.

So if you follow the flow here you can see that if the target maps the address space to a distinct non-zero value then it is using the target scheme mangling otherwise it will use the language spelling.

I believe this is done to prevent the name clashing, so if let's say you have overloaded functions:

foo(__global int*){}
foo(__local int*){}

and your target doesn't have the target address spaces (i.e. you would end up with AS0 and AS0 in both) then you would want to mangle the functions differently to avoid generating two definitions with the same name?

I think though this flow still has a flaw for the cases the target would map language address spaces to non-distinct and non-zero values there could still be a clash. Therefore I think using source address spaces would have been more stable but it would reduce the ability to link functions produced from different language modes.

clang/lib/Basic/Targets/SPIR.h
131

Ok, let's be a bit more specific regarding the clang design flaw i.e. I would add something like

Currently, there is no way of representing SYCL's default address space language semantic along with the semantics of embedded C's default address space in the same address space map. Hence the map needs to be reset to allow mapping to the desired value of 'Default' entry for SYCL.

You could also mention that SYCL has the same semantics as CUDA which would hint that this logic should be generalised to multiple languages not just SYCL.

140

This needs a language guard too?

clang/lib/CodeGen/TargetInfo.cpp
10081 ↗(On Diff #337441)

Can you add an error message string to the assert?

Also should you just assert that that address space is 0 i.e. not specified yet? Otherwise I don't think you should override it. Although you actually handle this case below as a valid input? Do you actually need this assert then?

10086 ↗(On Diff #337441)

Do you need to query the target from CGM? It returns TargetInfo which member you are overriding here.

https://clang.llvm.org/doxygen/classclang_1_1CodeGen_1_1CodeGenModule.html#a09a815215ff67d2e52b7605af35c40cb

However, I would just change this code to avoid generalizing since the generalization is SYCL specific anyway. The SPIR target does have a constant address space segment, so I would find it strange that it maps to global as a general rule. Why not to make it simple and just return global address space for SYCL in all cases?

9985 ↗(On Diff #333247)

I pointed to the patch version implementing address space deduction in CodeGenModule::GetGlobalVarAddressSpace.
Conversion pointer is RFC in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-tp4068754p4069039.html

This looks actually very neat and I can't see that anyone had any concerns about this.

I think John's comment on RFC is to point out that there are also Target hooks available should you need to override the language semantics but there is no statement that you should prefer it instead of implementing the language rules. I think the language semantics should take precedence.

Added early exist for non-SYCL modes.

To improve the understanding I would prefer if you guard the logic with if statement and return the original address space as default right at the end:

if (CGM.getLangOpts().SYCLIsDevice) {
// do what you need for SYCL
}
// default case - just return original address space
return AddrSpace;
bader added inline comments.Apr 19 2021, 12:34 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

I pointed to the patch version implementing address space deduction in CodeGenModule::GetGlobalVarAddressSpace.
Conversion pointer is RFC in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-tp4068754p4069039.html

This looks actually very neat and I can't see that anyone had any concerns about this.

I think John's comment on RFC is to point out that there are also Target hooks available should you need to override the language semantics but there is no statement that you should prefer it instead of implementing the language rules. I think the language semantics should take precedence.

Do I understand it correctly that you suggest replacing Target hooks with the CodeGen library changes from the first version of the patch?
@rjmccall, are you okay with that?

Anastasia added inline comments.Apr 19 2021, 3:42 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

Yes, that's right. I suggest lifting this logic into CodeGenModule::GetGlobalVarAddressSpace, so something like

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -3849,6 +3849,12 @@
     return AddrSpace;
   }
 
+  if (LangOpts.SYCLIsDevice) {
+    if (!D || D->getType().getAddressSpace() == LangAS::Default) {
+      return LangAS::opencl_global;
+    }
+  }
+
   if (LangOpts.CUDA && LangOpts.CUDAIsDevice) {
     if (D && D->hasAttr<CUDAConstantAttr>())
       return LangAS::cuda_constant;
@@ -3874,8 +3880,19 @@
   // OpenCL v1.2 s6.5.3: a string literal is in the constant address space.
   if (LangOpts.OpenCL)
     return LangAS::opencl_constant;
+
+  // If we keep a literal string in constant address space, the following code
+  // becomes illegal:
+  //
+  //   const char *getLiteral() n{
+  //     return "AB";
+  //   }
+  if (LangOpts.SYCLIsDevice)
+    return LangAS::opencl_private;
+
   if (auto AS = getTarget().getConstantAddressSpace())
     return AS.getValue();
+
   return LangAS::Default;
 }

from your original patchset.

bader added inline comments.Apr 19 2021, 4:06 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

Okay. Just to make sure that all details are clear:

  • We drop all SPIRTargetCodeGenInfo changes and getConstantAddressSpace for SPIR target.
  • SYCL language specific changes in CodeGen are not limited to CodeGenModule. As you might notice CodeGenModule modifications lead to address space changes, which are not expected by other part of the library. We need update bitcast with addrspacecast instruction in a few places to. Basically it means "take CodeGen library changes from the first patch".
  • Most of the test cases removed during code review cover the cases mentioned in the previous item and should be added back.

Let me know if there any concerns with any of these items.

Anastasia added inline comments.Apr 20 2021, 7:15 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

SYCL language specific changes in CodeGen are not limited to CodeGenModule. As you might notice CodeGenModule modifications lead to address space changes, which are not expected by other part of the library. We need update bitcast with addrspacecast instruction in a few places to. Basically it means "take CodeGen library changes from the first patch".

I don't see how moving the same logic elsewhere would need extra address space casts? Your SPIR target changes have identical semantics to the diff I pasted above.

bader added inline comments.Apr 20 2021, 8:37 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

Probably I misunderstand your proposal. Do you suggest moving SPIRTargetCodeGenInfo::getGlobalVarAddressSpace to CodeGenModule::GetGlobalVarAddressSpace, but keep the rest of SPIRTargetCodeGenInfo and SPIRTargetInfo changes?

Anastasia added inline comments.Apr 21 2021, 3:19 AM
clang/lib/CodeGen/TargetInfo.cpp
9985 ↗(On Diff #333247)

Yes, indeed.

bader updated this revision to Diff 339484.Apr 21 2021, 11:29 PM
bader marked 7 inline comments as done.

Applied more review comments.

bader updated this revision to Diff 339515.Apr 22 2021, 1:42 AM
bader marked an inline comment as done.

Added SYCL address spaces mangling for targets without address space map

bader marked an inline comment as done.Apr 22 2021, 2:18 AM
bader added inline comments.
clang/lib/Basic/Targets/SPIR.h
140

I can re-write this method to avoid using language address space:

llvm::Optional<LangAS> getConstantAddressSpace() const override {
  return getLangASFromTargetAS(1);
}

Does it look okay to you?

I don't think we need a language guard here as this hook is already guarded by users. E.g. https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenModule.cpp#L4137-L4159.
Adding language guards for TargetInfo::getConstantAddressSpace method will require API change similar to adjust method i.e. explicit LangOptions type parameter.

Anastasia added inline comments.Apr 22 2021, 9:12 AM
clang/include/clang/AST/Type.h
492

Do you still plan to wait for extra input or otherwise we could just update the documentation for now?

If you discover that you need to allow the reverse conversions later it should not be problematic to add since it won't break anyone's code. It will only allow more code to compile!

clang/lib/Basic/Targets/SPIR.h
140

Well, OpenCL is not the only language mode though so it generally is an ABI change. But also there are seem to be other uses of this function, for example in CGExpr.cpp that are not guarded by the language mode and there it seems like constant address space might make more sense for OpenCL.

I think the situation is similar here as for global variables - we might need a constant alternative to CodeGenModule::GetGlobalVarAddressSpace. But for the time being if you only need this logic for the purposes of castStringLiteralToDefaultAddressSpace why not to just add a SYCL special case straight there? I don't think your current implementation would allow using any other address space for this anyway?

bader marked 2 inline comments as done.Apr 22 2021, 10:34 AM
bader added inline comments.
clang/include/clang/AST/Type.h
492

Okay. I'll update the document right away.

clang/lib/Basic/Targets/SPIR.h
140

This logic should be applied to all types of literals, so changing only castStringLiteralToDefaultAddressSpace is not enough. I think this is where the most of the CodeGen changes from the first version of the patch are coming from (if not all of them). I'll check if it can be refactored to something like CodeGenModule::GetGlobalVarAddressSpace.

bader updated this revision to Diff 339973.Apr 23 2021, 3:56 AM
bader marked an inline comment as done.

Generalize getStringLiteralAddressSpace to GetGlobalConstantAddressSpace

Rebased on ToT.

bader marked 5 inline comments as done.Apr 23 2021, 7:02 AM

@Anastasia, I've updated https://reviews.llvm.org/D99488 and refactored getStringLiteralAddressSpace to handle non-string constants as well. Please, take a look.

Anastasia accepted this revision.Apr 26 2021, 4:45 AM

LGTM! Thanks!

This revision is now accepted and ready to land.Apr 26 2021, 4:45 AM
aaron.ballman accepted this revision.Apr 26 2021, 4:49 AM

Just one drive-by nit from me.

clang/lib/CodeGen/CodeGenModule.cpp
3852–3856

Just simplifying the logic a bit, likely needs to be reformatted though.

This revision was landed with ongoing or failed builds.Apr 26 2021, 6:44 AM
This revision was automatically updated to reflect the committed changes.
bader marked an inline comment as done.