Page MenuHomePhabricator

[SYCL] Implement SYCL address space attributes handling
Needs ReviewPublic

Authored by bader on Oct 21 2020, 1:34 PM.

Details

Summary

Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCLDevice environment to triple.

Static variables without address space now reside in global address
space, unless they have an explicit address space qualifier in source
code.

Although OpenCL specification explicitly states that a string literal
should reside in constant address space, it does not work for SYCL
with "generic by default" address space rules.

For example:

const char *getLiteral() {
  return "A";
}

void func(bool AorB) {
  char B[] = {'B', '\0'};
  const char* C = AorB ? A : B;
}

If `A' reside in constant address space, it cannot be returned from a
function `getLiteral', because it returns a generic const char*.

When default address space values in C++ are CodeGen'ed into different
address spaces in LLVM IR, we must ensure that an addrspacecast is
emitted before a bitcast.

Pointers with the same address space in AST may end up in different
address spaces in IR. We cannot CodeGen a conditional
operator (ternary 'if') whose operands have different address spaces,
so we have to addrspacecast them to generic.

When a static variable is lowered from AST to LLVM IR, it can change an
address space depending on its storage class. For example, a static
variable declared in function scope may be assigned to global address
space if language rules allow. When this happens, original address space
of a variable (retuned by D.getType()) is no longer can be used as
target address space. Correct address space for a global can be obtained
from CodeGenModule::GetGlobalVarAddressSpace function, but in this case
we just copy it from the existing global (OldGV).

Pointers with the same address space in AST may end up in different
address spaces in IR. We cannot CodeGen a binary compare
operator (<, >, ==, etc.) whose operands have different address spaces,
so we have to addrspacecast them to generic.

Emit address space cast if return type and return value have different
address spaces.

Invalid bitcast with different address spaces was emitted as arument of
llvm.invariant.start intrinsic for work group scope constants.
This problem caused assertion fail.

Make sure that all operands of phi instruction that is generated while
lowering conditional operator have the same type. This fixes compiler
assertion.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

The mapping has been discussed in this comment: https://github.com/intel/llvm/pull/1039/#discussion_r369667791.

The discussion you cite is in the separate Github project, but you should have a discussion using LLVM channels about the design you propose for llvm-project repo. Also I don't think citing resources with many comments and no conclusions is making the review process very productive.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"?

LLVM has it's own sematic which toolchains rely on. Toolchains consuming LLVM IR doesn't know if this LLVM IR was emitted by C++ compiler or written manually - they can rely only on LLVM IR semantic defined by https://llvm.org/docs/LangRef.html.

If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that?

Technically clang changes SPIR target mapping for Default address space only for sycldevice environment component of the target triple, so it doesn't depend on the input language i.e. Default is mapped to the same LLVM address space for C++ and SYCL.

If you are suggesting to add SYCL as a component for other languages supported by clang this deserves a separate review probably with different reviewers and a wider discussion with the community.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

This is unfortunate. Do you know if there are any plans to provide up-to-date documentation? It's critical for non-clang front-ends targeting SPIR-V format.

I am not aware of any plan. FYI SPIR has not been added or designed for SPIR-V. It is only used for SPIR-V as a workaround by some external projects but it is not the only use case even if perhaps an important one for the community.

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references.

I agree with that, but I don't get how it's related to the mapping language AS to LLVM AS.

Every language address space has its own semantic that should be mapped appropriately to LLVM IR to make sure the semantic is preserved.

OpenCL generic is a concept governed by OpenCL C spec cited above

It can be used with pointer types and it represents a placeholder for any of the named address spaces - global, local or private. It signals that a pointer points to an object in one of these concrete named address spaces. The exact address space resolution can occur dynamically during the kernel execution.

and it is completely different to the concept of Default that is a flat address space of C and C++ as per Embedded C s5.1.1

For the purpose of this Technical Report, the C language is extended to support additional address spaces.  When not specified otherwise, objects are allocated by default in a generic address space, which corresponds to the single address space of ISO/IEC 9899:1999.  In addition to the generic address space, an implementation may support other, named address spaces.

I hope you see the difference between the two.

As SPIR is a virtual target, Default and OpenCL generic have distinct values because you need to preserve their semantic until the mapping to the physical device. I don't see why these two concepts should have the same logical mappping? For the generality tools consuming SPIR LLVM IR should have information for both address spaces and then decide how to represent or map those to a physical device. Then on a physical device target it makes sense that some or even all address spaces are mapped to the same value representing physical segments.

Default address space is where objects are put by default - for example in C++ everything is mapped to this address space. I don't see why you just replace Default by OpenCL generic. It has been added for a very different purpose.

SPIR-V Generic is the only address space allowing us to preserve correct semantic of the program (possibly global can be used, but we didn't investigate this option due to obvious performance implications).

FYI LLVM doesn't support SPIR-V yet, but nevertheless SPIR-V generic has the same semantic as OpenCL generic but it is not the same as Default in clang.

The objects from Default AS might be allocated in any physical AS, so LLVM IR AS 4 allows LLVM compiler to infer it using InferAddressSpace pass.

As far as I am aware you can also use any value with InferAddressSpace pass not only 4.

bader added a comment.Feb 3 2021, 8:08 AM

Anastasia added a comment.
In D89909#2536331 https://reviews.llvm.org/D89909#2536331, @bader wrote:

In D89909#2536157 https://reviews.llvm.org/D89909#2536157, @Anastasia wrote:

In D89909#2534790 https://reviews.llvm.org/D89909#2534790, @bader wrote:

Regarding SYCLDevice and SYCLAddrSpaceMap I am still not very convinced about the flow. Have you had any design discussion regarding this already that you could point to?

We discussed this with you in https://github.com/intel/llvm/pull/1039/.

I can't see.

The mapping has been discussed in this comment: https://github.com/intel/llvm/pull/1039/#discussion_r369667791.

The discussion you cite is in the separate Github project, but you should have a discussion using LLVM channels about the design you propose for llvm-project repo. Also I don't think citing resources with many comments and no conclusions is making the review process very productive.

Sorry, I forgot to mentioned that this change was also discussed in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html.
There are no objections from the community.

What is not covered by this thread and probably worth to discuss separately is changing OpenCL private mapping to non-zero LLVM AS to avoid the mangler component change. I expect significant impact on SPIR LLVM IR consumers and long transition period if the change is accepted.

My main concern is that I am not sure it fits with the main design of Clang - producing IR that is to be consumed by multiple targets rather than only one specific target or even an external tool. Why do you add a triple component that is used only for one target SPIR:

How would existing toolchains that consume LLVM-IR for SPIR cope with the fact that the Default address space is different for C++ and SYCL.

High level languages differences doesn't matter for toolchains consuming LLVM IR - it can python, D or SYCL. Toolchains are assuming that LLVM IR follows the sematic defined in SPIR-1.2 document - https://www.khronos.org/registry/SPIR/specs/spir_spec-1.2.pdf.

I can't understand what you mean by "doesn't matter"?

LLVM has it's own sematic which toolchains rely on. Toolchains consuming LLVM IR doesn't know if this LLVM IR was emitted by C++ compiler or written manually - they can rely only on LLVM IR semantic defined by https://llvm.org/docs/LangRef.html.

If you compiler from C++ you get one address space as Default and if you compile from SYCL you get a different one for the same target - how do you expect the toolchain consuming the IR to handle that?

Technically clang changes SPIR target mapping for Default address space only for sycldevice environment component of the target triple, so it doesn't depend on the input language i.e. Default is mapped to the same LLVM address space for C++ and SYCL.

If you are suggesting to add SYCL as a component for other languages supported by clang this deserves a separate review probably with different reviewers and a wider discussion with the community.

No, but clang doesn't couple input language with environment component of the target triple. As I mentioned above, community doesn't have any objections to this approach.

FYI, the document is no longer describing the current SPIR target adequately as implementation deviated quite a bit from the original documentation.

This is unfortunate. Do you know if there are any plans to provide up-to-date documentation? It's critical for non-clang front-ends targeting SPIR-V format.

I am not aware of any plan. FYI SPIR has not been added or designed for SPIR-V. It is only used for SPIR-V as a workaround by some external projects but it is not the only use case even if perhaps an important one for the community.

I'm aware of SPIR target history and design - I was on the team contributed SPIR-1.2 support to clang, although I was involved into compiler back-end development.
We designed and contributed SPIR-1.2 support to clang aiming to provide a tool producing stable portable binary format for OpenCL programs.
SPIR format evolved from SPIR-1.2 to SPIR-V, which is not based on LLVM-based, but AFAIK the goal of SPIR target in clang tool has not changed and it is still to produce binary format for Khronos APIs (even though part of the toolchain is hosted outside of LLVM project).
It's great if there are other uses of SPIR target in LLVM project. Do you know where I can learn more about this?

Why is changing of `the address space map necessary for SPIR but not the other targets?

The main reason is the difference in handling Default address space in C++ for OpenCL mode and SYCL. C++ for OpenCL doesn't always deduce address space and there are cases when Default is used and SPIR target maps it AS 0 (equivalent of SPIR-V Function storage class). This lead to inconsistencies like reported here: https://bugs.llvm.org/show_bug.cgi?id=45472.

SYCL doesn't deduce language address space at all relies on mapping Default in LLVM AS equivalent of SPIR-V generic.

NOTE: other GPU targets like AMDGPU and NVPTX do the same i.e. maps Default language AS to LLVM IR AS equivalent of SPIR-V generic, so there is no need to update. I believe SPIR target must apply the same mapping.

Generic address space in Clang is a logical concept added for OpenCL to implement the following logical concept .
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#the-generic-address-space

Targets are allowed to map it to any physical address space (it is very reasonable to map it to Default) but it is only used for restricted use cases i.e. pointers or references.

I agree with that, but I don't get how it's related to the mapping language AS to LLVM AS.

Every language address space has its own semantic that should be mapped appropriately to LLVM IR to make sure the semantic is preserved.

OpenCL generic is a concept governed by OpenCL C spec cited above

It can be used with pointer types and it represents a placeholder for any of the named address spaces - global, local or private. It signals that a pointer points to an object in one of these concrete named address spaces. The exact address space resolution can occur dynamically during the kernel execution.

and it is completely different to the concept of Default that is a flat address space of C and C++ as per Embedded C s5.1.1

For the purpose of this Technical Report, the C language is extended to support additional address spaces.  When not specified otherwise, objects are allocated by default in a generic address space, which corresponds to the single address space of ISO/IEC 9899:1999.  In addition to the generic address space, an implementation may support other, named address spaces.

I hope you see the difference between the two.

As SPIR is a virtual target, Default and OpenCL generic have distinct values because you need to preserve their semantic until the mapping to the physical device. I don't see why these two concepts should have the same logical mappping? For the generality tools consuming SPIR LLVM IR should have information for both address spaces and then decide how to represent or map those to a physical device. Then on a physical device target it makes sense that some or even all address spaces are mapped to the same value representing physical segments.

SPIR-V Generic is the only address space allowing us to preserve correct semantic of the program (possibly global can be used, but we didn't investigate this option due to obvious performance implications).

FYI LLVM doesn't support SPIR-V yet, but nevertheless SPIR-V generic has the same semantic as OpenCL generic but it is not the same as Default in clang.

I puzzled by existing mapping: Aren't Default and SPIR-V Function (or OpenCL private) semantically different address spaces?
I don't see what difference between Default and SPIR-V generic (or OpenCL generic) plays significant role at LLVM IR level.
Note that LLVM IR objects are not allocated neither in Default nor in Generic address spaces as described by John in this message.
This seems to be only property important for separate mapping of these two address spaces to LLVM IR.
Could you provide an example where mapping Default to SPIR-V generic might cause the problem, please? I have a problem with imagining such case, but I can be biased.

FWIW, real GPU targets map Default to Generic and as far as I concerned it doesn't cause problems.

The objects from Default AS might be allocated in any physical AS, so LLVM IR AS 4 allows LLVM compiler to infer it using InferAddressSpace pass.

As far as I am aware you can also use any value with InferAddressSpace pass not only 4.

Right, but we use SPIR target and it uses 4.

Sorry, I forgot to mentioned that this change was also discussed in the mailing list: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html.
There are no objections from the community.

I remember asking similar questions regarding the SYCL address space map on that thread, and I don't think I could get more than what we are discussing here. I generally don't think that RFC ended with any conclusions as a result this patch contained different approach to what I have suggested at the end.

SPIR format evolved from SPIR-1.2 to SPIR-V, which is not based on LLVM-based, but AFAIK the goal of SPIR target in clang tool has not changed and it is still to produce binary format for Khronos APIs (even though part of the toolchain is hosted outside of LLVM project).

I would like to highlight that SPIR was completely replaced by SPIR-V. I don't really see it as evolution. There were several attempts to add SPIR-V into LLVM and Clang - both of us were involved in some of those.

As of now the comments in the source code still says:

llvm/include/llvm/ADT/Triple.h:    spir,           // SPIR: standard portable IR for OpenCL 32-bit version
llvm/include/llvm/ADT/Triple.h:    spir64,         // SPIR: standard portable IR for OpenCL 64-bit version

It's great if there are other uses of SPIR target in LLVM project. Do you know where I can learn more about this?

I presume it is used for what SPIR was originally intended - a de-facto portable format? As a matter of fact not all OpenCL vendors support SPIR-V and we as a community won't and can't force this. If you want to learn more about use cases I would ask the community, cfe-dev could be a good start.

I puzzled by existing mapping: Aren't Default and SPIR-V Function (or OpenCL private) semantically different address spaces?

Yes they are. OpenCL private is just the most common case this is why mapping them to the same value often works.

Note that LLVM IR objects are not allocated neither in Default nor in Generic address spaces as described by John in this message.

Default can be used to allocate objects even if address spaces for alloca can be customized. There are other objects that are used with Default - arguments or return types. It is just so inherent in the parser. Most of objects use Default at the AST level even if not all of this ends up in the IR.

Could you provide an example where mapping Default to SPIR-V generic might cause the problem, please? I have a problem with imagining such case, but I can be biased.

Sorry I don't think I have enough time. But in general I think you should not base the design decision solely on the examples as both C and C++ are complex enough to overlook various corner cases. Your design should make sense conceptually and then the tooling will be correct by construction and easy to reason about.

FWIW, real GPU targets map Default to Generic and as far as I concerned it doesn't cause problems.

How do you prove this though?

Right, but we use SPIR target and it uses 4.

I guess I still don't get your architecture, but you could also look at adding a pass that remaps the address spaces or extend the pass to work with different default address spaces?

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

clang/include/clang/AST/Type.h
492
clang/lib/AST/ItaniumMangle.cpp
2465
clang/lib/Basic/Targets/SPIR.h
75–76
clang/lib/CodeGen/TargetInfo.cpp
10076

I called it out because it wasn't clear to me what the type actually was, but I see now that it's an Optional<LangAS>. I think it's useful to know that we're dealing with an optional, but it's not crucial.

bader updated this revision to Diff 328100.Mar 4 2021, 3:30 AM
bader marked 4 inline comments as done.

Apply suggestions from Aaron.

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

bader added a comment.Mar 10 2021, 9:42 AM

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

Just a few minor nits from me, but I'm mostly wondering: where are we at with this and are there still substantive changes required? (I looked through the comments, but there's a lot of back-and-forth since Oct and I'm not certain what's holding the patch back currently.)

To make it short, from my side I am not very clear about the overall design. From the SYCL spec side, there is no indication of what compiler extensions are needed and if at all. As a result, some of the design choices are unclear to me - in particular why SPIR target would need a separate address space map for SYCL. This is not how it was intended originally and I am worried that this will create issues for the consumers of IR to handle two different formats. But in general, if the community is now to maintain this code we should at least have some deeper understanding of it.

I would suggest starting from some high-level documentation that provides the details of the compiler extension being implemented. Perhaps the documentation that @bader has linked earlier could be used as a starting point with some more details that would allow assessing and reviewing the changes.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

Just to be clear I am not suggesting a nice-to-have documentation. I would like to see the documentation that explains what behavior is expected from the compiler. I think this is normally a good starting point and it has been a common practice for many contributions I believe too. I guess we could first commit the implementation and then work out what we want it to be doing but that implies a bigger risks to the community. Also I am also not entirely happy with some directions taken deviating the original design in this patch. But it is hard to assess whether there are better practical alternatives and suggest anything without understanding what is being implemented.

Address spaces are used by many stakeholders in clang so having a clear understanding of various use cases publicly accessible would benefit the community and reduce the risks of disruptions and negative impact of each other's work.

Regarding the timing I believe it could take less than several months considering that there are some parts already available but it depends on various factors including everyone's availability. However, the fact that some areas would require collaborative exploration of design alternatives from stakeholders was to my memory highlighted on the original proposal to contribute SYCL support to clang quite a while ago. So the requirement here should not appear as unexpected.

This has been my suggestion to address the ping-ponging discussion issues and latencies/time incurred by them as I think it would be good for us to get the terminology sorted. But I am happy if there are other suggestions addressing the concerns.

@Anastasia, do you suggest we copy https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md document to clang/docs within this patch?

For the purpose of this specific topic, you could indeed move "Address spaces handling" section into the clang documentation. I would suggest creating a dedicated page where you could gather SYCL specific internals for the clang community. You could also link the github page to it for more comprehensive details.

I would recommend extending the documentation slightly to focus on what behavior is expected to be implemented rather than how you propose to implement it in clang too. This is a general guideline for the clang contributions that suggest that the documentation should be detailed such that it would be possible to implement it in other frontend/compiler. You could for example include some more information on expected language semantic i.e. what you inherit from embedded C and what you inherent from OpenCL or any other available language features with relevant spec/doc references including SYCL spec. This should facilitate anyone who needs to understand the implementation to find further details.

It would probably make sense to create a separate review to avoid adding too much noise here. Once you create a review we can link it here and also refine any necessary details as we go along.

These all seem like good suggestions, from my limited perspective, so thank you for them! I'm not opposed to the documentation requests, however, I think that is work that is separable from the proposed changes in this review and could be done with post-commit follow-ups, unless you think the current documentation is wrong as it relates to this patch. I worry that back-and-forth on documentation clarity (while certainly important) is going to hold this patch up for several more months, which is a burden.

Just to be clear I am not suggesting a nice-to-have documentation. I would like to see the documentation that explains what behavior is expected from the compiler. I think this is normally a good starting point and it has been a common practice for many contributions I believe too.

If what you're suggesting is that Clang have a SYCL-specific page that serves a similar purpose to https://clang.llvm.org/docs/OpenCLSupport.html, then I agree we should have that (and it should be linked to from https://clang.llvm.org/docs/index.html the same as OpenCL, OpenMP, etc). However, I think that's orthogonal to this patch and should be done separately. (If the documentation already existed, then I'd request that this patch update it, but because those docs don't exist yet, I think it's unrelated enough to warrant being done separately.)

If you're requesting something different, I'd appreciate a bit more specific details.

I guess we could first commit the implementation and then work out what we want it to be doing but that implies a bigger risks to the community.

Given that we lack the dedicated documentation page for SYCL within Clang, I guess I view that as the status quo (we've already started some of the upstreaming work and none of it is documented in a convenient place for Clang users). Based on that, as someone who doesn't really do much with SYCL to date, I think it's reasonable to accept this patch because it extends existing functionality in a straight-forward manner with a reasonable explanation as to why.

Also I am also not entirely happy with some directions taken deviating the original design in this patch.

If there are technical concerns to be addressed, I'm not certain it's clear (to me, at least) what they are.

But it is hard to assess whether there are better practical alternatives and suggest anything without understanding what is being implemented.

That's reasonable, but I think we also have to trust the domain experts who are submitting patches to have done their design homework and to be willing to address concerns as they come up when there's concretely a problem. This is not an extremely experimental invention from an unknown lone contributor; this is a long-time maintainer upstreaming work from downstream repo with users who use this functionality.

Address spaces are used by many stakeholders in clang so having a clear understanding of various use cases publicly accessible would benefit the community and reduce the risks of disruptions and negative impact of each other's work.

Strongly agreed! But my point is: it seems like this is (at least to some extent) already publicly accessible information and I don't think we've identified any disruptions or negative impacts (yet).

Regarding the timing I believe it could take less than several months considering that there are some parts already available but it depends on various factors including everyone's availability. However, the fact that some areas would require collaborative exploration of design alternatives from stakeholders was to my memory highlighted on the original proposal to contribute SYCL support to clang quite a while ago. So the requirement here should not appear as unexpected.

This has been my suggestion to address the ping-ponging discussion issues and latencies/time incurred by them as I think it would be good for us to get the terminology sorted. But I am happy if there are other suggestions addressing the concerns.

I read this as "Maybe it won't take several more months depending on factors outside of the patch author's control" which doesn't really address my concerns that there's not a clear direction on how to unblock this patch in a timely fashion. That's not to imply that I think your concerns are invalid or should be hand-waved away! I agree that we should make sure there's comfort with the design and try to reduce risk as much as we can. It's more that I think I'm recognizing that trying to gain this clarity over Phabricator with long latency on the discussion is somewhat ineffective.

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

If what you're suggesting is that Clang have a SYCL-specific page that serves a similar purpose to https://clang.llvm.org/docs/OpenCLSupport.html, then I agree we should have that (and it should be linked to from https://clang.llvm.org/docs/index.html the same as OpenCL, OpenMP, etc). However, I think that's orthogonal to this patch and should be done separately. (If the documentation already existed, then I'd request that this patch update it, but because those docs don't exist yet, I think it's unrelated enough to warrant being done separately.)

If you're requesting something different, I'd appreciate a bit more specific details.

I don't really mind the format. It doesn't have to be in Clang documentation although perhaps it makes the most sense logically. What I would like to see is a transparancy within the commnity that is I believe a common practice as per clang contribution policy:

A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.

I don't find it great to continue reviewing the work where I don't have a clear picture. I also would like to avoid the situation the community has to work with code that we don't understand. We should not be in need to reverse engineer what is being implemented. It is not a comfortable position to be nor it is productive. Once we know the high-level language semantic it facilitates us to reason about implementation should we need to redesign anything or fix a bug. If we only have fragements of code it doesn't tell us a lot and there is very little we can do with it. Perhaps this is why good practices have been added by the community? There are also other areas whether it demonstrated to work really well - for very good reasons we no longer add undocumented attributes to clang.

Given that we lack the dedicated documentation page for SYCL within Clang, I guess I view that as the status quo (we've already started some of the upstreaming work and none of it is documented in a convenient place for Clang users). Based on that, as someone who doesn't really do much with SYCL to date, I think it's reasonable to accept this patch because it extends existing functionality in a straight-forward manner with a reasonable explanation as to why.

I am not aware of all work on SYCL, but I believe some documentation was provided, for example, for the new kernel attribute?

Personally I don't find the patch straight-forward mainly because it deviates from the original design wrt target address space map. But there are other concerns - for example the need for InferAddressSpace that is not required for embedded C and C++ functionality that to my understanding SYCL aims to implement. However, I don't know whether my understanding of the aim is correct as I don't have any reference to verify my understanding.

If there are technical concerns to be addressed, I'm not certain it's clear (to me, at least) what they are.

Well the issue I got stuck at was highlighted earlier - the need to add new address spaces map for SYCL. So far the design we were striking at is that every language would have a new entry in the map but adding multiple maps per language seems to defeat the purpose. Considering that OpenCL and SYCL would likely be supported by similar targets we should try to avoid this or at least try to understand why we ended up with such situation and what we might be able to do to improve. I would accept if we can't do anything with this immediately in case it requires big restructuring but perhaps at least this deserves a wider community communication.

That's reasonable, but I think we also have to trust the domain experts who are submitting patches to have done their design homework and to be willing to address concerns as they come up when there's concretely a problem. This is not an extremely experimental invention from an unknown lone contributor; this is a long-time maintainer upstreaming work from downstream repo with users who use this functionality.

I don't think this can be called a trust issue. For me personally, it is about making sure to align with the requirements from various domains that none of us has complete knowledge of. There is certainly no invention but however a case of substantially new functionality that is using the feature scattered all over the frontend source code. I am sorry that I just don't find it trivial. While I have worked on this topic for many years I wouldn't trust myself to make a new design decision alone because it is important to get things right in general not just for my product. As a matter of fact me and some other developers from this area were working quite a lot in the last years on the unification of the functionality for different dialects which results in better understanding and less maintenance.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts.

This is somewhat unfortunate because the fact that collaboration would be needed in this area has been highlighted straight from the beginning. Technically we would only need #3 to unblock #0 as creating a new page is very trivial and it could be done as a part of documenting the SYCL address space language sematic. I believe that with the use of references to existing documentations - SYCL, OpenCL, embedded C specifications, etc, such a goal can be accomplished without significant time investment.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

I am not opposed to this, but it would be good to agree on the exact agenda before to ensure we can make some progress.

Tyker added a subscriber: Tyker.Thu, Mar 18, 8:24 AM
bader updated this revision to Diff 333247.Thu, Mar 25, 3:44 AM

[NFC] Align address space map names with AMDGPU target.

Rename SPIRAddrSpaceMap and SYCLAddrSpaceMap to SPIRDefIsPrivMap and SPIRDefIsGenMap.

bader added a comment.Thu, Mar 25, 5:21 AM

Let's take a look at compilation output for the following program:

c++
void foo(int *p) {}

OpenCL compiler produces following LLVM IR for SPIR target:

LLVM
define dso_local spir_func void @_Z3fooPU3AS4i(i32 addrspace(4)* nocapture %0) local_unnamed_addr #0 {
  ret void
}

Here is what SYCL device compiler produces:

LLVM
define dso_local spir_func void @_Z3fooPi(i32* nocapture %0) local_unnamed_addr #0 {
  ret void
}

We would like to get equivalent code produced by both compiler for SPIR target.
Currently SYCL device compiler emits pointer to address space 0 instead of expected 4. This lowering is defined by the address space map.
Having multiple maps is not something new to the clang community. Similar approach AMDGPU target applies to customize address space mapping for OpenCL language (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/AMDGPU.cpp#L354).

I updated address space map names to avoid confusion that mapping change is specific to the language mode. Now they use the same naming scheme as AMDGPU maps.

Using InferAddressSpace pass is not required for the targets supporting SPIR-V with pointers to generic address space, but it can be beneficial to produce more efficient native code.

I also created a review request with SYCL design documentation - https://reviews.llvm.org/D99190

Having multiple maps is not something new to the clang community. Similar approach AMDGPU target applies to customize address space mapping for OpenCL language (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/AMDGPU.cpp#L354).

I updated address space map names to avoid confusion that mapping change is specific to the language mode. Now they use the same naming scheme as AMDGPU maps.

Well AMDGPU is a specific target in contrast to SPIR. But the question is whether the map that has been renamed is only intended for SYCL or can it be used for OpenCL or other languages too? My worry is that the original design was intended to separate language and target information but somehow it is now driven towards tangling them together more and more.

Using InferAddressSpace pass is not required for the targets supporting SPIR-V with pointers to generic address space, but it can be beneficial to produce more efficient native code.

Ok, that's good to know that it is only intended for optimisations!

I also created a review request with SYCL design documentation - https://reviews.llvm.org/D99190

Thanks. I will take a look at the address space part. My first impression is that it would be good to abstract away from the implementation in clang and start from the language semantic that you are trying to get. Perhaps it will be easier if I provide concrete questions on the review.

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

@aaron.ballman, I've uploaded SYCL architecture design document with address space section (https://reviews.llvm.org/D99190).
I checked commits with SYCL functionality and we documented all extensions. In particular sycl_kernel attribute mentioned above is documented here - https://clang.llvm.org/docs/AttributeReference.html#sycl-kernel.
Are there any other steps I should do or we can consider it to be done?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

@Anastasia, @aaron.ballman, is there anything I can do for this patch to be accepted?

Based on the discussion so far, would these be acceptable steps to take?

0) Complete review on this patch for any technical concerns related to it and commit when it's ready (this unblocks some downstream needs quickly, hopefully).

  1. @bader (or someone else on the SYCL team) creates a bare-bones SYCL documentation page that is quickly accepted as a placeholder for us to put more documentation.
  2. Each new SYCL related review that needs user-facing documentation outside of what the SYCL standard documents will update the Clang SYCL doc.
  3. @bader (or someone else on the SYCL team) writes address-space mapping documentation and adds it to the Clang SYCL doc.
  4. @bader (or someone else on the SYCL team) looks at other commits that have already gone into Clang to write missing documentation.

Some of these steps can be done in parallel, of course. I recognize we could have the order be #1 -> #3 -> #0 (so the docs for this change are written as part of this patch), but my concern with that approach is that this patch is blocking other SYCL efforts (@bader can correct me if I'm wrong about this) and no one has identified a definite issue with it yet beyond the lack of Clang documentation.

If those steps seem unreasonable or like I've totally missed the point on something, perhaps we could reach an understanding more quickly via a meeting (we could summarize the decisions from the meeting here so the community is aware of the end results)?

@aaron.ballman, I've uploaded SYCL architecture design document with address space section (https://reviews.llvm.org/D99190).
I checked commits with SYCL functionality and we documented all extensions. In particular sycl_kernel attribute mentioned above is documented here - https://clang.llvm.org/docs/AttributeReference.html#sycl-kernel.
Are there any other steps I should do or we can consider it to be done?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

@Anastasia, @aaron.ballman, is there anything I can do for this patch to be accepted?

I have provided my feedback regarding the address space section on the documentation review. I don't think I will be able to look at the other parts of documentation though. So if you would like to finalize the address spaces faster it might make sense to split it up into a separate review. This might also help to reduce the noise.

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

I'd like to provide a bit more detailed summary. I think meeting was very useful with the following basic outcome:

  • Need to add references to SYCL spec where applicable
  • Need to update documentation where references are not enough, in particular:
    • C++ address space attributes used to implement SYCL address space behavior defined by the specification
    • main issue is missing documentation for Clang extensions being added for SYCL.
  • @Anastasia needs some time to read all references/docs provided, some of them has been sent during the meeting

Do you agree on the list above or I missed anything?

This is what has been done already after the meeting based on the comments made:

I believe that currently this PR is in much better shape than it was and I think I addressed all critical missing items that have been mentioned.
Can you please review and let me know if anything missing?

Yesterday, we chatted offline and agreed that the main issue is missing documentation for Clang extensions being added for SYCL. To address this issue we are adding SYCL architecture design document, which we are going to update along with adding new features.

I'd like to provide a bit more detailed summary. I think meeting was very useful with the following basic outcome:

  • Need to add references to SYCL spec where applicable
  • Need to update documentation where references are not enough, in particular:
    • C++ address space attributes used to implement SYCL address space behavior defined by the specification
    • main issue is missing documentation for Clang extensions being added for SYCL.
  • @Anastasia needs some time to read all references/docs provided, some of them has been sent during the meeting

Do you agree on the list above or I missed anything?

This is what has been done already after the meeting based on the comments made:

I believe that currently this PR is in much better shape than it was and I think I addressed all critical missing items that have been mentioned.
Can you please review and let me know if anything missing?

Thanks, your summary look fairly accurate to me.

Anastasia added inline comments.Wed, Apr 7, 10:54 AM
clang/include/clang/AST/Type.h
493

Ok if you allow implicit conversions both ways then this condition should be extended to also contain all named address spaces in A and Default in B. But actually, could you simplify by checking that you have Default on either side, so something like

(A == LangAS::Default || B == LangAS::Default)

?

clang/lib/AST/ItaniumMangle.cpp
2465

Any reason not to use OpenCL mangling? If you do then you might be able to link against libraries compiled for OpenCL. Also you will get more stable naming i.e. it would not differ from target to target.

clang/lib/Basic/Targets/AMDGPU.cpp
74

Would this map ever be used for SYCL? If not it would be better to add a comment about it and/or perhaps even just use dummy values.

clang/lib/Basic/Targets/SPIR.h
36

The same here. This map will never work for SYCL so let's just use dummy values like for CUDA and add a comment explaining this.

75

Ok so what I understand is that the only reason you need a separate map is that the semantics of Default is different for SYCL than for C/C++.

i.e. in SYCL (i.e. inherited from CUDA) it is a virtual/placeholder address space that can point to any of the named address spaces while in embedded C it is a physical address space which is the same as automatic storage / flat address space.

Since originally CUDA was not intended to compile for the same targets as C or C++ it hasn't been an issue as there was some sort of separation of concerns. But with wider adoption, it seems that this separation no longer exists and any target supporting CUDA style address space semantic together with C style would need to duplicate the address space map. This is not ideal especially considering the number of entities it contains and their constant growth. In reality, at least there won't be many such targets but if SYCL intends to support targets that OpenCL does then it is not uncommon. For example CPU targets can compile both OpenCL and C/C++...

If we adhere to the current architecture we should have C's Default and CUDA's Default as separate entries in the map. However, this might be quite an involved refactoring. Have you looked in this before? A less involved refactoring could be to add a special hook that would allow to alter the address space mapping only for Default address space. I am not sure whether this is something we could add to Targets, although it seems to be driven by the language semantic. Anyway we can't decide this in this review as it requires other community member involvement.

Considering that the workaround is relatively simple I think it would be reasonable to accept it as an initial step. However, I would prefer to simplify this patch further by removing the environment component for the time being and just conditionalizing the map on the language mode instead. You will probably need to reset the map in TargetInfo:adjust because this is where we have access to LangOpts.

https://clang.llvm.org/doxygen/Basic_2TargetInfo_8cpp_source.html#l00347

We already override lots of settings there and hopefully, this will only be needed temporarily. I would add a FIXME comment there explaining the design issue.

Then you won’t need to extend the triple component at least for now, until we find the right architectural solution for this. As I am not convinced this is a good approach. I would suggest to follow up on cfe-dev regarding this issue and how we should address this in the long term. Ideally, we should have started from this but I understand that you have some timing constraints.

125

Should the following code be legal? Perhaps you can replace this with a spec reference instead?

130

This only makes sense for SYCL so let's add a guard.

clang/lib/CodeGen/TargetInfo.cpp
10034

Is this change related to address spaces?

10063

I think this comment should be improved because OpenCL and CUDA are not really address space agnostic. Should this check also contain C or C++?

10064

Since you are using SYCL address space you should probably guard this line by SYCL mode... Btw the same seems to apply to the code below as it implements SYCL sematics?

Can you add spec references here too.

Also there seems to be nothing target specific in the code here as you are implementing what is specified by the language semantics. Should this not be moved to GetGlobalVarAddressSpace along with the other language handling?

I am not very familiar with this part of address space handling though. I would be more comfortable if @rjmccall could take a look too.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
2

Is this comment relevant?

40

This code doesn't seem to be relevant to the testing.

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

Are you testing address space segment binding here? Could you extend to other use cases too?

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
1 ↗(On Diff #333247)

How about adding explicit conversion here too and then renaming into something like address-space-conversions.cpp since the testing is not specific to a parameter.

80 ↗(On Diff #333247)

What functionality in the patch does this test?

99 ↗(On Diff #333247)

The same functionality seems to be tested above.

124 ↗(On Diff #333247)

Does this test anything from the patch?

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

What do you test in this file?

clang/test/CodeGenSYCL/address-spaces.cpp
1 ↗(On Diff #333247)

Does this test also check conversions? Could this be merged with another conversions test if there is anything missing there?

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
1 ↗(On Diff #333247)

I think this should also be renamed to address-space-conversions.cpp

You shouldn't need to add -x c++ as it is default for cpp files.

You should add testing of diagnostics for disallowed conversions too.

25 ↗(On Diff #333247)

Btw you don't seem to be testing the reverse conversion i.e. from named to Default.

47 ↗(On Diff #333247)

Why?

53 ↗(On Diff #333247)

This doesn't belong to conversions technically.

bader updated this revision to Diff 336543.Fri, Apr 9, 12:24 PM
bader marked 32 inline comments as done.

Applied code review suggestions.

Rebased on ToT and updated commit message.

clang/include/clang/AST/Type.h
493

Ok if you allow implicit conversions both ways then this condition should be extended to also contain all named address spaces in A and Default in B. But actually, could you simplify by checking that you have Default on either side, so something like

(A == LangAS::Default || B == LangAS::Default)

?

According to the comment above isAddressSpaceSupersetOf function definition.

/// Returns true if address space A is equal to or a superset of B.

(A == LangAS::Default || B == LangAS::Default) <- this change makes Default address space a superset of all address spaces including OpenCL, which we were trying to avoid with adding SYCL address spaces. Another problem with this code is that make Default a sub-set of named address spaces (like sycl_local), which is not right.
If I understand it correctly defining "isSupersSetOf" relation is enough for the rest of framework to enable conversions. Am I right?

clang/lib/AST/ItaniumMangle.cpp
2465

Any reason not to use OpenCL mangling? If you do then you might be able to link against libraries compiled for OpenCL. Also you will get more stable naming i.e. it would not differ from target to target.

I'm not sure I understand your suggestion. Could you elaborate on "OpenCL mangling", please?

Let me clarify the problem this change addresses. The test case covering it is located in clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp lines 86-91.

template <typename T>
void tmpl(T t) {}

int *NoAS;
__attribute__((opencl_private)) int *PRIV;

tmpl(PRIV);
// CHECK-DAG: [[PRIV_LOAD5:%[a-zA-Z0-9]+]] = load i32*, i32* addrspace(4)* [[PRIV]].ascast
// CHECK-DAG: call spir_func void [[PRIV_TMPL:@[a-zA-Z0-9_]+]](i32* [[PRIV_LOAD5]])
tmpl(NoAS);
// CHECK-DAG: [[NoAS_LOAD5:%[a-zA-Z0-9]+]] = load i32 addrspace(4)*, i32 addrspace(4)* addrspace(4)* [[NoAS]].ascast
// CHECK-DAG: call spir_func void [[GEN_TMPL:@[a-zA-Z0-9_]+]](i32 addrspace(4)* [[NoAS_LOAD5]])

Clang has separate code paths for mangling types w/ and w/o address space attributes (i.e. using Default address space).

Address space is not mangled if there is no AS attribute (Default) or if address space attribute is maps to 0 target address space. SPIR target maps *_private address space to 0, which causes name conflict for the example above.

This change for SYCL compiler enables mangling for non-default address space attributes regardless of their mapping to target address space.

clang/lib/Basic/Targets/AMDGPU.cpp
74

I can't find an example of how to do this.
CUDA address spaces use valid values and I wasn't able to find similar comments.

Where do you think we can put a comment?

clang/lib/Basic/Targets/SPIR.h
36

I've set 0 for SYCL values.

75

Good suggestion. I see that AMDGPU target does the same trick with overriding ajust method.
I also added back SPIR target assertions for unknow environment and OS components to clang/lib/Basic/Targets.cpp.

130

This only makes sense for SYCL so let's add a guard.

Unfortunately, I don't know what can be used as a guard here. TargetInfo doesn't provide access to language options.
Any suggestions?

clang/lib/CodeGen/TargetInfo.cpp
10034

I'll move this change to a separate patch.

10063

I think this comment should be improved because OpenCL and CUDA are not really address space agnostic. Should this check also contain C or C++?

I think the assert says that OpenCL and CUDA are not address space agnostic languages.
Tagging @yaxunl as an author of the check I just borrowed from AMDGPU target.

10064

This code assigns target address space "global variables w/o address space attribute".
SYCL says it's "implementation defined" (from https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace):

Namespace scope
If the type is const, the address space the declaration is assigned to is implementation-defined. If the target of the SYCL backend can represent the generic address space, then the assigned address space must be compatible with the generic address space.
Namespace scope non-const declarations cannot be used within a kernel, as restricted in Section 5.4. This means that non-const global variables cannot be accessed by any device kernel or code called by the device kernel.

I added clarification that SPIR target allocates global variables in global address space to https://reviews.llvm.org/D99488 (see line #248).

@rjmccall, mentioned in the mailing list discussion that this callbacks were developed for compiling C++ to AMDGPU target, so this not necessary designed only for SYCL, but it works for SYCL as well.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
2

Is this comment relevant?

I think so. It's placed by the script and useful for test maintenance.

40

Probably not for this patch, but the next patch (https://reviews.llvm.org/D71016) makes it necessary to emit LLVM IR for this test. Is it okay to leave it as is?

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

Are you testing address space segment binding here? Could you extend to other use cases too?

This test checks that returned address space is generic and address space is correctly casted for character, array, reference and aggregate types.

Could you clarify how exactly do you want this test to be extended, please? I think I'm not getting it.

clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
80 ↗(On Diff #333247)

What functionality in the patch does this test?

As I mentioned in the comment for clang/lib/AST/ItaniumMangle.cpp, there was a problem with instantiating templates for parameters which differs only by address space attribute. Lines 79-91 cover mangling changes.

99 ↗(On Diff #333247)

Removed.

124 ↗(On Diff #333247)

Does this test anything from the patch?

It's the same case as for address-space-cond-op.cpp test. This function defines an entry point for the device code. If it's not called, SYCL device compiler won't emit any LLVM IR output. I uploaded the final version of the test, which assumes that SYCL device compiler code generation part is in place. I would prefer to keep it instead of adding it back with my next patch. Is this okay?

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

What do you test in this file?

This test was added to cover a bug in CodeGen with incompatible address spaces between pointers to a struct and its members.
As we replaced CodeGen changes with SPIR target callbacks I think it's not needed anymore. I removed this test.

BTW, the same applies to clang/test/CodeGenSYCL/address-space-cond-op.cpp. Should we remove it as well?

clang/test/CodeGenSYCL/address-spaces.cpp
1 ↗(On Diff #333247)

Does this test also check conversions? Could this be merged with another conversions test if there is anything missing there?

I renamed this to address-space-deduction.cpp and address-space-parameter-conversions.cpp to address-space-conversions.cpp.

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
25 ↗(On Diff #333247)

Btw you don't seem to be testing the reverse conversion i.e. from named to Default.

Don't bar2(*GLOB);, bar2(*LOC);, bar2(*PRIV); cover this conversion?

Anastasia added inline comments.Mon, Apr 12, 7:56 AM
clang/include/clang/AST/Type.h
493

(A == LangAS::Default || B == LangAS::Default) <- this change makes Default address space a superset of all address spaces including OpenCL.

I see, yes this will break pretty much everything unless we guard by SYCL mode. But I don't think it is good to go this route though.

Another problem with this code is that make Default a sub-set of named address spaces (like sycl_local), which is not right.

Well, if you need implicit conversions to work both ways as you have written in the documentation then you don't really have a true super-/subsets between the named address spaces and the default one. They appear to be equivalent.

SYCL mode enables both explicit and implicit conversion to/from the default address space from/to
the address space-attributed type.

So do you actually need something like this to work?

int * genptr = ...;
__private int * privptr = genptr:
clang/lib/AST/ItaniumMangle.cpp
2465

It's just that all language address spaces are mangled with the source spelling in Italium ABI right now, if you check the else statement. I don't think it is part of the official spec yet but it might be better to stick to the same pattern if possible.

clang/lib/Basic/Targets/AMDGPU.cpp
74

I think you could add a comment right here. I bet you can use any values i.e. 0? They are not expected to be meaningful. It would be good if we could add an asset somewhere but perhaps it might be hard to find a good place...

clang/lib/Basic/Targets/SPIR.h
36

Great! Let's add a comment here that we use dummy values because the map can't be used for this language mode.

46

Ok let's set the OpenCL address spaces to 0 and add a comment that they can't be used with this map.

129

Ok, do you still plan to add a FIXME as mentioned previously explaining why we need to reset the map here?

clang/lib/CodeGen/TargetInfo.cpp
10064

After all what objects are allowed to bind to non-default address space here is defined in SYCL spec even if the exact address spaces are not defined so it is not completely a target-specific behavior.

My understanding of the API you are extending (judging from its use) is that it allows you to extend the language sematics with some target-specific setup. I.e. you could add extra address spaces to C++ or OpenCL or any other language. But here you are setting the language address spaces instead that are mapped to the target at some point implicitly.

It seems like this change better fits to CodeGenModule::GetGlobalVarAddressSpace that already contains very similar logic?

Otherwise, it makes more sense to use target address spaces directly instead of SYCL language address spaces. But either way, we should guard it by SYCL mode somehow as we have not established this as a universal logic for SPIR.

clang/test/CodeGenSYCL/address-space-cond-op.cpp
8

I am trying to understand what specifically you are testing from the patch and whether you need this whole IR output to be checked?

It helps for reviewing and maintenance purposes to minimize the IR patterns as much as possible.

40

I would prefer if we minimize the amount of noise as much as possible. This patch has enough functionality so I don't think adding extra helps in any way.

The testing should generally encapsulate what is relevant. Otherwise, I don't know how to assess it without reviewing other patches too.

clang/test/CodeGenSYCL/address-space-conversions.cpp
12

Should you check that the function name has expected mangling?

clang/test/CodeGenSYCL/address-space-of-returns.cpp
9

It seems to me that you are just testing address space inference here similar to address-space-deduction.cpp? I don't see anything specific to a return type in either the test or your patch?

clang/test/CodeGenSYCL/convergent.cpp
2

Is this change related? I thought we are not adding the environment component after all...

Anastasia added inline comments.Mon, Apr 12, 8:05 AM
clang/test/CodeGenSYCL/address-space-parameter-conversions.cpp
80 ↗(On Diff #333247)

You are not testing the mangling though?

124 ↗(On Diff #333247)

If you are not testing any logic from this patch let's leave it out because it is hard to navigate in the large review.

clang/test/CodeGenSYCL/address-spaces-struct.cpp
1 ↗(On Diff #333247)

Yeah if it is not testing anything in your patch then it's better to remove it.

clang/test/SemaSYCL/address-space-parameter-conversions.cpp
25 ↗(On Diff #333247)

Sorry I mean from Default to named AS.

bader marked an inline comment as done.Mon, Apr 12, 10:57 AM
bader added inline comments.
clang/include/clang/AST/Type.h
488

BTW, we need enable global_device and global_host attributes from https://reviews.llvm.org/D82174 for SYCL USM feature. I have following question regarding this: should I create a follow-up patch or we can enable all attributes for SYCL at once?

clang/test/CodeGenSYCL/convergent.cpp
2

Is this change related? I thought we are not adding the environment component after all...

While I was removing -sycldevice environment component from the patch, I noticed that one of the committed tests already uses it.
https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenSYCL/convergent.cpp#L2

Do you want to me to create a separate review request for this change?