Page MenuHomePhabricator

[SYCL] Implement SYCL address space attributes handling
Needs ReviewPublic

Authored by bader on Oct 21 2020, 1:34 PM.

Details

Summary

Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCLDevice environment to triple.

Static variables without address space now reside in global address
space, unless they have an explicit address space qualifier in source
code.

Although OpenCL specification explicitly states that a string literal
should reside in constant address space, it does not work for SYCL
with "generic by default" address space rules.

For example:

const char *getLiteral() {
  return "A";
}

void func(bool AorB) {
  char B[] = {'B', '\0'};
  const char* C = AorB ? A : B;
}

If `A' reside in constant address space, it cannot be returned from a
function `getLiteral', because it returns a generic const char*.

When default address space values in C++ are CodeGen'ed into different
address spaces in LLVM IR, we must ensure that an addrspacecast is
emitted before a bitcast.

Pointers with the same address space in AST may end up in different
address spaces in IR. We cannot CodeGen a conditional
operator (ternary 'if') whose operands have different address spaces,
so we have to addrspacecast them to generic.

When a static variable is lowered from AST to LLVM IR, it can change an
address space depending on its storage class. For example, a static
variable declared in function scope may be assigned to global address
space if language rules allow. When this happens, original address space
of a variable (retuned by D.getType()) is no longer can be used as
target address space. Correct address space for a global can be obtained
from CodeGenModule::GetGlobalVarAddressSpace function, but in this case
we just copy it from the existing global (OldGV).

Pointers with the same address space in AST may end up in different
address spaces in IR. We cannot CodeGen a binary compare
operator (<, >, ==, etc.) whose operands have different address spaces,
so we have to addrspacecast them to generic.

Emit address space cast if return type and return value have different
address spaces.

Invalid bitcast with different address spaces was emitted as arument of
llvm.invariant.start intrinsic for work group scope constants.
This problem caused assertion fail.

Make sure that all operands of phi instruction that is generated while
lowering conditional operator have the same type. This fixes compiler
assertion.

Diff Detail

Unit TestsFailed

TimeTest
350 mslinux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp

Event Timeline

bader created this revision.Oct 21 2020, 1:34 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 21 2020, 1:34 PM
bader requested review of this revision.Oct 21 2020, 1:34 PM
rjmccall added inline comments.
clang/lib/Basic/Targets/AMDGPU.cpp
57

I feel like we may be outgrowing these address-space map arrays.

clang/lib/CodeGen/CGCall.cpp
4596

This seems problematic; code like this shouldn't be necessary in every place that uses a pointer value. The general rule needs to be that expression emission produces a value of its expected type, so the places that emit pointers need to be casting to the generic address space. We can avoid that in some special expression emitters, like maybe EmitLValue, as long as the LValue records what happened and can apply the appropriate transform later.

Also, in IRGen we work as much as possible with AST address spaces rather than IR address spaces. That is, code like this that's checking for address space mismatches needs to be comparing the source AST address space with the destination AST address space and calling a method on the TargetInfo when a mismatch is detected, rather than comparing IR address spaces and directly building an addrspacecast. The idea is that in principle a target should be able to have two logical address spaces that are actually implemented in IR with the same underlying address space, with some arbitrary transform between them.

Anastasia added a comment.EditedOct 26 2020, 10:00 AM

In the RFC it has been discussed to either use target address spaces or perhaps to introduce a new attribute to reflect a semantic needed for SYCL, but it seems to me in this change you are building on top of the existing language address space attribute with new entries for SYCL.

http://lists.llvm.org/pipermail/cfe-dev/2020-August/066632.html

Has there been any other discussion regarding the approach which is not captured in the cfe-dev thread I have linked?

clang/lib/CodeGen/CGExpr.cpp
4559

What language rules?

In the RFC it has been discussed to either use target address spaces or perhaps to introduce a new attribute to reflect a semantic needed for SYCL, but it seems to me in this change you are building on top of the existing language address space attribute with new entries for SYCL.

http://lists.llvm.org/pipermail/cfe-dev/2020-August/066632.html

Right, this patch adds new semantic attributes for SYCL and I left a comment to decide whether we need a separate representation for parsed attribute and spelling as well. Do you suggest adding SYCL specific spelling for these attributes?

WRT to a solution built on top of https://reviews.llvm.org/D62574, I'm hesitate investing into this direction as it's not clear to me when we can get this patch in (it's on review for 17 months already). I'd like to make some progress with address space attributes to unblock upstreaming of other changes depending on this change.

Has there been any other discussion regarding the approach which is not captured in the cfe-dev thread I have linked?

I think everything is captured in the mailing list thread.

clang/lib/CodeGen/CGCall.cpp
4596

@erichkeane, @asavonic, could you help with addressing this concern, please?

clang/lib/CodeGen/CGExpr.cpp
4559

I suppose it's a generic note and author didn't meant some particular language here. This change was contributed by @sdmitriev.
@sdmitriev, could you clarify, please?

sdmitriev added inline comments.Thu, Nov 12, 12:52 AM
clang/lib/CodeGen/CGExpr.cpp
4559

This comment was originally added by @asavonic to clang/lib/CodeGen/CGExprScalar.cpp (lines 2962-2965 in this patch), and later I reused his code along with the comment in this file while fixing a very similar issue. So, I guess it would be more correct to ask Andrew to clarify. @asavonic, can you please comment on this?

asavonic added inline comments.Fri, Nov 13, 1:41 AM
clang/lib/CodeGen/CGExpr.cpp
4559

This comment was originally added by @asavonic to clang/lib/CodeGen/CGExprScalar.cpp (lines 2962-2965 in this patch), and later I reused his code along with the comment in this file while fixing a very similar issue.

There is if (Opts.SYCLIsDevice) In CGExprScalar.cpp, so we handle SYCL language rules.
In SYCL, (almost) all pointers have default address space in AST, but in CG some pointers become private/local/global, and some are lowered as generic pointers. We need to add an implicit addrspacecast in CG to emit an expression that expects the same type for its operands.

For example:

const char *str = ... ;
const char *phi_str = i > 2 ? str : "hello world!";

In AST types of str and "hello world" are the same (const char *). In CG we emit str as i8 addrspace(4)* and "hello world" as i8*. At this point we have to decide what type phi_str should have, and addrspacecast the operands accordingly.

bader added inline comments.Fri, Nov 13, 5:49 AM
clang/lib/CodeGen/CGExpr.cpp
4559

Is it possible to avoid fixing address space mismatch at this point by adjusting AST as suggested by John in this comment: https://reviews.llvm.org/D89909#inline-837495?

I wonder if it can be done w/o impacting C++ template metaprogramming as noted in this thread: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html.

asavonic added inline comments.Fri, Nov 13, 7:20 AM
clang/lib/CodeGen/CGExpr.cpp
4559

Is it possible to avoid fixing address space mismatch at this point by adjusting AST

Some address space mismatches are present in CG already, see CodeGenModule::getStringLiteralAddressSpace and CodeGenModule::GetGlobalVarAddressSpace. This should be moved to AST (or replaced) as well, right?

bader updated this revision to Diff 305165.Fri, Nov 13, 8:27 AM

Remove support for sycl_constant address space attribute.

bader updated this revision to Diff 305167.Fri, Nov 13, 8:29 AM

Upload full patch after removing sycl_constant address space attribute support.

In the RFC it has been discussed to either use target address spaces or perhaps to introduce a new attribute to reflect a semantic needed for SYCL, but it seems to me in this change you are building on top of the existing language address space attribute with new entries for SYCL.

http://lists.llvm.org/pipermail/cfe-dev/2020-August/066632.html

Right, this patch adds new semantic attributes for SYCL and I left a comment to decide whether we need a separate representation for parsed attribute and spelling as well. Do you suggest adding SYCL specific spelling for these attributes?

WRT to a solution built on top of https://reviews.llvm.org/D62574, I'm hesitate investing into this direction as it's not clear to me when we can get this patch in (it's on review for 17 months already). I'd like to make some progress with address space attributes to unblock upstreaming of other changes depending on this change.

Has there been any other discussion regarding the approach which is not captured in the cfe-dev thread I have linked?

I think everything is captured in the mailing list thread.

Did anyone conclude there that the language address spaces should be added for SYCL? I can't see any of this. In fact I don't even think there was any conclusion on the RFC. You should first make your design clear and agreed before going ahead with the implementation. I personally still think that address space AST attribute is not the right path for SYCL. And I haven't seen any evidence that it is the best solution to what you are trying to achieve.

bader added inline comments.Wed, Nov 18, 5:19 AM
clang/lib/CodeGen/CGExpr.cpp
4559

Some address space mismatches are present in CG already, see CodeGenModule::getStringLiteralAddressSpace and CodeGenModule::GetGlobalVarAddressSpace. This should be moved to AST (or replaced) as well, right?

Kind of. If I'm not mistaken the idea is to redefine some TargetInfo hooks for SPIR target (similar to AMDGPU target) and IRGen should be able to align address spaces without additional customization like this. John provided a little bit more details in the RFC thread: http://clang-developers.42468.n3.nabble.com/RFC-Re-use-OpenCL-address-space-attributes-for-SYCL-td4068754.html#a4069039.

bader added a comment.Wed, Nov 18, 8:49 AM

Did anyone conclude there that the language address spaces should be added for SYCL? I can't see any of this. In fact I don't even think there was any conclusion on the RFC. You should first make your design clear and agreed before going ahead with the implementation. I personally still think that address space AST attribute is not the right path for SYCL. And I haven't seen any evidence that it is the best solution to what you are trying to achieve.

I'll try to summarize my understanding of the current state. So far there were three options discussed in the mailing list thread:

  1. We started with re-using OpenCL address space attributes (https://reviews.llvm.org/D80932), but there was a concern regarding deviation from OpenCL semantics in SYCL mode for these attributes. In same review, there was a proposal to implement a separate set of attributes to address this concern, which lead to the solution #2.
  2. This patch (https://reviews.llvm.org/D89909) implements proposal from https://reviews.llvm.org/D80932 review comments to add SYCL specific attributes to avoid interference with OpenCL mode.
  3. Bevin Hansson suggested to resolve semantic difference between SYCL and OpenCL mode for OpenCL address space attributes handling by re-using this work: https://reviews.llvm.org/D62574, which provides an infrastructure to customize address space conversion rules based on information available in ASTContext. This seems to be viable approach to implement the difference between OpenCL and SYCL modes.

Open source SYCL implementation from https://github.com/intel/llvm/tree/sycl currently implements option #1 and option #2 was used in the past, so I'm quite confident that these will work. I haven't tried to prototype option #3, but as mentioned in RFC, https://reviews.llvm.org/D62574 provides necessary infrastructure to customize address space attributes already supported by the compiler.
As we already agreed in the RFC, we should have address space attributes in AST for SYCL and the only question we need to answer is whether SYCL mode can re-using existing attributes or add something new.

My current plan is to continue with option #2, unless Bevin's patch is going to be committed soon. I'd like to have something committed to unblock other changes relying on address space attributes. Let me know if you have any concerns regarding this plan.

Did anyone conclude there that the language address spaces should be added for SYCL? I can't see any of this. In fact I don't even think there was any conclusion on the RFC. You should first make your design clear and agreed before going ahead with the implementation. I personally still think that address space AST attribute is not the right path for SYCL. And I haven't seen any evidence that it is the best solution to what you are trying to achieve.

I'll try to summarize my understanding of the current state. So far there were three options discussed in the mailing list thread:

  1. We started with re-using OpenCL address space attributes (https://reviews.llvm.org/D80932), but there was a concern regarding deviation from OpenCL semantics in SYCL mode for these attributes. In same review, there was a proposal to implement a separate set of attributes to address this concern, which lead to the solution #2.
  2. This patch (https://reviews.llvm.org/D89909) implements proposal from https://reviews.llvm.org/D80932 review comments to add SYCL specific attributes to avoid interference with OpenCL mode.
  3. Bevin Hansson suggested to resolve semantic difference between SYCL and OpenCL mode for OpenCL address space attributes handling by re-using this work: https://reviews.llvm.org/D62574, which provides an infrastructure to customize address space conversion rules based on information available in ASTContext. This seems to be viable approach to implement the difference between OpenCL and SYCL modes.

Open source SYCL implementation from https://github.com/intel/llvm/tree/sycl currently implements option #1 and option #2 was used in the past, so I'm quite confident that these will work. I haven't tried to prototype option #3, but as mentioned in RFC, https://reviews.llvm.org/D62574 provides necessary infrastructure to customize address space attributes already supported by the compiler.
As we already agreed in the RFC, we should have address space attributes in AST for SYCL and the only question we need to answer is whether SYCL mode can re-using existing attributes or add something new.

My current plan is to continue with option #2, unless Bevin's patch is going to be committed soon. I'd like to have something committed to unblock other changes relying on address space attributes. Let me know if you have any concerns regarding this plan.

I didn't sense any agreement or any sort of a conclusion on the RFC. Moreover, I didn't feel at all that the existing address space attribute was a good fit. I was suggesting to add a different attribute that isn't an address space attribute from Embedded C. I don't understand what you gain from the existing address space attribute at the moment. It was mentioned that the changes in the type system with address spaces is undesirable for SYCL because you indend to parse existing C++ code as-is. This contradicts the intended semantic of address spaces where the whole point of it is to modify the standard types and therefore a compilation of C++ with the standard semantic is only guaranteed when the attribute is not used at all.

I understand that you workaround this issue by adding the implicit conversions to obtain a flat address space and then recover the address spaces back in CodeGen, but this is doesn't seem like a good use of this attribute. If we all start to alter the semantic of the attribute then we won't be able to make any sense out of it. This will impact future development as any refactoring, improvements and evolution would require the understanding of the special cases that we add. I am worried about the impact on the community for future work. This is why I was suggesting to add a completely new attribute.

Perhaps to unblock your work it would be good to have a summary of what functionality you require from the address space attribute and what needs to work differently. Then at least it would be more clear if the attribute is a good fit here and if so what difference in its semantic will be brought by SYCL customizations. Whichever route we decide to go ahead, the documentation of intended language semantic should be added somewhere publicly accessible to facilitate adequate code review and maintenance because as far as I am aware it is not documented as part of the SYCL spec or any other documentation resource.