Page MenuHomePhabricator

[OpenCL] Use __generic addr space when generating internal representation of lambda
ClosedPublic

Authored by Anastasia on Nov 7 2019, 3:41 AM.

Details

Summary

Since lambdas are represented by callable objects, we need to adjust addr space of implicit obj parameter.

This patch suggests to use __generic for OpenCL mode. Then any lambda variable declared in __constant addr space (which is not convertible to __generic) would fail to compile with a diagnostic.

Towards generic C++ mode it might be better to just use add space from the lambda variable to qualify internal class member functions of lambda? However, I am not clear how to propagate the addr space since we are handling the initializers before the variable declaration. Alternatively it might make sense to just disallow qualifying lambdas with an addr space since they have internal compiler representation defined by the implementation.

Diff Detail

Event Timeline

Anastasia created this revision.Nov 7 2019, 3:41 AM

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

The right place to write the qualifier is the same place where you have to write attributes and/or mutable, i.e. in the lambda declarator, something like []() __local { ... }.

Arguably there ought to be a special-case rule allowing lambdas with no captures to be called in an arbitrary address space, but I don't think we need to go there in this patch.

clang/test/SemaOpenCLCXX/address-space-lambda.cl
14

Can we get a better diagnostic here when a candidate would have succeeded except for an address-space mismatch? Something like "candidate function not viable: cannot convert 'this' from 'constant' address space to 'generic'"? You can do that in a separate patch, of course.

Anastasia marked an inline comment as done.Nov 11 2019, 10:33 AM

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

The right place to write the qualifier is the same place where you have to write attributes and/or mutable, i.e. in the lambda declarator, something like []() __local { ... }.

Arguably there ought to be a special-case rule allowing lambdas with no captures to be called in an arbitrary address space, but I don't think we need to go there in this patch.

I am trying to understand what are the limitations here. Once we add ability to qualify lambdas with an addr space it should just work? But also it should be ok for all lambdas? Apart from captures won'twork for __constant addr space in OpenCL.

To summarize what has to be done overall to support addr spaces with lambdas:

  1. Add default AS on object param of internal lambda class member functions.
  1. Allow addr space qualifier on lambda expression.

Example:

[&]() __local  {i++;};
  1. Diagnose address space mismatch between variable decl and lambda expr qualifier

Example:

__private auto  llambda = [&]() __local {i++;}; // error no legal conversion b/w __private and __local

I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

clang/test/SemaOpenCLCXX/address-space-lambda.cl
14

Sure. Makes sense!

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

The right place to write the qualifier is the same place where you have to write attributes and/or mutable, i.e. in the lambda declarator, something like []() __local { ... }.

Arguably there ought to be a special-case rule allowing lambdas with no captures to be called in an arbitrary address space, but I don't think we need to go there in this patch.

I am trying to understand what are the limitations here. Once we add ability to qualify lambdas with an addr space it should just work? But also it should be ok for all lambdas? Apart from captures won'twork for __constant addr space in OpenCL.

I think it should just work, yes. Let's ignore my off-hand suggestion about the no-captures special case.

To summarize what has to be done overall to support addr spaces with lambdas:

  1. Add default AS on object param of internal lambda class member functions.

Right, the same default AS as is used for methods generally.

  1. Allow addr space qualifier on lambda expression. Example: [&]() __local {i++;};

Right.

  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Anastasia updated this revision to Diff 228851.Nov 12 2019, 3:56 AM
  • Added missing handling of lambda w/o param list.

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

Ok, makes sense so something like this should compile fine:

__global auto glob = [&] __global { ... };
  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Ok, since we can't qualify lambda expr with addr spaces yet there is not much I can test at this point. __constant lambda variable seems to be the only case with a diagnostic for OpenCL.

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

Ok, makes sense so something like this should compile fine:

__global auto glob = [&] __global { ... };

Right, a call to glob() should work in this case. I don't know if __global would parse here without (), though; in the grammar, it looks like attributes and so on have to follow an explicit parameter clause.

  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Ok, since we can't qualify lambda expr with addr spaces yet there is not much I can test at this point. __constant lambda variable seems to be the only case with a diagnostic for OpenCL.

You can certainly copy a lambda type into a different address space, or construct a pointer to a qualified lambda type, e.g. (*(__constant decltype(somelambda) *) nullptr)().

John.

Anastasia updated this revision to Diff 229308.Nov 14 2019, 7:13 AM
  • Added pointer to lambda test case.

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

Ok, makes sense so something like this should compile fine:

__global auto glob = [&] __global { ... };

Right, a call to glob() should work in this case. I don't know if __global would parse here without (), though; in the grammar, it looks like attributes and so on have to follow an explicit parameter clause.

  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Ok, since we can't qualify lambda expr with addr spaces yet there is not much I can test at this point. __constant lambda variable seems to be the only case with a diagnostic for OpenCL.

You can certainly copy a lambda type into a different address space, or construct a pointer to a qualified lambda type, e.g. (*(__constant decltype(somelambda) *) nullptr)().

Perhaps I am not thinking of the right use cases, but I am struggling to understand what would be the difference to variables of other type here.

I have added (*(__constant decltype(somelambda) *) nullptr)() as a test case however. It is rejected due to multiple addr spaces.

Also generally do you think there is any situation in which having lambda object and lambda call operator in different addr spaces would be useful?

John.

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

Ok, makes sense so something like this should compile fine:

__global auto glob = [&] __global { ... };

Right, a call to glob() should work in this case. I don't know if __global would parse here without (), though; in the grammar, it looks like attributes and so on have to follow an explicit parameter clause.

  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Ok, since we can't qualify lambda expr with addr spaces yet there is not much I can test at this point. __constant lambda variable seems to be the only case with a diagnostic for OpenCL.

You can certainly copy a lambda type into a different address space, or construct a pointer to a qualified lambda type, e.g. (*(__constant decltype(somelambda) *) nullptr)().

Perhaps I am not thinking of the right use cases, but I am struggling to understand what would be the difference to variables of other type here.

I have added (*(__constant decltype(somelambda) *) nullptr)() as a test case however. It is rejected due to multiple addr spaces.

Hmm. That makes sense — decltype is picking up the address-space qualification of somelambda — but seems unfortunate; is the only way to change the address space of a type to specifically strip the address space with a template metaprogram? I would hope that the multiple-address-spaces error would only apply in the case of multiple explicit qualifiers, and otherwise the new address space qualifier would replace the old one. I won't ask you to do that just to do this patch, but you could write the template metaprogram:

template <class T> struct remove_address_space { using type = T; };
template <class T> struct remove_address_space<__private T> { using type = T; };
...
auto globalLambdaPtr = (__global remove_address_space<decltype(somelambda)>::type*) nullptr;
globalLambdaPtr();

Also generally do you think there is any situation in which having lambda object and lambda call operator in different addr spaces would be useful?

Absolutely; remember that all lambda objects begin their life as temporaries and are thus in the private address space, so if you want the lambda to be storable in any other address space you will have at least a brief mismatch. The user can address-space-qualify the lambda all they want, but we can't generally automatically allocate memory for it that address space.

It does make logical sense to be able to qualify the call operator of a lambda. The lambda always starts as a temporary, so presumably we want the default address space qualifier on lambdas to be compatible with the address space of temporaries. However, that's probably also true of the default address qualifier on methods in general or you wouldn't be able to call them on temporaries and locals. Thus, we should be able to use the same default in both cases, which seems to be what you've implemented.

It even makes sense to be able to qualify a lambda with an address space that's not compatible with the address space of temporaries; that just means that the lambda has to be copied elsewhere before it can be invoked.

Just to clarify... Do you mean we should be able to compile this example:

[&] __global {
      i++;
} ();

Or should this result in a diagnostic about an addr space mismatch?

It should result in a diagnostic on the call. But if you assigned that lambda into global memory (somehow) it should work.

Ok, makes sense so something like this should compile fine:

__global auto glob = [&] __global { ... };

Right, a call to glob() should work in this case. I don't know if __global would parse here without (), though; in the grammar, it looks like attributes and so on have to follow an explicit parameter clause.

  1. Diagnose address space mismatch between variable decl and lambda expr qualifier Example: private auto llambda = [&]() local {i++;}; // error no legal conversion b/w private and local

    I think 1 is covered by this patch. I would like to implement 2 as a separate patch though to be able to commit fix for 1 earlier and unblock some developers waiting for the fix. I think 3 already works and I will just update the diagnostic in a separate patch too.

I agree that 3 should just fall out. And yeah, implementing 2 as a separate patch should be fine. Please make sure 3 is adequately tested in this patch.

Ok, since we can't qualify lambda expr with addr spaces yet there is not much I can test at this point. __constant lambda variable seems to be the only case with a diagnostic for OpenCL.

You can certainly copy a lambda type into a different address space, or construct a pointer to a qualified lambda type, e.g. (*(__constant decltype(somelambda) *) nullptr)().

Perhaps I am not thinking of the right use cases, but I am struggling to understand what would be the difference to variables of other type here.

I have added (*(__constant decltype(somelambda) *) nullptr)() as a test case however. It is rejected due to multiple addr spaces.

Hmm. That makes sense — decltype is picking up the address-space qualification of somelambda — but seems unfortunate; is the only way to change the address space of a type to specifically strip the address space with a template metaprogram?

I can't think of any other way. The main reason is that in OpenCL we add address spaces implicitly to all objects if they are not specified. Unless we decide to modify decltype but I think in other places it is useful to copy addr space qualifier too. For C++ however it might work, I will look into it.

I would hope that the multiple-address-spaces error would only apply in the case of multiple explicit qualifiers, and otherwise the new address space qualifier would replace the old one. I won't ask you to do that just to do this patch, but you could write the template metaprogram:

template <class T> struct remove_address_space { using type = T; };
template <class T> struct remove_address_space<__private T> { using type = T; };
...
auto globalLambdaPtr = (__global remove_address_space<decltype(somelambda)>::type*) nullptr;
globalLambdaPtr();

I was already planning to add this. I could look into it next and maybe just a add FIXME in the test for now.

Also generally do you think there is any situation in which having lambda object and lambda call operator in different addr spaces would be useful?

Absolutely; remember that all lambda objects begin their life as temporaries and are thus in the private address space, so if you want the lambda to be storable in any other address space you will have at least a brief mismatch. The user can address-space-qualify the lambda all they want, but we can't generally automatically allocate memory for it that address space.

Btw global lambda objects are in __global address space in OpenCL.

I was already planning to add this. I could look into it next and maybe just a add FIXME in the test for now.

Sure.

Btw global lambda objects are in __global address space in OpenCL.

Just based on being written outside of a function body or in a static-local initializer? I guess it's supportable; you can reasonably allocate global memory for those temporaries.

I was already planning to add this. I could look into it next and maybe just a add FIXME in the test for now.

Sure.

Btw global lambda objects are in __global address space in OpenCL.

Just based on being written outside of a function body or in a static-local initializer? I guess it's supportable; you can reasonably allocate global memory for those temporaries.

If I compile this kernel:

auto glambda = [](auto a)  { return 1; };

__kernel void foo() {
  glambda(1);
}

I will get the following IR:

%class.anon = type { i8 }

@glambda = internal addrspace(1) global %class.anon undef, align 1

; Function Attrs: convergent noinline nounwind optnone
define spir_kernel void @foo() #0 !kernel_arg_addr_space !3 !kernel_arg_access_qual !3 !kernel_arg_type !3 !kernel_arg_base_type !3 !kernel_arg_type_qual !3 {
entry:
  %call = call spir_func i32 @"_ZNU3AS43$_0clIiEEDaT_"(%class.anon addrspace(4)* addrspacecast (%class.anon addrspace(1)* @glambda to %class.anon addrspace(4)*), i32 1) #2
  ret void
}

; Function Attrs: convergent noinline nounwind optnone
define internal spir_func i32 @"_ZNU3AS43$_0clIiEEDaT_"(%class.anon addrspace(4)* %this, i32 %a) #1 align 2 {
entry:
  %this.addr = alloca %class.anon addrspace(4)*, align 4
  %a.addr = alloca i32, align 4
  store %class.anon addrspace(4)* %this, %class.anon addrspace(4)** %this.addr, align 4
  store i32 %a, i32* %a.addr, align 4
  %this1 = load %class.anon addrspace(4)*, %class.anon addrspace(4)** %this.addr, align 4
  ret i32 1
}

and the AST is:

|-VarDecl 0x55e8d2789500 <test.cl:1:1, col:39> col:6 used glambda '__global (lambda at test.cl:1:16)':'__global (lambda at test.cl:1:16)' cinit
| `-ImplicitCastExpr 0x55e8d278a1f0 <col:16, col:39> '__global (lambda at test.cl:1:16)':'__global (lambda at test.cl:1:16)' <NoOp>
|   `-LambdaExpr 0x55e8d2789f90 <col:16, col:39> '(lambda at test.cl:1:16)'
|     |-CXXRecordDecl 0x55e8d2789808 <col:16> col:16 implicit class definition
|     | |-DefinitionData generic lambda pass_in_registers empty standard_layout trivially_copyable literal can_const_default_init
|     | | |-DefaultConstructor defaulted_is_constexpr
|     | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
|     | | |-MoveConstructor exists simple trivial needs_implicit
|     | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
|     | | |-MoveAssignment
|     | | `-Destructor simple irrelevant trivial
|     | |-FunctionTemplateDecl 0x55e8d2789aa8 <<invalid sloc>, col:39> col:16 operator()
|     | | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
|     | | |-CXXMethodDecl 0x55e8d27899f8 <col:25, col:39> col:16 constexpr operator() 'auto (auto) const __generic' inline
|     | | | |-ParmVarDecl 0x55e8d27896a8 <col:19, col:24> col:24 a 'auto'
|     | | | `-CompoundStmt 0x55e8d2789b48 <col:27, col:39>
|     | | |   `-ReturnStmt 0x55e8d2789b38 <col:29, col:36>
|     | | |     `-IntegerLiteral 0x55e8d2789b18 <col:36> 'int' 1
|     | | `-CXXMethodDecl 0x55e8d27b9750 <col:25, col:39> col:16 used constexpr operator() 'int (int) const __generic' inline
|     | |   |-TemplateArgument type 'int'
|     | |   |-ParmVarDecl 0x55e8d27b9608 <col:19, col:24> col:24 a 'int':'int'
|     | |   `-CompoundStmt 0x55e8d27b9978 <col:27, col:39>
|     | |     `-ReturnStmt 0x55e8d27b9968 <col:29, col:36>
|     | |       `-IntegerLiteral 0x55e8d2789b18 <col:36> 'int' 1
|     | |-FunctionTemplateDecl 0x55e8d2789e18 <<invalid sloc>, col:39> col:16 implicit operator auto (*)(type-parameter-0-0)
|     | | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
|     | | `-CXXConversionDecl 0x55e8d2789d70 <col:16, col:39> col:16 implicit constexpr operator auto (*)(type-parameter-0-0) 'auto (*() const noexcept)(auto)' inline
|     | |-FunctionTemplateDecl 0x55e8d2789f20 <<invalid sloc>, col:39> col:16 implicit __invoke
|     | | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
|     | | `-CXXMethodDecl 0x55e8d2789e78 <col:16, col:39> col:16 implicit __invoke 'auto (auto)' static inline
|     | |   `-ParmVarDecl 0x55e8d2789d08 <col:19, col:24> col:24 a 'auto'
|     | `-CXXDestructorDecl 0x55e8d2789fc0 <col:16> col:16 implicit referenced ~ 'void () __generic noexcept' inline default trivial
|     `-CompoundStmt 0x55e8d2789b48 <col:27, col:39>
|       `-ReturnStmt 0x55e8d2789b38 <col:29, col:36>
|         `-IntegerLiteral 0x55e8d2789b18 <col:36> 'int' 1
|-CXXRecordDecl 0x55e8d2789808 <col:16> col:16 implicit class definition
| |-DefinitionData generic lambda pass_in_registers empty standard_layout trivially_copyable literal can_const_default_init
| | |-DefaultConstructor defaulted_is_constexpr
| | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveConstructor exists simple trivial needs_implicit
| | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveAssignment
| | `-Destructor simple irrelevant trivial
| |-FunctionTemplateDecl 0x55e8d2789aa8 <<invalid sloc>, col:39> col:16 operator()
| | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
| | |-CXXMethodDecl 0x55e8d27899f8 <col:25, col:39> col:16 constexpr operator() 'auto (auto) const __generic' inline
| | | |-ParmVarDecl 0x55e8d27896a8 <col:19, col:24> col:24 a 'auto'
| | | `-CompoundStmt 0x55e8d2789b48 <col:27, col:39>
| | |   `-ReturnStmt 0x55e8d2789b38 <col:29, col:36>
| | |     `-IntegerLiteral 0x55e8d2789b18 <col:36> 'int' 1
| | `-CXXMethodDecl 0x55e8d27b9750 <col:25, col:39> col:16 used constexpr operator() 'int (int) const __generic' inline
| |   |-TemplateArgument type 'int'
| |   |-ParmVarDecl 0x55e8d27b9608 <col:19, col:24> col:24 a 'int':'int'
| |   `-CompoundStmt 0x55e8d27b9978 <col:27, col:39>
| |     `-ReturnStmt 0x55e8d27b9968 <col:29, col:36>
| |       `-IntegerLiteral 0x55e8d2789b18 <col:36> 'int' 1
| |-FunctionTemplateDecl 0x55e8d2789e18 <<invalid sloc>, col:39> col:16 implicit operator auto (*)(type-parameter-0-0)
| | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
| | `-CXXConversionDecl 0x55e8d2789d70 <col:16, col:39> col:16 implicit constexpr operator auto (*)(type-parameter-0-0) 'auto (*() const noexcept)(auto)' inline
| |-FunctionTemplateDecl 0x55e8d2789f20 <<invalid sloc>, col:39> col:16 implicit __invoke
| | |-TemplateTypeParmDecl 0x55e8d27895d8 <col:19> col:19 implicit class depth 0 index 0
| | `-CXXMethodDecl 0x55e8d2789e78 <col:16, col:39> col:16 implicit __invoke 'auto (auto)' static inline
| |   `-ParmVarDecl 0x55e8d2789d08 <col:19, col:24> col:24 a 'auto'
| `-CXXDestructorDecl 0x55e8d2789fc0 <col:16> col:16 implicit referenced ~ 'void () __generic noexcept' inline default trivial
`-FunctionDecl 0x55e8d278a258 <line:2:1, line:11:1> line:2:15 foo 'void ()'
  |-CompoundStmt 0x55e8d27b9a98 <col:21, line:11:1>
  | `-CXXOperatorCallExpr 0x55e8d27b9a60 <line:4:3, col:12> 'int':'int'
  |   |-ImplicitCastExpr 0x55e8d27b9a08 <col:10, col:12> 'int (*)(int) const __generic' <FunctionToPointerDecay>
  |   | `-DeclRefExpr 0x55e8d27b9990 <col:10, col:12> 'int (int) const __generic' lvalue CXXMethod 0x55e8d27b9750 'operator()' 'int (int) const __generic'
  |   |-ImplicitCastExpr 0x55e8d27b9a48 <col:3> 'const __generic (lambda at test.cl:1:16)' lvalue <AddressSpaceConversion>
  |   | `-DeclRefExpr 0x55e8d278a338 <col:3> '__global (lambda at test.cl:1:16)':'__global (lambda at test.cl:1:16)' lvalue Var 0x55e8d2789500 'glambda' '__global (lambda at test.cl:1:16)':'__global (lambda at test.cl:1:16)'
  |   `-IntegerLiteral 0x55e8d278a358 <col:11> 'int' 1
  `-OpenCLKernelAttr 0x55e8d278a2f8 <line:2:1>

It seems we don't actually have temporaries.

Anastasia updated this revision to Diff 230070.Nov 19 2019, 7:21 AM
  • Added FIXME to enhance testing.

Yes, in that case copy-elision into the global variable is guaranteed. You can write arbitrary expressions in global initializers, however, and those can use temporary lambdas.

Yes, in that case copy-elision into the global variable is guaranteed. You can write arbitrary expressions in global initializers, however, and those can use temporary lambdas.

I guess you mean something like this?

auto glambda = []() { return 1; }();

I don't see a way to change the address space of a lambda object however. It would only be possible to specify addr space qualifiers for a call operator of such lambdas.

Yes, in that case copy-elision into the global variable is guaranteed. You can write arbitrary expressions in global initializers, however, and those can use temporary lambdas.

I guess you mean something like this?

auto glambda = []() { return 1; }();

Yes, or just passing it as an argument to something.

I don't see a way to change the address space of a lambda object however. It would only be possible to specify addr space qualifiers for a call operator of such lambdas.

Yes, I think the language has to say what address space the lambda temporary is in. Among other things, this can affect how captures are initialized, since some constructors are only usable in certain address spaces. (Lambdas in global contexts can't capture surrounding local variables, but they can still have explicitly-initialized captures, like [x=foo()] { ... }.) That language rule could be that lambda temporaries are always in the private address space, or it could be that lambdas in global contexts are in the global address space. The latter might be easier because it's compatible with lifetime-extension: we don't necessarily know when processing the lambda whether its temporary will be lifetime-extended, and if it is, it needs to be in the global address space. The only disadvantage of that is that we'd have to use global memory even for non-extended temporaries in global initializers.

Richard might have thoughts about this. I don't know if there's been any language work around copy-elision and/or lifetime-extension that might be stretchable to allow us to delay the address-space decision until later.

Yes, in that case copy-elision into the global variable is guaranteed. You can write arbitrary expressions in global initializers, however, and those can use temporary lambdas.

I guess you mean something like this?

auto glambda = []() { return 1; }();

Yes, or just passing it as an argument to something.

I don't see a way to change the address space of a lambda object however. It would only be possible to specify addr space qualifiers for a call operator of such lambdas.

Yes, I think the language has to say what address space the lambda temporary is in. Among other things, this can affect how captures are initialized, since some constructors are only usable in certain address spaces. (Lambdas in global contexts can't capture surrounding local variables, but they can still have explicitly-initialized captures, like [x=foo()] { ... }.) That language rule could be that lambda temporaries are always in the private address space, or it could be that lambdas in global contexts are in the global address space. The latter might be easier because it's compatible with lifetime-extension: we don't necessarily know when processing the lambda whether its temporary will be lifetime-extended, and if it is, it needs to be in the global address space. The only disadvantage of that is that we'd have to use global memory even for non-extended temporaries in global initializers.

Richard might have thoughts about this. I don't know if there's been any language work around copy-elision and/or lifetime-extension that might be stretchable to allow us to delay the address-space decision until later.

Ok, I see. Thanks! I agree for OpenCL __global might make more sense for the program scope lambdas. I will look into this. However, would it be ok to proceed with this patch adding __generic address space as a qualifier to lambda call operators by default. Is there anything that should be added to complete it?

rjmccall added inline comments.Nov 25 2019, 2:00 PM
clang/lib/Sema/SemaLambda.cpp
921

This should probably check that there isn't already an address space, right?

Anastasia marked an inline comment as done.Nov 26 2019, 2:41 AM
Anastasia added inline comments.
clang/lib/Sema/SemaLambda.cpp
921

EPI has just been constructed newly here or do you mean if addr space has been set in the constructor? That is currently not the case but might happen in the future perhaps.

rjmccall added inline comments.Nov 26 2019, 10:18 AM
clang/lib/Sema/SemaLambda.cpp
921

Oh, sorry, I see now that this is the place where we synthesize a signature. This is fine, nevermind.

Actually, could you go ahead and extract out some sort of getDefaultCXXMethodAddrSpace() function on Sema that you can use consistently here and in the other places?

Anastasia updated this revision to Diff 231247.Nov 27 2019, 6:48 AM

Added getDefaultCXXMethodAddrSpace helper to Sema

rjmccall added inline comments.Nov 27 2019, 10:52 AM
clang/lib/Sema/SemaLambda.cpp
922

This should just be AS.

Could you go ahead and change the other, existing places that should use this method?

Anastasia updated this revision to Diff 231409.Nov 28 2019, 4:53 AM

Switched to using getAddrSpaceQualType in the entire code base.

rjmccall accepted this revision.Nov 28 2019, 11:34 AM

Thanks, that looks great.

This revision is now accepted and ready to land.Nov 28 2019, 11:34 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2019, 8:14 AM