It seems relying on isStandardLayoutType helpers is fragile when kernel arguments are references/pointers to templated types.
Clang uses lazy template instantiation in this case and therefore types are not fully instantiated when those are used.
Example:
template<typename T> struct outer{ public: struct inner{ int size; }; }; void foo(outer<int>::inner& p) { }
will have the following AST when p is a reference:
| `-ClassTemplateSpecializationDecl <line:1:1, line:7:1> line:2:8 struct outer definition | |-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init | | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment exists simple trivial needs_implicit | | `-Destructor simple irrelevant trivial needs_implicit | |-TemplateArgument type 'int' | | `-BuiltinType 'int' | |-CXXRecordDecl <col:1, col:8> col:8 implicit struct outer | |-AccessSpecDecl <line:3:1, col:7> col:1 public | `-CXXRecordDecl <line:4:5, col:12> col:12 referenced struct inner
and when it is not a reference
| `-ClassTemplateSpecializationDecl <line:1:1, line:7:1> line:2:8 struct outer definition | |-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init | | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment exists simple trivial needs_implicit | | `-Destructor simple irrelevant trivial needs_implicit | |-TemplateArgument type 'int' | | `-BuiltinType 'int' | |-CXXRecordDecl <col:1, col:8> col:8 implicit struct outer | |-AccessSpecDecl <line:3:1, col:7> col:1 public | `-CXXRecordDecl <line:4:5, line:6:5> line:4:12 referenced struct inner definition | |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable pod trivial literal | | |-DefaultConstructor exists trivial needs_implicit | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment exists simple trivial needs_implicit | | `-Destructor simple irrelevant trivial needs_implicit | |-CXXRecordDecl <col:5, col:12> col:12 implicit struct inner | `-FieldDecl <line:5:9, col:13> col:13 size 'int'
In the first case for reference type inner is incomplete while if we use a non-reference type the definition of template specialization is complete.
I have attempted to workaround this issue for the reported test cases, however it still doesn't quite work when the type is created from the template parameter (see FIXME test case in the patch). I presume we want to allow this? If so we might need to disable lazy template instantiation in this case. My guess is the only issue this this is that we will have performance penalty for the code of this format:
template<class T> struct Dummy { T i; }; extern kernel void foo(Dummy<int>& i);
which doesn't technically need the full instantiation.
Also I have discovered that in OpenCL C we have allowed passing kernel parameters with pointers to forward declared structs, but in C++ for OpenCL that no longer compiles. Not sure whether we should keep this restriction but if not it might interfere with the idea of always instantiating the structs definitions fully and even providing the safe diagnostics. So another approach we could take is to change this diagnostics into warnings and then if we can't fully detect the type provide the messaging that we can't detect whether the type is safe...
Another aspect to consider is that we have an extension that allows disabling all of this safe checks completely: https://clang.llvm.org/docs/LanguageExtensions.html#cl-clang-non-portable-kernel-param-types
While this matter might need more thoughts and investigations I wonder whether it makes sense to commit this fix for the time being since it's fixing the reported test case at least?
A comment explaining what we're trying to do here would be nice.