This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Normalize handling of defauled dtor.
ClosedPublic

Authored by tra on Jan 14 2021, 4:37 PM.

Details

Summary

Defaulted destructor was treated inconsistently, compared to other compiler-generated functions.

When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't have
implicit __host__ __device__ attributes applied yet, it would treat it as a host function.
That happened to (sometimes) hide the error when dtor referred to a host-only functions.

Even when we had identified defaulted dtor as a HD function, we still treated it inconsistently during
selection of usual deallocators, where we did not allow referring to wrong-side functions, while it
is allowed for other HD functions.

This change brings handling of defaulted dtors in line with other HD functions.

Diff Detail

Event Timeline

tra created this revision.Jan 14 2021, 4:37 PM
tra requested review of this revision.Jan 14 2021, 4:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2021, 4:37 PM
rsmith edited reviewers, added: rsmith; removed: HAPPY.Jan 14 2021, 5:03 PM
rsmith added inline comments.Jan 19 2021, 5:50 PM
clang/lib/Sema/SemaDeclCXX.cpp
15162–15170

Is this safe? What happens if the destructor for the variable is a template, and instantiating that template results in a reference to a device function? Eg:

template<typename T> __device__ void f() {}
template<typename T> struct A {
  ~A() { f<<<>>>(); }
};
A a;
clang/test/SemaCUDA/default-ctor.cu
28

I don't think we should accept this -- only functions that are defaulted on their first declaration should get special treatment. Instead of checking for isDefaulted() above, you should check for !isUserProvided() instead.

tra updated this revision to Diff 317946.Jan 20 2021, 11:04 AM

Addressed Richard's comments.

tra added inline comments.Jan 20 2021, 11:05 AM
clang/lib/Sema/SemaDeclCXX.cpp
15162–15170

This is business as usual -- we catch it during host compilation, where a is instantiated.

h.cu:3:10: error: no matching function for call to 'f'
  ~A() { f<T>(); }
         ^~~~
h.cu:5:8: note: in instantiation of member function 'A<int>::~A' requested here
A<int> a;
       ^
h.cu:1:51: note: candidate function not viable: call to __device__ function from __host__ function
template<typename T> __attribute__((device)) void f() {}

1 error generated when compiling for host.

If it were a __device__ A<int> a; , then we catch it during GPU compilation and also complain that we can't have dynamic initializers.

clang/test/SemaCUDA/default-ctor.cu
28

Fixed.

rsmith added inline comments.Jan 20 2021, 2:16 PM
clang/lib/Sema/SemaDeclCXX.cpp
15162–15170

Sorry, testcase wasn't quite right; I meant for f to be __global__ not __device__ so that the kernel call to it works. Fixed example:

extern "C" int cudaConfigureCall(int a, int b);
template<typename T> __attribute__((__global__)) void f(T) {}
template<typename T> struct A {
  ~A() { f<<<1, 1>>>(T()); }
};
A<int> a;

I think that this is valid. In order for it to work, we need to trigger instantiation of f<int> on the device side of the compilation. In order to do that, we need to trigger instantiation of A<int>::~A(), so we need to mark it referenced on the device side. (This is, I think, in line with the general principle that we want to do the same template instantiations of host functions on both sides of the compilation, so that both sides agree on which kernel functions are referenced.)

tra added inline comments.Jan 20 2021, 2:32 PM
clang/lib/Sema/SemaDeclCXX.cpp
15162–15170

You're right. To think of it this particular change is not needed at all any more. The real issue is fixed by the better selection of the usual deallocator. We do not need to skip dtor checks here.

tra updated this revision to Diff 318018.Jan 20 2021, 2:34 PM

Removed unneeded changes.

rsmith accepted this revision.Jan 20 2021, 2:51 PM
This revision is now accepted and ready to land.Jan 20 2021, 2:51 PM
tra updated this revision to Diff 318052.Jan 20 2021, 4:13 PM

Added a test for the corner case Richard has pointed out in the comments.

This revision was automatically updated to reflect the committed changes.