Deallocation functions are called from deleting destructors that are declared virtual. Issue an error if an aligned deallocation function is selected but is not available.
Details
- Reviewers
rsmith vsapsai EricWF erik.pilkington tra - Commits
- rG71645c2febab: [Sema] Produce diagnostics when C++17 aligned allocation/deallocation functions…
rC349890: [Sema] Produce diagnostics when C++17 aligned allocation/deallocation
rL349890: [Sema] Produce diagnostics when C++17 aligned allocation/deallocation
Diff Detail
- Repository
- rC Clang
Event Timeline
I think we should sink the DiagnoseUnavailableAlignedAllocation into DiagnoseUseOfDecl, and then adds call's in the few places they're needed. Most paths we care about already pass through there.
include/clang/Sema/Sema.h | ||
---|---|---|
5203 | DiagnoseUnavailableAlignedAllocation. Also, I think we can drop the IsDelete parameter and instead deduce it using FD.getDeclName().getCXXOverloadedOperator(). |
Sink diagnoseUnavailableAlignedAllocation into DiagnoseUseOfDecl and add calls to it in a few other places so that diagnostics are produced for calls to aligned operator and builtin operator new/delete in addition to aligned deallocation functions called from deleting destructors.
I had to make changes to two test cases (dr2xx.cpp and call-host-fn-from-device.cu) that have nothing to do with aligned allocation/deallocation functions. The warning in dr2xx.cpp seems correct to me judging from the comment left few lines above it ("We're also missing the -Wused-but-marked-unused "). I'm not sure about the diagnostic in call-host-fn-from-device.cu. The original comment says the sized delete function annotated with __device__ is called, but it seems that the non-sized __host__ function is being called, in which case I think the diagnostic is correct.
@jlebar, is the change I made to call-host-fn-from-device.cu correct?
I don't think so -- that's a change in overloading behavior afaict.
The original comment said "call sized device delete even though host has preferable non-sized version", but it seems that the non-sized host version 'T::operator delete(void*)' is being called in the IR, not the sized device version of delete. Is that a bug in overload resolution?
This is the command I used:
"clang -cc1 -internal-isystem --std=c++11 -triple nvptx-unknown-unknown -fcuda-is-device -emit-llvm -o - test.cu"
$ cat test.cu
#define __constant__ __attribute__((constant)) #define __device__ __attribute__((device)) #define __host__ __attribute__((host)) typedef __SIZE_TYPE__ size_t; struct T { __host__ void operator delete(void*); __device__ void operator delete(void*, size_t); }; __host__ __device__ void class_specific_delete(T *t) { delete t; }
I mean ToT clang (without my patch applied) seems to select the non-sized host version 'T::operator delete(void*)'.
OK, if this is just making an error out of something which previously silently didn't work (and should result in a compile error further down the line when we try to and can't resolve that function), then this is totally fine.
Yes, that is the case. It doesn't change overloading resolution, it is just producing a diagnostic.
test/SemaCUDA/call-host-fn-from-device.cu | ||
---|---|---|
88 ↗ | (On Diff #150211) | The C++ magic is way above my paygrade, but as far as CUDA goes this is a regression, compared to what nvcc does. This code in NVCC produced a warning and clang should not error out at this point in time either as |
lib/Sema/SemaExprCXX.cpp | ||
---|---|---|
3545 | Are we also missing a MarkFunctionReferenced call here? (I don't think it matters much for the predefined new/delete, since they can't be inline or templated, but it's still wrong to not mark the function the builtin will call as referenced.) | |
test/CXX/drs/dr2xx.cpp | ||
721–722 | These are -> This is | |
test/SemaCUDA/call-host-fn-from-device.cu | ||
88 ↗ | (On Diff #150211) | We're in -fcuda-is-device mode, so IIUC it's correct to reject a call to a host function here (because __host__ __device__ is treated as basically meaning __device__ in that mode for the purpose of checking whether a call is valid), right? However, the comment suggests that the intent was that this would instead call the device version. Did that actually previously happen (in which case this patch is somehow affecting overload resolution and should be fixed), or is the comment prior to this patch wrong and we were silently calling a host function from a device function (in which case this patch is fine, but we should add a FIXME here to select the device delete function if we think that's appropriate)? |
test/SemaCUDA/call-host-fn-from-device.cu | ||
---|---|---|
88 ↗ | (On Diff #150211) | OK, I see from prior review comments (that phab is helpfully hiding from view) that this is just adding a diagnostic and the overload resolution behavior is unchanged. So I think this change is correct. @tra, can you confirm? My testing shows that __host__ void f(); __host__ __device__ void g() { f(); } is accepted by default but rejected in -fcuda-is-device mode, which is consistent with the behavior after this patch is applied. |
Call MarkFunctionReferenced to mark the function __builtin_operator_new or __builtin_operator_delete will call as referenced.
test/SemaCUDA/call-host-fn-from-device.cu | ||
---|---|---|
88 ↗ | (On Diff #150211) | "-fcuda-is-device" does not necessarily mean that the host device function will be used. In the example above the error is indeed correct, as g() is considered to be externally visible and we will attempt to generate code for it, and we can't call f() on device. However, if you make it static, there should be no error: __host__ void f(); static __host__ __device__ void g() { f(); } CUDA is somewhat weird when it comes to what's considered available and what is not. @jlebar has details on how we handle the errors in cases like that: |
Doing that in a separate patch is OK, provided that that patch will be committed along with this one.
It's a regression. There's a decent chance it breaks someone and this patch, if committed by itself, will end up being rolled back.
Is the regression you are referring to about the static function case? I don't see a difference between ToT clang and my patch in the diagnostics they produce when I compile the following code:
__host__ void f(); static __host__ __device__ void g() { f(); } __host__ __device__ void g2() { g(); }
Both error out when -fcuda-is-device is provided. If I comment out the definition of g2, it compiles fine.
The example above *is* expected to produce the error on device side, bacause g2() is externally visible, uses g(), which in turn uses host-only f().
I'm talking about a case where g() {f()} is present in the source code, but will not be codegen'ed on device side.
The code below is expected to compile. Note that g2() is host-only.
__host__ void f(); static __host__ __device__ void g() { f(); } __host__ void g2() { g(); }
The code you showed does compile with or without -fcuda-is-device after applying my patch.
Talked to @ahatanak over IRC. It appears that this patch may have exposed a preexisting bug.
Apparently delete t; in test/SemaCUDA/call-host-fn-from-device.cu does actually end up calling __host__ operator delete. It should've picked __device__ operator delete, but it does not, so reporting an error here appears to be correct.
It's visible in AST and the IR.
@rsmith -- the original change was done a while back in rL283830. I assume it worked at that time and wonder if it's a (possibly not-so-)recent regression.
I've confirmed that the patch does not break anything in our CUDA code, so it's good to go as far as CUDA is concerned.
I'll fix the exposed CUDA issue in a separate patch.
Rebase & ping.
I've reverted the changes I made to test/SemaCUDA/call-host-fn-from-device.cu since r342749 fixed the overload resolution bug.
lib/Sema/SemaExpr.cpp | ||
---|---|---|
54 | Does this also need to be updated? |
Check whether the declaration passed to Sema::CanUseDecl is an aligned allocation/deallocation function that is unavailable.
The change itself looks correct. Cannot really tell if you need to make changes in other places. For that I rely on Richard's opinion.
DiagnoseUnavailableAlignedAllocation.
Also, I think we can drop the IsDelete parameter and instead deduce it using FD.getDeclName().getCXXOverloadedOperator().