This is an archive of the discontinued LLVM Phabricator instance.

[C++4OpenCL] Add support for multiple address spaced destructors
Needs ReviewPublic

Authored by olestrohm on Sep 10 2021, 9:08 AM.

Details

Summary

This patch aims to add initial support for multiple address spaced destructors.
Currently this parses fine and the destructors are correctly added to the class,
but there is no logic to choose the correct destructor for a variable in a specific address space.
Currently it always picks the first destructor defined in the source code, regardless of the
address space of the variable.

This is not aiming to be a complete implementation, as destructors are used in a very large
number of locations, and so doing all of it in one patch would be very difficult to review and code.

The goals for this patch is thus to add basic support for destructors, and to build a framework for
additional work to be done on this front.
Since this feature won't be completed in this patch, I have attempted to ensure that no C++
functionality is changed, to ensure that the changes will only affect C++ for OpenCL.
This also has the effect that whenever destructors aren't correctly implemented it will fallback
to default behaviour, which is exactly the same currently happens.

In summary destructors in C++ for OpenCL are currently unusable for multiple destructors. With
this patch the feature will be usable in the most common situation, and every bug that currently
exists that isn't covered by the changes I have made here will continue to be bugs. The hope is that
this patch therefore is almost entirely a positive, since while it doesn't fix every bug, it will make the
feature actually work, and will create a base to fix the other bugs as they are discovered.

This is why it's mostly implemented through default parameters, as that will allow this change to
remain as small as it is, while also allowing further fixes to come along later.

The feature is also C++ for OpenCL specific to let the fast path remain when not utilizing the
address spaces, as this new implementation will be slower than the bitfields that C++ currently
uses, but I hope this can also be optimized in the future if it turns out to be very slow.

Diff Detail

Event Timeline

olestrohm created this revision.Sep 10 2021, 9:08 AM
olestrohm requested review of this revision.Sep 10 2021, 9:08 AM

The feature is also C++ for OpenCL specific to let the fast path remain when not utilizing the
address spaces, as this new implementation will be slower than the bitfields that C++ currently
uses, but I hope this can also be optimized in the future if it turns out to be very slow.

In general we try to align OpenCL's address spaces with C/C++ general direction but we haven't completed support of address space methods qualifiers in C++ yet. Therefore supporting dtors with address spaces won't be possible in generic C++. However, it would be good to check that we don't entirely misalign with the C++ requirements.

I would like to see if @rjmccall has any suggestions at this point.

clang/include/clang/AST/DeclCXX.h
983

According to the current OpenCL strategy, we only declare implicit members in default address space (i.e. __generic). It might be that we can declare them for all concrete address spaces but we are not there yet to start adding this feature. So it would make sense, for now, to simplify this logic and only add implicit dtors in the generic address space.

This implies that objects in concrete address spaces would still be able to use such implicit dtors but the compiler would add an implicit conversion to __generic for the implicit object parameter.

clang/include/clang/Sema/Sema.h
4048

it feels to me that this should be named ASThis following the convention?

clang/lib/AST/DeclCXX.cpp
1901

For ctors and other special members the logic for the selection of the overload candidates seems to be implemented at the end of Sema::LookupSpecialMember:
https://clang.llvm.org/doxygen/SemaLookup_8cpp_source.html#l03114

There it creates an implicit parameter this and performs regular overload resolution.

Currently dtors don't hit that point because the implementation fast paths them in that function.

Do you think it makes sense to follow the same logic for dtors (by preventing the early exit there), even though perhaps we would then only do it for OpenCL to avoid performance regressions for pure C++ implementation? However, I am not entirely convinced when and how that code is called for the special members since it doesn't have address space handling anywhere but clang's behavior seems correct for ctors and assignments at least. Although it is possible that it is only used for some corner cases we haven't encountered yet and hence we have a general bug for special members with address spaces in some situations.

Alternatively, we could just add similar logic here if it makes implementation more efficient. After all the special member implementation doesn't seem to reuse fully common member function overload resolution code so perhaps it is not unreasonable to add a special case for dtors. I am not saying though we need to do it in this path... but it would be good to understand how the evolution of your approach would look like towards completion.

clang/lib/Sema/SemaLookup.cpp
3082

Btw ctors and assignments don't seem to need this but they seem to work fine though...

For example here the right assignment overload with __local address space is selected
https://godbolt.org/z/aYKj4W6rc
or ctor overload with __global is selected here correctly:
https://godbolt.org/z/1frheezE5

So it seems like there is some other logic to handle address spaces for special members elsewhere? Although it is very strange and rather confusing.

3099

Currently, all other special members added implicitly use default address space (i.e. __generic for OpenCL) for this via Sema::setupImplicitSpecialMemberType helper so I think we should mirror it for DeclareImplicitDestructor too... then perhaps we can avoid changes in Sema::setupImplicitSpecialMemberType?

FYI in the past we have discussed that we might want to change the design and put special members in concrete address spaces but we haven't gone that far yet and it would make sense to have all special members working coherently... so it would be better for dtors to just mirror logic from the other special members for now.

keryell added a subscriber: keryell.EditedSep 13 2021, 1:21 AM

While it might be possible to extend arbitrarily C++, I have the feeling that having just 1 destructor and having a different code path-code according to the address space could be enough.
It could be possible to write:

~MyDestructor() {
  if constexpr (SomeAPIReturningAddressSpace() == ...::Global) {
    /* Global address space code */
  }
  else if constexpr (...) {
  }
}

It is possible to build higher-level dispatching constructs from this in OpenCL C++ library, for example à la std::visit from std::variant.

olestrohm updated this revision to Diff 373218.Sep 17 2021, 7:43 AM

I made the implicit destructor always be created in __generic address space.

I couldn't manage to properly figure a case that would trigger LookupSpecialMember,
so I couldn't figure out how or if address spaces are handled there.

re: @keryell
This might work, though I have no idea how that SomeAPIReturningAddrSpace would work, since at this point the variable would likely be cast to be in __generic addrspace.
Also I'm not sure how that would interact with deleted destructors.

olestrohm marked 2 inline comments as done.Sep 17 2021, 7:45 AM
olestrohm added inline comments.
clang/lib/Sema/SemaLookup.cpp
3082

Yes, it seems other special members are handled somewhere else according to my preliminary investigations, so that might explain why address spaces where never introduced into this function.

olestrohm updated this revision to Diff 373225.Sep 17 2021, 8:10 AM
olestrohm marked an inline comment as done.

re: @keryell
This might work, though I have no idea how that SomeAPIReturningAddrSpace would work, since at this point the variable would likely be cast to be in __generic addrspace.
Also I'm not sure how that would interact with deleted destructors.

You mean for example say that the constructor in global address-space would have some definition but the constructor in constant is deleted?
Ouch. You are killing my suggestion. :-)