Page MenuHomePhabricator

[OpenCL][PR41727] Prevent ICE on global dtors
ClosedPublic

Authored by Anastasia on May 24 2019, 11:15 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

Anastasia created this revision.May 24 2019, 11:15 AM
rjmccall added inline comments.May 28 2019, 12:23 PM
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

Should this code be conditional to OpenCL? And why does _cxa_atexit take a __global pointer instead of, say, a __generic one?

Anastasia updated this revision to Diff 202708.Jun 3 2019, 6:48 AM
  • Guard OpenCL specific IR generation under OpenCL lang mode
Anastasia marked an inline comment as done.Jun 3 2019, 6:53 AM
Anastasia added inline comments.
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

The only objects that are destructible globally in OpenCL are __global and __constant. However __constant isn't convertible to __generic. Therefore, I am adding __global directly to avoid extra conversion. I am not yet sure how to handle __constantthough and how much destructing objects in read-only memory segments would make sense anyway. I think I will address this separately.

rjmccall added inline comments.Jun 10 2019, 3:43 PM
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

The pointer argument to __cxa_atexit is just an arbitrary bit of context and doesn't have to actually be the address of a global. It's *convenient* to use the address of a global sometimes; e.g. you can use the global as the pointer and its destructor as the function, and then __cxa_atexit will just call the destructor for you without any additional code. But as far as the runtime is concerned, the pointer could be malloc'ed or something; we've never had a need to do that in the ABI, but it's good future-proofing to allow it.

So there are three ways to get a global destructor to destroy a variable in __constant:

  • You can pass the pointer bitcast'ed as long as sizeof(__constant void*) <= sizeof(__cxa_atexit_context_pointer).
  • You can ignore the argument and just materialize the address separately within the destructor function.
  • You can allocate memory for a context and then store the pointer in that.

Obviously you should go with the one of the first two, but you should make sure your ABI doesn't preclude doing the latter in case it's useful for some future language feature. In other words, it doesn't really matter whether this argument is notionally in __global as long as that's wide enough to pass a more-or-less arbitrary pointer through.

Anastasia marked an inline comment as done.Jun 21 2019, 7:48 AM
Anastasia added inline comments.
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

Ok, I see. I guess option 1 would be fine since we can't setup pointer width per address space in clang currently. However, spec doesn't provide any clarifications in this regard.

So I guess using either __global or __generic for the pointer parameter would be fine... Or perhaps even leave it without any address space (i.e. _`_private`) and just addrspacecast from either __global or __constant. Do you have any preferences?

As for malloc I am not sure that will work for OpenCL since we don't allow mem allocation on the device. Unless you mean the memory is allocated on a host... then I am not sure how it should work.

rjmccall added inline comments.Jun 27 2019, 11:32 PM
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

Ok, I see. I guess option 1 would be fine since we can't setup pointer width per address space in clang currently.

Really? What's missing there? It looks to me like getPointerSize does take an address space.

So I guess using either global or generic for the pointer parameter would be fine... Or perhaps even leave it without any address space (i.e. _`_private`) and just addrspacecast from either global or constant. Do you have any preferences?

__private is likely to be a smaller address space, right? I would recommend using the fattest pointer that you want to actually support at runtime — you shouldn't go all the way to __generic if the target relies on eliminating that statically. If you want a target hook for the address space of the notional __cxa_atexit_context_pointer typedef, I think that would be reasonable.

As for malloc I am not sure that will work for OpenCL since we don't allow mem allocation on the device. Unless you mean the memory is allocated on a host... then I am not sure how it should work.

Well, maybe not actually heap-allocated. I just think you should design the ABI so that it's reasonably future-proof against taking any specific sort of reasonable pointer.

Anastasia updated this revision to Diff 207574.Jul 2 2019, 9:02 AM

Added a target hook to query address spaces with the largest pointer width.

rjmccall added inline comments.Jul 8 2019, 1:49 PM
lib/CodeGen/TargetInfo.h
274 ↗(On Diff #207574)

This target hook should just return the address space of the __cxa_atexit argument, not to claim anything specific about the relation between that address space and others. That is a global and permanent property of the target ABI. We should *advise* targets to use an address space that other major address spaces can be bitcast to, but ultimately that's just advisory.

This would be a good thing to include in a PR to the Itanium C++ ABI, but it would be reasonable to bundle all the address-space-related changes you need in one PR.

Changed API for target hook.

Anastasia marked an inline comment as done.Jul 10 2019, 10:33 AM
Anastasia added inline comments.
lib/CodeGen/TargetInfo.h
274 ↗(On Diff #207574)

This would be a good thing to include in a PR to the Itanium C++ ABI, but it would be reasonable to bundle all the address-space-related changes you need in one PR.

I can try to do it but I would need some explanation of the process. :)

rjmccall added inline comments.Jul 10 2019, 11:04 AM
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

This cast only works if the address space is a subspace of the __cxa_atexit address space, right? Should we be checking that and emitting a diagnostic if that's not true? I think an IRGen-level diagnostic is fine here.

Anastasia marked an inline comment as done.Jul 11 2019, 1:28 AM
Anastasia added inline comments.
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

Then it would fail to compile for __constant.

You can pass the pointer bitcast'ed as long as sizeof(constant void*) <= sizeof(cxa_atexit_context_pointer).

Do you think I should leave a bitcast then? Not sure if something might assert in LLVM though if there is a bitcast between pointers to different address space... so I am confused...

rjmccall added inline comments.Jul 11 2019, 10:49 AM
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

I think LLVM doesn't allow direct bitcasts between different address spaces (to help eliminate obvious bugs), but you can do it with ptrtoint. For generality, though, you should emit a diagnostic if the __cxa_atexit pointer size is actually smaller than the target pointer.

Alternatively, I'm not sure we actually rely on this pointer for anything right now, so you might be able to just use null if the address spaces are different. That decision will eventually need to be able to affect how we generate the global destructor function, though.

Anastasia updated this revision to Diff 209463.Jul 12 2019, 5:00 AM

Generate NULL for pointer param of __cxa_atexit on mismatching addr spaces

Anastasia marked an inline comment as done.Jul 12 2019, 5:03 AM
Anastasia added inline comments.
lib/CodeGen/CGDeclCXX.cpp
132 ↗(On Diff #201293)

Thanks! I will go for NULL for now to prevent ICE and unblock some further work.

But I added a FIXME to make sure to come back to this. I would then either change the dtor generation of emit some meaningful non-NULL value as an argument.

If you're interested in working on this, great. I actually think there's zero reason to emit a non-null argument here on any target unless we're going to use the destructor as the function pointer — but we can do that on every Itanium target, so we should. So where we want to be is probably:

  • use the complete-object destructor and the object pointer as its argument when destroying a (non-array) object of C++ class type
  • otherwise use a custom function that materializes the object pointer on its own and uses a null pointer as the argument

And then the address-spaces tweak to that is that we use the second option when we either (1) the destructor isn't in the __cxa_atexit address space or (2) the object pointer can't be converted to that address space.

rjmccall accepted this revision.Jul 12 2019, 10:37 AM

Current patch LGTM

This revision is now accepted and ready to land.Jul 12 2019, 10:37 AM
Closed by commit rL366059: [OpenCL][PR41727] Prevent ICE on global dtors (authored by stulova, committed by ). · Explain WhyJul 15 2019, 4:58 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2019, 4:58 AM