This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Permit address space casts with 'reinterpret_cast' in C++
Needs RevisionPublic

Authored by jhuber6 on May 22 2023, 6:00 AM.

Details

Summary

Previously, D58346 changed the rules to not permit address space casts
in C++. The cited reason was that reinterpret_cast is not allowed to
remove qualifiers, however the standard only explicitly says that
const and volatile qualifiers cannot be removed, see
http://eel.is/c++draft/expr.reinterpret.cast#7. The current behaviour is
suboptimal as it means there is no way in C++ to change address spaces
and we need to rely on unsafe C style casts that aren't permitted by
many styles. This patch changes the handling to only apply to OpenCL.

Diff Detail

Event Timeline

jhuber6 created this revision.May 22 2023, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 6:00 AM
jhuber6 requested review of this revision.May 22 2023, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2023, 6:00 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

What would be the semantics of such an operation if the address spaces are disjoint? Or, if the underlying pointer widths aren't the same?

What would be the semantics of such an operation if the address spaces are disjoint? Or, if the underlying pointer widths aren't the same?

It would most likely invalid, but I'm not asserting that clang should be responsible for diagnosing misuse in these cases. Especially because in generic freestanding C++ we don't have any language options to suggest the actual semantics.

What would be the semantics of such an operation if the address spaces are disjoint? Or, if the underlying pointer widths aren't the same?

It would most likely invalid, but I'm not asserting that clang should be responsible for diagnosing misuse in these cases. Especially because in generic freestanding C++ we don't have any language options to suggest the actual semantics.

If the pointer sizes don’t match I think it has to be rejected since it’s no longer a pure bitcast

It would most likely invalid, but I'm not asserting that clang should be responsible for diagnosing misuse in these cases. Especially because in generic freestanding C++ we don't have any language options to suggest the actual semantics.

If the pointer sizes don’t match I think it has to be rejected since it’s no longer a pure bitcast

I don't think that's something we can diagnose here with just the address space number. it would require information from the underlying target for the expected pointer qualities to the address space.

What would be the semantics of such an operation if the address spaces are disjoint? Or, if the underlying pointer widths aren't the same?

It would most likely invalid, but I'm not asserting that clang should be responsible for diagnosing misuse in these cases. Especially because in generic freestanding C++ we don't have any language options to suggest the actual semantics.

That's fair. I would like clang to improve and formalize the semantics for generic address space behavior a bit, which was part of the point with D62574. But there don't seem to be enough people who need something like it to make it happen.

Honestly, looking at the patch again does suggest to me that your use case would be covered. It just wouldn't be done with a reinterpret_cast, but an addrspace_cast. Since every target by default would permit explicit casts with isExplicitAddrSpaceConversionLegal, your desired behavior should work.

That's fair. I would like clang to improve and formalize the semantics for generic address space behavior a bit, which was part of the point with D62574. But there don't seem to be enough people who need something like it to make it happen.

Honestly, looking at the patch again does suggest to me that your use case would be covered. It just wouldn't be done with a reinterpret_cast, but an addrspace_cast. Since every target by default would permit explicit casts with isExplicitAddrSpaceConversionLegal, your desired behavior should work.

The problem is we don't have addrspace_cast in freestanding C++, so as it stands we currently have no way to perform this operation in C++ which is preventing me from implementing things in the LLVM LibC port for GPUs I'm working on.

The problem is we don't have addrspace_cast in freestanding C++, so as it stands we currently have no way to perform this operation in C++ which is preventing me from implementing things in the LLVM LibC port for GPUs I'm working on.

By "freestanding C++" I assume you mean "C++ without the OpenCL C++ extension" and not "C++ without extensions at all", because in the latter case, you don't have address spaces either.

This is what I meant in D58346 by "There's many places in the codebase where OpenCL flags restrict generic address space behavior." Isn't the solution here to allow addrspace_cast even when OpenCL C++ isn't turned on, as a more generic extension?

jhuber6 added a comment.EditedMay 22 2023, 7:57 AM

By "freestanding C++" I assume you mean "C++ without the OpenCL C++ extension" and not "C++ without extensions at all", because in the latter case, you don't have address spaces either.

This is what I meant in D58346 by "There's many places in the codebase where OpenCL flags restrict generic address space behavior." Isn't the solution here to allow addrspace_cast even when OpenCL C++ isn't turned on, as a more generic extension?

You can have address spaced in freestanding C++, they just need to be assigned according to the backens, e.g. https://godbolt.org/z/ahazae6Ta. And I don't think that's a desirable solution because it would require a language extension to cover a use-case that isn't disallowed by the C++ standard anyway. I don't see a reason why we shouldn't be able to do this directly.

I don't think that's something we can diagnose here with just the address space number. it would require information from the underlying target for the expected pointer qualities to the address space.

Yes, this is mandatory. Numbered address spaces are broken and just work by accident and I don't love seeing new code rely on whatever random behavior they have now. Right now they happen to codegen to something that appears to work, while bypassing any kind of sensible semantic checking for what you're doing with them. If you want to really use this, we need to find some way to map numbers back into language address spaces and make them fully aware of the target properties

I don't think that's something we can diagnose here with just the address space number. it would require information from the underlying target for the expected pointer qualities to the address space.

Yes, this is mandatory. Numbered address spaces are broken and just work by accident and I don't love seeing new code rely on whatever random behavior they have now. Right now they happen to codegen to something that appears to work, while bypassing any kind of sensible semantic checking for what you're doing with them. If you want to really use this, we need to find some way to map numbers back into language address spaces and make them fully aware of the target properties

How are they broken? The expectation is just that they line up with what the backend defines them as, which should be a stable target. We could potentially use target info to map the numbered address spaces into something sensible. E.g. 1 = opencl_lobal, 3 = opencl_local, 5 = opencl_private. I think that's commend between NVPTX and AMDGPU. But I still don't think that we should be preventing from even doing wrong things with them if the user explicitly requests it.

How are they broken? The expectation is just that they line up with what the backend defines them as, which should be a stable target. We could potentially use target info to map the numbered address spaces into something sensible. E.g. 1 = opencl_lobal, 3 = opencl_local, 5 = opencl_private. I think that's commend between NVPTX and AMDGPU. But I still don't think that we should be preventing from even doing wrong things with them if the user explicitly requests it.

They bypass all semantic checks. For example if you declare something as address space 4, it will let you write to it unlike constant. It will let you place stack objects in globals and don't interact correctly with address space typed pointers in builtins (and any other context where you would want to interact with the matching language addrspace). Don't think sizeof works as you would expect either. Whatever is happening is not by design

They bypass all semantic checks. For example if you declare something as address space 4, it will let you write to it unlike constant. It will let you place stack objects in globals and don't interact correctly with address space typed pointers in builtins (and any other context where you would want to interact with the matching language addrspace). Don't think sizeof works as you would expect either. Whatever is happening is not by design

I'm not sure how much work it would be to line up the numbers with the equivalent OpenCL type using the target info, or if that would break anything. However, I think this is out of scope, because even if we did have such semantic checks we'd still want to be able to reinterpret_cast on them in C++ which currently isn't allowed.

I like the current behaviour of bypass semantic checking and emit IR with the same number on it as my primary use case for feeding freestanding C++ to clang is as a convenient way to emit specific IR and I'm not that bothered about clang detecting mistakes along the way.

Funny that you mention addrspace(5) global as that's a pretty good approximation to what we'd like for errno. The semantics that would have been convenient is put it at the bottom of the stack in every kernel that can reach the variable and wire accesses to it through some base-of-stack built-in, which is presently not a thing but could be implemented.

You can have address spaced in freestanding C++, they just need to be assigned according to the backens, e.g. https://godbolt.org/z/ahazae6Ta. And I don't think that's a desirable solution because it would require a language extension to cover a use-case that isn't disallowed by the C++ standard anyway. I don't see a reason why we shouldn't be able to do this directly.

The problem with reinterpret_cast is that it isn't sufficient for arbitrary address space configurations. The first issue is that reinterpret_cast inherently performs a bitcast, which is not allowed if the pointers to address spaces being casted between are of different sizes. Then, the reinterpret_cast would suddenly have different semantics than a bitcast.

Another (contrived, but possible) case is if you have address spaces on a target where the spaces overlap, but with different semantics between them, and the pointers have the same size. What is now a reinterpret_cast? An address space conversion, or a bitcast? It's not as straightforward as it might seem.

Clang hasn't needed to formalize all of the address space behavior because it's managed to piggyback off of the language semantics provided by OpenCL, and no targets really have had a need for it. Once you start looking at the target AS stuff on its own, you realize it's not really that well defined, and making it even less defined by allowing arbitrary conversions isn't the solution.

Clang hasn't needed to formalize all of the address space behavior because it's managed to piggyback off of the language semantics provided by OpenCL, and no targets really have had a need for it. Once you start looking at the target AS stuff on its own, you realize it's not really that well defined, and making it even less defined by allowing arbitrary conversions isn't the solution.

I'd rather have an operation whose semantics are a little dangerous than something that doesn't work at all. As it stands we need to use C-style casts for this and I don't think there's a good reason to forbid this at least from the C++ standard point of view. For OpenCL where we have the concept of address spaces it makes sense, but not for C++.

I'd rather have an operation whose semantics are a little dangerous than something that doesn't work at all. As it stands we need to use C-style casts for this and I don't think there's a good reason to forbid this at least from the C++ standard point of view. For OpenCL where we have the concept of address spaces it makes sense, but not for C++.

I don't think the standard argument really holds. It doesn't mention restrictions on address spaces because it doesn't have to, since they don't exist in C++. If they did, I'm pretty sure that reinterpret_cast would disallow arbitrary address space-modifying casts, since they wouldn't necessarily be bitcasts.

Like @arsenm said, any behaviors that you're using or observing regarding conversion of target address spaces in both C and C++ are purely coincidental. I don't think it's a great idea to add more such coincidental behaviors. The result will be that future code will become dependent on these arbitrary, less restrictive behaviors, and it will be much harder to properly define sensible semantics later on.

I don't think the standard argument really holds. It doesn't mention restrictions on address spaces because it doesn't have to, since they don't exist in C++. If they did, I'm pretty sure that reinterpret_cast would disallow arbitrary address space-modifying casts, since they wouldn't necessarily be bitcasts.

The reason cited for this change was that the standard does not allow reinterpret_cast to drop qualifiers, I don't think that's true so we're free to define our own behavior. Whether or not this is safe or desirable is up to us, and I'm arguing that for C++ it should be permitted.

Like @arsenm said, any behaviors that you're using or observing regarding conversion of target address spaces in both C and C++ are purely coincidental. I don't think it's a great idea to add more such coincidental behaviors. The result will be that future code will become dependent on these arbitrary, less restrictive behaviors, and it will be much harder to properly define sensible semantics later on.

The behavior isn't coincidental, it does exactly what you expect if you were to use the same address space in LLVM-IR. The problem is that there are no semantic checks on them so if you use them incorrectly it will break. I feel like this is a separate issue and I don't know why it would prevent us from doing *any* kind of address space cast in C++. There is no alternative in C++ and we already permit this with C-style casts, we cannot use OpenCL extensions like addrspace_cast so that leaves us unable to use C++ to write programs we want to write. We already use numerical address spaces in the OpenMP GPU runtime, and internally there we need to use C to do all the address space casts because of this.

There would be definite value in using the target's information to map address spaces to the known OpenCL ones so we could share some of their checks, but I feel like that's a separate issue.

I don't feel reinterpret_cast is the right fit for this as it's expected to do just reinterpretation but addrspacecast LLVM instruction which Clang generates might result in some more operations in particular some special instructions for address calculation. Would it work if you enable addrspace_cast with double underscore prefix directly in C++ or something like this? Otherwise C-cast would be just as good as changing reinterpret cast in this way.

It's weird to have C-style casts that can't be done with any combination of named casts, so I think this makes some sense. I do think it should be limited to numbered address spaces of the same size, though, rather than basing it on whether we're in OpenCL mode. Target address spaces should not generally be forced into this rule.

The address space feature is governed by TR 18037 (https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip), see section 5. The TR says (in part):

A non-null pointer into an address space A can be cast to a pointer into another address space B,
but such a cast is undefined if the source pointer does not point to a location in B. Note that if A is a
subset of B, the cast is always valid; however, if B is a subset of A, the cast is valid only if the
source pointer refers to a location in B. A null pointer into one address space can be cast to a null
pointer into any overlapping address space.

It's pretty well-established that C++ provides a named cast to perform almost any operation that can be done in a C-style cast, so I think this is a good discussion on what named cast to surface this functionality under.

reinterpret_cast in C++ is intended to give some measure of better guarantees than a C-style cast, but it is not allowed to "fail" (in the sense of a dynamic_cast). Because not all address space casts can be correctly handled (due to pointer size differences, for example), it seems to me that allowing the conversion through reinterpret_cast would not be appropriate; it stops being a bit-cast operation and starts being a conversion operation instead.

I looked to see if WG21 has made any mention about address spaces within the context of the standard, and I did not see anything relevant (but it's possible I missed something on one of the various study group mailing lists). So I did the next best thing and started quietly asking a few relevant WG21 folks for opinions and learned some things.

The current SG1 position within WG21 is that address spaces are not something that ISO C++ should have. However, this is not a universally held opinion, some SG1 members would like to see address spaces added. That said, the address space is part of the pointer's provenance and only a few address space casts can produce provenance that works at all. You could build up from casts to and from uintptr_t/intptr_t that round-trip and then document which address spaces can be converted through this mechanism (with the default answer being "none"). When the address spaces are nested, then the casts are desirable and should be reasonably ergonomic. There was sentiment to go with __addrspace_cast that doesn't let you change any other property of the pointer but its address space and make it UB at the point of cast if that conversion is going to do bad things (so we can diagnose immediately at the cast site).

Based on all this, I think we should go with __addrspace_cast as a named cast and not allow the conversion through reinterpret_cast unless going to/from a [u]intptr_t.

What is now a reinterpret_cast? An address space conversion, or a bitcast? It's not as straightforward as it might seem.

This is the most straightforward part. It's a bitcast.

We should at least be able to reinterpret_cast between cases we know are compatible, as the OpenCL check does. One of the problems with the numerical address space is it doesn't have any information to know if that's strictly legal or not. I'm not sure if casting away an address space is always legal, it's generally what we do in LLVM-IR.

Based on all this, I think we should go with __addrspace_cast as a named cast and not allow the conversion through reinterpret_cast unless going to/from a [u]intptr_t.

I think this sounds good. Most of the building blocks for it should already be in place in the form of OpenCL's addrspace_cast. It would remove reinterpret_cast's ability to perform safe conversions, though. Could that affect something else inadvertently? ICS or SCS?

There are other C++ casts that deal with address spaces today. static_cast can also do it when converting from a void pointer, for example. Should it also lose the ability?

What is now a reinterpret_cast? An address space conversion, or a bitcast? It's not as straightforward as it might seem.

This is the most straightforward part. It's a bitcast.

And because of that, it wouldn't be possible to perform address space conversions between such pointers using reinterpret_cast.

I'm not sure if casting away an address space is always legal, it's generally what we do in LLVM-IR.

The TR says this:

There is no requirement that named address spaces (intrinsic or otherwise) be subsets of the
generic address space.

but also what Aaron pasted in his earlier comment:

A non-null pointer into an address space A can be cast to a pointer into another address space B,
but such a cast is undefined if the source pointer does not point to a location in B.

So, according to the report, converting to the generic address space would always be valid, but might be undefined. I've never understood why the TR did it this way; there's no reason to allow such conversions if we know they are undefined at compile time. Conversions that are undecidable (from superset to subset) are one thing, but the report makes no validity restrictions on converting between disjoint spaces.

Undefined could technically mean "emit an error", but this isn't what Clang does today, and lots of tests and probably code depends on it working like that now.

Based on all this, I think we should go with __addrspace_cast as a named cast and not allow the conversion through reinterpret_cast unless going to/from a [u]intptr_t.

I think this sounds good. Most of the building blocks for it should already be in place in the form of OpenCL's addrspace_cast.

I think having some kind of addrspacecast operator is unavoidable. reinterpret_cast needs to lower to something like ptrtoint/inttoptr and isn't interchangeable

Based on all this, I think we should go with __addrspace_cast as a named cast and not allow the conversion through reinterpret_cast unless going to/from a [u]intptr_t.

I think this sounds good. Most of the building blocks for it should already be in place in the form of OpenCL's addrspace_cast. It would remove reinterpret_cast's ability to perform safe conversions, though. Could that affect something else inadvertently? ICS or SCS?

There are other C++ casts that deal with address spaces today. static_cast can also do it when converting from a void pointer, for example. Should it also lose the ability?

I think this is another design approach we could consider -- splitting the functionality out between the various C++ named casts.

According to the TR, we should know at compile time which address spaces exist and whether space B is a subset of space A or whether they're disjoint. There can be implicit conversions from A to B if A is a subset of B, but it might not be bit-pattern preserving. That's similar to a derived-to-base conversion. static_cast can reverse implicit conversions, so it could be used to reverse the A->B conversion. It'd be undefined behavior if the pointer isn't representable in A's address space, but that's also the same as a base-to-derived conversion when the object doesn't have a derived type. For disjoint conversions of A to B, static_cast should fail (and perhaps C-style casts should as well -- there's no meaningful conversion between disjoint address spaces). And then dynamic_cast could be used for doing reverse implicit conversions that can detect invalid conversion attempts and throw an exception/return a nullptr. reinterpret_cast could then be used for bit-pattern preserving same-sized casts so it remains a bitcast operation.

(This idea came via a different WG21 member I was also discussing this topic with.)

rjmccall added a comment.EditedMay 25 2023, 11:18 AM

Right, that's been my thinking, too. The CVR qualifiers are all "view" qualifiers: both the underlying object and the reference to it are assumed to be the same independent of qualifiers, and the qualifiers just affect the rules around accesses. C++'s rules around qualifiers in conversions/casts make sense for view qualifiers, but a lot of our extended qualifiers aren't view qualifiers, so it makes sense that the rules would be different. Conversions between overlapping address spaces make sense to do with static_cast (or even an implicit conversion if the source AS is a subset). Conversions between non-overlapping address spaces should probably just not be allowed, honestly, the same way we don't allow conversions between different ObjC ARC qualifiers. There are plenty of less dangerous ways to reinterpret bits if someone's sure they know what they're doing.

C already formalizes this distinction to a degree: _Atomic behaves like a qualifier in many ways, but the language doesn't formally consider it one, and part of the reason for that is that it *doesn't* behave like a qualifier in some *other* ways. For example, you cannot convert a T * to an _Atomic(T) *, and the object representations don't necessarily match.

arsenm requested changes to this revision.Jul 7 2023, 5:26 AM

Conclusion seems to be this should have a separate cast operation

This revision now requires changes to proceed.Jul 7 2023, 5:26 AM