If agreed, some work will be needed to comply with these rules.
Thanks for working on this, it's a seriously underspecified part of LLVM.
What does 'is legal' actually mean? For example, if I have a 32-bit VA space and a 64-bit VA space in the same process with a subset of the program understanding only one and a shim layer understanding both, then the cast from 32-bit to 64-bit will always succeed but the converse cast is always a statically valid operation but may dynamically fail (and the failure case may be to return null).
Does this not preclude using address space casts to make pointers visible to GC (or, conversely, to notify the GC that a pointer has escaped)? I think we may need to define what 'capture' means in this context. I believe the goal of this is to state that the compiler is free to elide address space casts with no uses?
I believe that this holds only if both address spaces are permutations. Again, the simple 32/64-bit case, a round trip starting in the 32-bit world is always possible, but the converse does not apply.
I was thinking legal means a dereferencable result. If you were to access the result when the cast failed, it would be undefined
I'm trying to reiterate that this is not an operation that touches memory in any way. We do allow addrspacecasts in constant expressions, and they can't trap. Elision should be OK.
The intent is addrspacecast has stronger aliasing properties than a ptrtoint/inttoptr pair, which was part of the original intent of adding a new instruction.
This probably needs an additional constraint that the result was also dereferencable / legal as stated before. The goal is it should be legal to insert new addrspacecasts back to the source addrspace in certain contexts. The interesting ones are cases where you are dereferencing the pointer anyway, so it would have been undefined if the cast failed
Apparently I forgot to hit submit on the comments I made yesterday, sorry!
I'm happy with that clarification. We probably want to explicitly talk about dynamic failure for addrspacecast somewhere in this documentation.
Makes sense, we probably want to clarify what capturing means from the perspective of allowed transforms here. It may be that an addrsoacecast has static side effects (if that's a valid term) causing some extra metadata to be emitted about roots on the stack or in registers, for example, but the compiler is free to move address space casts across other operations and to elide them.
In the CHERI case, we do have one operation that it is not safe to move address space casts across (writes to the default data capability change the meaning of the current address space), but it's also not safe to reorder loads / stores across that and so we probably need to add an explicit address space change barrier intrinsic for things like that.
I think we need to be very careful about eliding user-expected trapping behaviour here. CHERI has similar behaviour, but consider a simple 32/64-bit example where:
If I have a sequence of:
%0 is some AS1 pointer
This would be expected to trap, but with the text as written it would be valid to elide %1 and have %2 be a load from %0. This is probably not what the programmer expected. Would you need to mark one of these as non-integral to achieve those guarantees? Or have some ordering of permissiveness? In this scenario, it's fine to do the elision the other way around:
%0 is some AS0 pointer
In this version, eliding %1 and making %2 load from %0 is totally fine. I think we really need to model three kinds of cast:
I'd suggest making the last case the default and adding metadata for the first two. Front ends generally know which of these are the case and can add the metadata and some later analyses may be able to once some more is known about the values. Optimisers are then free to elide any always-valid addrspacecast instructions. If the metadata is accidentally dropped then that's also fine.
In the CHERI case, an AS0 to AS200 cast is always valid (and clang could be taught to add the metadata). An AS200 to AS0 cast is always valid if you can statically prove that it was the result of a cast in the opposite direction, or if it came from a specific allocator. Most AS0 to AS200 casts are followed by a call to an intrinsic that sets bounds / permissions, and this is the only use of the addrspacecast instruction, which prohibits these optimisations (correctly).
Loading from the failed cast would be undefined behavior, so transforming the load and avoiding the trap should be legal. I wouldn't expect this to be any different from the compiler eliminating the a trap from a store to NULL. I do think this is an argument for not allowing changing the address space of volatile operations (see D63401).
I'd change the last sentence, maybe as shown below, but the meaning is fine.
"In general, "null-pointers" have to be assumed dereferenceable in non-zero address spaces."
dereferencable -> dereferenceable
I'm not happy with this wording,... but I can't quite put my finger on it.
Is it necessarily the same memory location or the same "logical" memory location? This is an honest question.
I am also unhappy with this as a claim. null pointers should always be non-dereferenceable. The semantics of inttoptr of 0 may not yield null for a given address space and the semantics of ptrtoint of null may not yield the integer 0, but the definition of a null pointer is a guaranteed non-dereferenceable pointer.
I'm unhappy with the claim, but I'm also unhappy with the state of null in llvm. "i8 addrspace(N)* null" is always interpreted as bit pattern 0. I would like to fix this by defining the invalid pointer value in the datalayout, and remove all special treatment of addrspace(0). As-is I don't think there is a way to describe a constant, known-invalid/null pointer in the IR.
Sorry for the delay.
I have a few more notes/questions.
Why, "In general,"? Do we have exceptions?
Again, why "possible" and "may". Don't we "just" want to say "Exceptions apply for `volatile` operations"?
I'm confused by:
why are address spaces pointer types (or am I reading this wrong)?
Do we want "is assumed" and "should"? As above, that sounds like the are alternative options.
Just chiming in here. We are planning to use addrspacecast for pointer width conversion (32 to 64 bit and vice-versa). Does not change the memory in anyway, i.e. if we lowered the addrspacecast into an IR call, that call could be marked argmemonly readnone. However, it is *not* a simple sign-extend/truncate either.
I've added some comments inline. Also, I think we have the scope to improve optimizations for addrspacecast at IR level, which is currently limited because different targets have varying lowering for addrspacecast.
Specifically, things like "looking through" an addrspacecast when the source operand of AS is nonnull and noting that the result is nonnull as well (i.e. improvements to isKnownNonZero and computeKnownBits). I was initially leaning towards adding a metadata for special addrspacecast semantics. However, @arsenm mentioned his ideas on adding information in the datalayout, i.e. 0 is a null pointer value for specific addrspaces. The latter may flow nicely with existing optimizations.
I believe this by itself is a very good statement/property to add to the langref and Verifier.
This rule allows us to optimize away redundant addrspacecasts.
%v.32 = addrspacecast i8 addrspace(1)* %v.64 to i8 addrspace(2)* %v.64.copy = addrspacecast i8 addrspace(2)* to i8 addrspace(1)*