This is an archive of the discontinued LLVM Phabricator instance.

LLVM IR should allow bitcast between address spaces with the same size.
Needs ReviewPublic

Authored by skc7 on Nov 24 2021, 5:58 AM.

Details

Summary

When the addrspacecast instruction was added, the ability to bitcast between pointers from different address spaces was removed.
There are cases, where after analysis, cast between pointers from different address spaces can be concluded to be a no-op cast.
If bitcast can be allowed in these scenarios, it would help further optimise the IR in Transform passes since its a no-op cast.
This enhancement to bitcast will require that pointers to the two address spaces have the same bit widths(can be queried from DataLayout).
Frontend should never misuse the bitcast wherever addrspace cast would have been more appropriate.

For example consider the below IR:

GVN pass has discovered a reinterpretation of bits via a store followed by a load.

%struct.S.coerce = type { i32 addrspace(1)* }
%s.sroa.0 = alloca i32*, align 8, addrspace(5)
%0 = extractvalue %struct.S.coerce %s.coerce, 0
%1 = bitcast i32* addrspace(5)* %s.sroa.0 to i32 addrspace(1)* addrspace(5)*
%2 = addrspacecast i32 addrspace(1)* addrspace(5)* %1 to i32 addrspace(1)
store i32 addrspace(1)* %0, i32 addrspace(1)
%2, align 8

%3 = load i32*, i32* addrspace(5)* %s.sroa.0, align 8, !tbaa !2

GVN pass currently introduces no-op ptrotoint/inttoptr for load.
%3 = ptrtoint i32 addrspace(1)* %0 to i64
%4 = inttoptr i64 %3 to i32*

If bitcast of pointers from different address is allowed, load can be replaced with no-op bitcast
%3 = bitcast i32 addrspace(1)* %0 to i32*

Diff Detail

Event Timeline

skc7 created this revision.Nov 24 2021, 5:58 AM
skc7 requested review of this revision.Nov 24 2021, 5:58 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 24 2021, 5:58 AM

Is there an RFC? This quite seems like asking for problems.

Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs

skc7 edited the summary of this revision. (Show Details)Nov 24 2021, 6:59 AM

Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs

That is a good point, but all the motivational cases seem to be about chaining the address space of the pointee pointer.
Does this come up for changing the address space of the pointer itself? I'm just wondering if this is still relevant with opaque pointers.

Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs

That is a good point, but all the motivational cases seem to be about chaining the address space of the pointee pointer.
Does this come up for changing the address space of the pointer itself? I'm just wondering if this is still relevant with opaque pointers.

Nothing about address spaces changes with opaque pointers. The address space is a property of the pointer itself, not the memory it points to. This is for dealing with cases where we end up with a bit reinterpret of a pointer in another address space, and need to do a true no-op cast to convert them.

jrtc27 added a subscriber: jrtc27.Nov 25 2021, 7:24 AM

This seems like it should not apply to non-integral address spaces?

llvm/docs/LangRef.rst
10999

This should be illegal, not legal and undefined, you really want the IR verifier to be able to pick up on this at least, if not an assertion at creation time

llvm/include/llvm/IR/InstrTypes.h
729

Typo "e"

This seems like it should not apply to non-integral address spaces?

No, it shouldn’t care about the individual address spaces. It’s a reinterpret of the bits, the type meanings don’t matter

This seems like it should not apply to non-integral address spaces?

No, it shouldn’t care about the individual address spaces. It’s a reinterpret of the bits, the type meanings don’t matter

Well, they absolutely do for us downstream, but we also don't use the non-integral address space feature, since it didn't exist when our fork started and also has overly restrictive semantics. Normally the places we need to stop these kinds of casts/reinterpretations are a subset of those for non-integral address spaces, but maybe this is one of those exceptions then. I certainly have no clue about what the non-integral address spaces are actually used for and mean :)

skc7 updated this revision to Diff 389983.Nov 26 2021, 3:55 AM

Updated diff with changes suggested by jrtc27

@jrtc27 is correct. This absolutely must not apply to non-integral address spaces. It is not legal for LLVM to introduce additional ptrtoint instructions for non-integral address spaces that were not present in the original input IR. That doesn't change if the spelling of a ptrtoint/inttoptr pair is changed to bitcast. There is a bit of a larger issue here that LLVM IR isn't really rich enough to currently describe what operations are legal for the optimizer to introduce and what aren't. Every frontend/backend that uses them has their own rules that appear obvious for a particular use case, but aren't necessarily general. A similar issue applies to commuting GEPs with addrspacecasts. I'm wondering if we need something like the datalayout but for describing relationships between addrspaces and what things are legal and what are not.

skc7 updated this revision to Diff 391836.Dec 4 2021, 5:55 AM

updated test IR files.

tra added a subscriber: tra.Jan 21 2022, 4:33 PM
skc7 updated this revision to Diff 406351.Feb 7 2022, 1:37 AM

Rebase

pmatos added a subscriber: pmatos.Mar 10 2022, 3:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 3:07 AM