This is an archive of the discontinued LLVM Phabricator instance.

Canonicalize addrspacecast between different pointer types.
AbandonedPublic

Authored by arsenm on Nov 15 2013, 2:24 AM.

Details

Reviewers
None
Summary

Cast the address space with the same pointer element type
and then bitcast the type in the new address space.

addrspacecast X addrspace(M)* to Y addrspace(N)*
->
bitcast X addrspace(N)* (addrspacecast X addrspace(M)* to X addrspace(N)*) to Y addrspace(N)*

Diff Detail

Event Timeline

I would like to better understand the motivation of this patch: what are the benefit obtained doing a split between the change of address space and the pointer type within the new address space?
Being a bitcast just a reintepretation of the input value (no change in the bits) this behavior has been included in the addrspacecast instruction (nobody reported any objection on this).

Thanks in advance.

Hi,

I would like to follow up on this patch because it can simplify some addrspacecast-related optimizations. I recently worked on a patch (http://reviews.llvm.org/D3586) to eliminate unnecessary addrspacecasts from non-generic address spaces to the generic one for the NVPTX backend. For instance, we may want to optimize "load i32* addrspacecast (i64 addrspace(1)* to i32*)" to "load i32 addrspace(1)* bitcast (i64 addrspace(1)* to i32 addrspace(1)*)" because loading from non-generic address spaces is typically faster. If addrspacecasts are "canonicalized" as changing only the address space, the logic in my patch can be simplified a lot: I can simply remove the addrspacecast without introducing the bitcast.

In general, I believe this canonicalization is beneficial, and would love to see it pushed in. I guess Matt has more examples that can benefit from this transformation. In terms of implementation, I am open to other alternatives such as implementing this canonicalization as a separate pass that can be enabled/disabled by backends.

Thanks,
Jingyue

Hi Matt,

I found an issue with this patch that causes instcombine to run into an
infinite loops.

The test case is attached. The loop begins with the addrspacecast
instruction:

addrspacecast [16 x i32] addrspace(1)* %arr to i32

--> D2186:

%0 = addrspacecast [16 x i32] addrspace(1)* %arr to [16 x i32]*
bitcast [16 x i32]* %0 to i32*

--> visitBitCast

%0 = addrspacecast [16 x i32] addrspace(1)* %arr to [16 x i32]*
getelementptr [16 x i32]* %0, i64 0, i64 0

--> visitGetElementPtr

%0 = getelementptr [16 x i32] addrspace(1)* %arr, i64 0, i64 0
addrspacecast i32 addrspace(1)* %0 to i32*

--> commonPointerCastTransforms

addrspacecast [16 x i32] addrspace(1)* %arr to i32

and we have a loop.

I'll try to figure out the root cause today.

Jingyue

Matt,

I think the root cause for this loop is at InstCombineCasts.cpp:1438. The code there considers a GEP with all zero indices as a bitcast, and merges it with CI. While this merging is fine for bitcast, it undoes the canonicalization if CI is addrspacecast.

To fix this issue, I modified the code around there: if CI is addrspacecast and the getelementptr changes the pointer type, do not merge them.

I uploaded my changes to http://reviews.llvm.org/differential/diff/10068/. These changes include:

  • Applied D2186
  • Fixed the infinite loop issue and added the failed test to addrspacecast.ll
  • As discussed offline, I put bitcast before addrspacecast instead of after because I slightly prefer this way.
  • Updated all affected tests. One affected test (@test2_addrspacecast) in memcpy-from-global.ll is actually better optimized because of this canonicalization. See how alloca %T being transformed to alloca [128 x i8]

I am unable to update this diff directly probably because it's not created by me. If my changes conceptually look good to you, I can create a separate revision for further review.

Thanks,
Jingyue

arsenm abandoned this revision.Jun 6 2014, 3:25 PM

Abandoning since superseded by r210375