This is an archive of the discontinued LLVM Phabricator instance.

Canonicalize addrspacecast between different pointer types.
AbandonedPublic

Authored by arsenm on Nov 15 2013, 2:24 AM.

Download Raw Diff

Details

Reviewers: None

Summary

Cast the address space with the same pointer element type
and then bitcast the type in the new address space.

addrspacecast X addrspace(M)* to Y addrspace(N)*
->
bitcast X addrspace(N)* (addrspacecast X addrspace(M)* to X addrspace(N)*) to Y addrspace(N)*

Diff Detail

Event Timeline

I would like to better understand the motivation of this patch: what are the benefit obtained doing a split between the change of address space and the pointer type within the new address space?
Being a bitcast just a reintepretation of the input value (no change in the bits) this behavior has been included in the addrspacecast instruction (nobody reported any objection on this).

Thanks in advance.

Hi,

I would like to follow up on this patch because it can simplify some addrspacecast-related optimizations. I recently worked on a patch (http://reviews.llvm.org/D3586) to eliminate unnecessary addrspacecasts from non-generic address spaces to the generic one for the NVPTX backend. For instance, we may want to optimize "load i32* addrspacecast (i64 addrspace(1)* to i32*)" to "load i32 addrspace(1)* bitcast (i64 addrspace(1)* to i32 addrspace(1)*)" because loading from non-generic address spaces is typically faster. If addrspacecasts are "canonicalized" as changing only the address space, the logic in my patch can be simplified a lot: I can simply remove the addrspacecast without introducing the bitcast.

In general, I believe this canonicalization is beneficial, and would love to see it pushed in. I guess Matt has more examples that can benefit from this transformation. In terms of implementation, I am open to other alternatives such as implementing this canonicalization as a separate pass that can be enabled/disabled by backends.

Thanks,
Jingyue

Hi Matt,

I found an issue with this patch that causes instcombine to run into an
infinite loops.

The test case is attached. The loop begins with the addrspacecast
instruction:

addrspacecast [16 x i32] addrspace(1)* %arr to i32

--> D2186:

%0 = addrspacecast [16 x i32] addrspace(1)* %arr to [16 x i32]*
bitcast [16 x i32]* %0 to i32*

--> visitBitCast

%0 = addrspacecast [16 x i32] addrspace(1)* %arr to [16 x i32]*
getelementptr [16 x i32]* %0, i64 0, i64 0

--> visitGetElementPtr

%0 = getelementptr [16 x i32] addrspace(1)* %arr, i64 0, i64 0
addrspacecast i32 addrspace(1)* %0 to i32*

--> commonPointerCastTransforms

addrspacecast [16 x i32] addrspace(1)* %arr to i32

and we have a loop.

I'll try to figure out the root cause today.

Jingyue

Matt,

I think the root cause for this loop is at InstCombineCasts.cpp:1438. The code there considers a GEP with all zero indices as a bitcast, and merges it with CI. While this merging is fine for bitcast, it undoes the canonicalization if CI is addrspacecast.

To fix this issue, I modified the code around there: if CI is addrspacecast and the getelementptr changes the pointer type, do not merge them.

I uploaded my changes to http://reviews.llvm.org/differential/diff/10068/. These changes include:

Applied D2186
Fixed the infinite loop issue and added the failed test to addrspacecast.ll
As discussed offline, I put bitcast before addrspacecast instead of after because I slightly prefer this way.
Updated all affected tests. One affected test (@test2_addrspacecast) in memcpy-from-global.ll is actually better optimized because of this canonicalization. See how alloca %T being transformed to alloca [128 x i8]

I am unable to update this diff directly probably because it's not created by me. If my changes conceptually look good to you, I can create a separate revision for further review.

Thanks,
Jingyue

Abandoning since superseded by r210375

Revision Contents

Path

Size

lib/

IR/

Instructions.cpp

39 lines

Transforms/

InstCombine/

InstCombineCasts.cpp

19 lines

test/

Transforms/

InstCombine/

addrspacecast.ll

69 lines

Diff 5564

lib/IR/Instructions.cpp

Show First 20 Lines • Show All 2,276 Lines • ▼ Show 20 Lines	case 11: {
unsigned PtrSize = MidIntPtrTy->getScalarSizeInBits();		unsigned PtrSize = MidIntPtrTy->getScalarSizeInBits();
unsigned SrcSize = SrcTy->getScalarSizeInBits();		unsigned SrcSize = SrcTy->getScalarSizeInBits();
unsigned DstSize = DstTy->getScalarSizeInBits();		unsigned DstSize = DstTy->getScalarSizeInBits();
if (SrcSize <= PtrSize && SrcSize == DstSize)		if (SrcSize <= PtrSize && SrcSize == DstSize)
return Instruction::BitCast;		return Instruction::BitCast;
return 0;		return 0;
}		}
case 12: {		case 12: {
// addrspacecast, addrspacecast -> bitcast, if SrcAS == DstAS
// addrspacecast, addrspacecast -> addrspacecast, if SrcAS != DstAS		// addrspacecast, addrspacecast -> addrspacecast, if SrcAS != DstAS
if (SrcTy->getPointerAddressSpace() != DstTy->getPointerAddressSpace())		if (SrcTy->getPointerAddressSpace() != DstTy->getPointerAddressSpace())
return Instruction::AddrSpaceCast;		return Instruction::AddrSpaceCast;
return Instruction::BitCast;		return 0;
}		}
case 13:		case 13:
// FIXME: this state can be merged with (1), but the following assert		// addrspacecast, bitcast -> addrspacecast if the pointer element types
// is useful to check the correcteness of the sequence due to semantic		// are the same.
// change of bitcast.		if (SrcTy->getPointerElementType() == DstTy->getPointerElementType())
assert(		return Instruction::AddrSpaceCast;
SrcTy->isPtrOrPtrVectorTy() &&		return 0;
MidTy->isPtrOrPtrVectorTy() &&
DstTy->isPtrOrPtrVectorTy() &&
SrcTy->getPointerAddressSpace() != MidTy->getPointerAddressSpace() &&
MidTy->getPointerAddressSpace() == DstTy->getPointerAddressSpace() &&
"Illegal addrspacecast, bitcast sequence!");
// Allowed, use first cast's opcode
return firstOp;
case 14:		case 14:
// FIXME: this state can be merged with (2), but the following assert		// bitcast, addrspacecast -> addrspacecast if the pointer element types
// is useful to check the correcteness of the sequence due to semantic		// are the same.
// change of bitcast.		if (MidTy->getPointerElementType() == DstTy->getPointerElementType())
assert(		return Instruction::AddrSpaceCast;
SrcTy->isPtrOrPtrVectorTy() &&		return 0;
MidTy->isPtrOrPtrVectorTy() &&
DstTy->isPtrOrPtrVectorTy() &&
SrcTy->getPointerAddressSpace() == MidTy->getPointerAddressSpace() &&
MidTy->getPointerAddressSpace() != DstTy->getPointerAddressSpace() &&
"Illegal bitcast, addrspacecast sequence!");
// Allowed, use second cast's opcode
return secondOp;
case 15:		case 15:
// FIXME: this state can be merged with (1), but the following assert		// FIXME: this state can be merged with (1), but the following assert
// is useful to check the correcteness of the sequence due to semantic		// is useful to check the correcteness of the sequence due to semantic
// change of bitcast.		// change of bitcast.
assert(		assert(
SrcTy->isIntOrIntVectorTy() &&		SrcTy->isIntOrIntVectorTy() &&
MidTy->isPtrOrPtrVectorTy() &&		MidTy->isPtrOrPtrVectorTy() &&
DstTy->isPtrOrPtrVectorTy() &&		DstTy->isPtrOrPtrVectorTy() &&
▲ Show 20 Lines • Show All 1,360 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineCasts.cpp

Show First 20 Lines • Show All 1,847 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitBitCast(BitCastInst &CI) {
}		}

if (SrcTy->isPointerTy())		if (SrcTy->isPointerTy())
return commonPointerCastTransforms(CI);		return commonPointerCastTransforms(CI);
return commonCastTransforms(CI);		return commonCastTransforms(CI);
}		}

Instruction *InstCombiner::visitAddrSpaceCast(AddrSpaceCastInst &CI) {		Instruction *InstCombiner::visitAddrSpaceCast(AddrSpaceCastInst &CI) {
return commonCastTransforms(CI);		// If the destination pointer element type is not the the same as the source's
		// do the addrspacecast to the same type, and then the bitcast in the new
		// address space. This allows the cast to be exposed to other transforms.
		Value *Src = CI.getOperand(0);
		PointerType *SrcTy = cast<PointerType>(Src->getType()->getScalarType());
		PointerType *DestTy = cast<PointerType>(CI.getType()->getScalarType());

		Type *SrcElemTy = SrcTy->getElementType();
		if (SrcElemTy != DestTy->getElementType()) {
		Type *MidTy = PointerType::get(SrcElemTy, DestTy->getAddressSpace());
		if (CI.getType()->isVectorTy()) // Handle vectors of pointers.
		MidTy = VectorType::get(MidTy, CI.getType()->getVectorNumElements());

		Value *NewASCast = Builder->CreateAddrSpaceCast(Src, MidTy);
		return new BitCastInst(NewASCast, CI.getType());
		}

		return commonPointerCastTransforms(CI);
}		}

test/Transforms/InstCombine/addrspacecast.ll

	Show All 16 Lines
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%y = addrspacecast <4 x i32 addrspace(1)> %x to <4 x i32 addrspace(3)>			%y = addrspacecast <4 x i32 addrspace(1)> %x to <4 x i32 addrspace(3)>
	%z = addrspacecast <4 x i32 addrspace(3)> %y to <4 x i32>			%z = addrspacecast <4 x i32 addrspace(3)> %y to <4 x i32>
	ret <4 x i32*> %z			ret <4 x i32*> %z
	}			}

	define float* @combine_redundant_addrspacecast_types(i32 addrspace(1)* %x) nounwind {			define float* @combine_redundant_addrspacecast_types(i32 addrspace(1)* %x) nounwind {
	; CHECK-LABEL: @combine_redundant_addrspacecast_types(			; CHECK-LABEL: @combine_redundant_addrspacecast_types(
	; CHECK: addrspacecast i32 addrspace(1)* %x to float*			; CHECK-NEXT: addrspacecast i32 addrspace(1)* %x to i32*
				; CHECK-NEXT: bitcast i32* %1 to float*
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%y = addrspacecast i32 addrspace(1)* %x to i32 addrspace(3)*			%y = addrspacecast i32 addrspace(1)* %x to i32 addrspace(3)*
	%z = addrspacecast i32 addrspace(3)* %y to float*			%z = addrspacecast i32 addrspace(3)* %y to float*
	ret float* %z			ret float* %z
	}			}

				define <4 x float> @combine_redundant_addrspacecast_types_vector(<4 x i32 addrspace(1)> %x) nounwind {
				; CHECK-LABEL: @combine_redundant_addrspacecast_types_vector(
				; CHECK-NEXT: addrspacecast <4 x i32 addrspace(1)> %x to <4 x i32>
				; CHECK-NEXT: bitcast <4 x i32> %1 to <4 x float>
				; CHECK-NEXT: ret
				%y = addrspacecast <4 x i32 addrspace(1)> %x to <4 x i32 addrspace(3)>
				%z = addrspacecast <4 x i32 addrspace(3)> %y to <4 x float>
				ret <4 x float*> %z
				}

				define float addrspace(2)* @combine_addrspacecast_bitcast_1(i32 addrspace(1)* %x) nounwind {
				; CHECK-LABEL: @combine_addrspacecast_bitcast_1(
				; CHECK-NEXT: addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				; CHECK-NEXT: bitcast i32 addrspace(2)* %y to float addrspace(2)*
				; CHECK-NEXT: ret
				%y = addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				%z = bitcast i32 addrspace(2)* %y to float addrspace(2)*
				ret float addrspace(2)* %z
				}

				define i32 addrspace(2)* @combine_addrspacecast_bitcast_2(i32 addrspace(1)* %x) nounwind {
				; CHECK-LABEL: @combine_addrspacecast_bitcast_2(
				; CHECK: addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				; CHECK-NEXT: ret
				%y = addrspacecast i32 addrspace(1)* %x to float addrspace(2)*
				%z = bitcast float addrspace(2)* %y to i32 addrspace(2)*
				ret i32 addrspace(2)* %z
				}

				define i32 addrspace(2)* @combine_bitcast_addrspacecast_1(i32 addrspace(1)* %x) nounwind {
				; CHECK-LABEL: @combine_bitcast_addrspacecast_1(
				; CHECK: addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				; CHECK-NEXT: ret
				%y = bitcast i32 addrspace(1)* %x to i8 addrspace(1)*
				%z = addrspacecast i8 addrspace(1)* %y to i32 addrspace(2)*
				ret i32 addrspace(2)* %z
				}

				define float addrspace(2)* @combine_bitcast_addrspacecast_2(i32 addrspace(1)* %x) nounwind {
				; CHECK-LABEL: @combine_bitcast_addrspacecast_2(
				; CHECK: addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				; CHECK: bitcast i32 addrspace(2)* %1 to float addrspace(2)*
				; CHECK-NEXT: ret
				%y = bitcast i32 addrspace(1)* %x to i8 addrspace(1)*
				%z = addrspacecast i8 addrspace(1)* %y to float addrspace(2)*
				ret float addrspace(2)* %z
				}

				define float addrspace(2)* @combine_addrspacecast_types(i32 addrspace(1)* %x) nounwind {
				; CHECK-LABEL: @combine_addrspacecast_types(
				; CHECK-NEXT: addrspacecast i32 addrspace(1)* %x to i32 addrspace(2)*
				; CHECK-NEXT: bitcast i32 addrspace(2)* %1 to float addrspace(2)*
				; CHECK-NEXT: ret
				%y = addrspacecast i32 addrspace(1)* %x to float addrspace(2)*
				ret float addrspace(2)* %y
				}

				define <4 x float addrspace(2)> @combine_addrspacecast_types_vector(<4 x i32 addrspace(1)> %x) nounwind {
				; CHECK-LABEL: @combine_addrspacecast_types_vector(
				; CHECK-NEXT: addrspacecast <4 x i32 addrspace(1)> %x to <4 x i32 addrspace(2)>
				; CHECK-NEXT: bitcast <4 x i32 addrspace(2)> %1 to <4 x float addrspace(2)>
				; CHECK-NEXT: ret
				%y = addrspacecast <4 x i32 addrspace(1)> %x to <4 x float addrspace(2)>
				ret <4 x float addrspace(2)*> %y
				}