This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/
-
CodeGen/
1/3
CodeGenPrepare.cpp
-
test/Transforms/CodeGenPrepare/NVPTX/
-
Transforms/
-
CodeGenPrepare/
-
NVPTX/
-
dont-introduce-addrspacecast.ll

Differential D92210

Don't sink ptrtoint/inttoptr sequences into non-noop addrspacecasts.
ClosedPublic

Authored by maleadt on Nov 27 2020, 12:21 AM.

Download Raw Diff

Details

Reviewers

efriedma
jlebar
tra
arsenm

Summary

In https://reviews.llvm.org/D30114, support for mismatching address spaces was introduced to CodeGenPrepare's optimizeMemoryInst, using addrspacecast as it was argued that only no-op addrspacecasts would be considered when constructing the address mode. However, by doing inttoptr/ptrtoint, it's possible to get CGP to emit an addrspace that's not actually no-op, introducing a miscompilation:

define void @kernel(i8* %julia_ptr) {
  %intptr = ptrtoint i8* %julia_ptr to i64
  %ptr = inttoptr i64 %intptr to i32 addrspace(3)*

  br label %end
end:

  store atomic i32 1, i32 addrspace(3)* %ptr unordered, align 4
  ret void
}

Gets compiled to:

define void @kernel(i8* %julia_ptr) {
end:
  %0 = addrspacecast i8* %julia_ptr to i32 addrspace(3)*
  store atomic i32 1, i32 addrspace(3)* %0 unordered, align 4
  ret void
}

In the case of NVPTX, this introduces a cvta.to.shared, whereas leaving out the %end block and branch doesn't trigger this optimization. This results in illegal memory accesses as seen in https://github.com/JuliaGPU/CUDA.jl/issues/558

In this change, I introduced a check before doing the pointer cast that verifies address spaces are the same. If not, it emits a ptrtoint/inttoptr combination to get a no-op cast between address spaces. I decided against disallowing ptrtoint/inttoptr with non-default AS in matchOperationAddr, because now its still possible to look through multiple sequences of them that ultimately do not result in a address space mismatch (i.e. the second lit test).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

maleadt created this revision.Nov 27 2020, 12:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 27 2020, 12:21 AM

Herald added subscribers: llvm-commits, jfb, hiraditya, jholewinski. · View Herald Transcript

maleadt requested review of this revision.Nov 27 2020, 12:21 AM

maleadt edited the summary of this revision. (Show Details)Nov 27 2020, 12:29 AM

Herald added a subscriber: arichardson. · View Herald TranscriptNov 27 2020, 12:29 AM

maleadt added reviewers: efriedma, jlebar.Nov 27 2020, 12:34 AM

maleadt added a project: Restricted Project.Nov 27 2020, 12:37 AM

Harbormaster completed remote builds in B80303: Diff 307972.Nov 27 2020, 12:54 AM

jlebar added a reviewer: tra.Nov 27 2020, 10:34 AM

This looks reasonable to me (and I appreciate all the debugging in the Julia bug!), but I have never touched this code, so I don't 100% feel comfortable approving the change.

In D92210#2421637, @jlebar wrote:

This looks reasonable to me (and I appreciate all the debugging in the Julia bug!), but I have never touched this code, so I don't 100% feel comfortable approving the change.

Same here. The change makes sense, but there's more going on that I'd like the authors of the CGP to take a look.

efriedma added inline comments.Nov 30 2020, 12:04 PM

llvm/lib/CodeGen/CodeGenPrepare.cpp
5049	I'd state this comment differently: There are two reasons the types might not match: a no-op addrspacecast, or a ptrtoint/inttoptr pair. Either way, we emit a ptrtoint/inttoptr pair, to ensure we match the original semantics. Ideally, we don't want to convert an addrspacecast to a ptrtoint/inttoptr pair; it's semantically valid, but we lose information. At this point in the pipeline, it doesn't affect that much, but it does hurt alias analysis a bit. I'm okay with starting with a conservative fix, and revisiting later.

arsenm added a subscriber: arsenm.Nov 30 2020, 1:29 PM

arsenm added inline comments.

llvm/lib/CodeGen/CodeGenPrepare.cpp
5049	Ideally we would allow bitcast between different address space pointers with the same size, which would avoid the need for this special case

Update comment.

Thanks for the review, I've updated the comment.
FWIW, we have already applied this patch to the LLVM tree used by Julia, and it (unsurprisingly) hasn't resulted in any issues.

llvm/lib/CodeGen/CodeGenPrepare.cpp
5049	You mean "two reasons the address spaces of the types might not match"? I'll update a comment and add a TODO about bitcasts between different address space pointers.

Harbormaster completed remote builds in B80811: Diff 308962.Dec 2 2020, 7:51 AM

JonChesterfield added a subscriber: JonChesterfield.Dec 11 2020, 1:49 PM

Bump, anything else needed here?

In D92210#2559113, @maleadt wrote:

Bump, anything else needed here?

@arsenm I think this is to you.

bump

vchuravy added a reviewer: arsenm.Oct 24 2021, 1:50 PM

Herald added a subscriber: wdng. · View Herald TranscriptOct 24 2021, 1:50 PM

LGTM

This revision is now accepted and ready to land.Oct 25 2021, 6:25 AM

Can somebody land this for me? I don't have commit access.

Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2022, 3:01 AM

Herald added subscribers: mattd, gchakrabarti, asavonic. · View Herald Transcript

a323dfc0152a11c6934695b1ae67920dae9fad9b

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

CodeGenPrepare.cpp

36 lines

test/

Transforms/

CodeGenPrepare/

NVPTX/

dont-introduce-addrspacecast.ll

43 lines

Diff 308962

llvm/lib/CodeGen/CodeGenPrepare.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,035 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::optimizeMemoryInst(Instruction MemoryInst, Value Addr,
// that we have to sink it into this block. Check to see if we have already		// that we have to sink it into this block. Check to see if we have already
// done this for some other load/store instr in this block. If so, reuse		// done this for some other load/store instr in this block. If so, reuse
// the computation. Before attempting reuse, check if the address is valid		// the computation. Before attempting reuse, check if the address is valid
// as it may have been erased.		// as it may have been erased.

WeakTrackingVH SunkAddrVH = SunkAddrs[Addr];		WeakTrackingVH SunkAddrVH = SunkAddrs[Addr];

Value * SunkAddr = SunkAddrVH.pointsToAliveValue() ? SunkAddrVH : nullptr;		Value * SunkAddr = SunkAddrVH.pointsToAliveValue() ? SunkAddrVH : nullptr;
		Type *IntPtrTy = DL->getIntPtrType(Addr->getType());
if (SunkAddr) {		if (SunkAddr) {
LLVM_DEBUG(dbgs() << "CGP: Reusing nonlocal addrmode: " << AddrMode		LLVM_DEBUG(dbgs() << "CGP: Reusing nonlocal addrmode: " << AddrMode
<< " for " << *MemoryInst << "\n");		<< " for " << *MemoryInst << "\n");
if (SunkAddr->getType() != Addr->getType())		if (SunkAddr->getType() != Addr->getType()) {
		if (SunkAddr->getType()->getPointerAddressSpace() !=
		efriedmaUnsubmitted Not Done Reply Inline Actions I'd state this comment differently: There are two reasons the types might not match: a no-op addrspacecast, or a ptrtoint/inttoptr pair. Either way, we emit a ptrtoint/inttoptr pair, to ensure we match the original semantics. Ideally, we don't want to convert an addrspacecast to a ptrtoint/inttoptr pair; it's semantically valid, but we lose information. At this point in the pipeline, it doesn't affect that much, but it does hurt alias analysis a bit. I'm okay with starting with a conservative fix, and revisiting later. efriedma: I'd state this comment differently: There are two reasons the types might not match: a no-op…
		arsenmUnsubmitted Not Done Reply Inline Actions Ideally we would allow bitcast between different address space pointers with the same size, which would avoid the need for this special case arsenm: Ideally we would allow bitcast between different address space pointers with the same size…
		maleadtAuthorUnsubmitted Done Reply Inline Actions You mean "two reasons the address spaces of the types might not match"? I'll update a comment and add a TODO about bitcasts between different address space pointers. maleadt: You mean "two reasons the address spaces of the types might not match"? I'll update a…
		Addr->getType()->getPointerAddressSpace() &&
		!DL->isNonIntegralPointerType(Addr->getType())) {
		// There are two reasons the address spaces might not match: a no-op
		// addrspacecast, or a ptrtoint/inttoptr pair. Either way, we emit a
		// ptrtoint/inttoptr pair to ensure we match the original semantics.
		// TODO: allow bitcast between different address space pointers with the
		// same size.
		SunkAddr = Builder.CreatePtrToInt(SunkAddr, IntPtrTy, "sunkaddr");
		SunkAddr =
		Builder.CreateIntToPtr(SunkAddr, Addr->getType(), "sunkaddr");
		} else
SunkAddr = Builder.CreatePointerCast(SunkAddr, Addr->getType());		SunkAddr = Builder.CreatePointerCast(SunkAddr, Addr->getType());
		}
} else if (AddrSinkUsingGEPs \|\| (!AddrSinkUsingGEPs.getNumOccurrences() &&		} else if (AddrSinkUsingGEPs \|\| (!AddrSinkUsingGEPs.getNumOccurrences() &&
SubtargetInfo->addrSinkUsingGEPs())) {		SubtargetInfo->addrSinkUsingGEPs())) {
// By default, we use the GEP-based method when AA is used later. This		// By default, we use the GEP-based method when AA is used later. This
// prevents new inttoptr/ptrtoint pairs from degrading AA capabilities.		// prevents new inttoptr/ptrtoint pairs from degrading AA capabilities.
LLVM_DEBUG(dbgs() << "CGP: SINKING nonlocal addrmode: " << AddrMode		LLVM_DEBUG(dbgs() << "CGP: SINKING nonlocal addrmode: " << AddrMode
<< " for " << *MemoryInst << "\n");		<< " for " << *MemoryInst << "\n");
Type *IntPtrTy = DL->getIntPtrType(Addr->getType());
Value ResultPtr = nullptr, ResultIndex = nullptr;		Value ResultPtr = nullptr, ResultIndex = nullptr;

// First, find the pointer.		// First, find the pointer.
if (AddrMode.BaseReg && AddrMode.BaseReg->getType()->isPointerTy()) {		if (AddrMode.BaseReg && AddrMode.BaseReg->getType()->isPointerTy()) {
ResultPtr = AddrMode.BaseReg;		ResultPtr = AddrMode.BaseReg;
AddrMode.BaseReg = nullptr;		AddrMode.BaseReg = nullptr;
}		}

▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	if (!ResultPtr &&
ResultPtr = Builder.CreatePointerCast(ResultPtr, I8PtrTy);		ResultPtr = Builder.CreatePointerCast(ResultPtr, I8PtrTy);
SunkAddr =		SunkAddr =
AddrMode.InBounds		AddrMode.InBounds
? Builder.CreateInBoundsGEP(I8Ty, ResultPtr, ResultIndex,		? Builder.CreateInBoundsGEP(I8Ty, ResultPtr, ResultIndex,
"sunkaddr")		"sunkaddr")
: Builder.CreateGEP(I8Ty, ResultPtr, ResultIndex, "sunkaddr");		: Builder.CreateGEP(I8Ty, ResultPtr, ResultIndex, "sunkaddr");
}		}

if (SunkAddr->getType() != Addr->getType())		if (SunkAddr->getType() != Addr->getType()) {
		if (SunkAddr->getType()->getPointerAddressSpace() !=
		Addr->getType()->getPointerAddressSpace() &&
		!DL->isNonIntegralPointerType(Addr->getType())) {
		// There are two reasons the address spaces might not match: a no-op
		// addrspacecast, or a ptrtoint/inttoptr pair. Either way, we emit a
		// ptrtoint/inttoptr pair to ensure we match the original semantics.
		// TODO: allow bitcast between different address space pointers with
		// the same size.
		SunkAddr = Builder.CreatePtrToInt(SunkAddr, IntPtrTy, "sunkaddr");
		SunkAddr =
		Builder.CreateIntToPtr(SunkAddr, Addr->getType(), "sunkaddr");
		} else
SunkAddr = Builder.CreatePointerCast(SunkAddr, Addr->getType());		SunkAddr = Builder.CreatePointerCast(SunkAddr, Addr->getType());
}		}
		}
} else {		} else {
// We'd require a ptrtoint/inttoptr down the line, which we can't do for		// We'd require a ptrtoint/inttoptr down the line, which we can't do for
// non-integral pointers, so in that case bail out now.		// non-integral pointers, so in that case bail out now.
Type *BaseTy = AddrMode.BaseReg ? AddrMode.BaseReg->getType() : nullptr;		Type *BaseTy = AddrMode.BaseReg ? AddrMode.BaseReg->getType() : nullptr;
Type *ScaleTy = AddrMode.Scale ? AddrMode.ScaledReg->getType() : nullptr;		Type *ScaleTy = AddrMode.Scale ? AddrMode.ScaledReg->getType() : nullptr;
PointerType *BasePtrTy = dyn_cast_or_null<PointerType>(BaseTy);		PointerType *BasePtrTy = dyn_cast_or_null<PointerType>(BaseTy);
PointerType *ScalePtrTy = dyn_cast_or_null<PointerType>(ScaleTy);		PointerType *ScalePtrTy = dyn_cast_or_null<PointerType>(ScaleTy);
if (DL->isNonIntegralPointerType(Addr->getType()) \|\|		if (DL->isNonIntegralPointerType(Addr->getType()) \|\|
▲ Show 20 Lines • Show All 2,797 Lines • Show Last 20 Lines

llvm/test/Transforms/CodeGenPrepare/NVPTX/dont-introduce-addrspacecast.ll

This file was added.

				; RUN: opt -S -codegenprepare < %s \| FileCheck %s

				target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"
				target triple = "nvptx64-nvidia-cuda"


				; ptrtoint/inttoptr combinations can introduce semantically-meaningful address space casts
				; which we can't sink into an addrspacecast

				; CHECK-LABEL: @test
				define void @test(i8* %input_ptr) {
				; CHECK-LABEL: l1:
				; CHECK-NOT: addrspacecast
				%intptr = ptrtoint i8* %input_ptr to i64
				%ptr = inttoptr i64 %intptr to i32 addrspace(3)*

				br label %l1
				l1:

				store atomic i32 1, i32 addrspace(3)* %ptr unordered, align 4
				ret void
				}


				; we still should be able to look through multiple sequences of inttoptr/ptrtoint

				; CHECK-LABEL: @test2
				define void @test2(i8* %input_ptr) {
				; CHECK-LABEL: l2:
				; CHECK: bitcast
				; CHECK-NEXT: store
				%intptr = ptrtoint i8* %input_ptr to i64
				%ptr = inttoptr i64 %intptr to i32 addrspace(3)*

				%intptr2 = ptrtoint i32 addrspace(3)* %ptr to i64
				%ptr2 = inttoptr i64 %intptr2 to i32*

				br label %l2
				l2:

				store atomic i32 1, i32* %ptr2 unordered, align 4
				ret void
				}