This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
RewriteStatepointsForGC.cpp
-
test/Transforms/RewriteStatepointsForGC/
-
Transforms/
-
RewriteStatepointsForGC/
-
deopt-bundles/
-
live-vector-nosplit.ll
-
live-vector.ll
-
live-vector.ll

Differential D15982

[rs4gc] Optionally directly relocated vector of pointers
ClosedPublic

Authored by reames on Jan 7 2016, 4:37 PM.

Download Raw Diff

Details

Reviewers

pgavlin
sanjoy
mjacob
JosephTremoulet

Commits

rG5715f576ea2f: [rs4gc] Optionally directly relocated vector of pointers
rL257244: [rs4gc] Optionally directly relocated vector of pointers

Summary

This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates.

The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications.

Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR.

At the moment, the new functionality is hidden behind a flag. To use this code, you need to disable use "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested.

Diff Detail

Repository: rL LLVM

Event Timeline

reames updated this revision to Diff 44290.Jan 7 2016, 4:37 PM

reames retitled this revision from to [rs4gc] Optionally directly relocated vector of pointers.

reames updated this object.

reames added reviewers: pgavlin, JosephTremoulet, sanjoy, mjacob.

reames added a subscriber: llvm-commits.

Herald added a subscriber: sanjoy. · View Herald TranscriptJan 7 2016, 4:37 PM

mjacob added inline comments.Jan 7 2016, 6:12 PM

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
1331–1342 ↗	(On Diff #44290)	You can hoist the construction of the i8 pointer in the correct address space before the conditional by using `Type::getInt8PtrTy(M->getContext(), Ty->getPointerAddressSpace())`. Then you could further simplify the code by sinking the call to `Intrinsic::getDeclaration()` to a common path. The resulting code would be similar to this: Type Int8Ptr = Type::getInt8PtrTy(M->getContext(), Ty->getPointerAddressSpace()); if (VectorType VT = dyn_cast<VectorType>(Ty)) Int8Ptr = VectorType::get(Int8Ptr, VT->getNumElements()); return Intrinsic::getDeclaration(M, Intrinsic::experimental_gc_relocate, {Int8Ptr});

address manuel's comments - thanks, that's a lot cleaner!

LGTM.

Maybe you could move the caching to the lambda. However I'm fine with it if you had good reasons not to do so.

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
1322–1327 ↗	(On Diff #44378)	This comment is a bit inaccurate now.

This revision is now accepted and ready to land.Jan 8 2016, 4:43 PM

In D15982#322785, @mjacob wrote:

Maybe you could move the caching to the lambda. However I'm fine with it if you had good reasons not to do so.

I generally find that having the generation function separate from the caching layer leads to fewer bugs. Maybe it's just me, but I've seen and written a *lot* of bugs where something didn't get cached correctly. I like to keep that portion of the code as simple and obviously correct as possible. I could use a second layer of lambda, but that seemed like a overkill in this situation.

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
1322–1327 ↗	(On Diff #44378)	Will tweak before submission.

Closed by commit rL257244: [rs4gc] Optionally directly relocated vector of pointers (authored by reames). · Explain WhyJan 8 2016, 5:34 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

RewriteStatepointsForGC.cpp

41 lines

test/

Transforms/

RewriteStatepointsForGC/

deopt-bundles/

live-vector-nosplit.ll

112 lines

live-vector.ll

2 lines

live-vector.ll

2 lines

Diff 44391

llvm/trunk/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp

Show First 20 Lines • Show All 1,311 Lines • ▼ Show 20 Lines	static void CreateGCRelocates(ArrayRef<Value *> LiveVariables,

auto FindIndex = [](ArrayRef<Value > LiveVec, Value Val) {		auto FindIndex = [](ArrayRef<Value > LiveVec, Value Val) {
auto ValIt = std::find(LiveVec.begin(), LiveVec.end(), Val);		auto ValIt = std::find(LiveVec.begin(), LiveVec.end(), Val);
assert(ValIt != LiveVec.end() && "Val not found in LiveVec!");		assert(ValIt != LiveVec.end() && "Val not found in LiveVec!");
size_t Index = std::distance(LiveVec.begin(), ValIt);		size_t Index = std::distance(LiveVec.begin(), ValIt);
assert(Index < LiveVec.size() && "Bug in std::find?");		assert(Index < LiveVec.size() && "Bug in std::find?");
return Index;		return Index;
};		};

// All gc_relocate are set to i8 addrspace(1)* type. We originally generated
// unique declarations for each pointer type, but this proved problematic
// because the intrinsic mangling code is incomplete and fragile. Since
// we're moving towards a single unified pointer type anyways, we can just
// cast everything to an i8* of the right address space. A bitcast is added
// later to convert gc_relocate to the actual value's type.
Module *M = StatepointToken->getModule();		Module *M = StatepointToken->getModule();
auto AS = cast<PointerType>(LiveVariables[0]->getType())->getAddressSpace();
Type *Types[] = {Type::getInt8PtrTy(M->getContext(), AS)};		// All gc_relocate are generated as i8 addrspace(1)* (or a vector type whose
Value *GCRelocateDecl =		// element type is i8 addrspace(1)*). We originally generated unique
Intrinsic::getDeclaration(M, Intrinsic::experimental_gc_relocate, Types);		// declarations for each pointer type, but this proved problematic because
		// the intrinsic mangling code is incomplete and fragile. Since we're moving
		// towards a single unified pointer type anyways, we can just cast everything
		// to an i8* of the right address space. A bitcast is added later to convert
		// gc_relocate to the actual value's type.
		auto getGCRelocateDecl = [&] (Type *Ty) {
		assert(isHandledGCPointerType(Ty));
		auto AS = Ty->getScalarType()->getPointerAddressSpace();
		Type *NewTy = Type::getInt8PtrTy(M->getContext(), AS);
		if (auto *VT = dyn_cast<VectorType>(Ty))
		NewTy = VectorType::get(NewTy, VT->getNumElements());
		return Intrinsic::getDeclaration(M, Intrinsic::experimental_gc_relocate,
		{NewTy});
		};

		// Lazily populated map from input types to the canonicalized form mentioned
		// in the comment above. This should probably be cached somewhere more
		// broadly.
		DenseMap<Type, Value> TypeToDeclMap;

for (unsigned i = 0; i < LiveVariables.size(); i++) {		for (unsigned i = 0; i < LiveVariables.size(); i++) {
// Generate the gc.relocate call and save the result		// Generate the gc.relocate call and save the result
Value *BaseIdx =		Value *BaseIdx =
Builder.getInt32(LiveStart + FindIndex(LiveVariables, BasePtrs[i]));		Builder.getInt32(LiveStart + FindIndex(LiveVariables, BasePtrs[i]));
Value *LiveIdx = Builder.getInt32(LiveStart + i);		Value *LiveIdx = Builder.getInt32(LiveStart + i);

		Type *Ty = LiveVariables[i]->getType();
		if (!TypeToDeclMap.count(Ty))
		TypeToDeclMap[Ty] = getGCRelocateDecl(Ty);
		Value *GCRelocateDecl = TypeToDeclMap[Ty];

// only specify a debug name if we can give a useful one		// only specify a debug name if we can give a useful one
CallInst *Reloc = Builder.CreateCall(		CallInst *Reloc = Builder.CreateCall(
GCRelocateDecl, {StatepointToken, BaseIdx, LiveIdx},		GCRelocateDecl, {StatepointToken, BaseIdx, LiveIdx},
suffixed_name_or(LiveVariables[i], ".relocated", ""));		suffixed_name_or(LiveVariables[i], ".relocated", ""));
// Trick CodeGen into thinking there are lots of free registers at this		// Trick CodeGen into thinking there are lots of free registers at this
// fake call.		// fake call.
Reloc->setCallingConv(CallingConv::Cold);		Reloc->setCallingConv(CallingConv::Cold);
}		}
▲ Show 20 Lines • Show All 1,126 Lines • ▼ Show 20 Lines	#ifndef NDEBUG
}		}
#endif		#endif
}		}
unique_unsorted(Live);		unique_unsorted(Live);

#ifndef NDEBUG		#ifndef NDEBUG
// sanity check		// sanity check
for (auto *Ptr : Live)		for (auto *Ptr : Live)
assert(isGCPointerType(Ptr->getType()) && "must be a gc pointer type");		assert(isHandledGCPointerType(Ptr->getType()) &&
		"must be a gc pointer type");
#endif		#endif

relocationViaAlloca(F, DT, Live, Records);		relocationViaAlloca(F, DT, Live, Records);
return !Records.empty();		return !Records.empty();
}		}

// Handles both return values and arguments for Functions and CallSites.		// Handles both return values and arguments for Functions and CallSites.
template <typename AttrHolder>		template <typename AttrHolder>
▲ Show 20 Lines • Show All 429 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/RewriteStatepointsForGC/deopt-bundles/live-vector-nosplit.ll

				; Test that we can correctly handle vectors of pointers in statepoint
				; rewriting.
				; RUN: opt %s -rewrite-statepoints-for-gc -rs4gc-use-deopt-bundles -rs4gc-split-vector-values=0 -S \| FileCheck %s

				; A non-vector relocation for comparison
				define i64 addrspace(1)* @test(i64 addrspace(1)* %obj) gc "statepoint-example" {
				; CHECK-LABEL: test
				; CHECK: gc.statepoint
				; CHECK-NEXT: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret i64 addrspace(1)*
				; A base vector from a argument
				entry:
				call void @do_safepoint() [ "deopt"() ]
				ret i64 addrspace(1)* %obj
				}

				; A vector argument
				define <2 x i64 addrspace(1)> @test2(<2 x i64 addrspace(1)> %obj) gc "statepoint-example" {
				; CHECK-LABEL: test2
				; CHECK-NEXT: gc.statepoint
				; CHECK-NEXT: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*>
				call void @do_safepoint() [ "deopt"() ]
				ret <2 x i64 addrspace(1)*> %obj
				}

				; A load
				define <2 x i64 addrspace(1)> @test3(<2 x i64 addrspace(1)>* %ptr) gc "statepoint-example" {
				; CHECK-LABEL: test3
				; CHECK: load
				; CHECK-NEXT: gc.statepoint
				; CHECK-NEXT: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*>
				entry:
				%obj = load <2 x i64 addrspace(1)>, <2 x i64 addrspace(1)>* %ptr
				call void @do_safepoint() [ "deopt"() ]
				ret <2 x i64 addrspace(1)*> %obj
				}

				declare i32 @fake_personality_function()

				; When a statepoint is an invoke rather than a call
				define <2 x i64 addrspace(1)> @test4(<2 x i64 addrspace(1)>* %ptr) gc "statepoint-example" personality i32 ()* @fake_personality_function {
				; CHECK-LABEL: test4
				; CHECK: load
				; CHECK-NEXT: gc.statepoint
				entry:
				%obj = load <2 x i64 addrspace(1)>, <2 x i64 addrspace(1)>* %ptr
				invoke void @do_safepoint() [ "deopt"() ]
				to label %normal_return unwind label %exceptional_return

				normal_return: ; preds = %entry
				; CHECK-LABEL: normal_return:
				; CHECK: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*>
				ret <2 x i64 addrspace(1)*> %obj

				exceptional_return: ; preds = %entry
				; CHECK-LABEL: exceptional_return:
				; CHECK: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*>
				%landing_pad4 = landingpad token
				cleanup
				ret <2 x i64 addrspace(1)*> %obj
				}

				; A newly created vector
				define <2 x i64 addrspace(1)> @test5(i64 addrspace(1) %p) gc "statepoint-example" {
				; CHECK-LABEL: test5
				; CHECK: insertelement
				; CHECK-NEXT: gc.statepoint
				; CHECK-NEXT: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*> %vec.relocated.casted
				entry:
				%vec = insertelement <2 x i64 addrspace(1)> undef, i64 addrspace(1) %p, i32 0
				call void @do_safepoint() [ "deopt"() ]
				ret <2 x i64 addrspace(1)*> %vec
				}

				; A merge point
				define <2 x i64 addrspace(1)> @test6(i1 %cnd, <2 x i64 addrspace(1)>* %ptr) gc "statepoint-example" {
				; CHECK-LABEL: test6
				entry:
				br i1 %cnd, label %taken, label %untaken

				taken: ; preds = %entry
				%obja = load <2 x i64 addrspace(1)>, <2 x i64 addrspace(1)>* %ptr
				br label %merge

				untaken: ; preds = %entry
				%objb = load <2 x i64 addrspace(1)>, <2 x i64 addrspace(1)>* %ptr
				br label %merge

				merge: ; preds = %untaken, %taken
				; CHECK-LABEL: merge:
				; CHECK-NEXT: = phi
				; CHECK-NEXT: gc.statepoint
				; CHECK-NEXT: gc.relocate
				; CHECK-NEXT: bitcast
				; CHECK-NEXT: ret <2 x i64 addrspace(1)*>
				%obj = phi <2 x i64 addrspace(1)*> [ %obja, %taken ], [ %objb, %untaken ]
				call void @do_safepoint() [ "deopt"() ]
				ret <2 x i64 addrspace(1)*> %obj
				}

				declare void @do_safepoint()

llvm/trunk/test/Transforms/RewriteStatepointsForGC/deopt-bundles/live-vector.ll

	; Test that we can correctly handle vectors of pointers in statepoint			; Test that we can correctly handle vectors of pointers in statepoint
	; rewriting. Currently, we scalarize, but that's an implementation detail.			; rewriting. Currently, we scalarize, but that's an implementation detail.
	; RUN: opt %s -rewrite-statepoints-for-gc -rs4gc-use-deopt-bundles -S \| FileCheck %s			; RUN: opt %s -rewrite-statepoints-for-gc -rs4gc-use-deopt-bundles -rs4gc-split-vector-values -S \| FileCheck %s

	; A non-vector relocation for comparison			; A non-vector relocation for comparison

	define i64 addrspace(1)* @test(i64 addrspace(1)* %obj) gc "statepoint-example" {			define i64 addrspace(1)* @test(i64 addrspace(1)* %obj) gc "statepoint-example" {
	; CHECK-LABEL: test			; CHECK-LABEL: test
	; CHECK: gc.statepoint			; CHECK: gc.statepoint
	; CHECK-NEXT: gc.relocate			; CHECK-NEXT: gc.relocate
	; CHECK-NEXT: bitcast			; CHECK-NEXT: bitcast
	▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/RewriteStatepointsForGC/live-vector.ll

	; Test that we can correctly handle vectors of pointers in statepoint			; Test that we can correctly handle vectors of pointers in statepoint
	; rewriting. Currently, we scalarize, but that's an implementation detail.			; rewriting. Currently, we scalarize, but that's an implementation detail.
	; RUN: opt %s -rewrite-statepoints-for-gc -S \| FileCheck %s			; RUN: opt %s -rewrite-statepoints-for-gc -rs4gc-split-vector-values -S \| FileCheck %s

	; A non-vector relocation for comparison			; A non-vector relocation for comparison
	define i64 addrspace(1)* @test(i64 addrspace(1)* %obj) gc "statepoint-example" {			define i64 addrspace(1)* @test(i64 addrspace(1)* %obj) gc "statepoint-example" {
	; CHECK-LABEL: test			; CHECK-LABEL: test
	; CHECK: gc.statepoint			; CHECK: gc.statepoint
	; CHECK-NEXT: gc.relocate			; CHECK-NEXT: gc.relocate
	; CHECK-NEXT: bitcast			; CHECK-NEXT: bitcast
	; CHECK-NEXT: ret i64 addrspace(1)* %obj.relocated.casted			; CHECK-NEXT: ret i64 addrspace(1)* %obj.relocated.casted
	▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines