This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
RewriteStatepointsForGC.cpp
-
test/Transforms/RewriteStatepointsForGC/
-
Transforms/
-
RewriteStatepointsForGC/
-
codegen-cond.ll

Differential D11819

[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
ClosedPublic

Authored by reames on Aug 6 2015, 2:06 PM.

Download Raw Diff

Details

Reviewers

• chenli
swaroop.sridhar
pgavlin
sanjoy

Commits

rG971dc3a82a01: [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
rL244821: [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints

Summary

To be clear: this is an *optimization* not a correctness change.

CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.

One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.

I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.

Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.

Diff Detail

Repository: rL LLVM

Event Timeline

reames updated this revision to Diff 31472.Aug 6 2015, 2:06 PM

reames retitled this revision from to [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints.

reames updated this object.

reames added reviewers: sanjoy, • chenli, pgavlin, swaroop.sridhar.

reames added a subscriber: llvm-commits.

sanjoy requested changes to this revision.Aug 11 2015, 3:33 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
2424 ↗	(On Diff #31472)	A `switch` on an `icmp` is equivalent to a `br`; perhaps we should just teach LLVM to transform %c = icmp slt i32 %x, 0 switch i1 %c, label %def [ i1 0, label %f i1 1, label %t ] to %c = icmp slt i32 %x, 0 br i1 %c, label %t, label %f so that we don't have to deal with `switch` separately here? (Right now LLVM does not do the said transform, which surprised me). If you remove the `switch` case (pun intended!) here then you should be able to use a simple expression from PatternMatch.h to match a conditional `br` on an `icmp`.
2426 ↗	(On Diff #31472)	I'd prefer typing the lambda with an explicit type, like [](TerminatorInst TI) -> Instruction { ...

This revision now requires changes to proceed.Aug 11 2015, 3:33 PM

sanjoy edited edge metadata.Aug 11 2015, 3:36 PM

• chenli added inline comments.Aug 11 2015, 3:41 PM

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
2424 ↗	(On Diff #31472)	Hmm, I think -simplifycfg should do the transformation. Do we ever see SwitchInst in practice?

sanjoy added inline comments.Aug 11 2015, 3:51 PM

lib/Transforms/Scalar/RewriteStatepointsForGC.cpp

2424 ↗

(On Diff #31472)

Here's the full example

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @g()
declare void @h()
declare void @i()

define void @f(i32 %x) {
 entry:
  %c = icmp slt i32 %x, 0
  switch i1 %c, label %def [ i1 0, label %f
                             i1 1, label %t ]

 t:
  call void @g()
  ret void

 f:
  call void @h()
  ret void

 def:
  call void @i()
  ret void
}

opt -O3 does not transform the switch to a br.

Sanjoy, you're example transform isn't directly legal. Your example switch is a three way branch (default, true, false), not a two way branch. There is no way to represent a three way branch in a single br instruction. Your example does point out that we're failing to remove a provably unreachable default case. If we did so, the switch would become a two way switch and then be converted to a branch.

Given the switch logic in the patch is pretty much useless as you point out, I'm just going to remove it for the moment. Longer term, I want to extend this code to reason about other classes of instructions and whether they're legal to move. I want that to be a separate patch though.

I will take your other comment and apply it.

Revised per review comments.

Herald added a subscriber: sanjoy. · View Herald TranscriptAug 12 2015, 2:09 PM

lgtm

This revision is now accepted and ready to land.Aug 12 2015, 2:21 PM

Closed by commit rL244821: [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints (authored by reames). · Explain WhyAug 12 2015, 3:12 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

RewriteStatepointsForGC.cpp

31 lines

test/

Transforms/

RewriteStatepointsForGC/

codegen-cond.ll

74 lines

Diff 31990

llvm/trunk/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp

Show First 20 Lines • Show All 2,446 Lines • ▼ Show 20 Lines	bool RewriteStatepointsForGC::runOnFunction(Function &F) {
// of liveness sets for no good reason. It may be harder to do this post		// of liveness sets for no good reason. It may be harder to do this post
// insertion since relocations and base phis can confuse things.		// insertion since relocations and base phis can confuse things.
for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
if (BB.getUniquePredecessor()) {		if (BB.getUniquePredecessor()) {
MadeChange = true;		MadeChange = true;
FoldSingleEntryPHINodes(&BB);		FoldSingleEntryPHINodes(&BB);
}		}

		// Before we start introducing relocations, we want to tweak the IR a bit to
		// avoid unfortunate code generation effects. The main example is that we
		// want to try to make sure the comparison feeding a branch is after any
		// safepoints. Otherwise, we end up with a comparison of pre-relocation
		// values feeding a branch after relocation. This is semantically correct,
		// but results in extra register pressure since both the pre-relocation and
		// post-relocation copies must be available in registers. For code without
		// relocations this is handled elsewhere, but teaching the scheduler to
		// reverse the transform we're about to do would be slightly complex.
		// Note: This may extend the live range of the inputs to the icmp and thus
		// increase the liveset of any statepoint we move over. This is profitable
		// as long as all statepoints are in rare blocks. If we had in-register
		// lowering for live values this would be a much safer transform.
		auto getConditionInst = [](TerminatorInst TI) -> Instruction {
		if (auto *BI = dyn_cast<BranchInst>(TI))
		if (BI->isConditional())
		return dyn_cast<Instruction>(BI->getCondition());
		// TODO: Extend this to handle switches
		return nullptr;
		};
		for (BasicBlock &BB : F) {
		TerminatorInst *TI = BB.getTerminator();
		if (auto *Cond = getConditionInst(TI))
		// TODO: Handle more than just ICmps here. We should be able to move
		// most instructions without side effects or memory access.
		if (isa<ICmpInst>(Cond) && Cond->hasOneUse()) {
		MadeChange = true;
		Cond->moveBefore(TI);
		}
		}

MadeChange \|= insertParsePoints(F, DT, this, ParsePointNeeded);		MadeChange \|= insertParsePoints(F, DT, this, ParsePointNeeded);
return MadeChange;		return MadeChange;
}		}

// liveness computation via standard dataflow		// liveness computation via standard dataflow
// -------------------------------------------------------------------		// -------------------------------------------------------------------

// TODO: Consider using bitvectors for liveness, the set of potentially		// TODO: Consider using bitvectors for liveness, the set of potentially
▲ Show 20 Lines • Show All 235 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/RewriteStatepointsForGC/codegen-cond.ll

				; RUN: opt -rewrite-statepoints-for-gc -S < %s \| FileCheck %s

				; A null test of a single value
				define i1 @test(i8 addrspace(1)* %p, i1 %rare) gc "statepoint-example" {
				; CHECK-LABEL: @test
				entry:
				%cond = icmp eq i8 addrspace(1)* %p, null
				br i1 %rare, label %safepoint, label %continue, !prof !0
				safepoint:
				call i32 (i64, i32, void (), i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void () @safepoint, i32 0, i32 0, i32 0, i32 0)
				br label %continue
				continue:
				; CHECK-LABEL: continue:
				; CHECK: phi
				; CHECK-DAG: [ %p.relocated, %safepoint ]
				; CHECK-DAG [ %p, %entry ]
				; CHECK: %cond = icmp
				; CHECK: br i1 %cond
				br i1 %cond, label %taken, label %untaken
				taken:
				ret i1 true
				untaken:
				ret i1 false
				}

				; Comparing two pointers
				define i1 @test2(i8 addrspace(1)* %p, i8 addrspace(1)* %q, i1 %rare)
				gc "statepoint-example" {
				; CHECK-LABEL: @test2
				entry:
				%cond = icmp eq i8 addrspace(1)* %p, %q
				br i1 %rare, label %safepoint, label %continue, !prof !0
				safepoint:
				call i32 (i64, i32, void (), i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void () @safepoint, i32 0, i32 0, i32 0, i32 0)
				br label %continue
				continue:
				; CHECK-LABEL: continue:
				; CHECK: phi
				; CHECK-DAG: [ %q.relocated, %safepoint ]
				; CHECK-DAG [ %q, %entry ]
				; CHECK: phi
				; CHECK-DAG: [ %p.relocated, %safepoint ]
				; CHECK-DAG [ %p, %entry ]
				; CHECK: %cond = icmp
				; CHECK: br i1 %cond
				br i1 %cond, label %taken, label %untaken
				taken:
				ret i1 true
				untaken:
				ret i1 false
				}

				; Sanity check that nothing bad happens if already last instruction
				; before terminator
				define i1 @test3(i8 addrspace(1)* %p, i8 addrspace(1)* %q, i1 %rare)
				gc "statepoint-example" {
				; CHECK-LABEL: @test3
				entry:
				call i32 (i64, i32, void (), i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void () @safepoint, i32 0, i32 0, i32 0, i32 0)
				; CHECK: gc.statepoint
				; CHECK: %cond = icmp
				; CHECK: br i1 %cond
				%cond = icmp eq i8 addrspace(1)* %p, %q
				br i1 %cond, label %taken, label %untaken
				taken:
				ret i1 true
				untaken:
				ret i1 false
				}

				declare void @safepoint()
				declare i32 @llvm.experimental.gc.statepoint.p0f_isVoidf(i64, i32, void ()*, i32, i32, ...)

				!0 = !{!"branch_weights", i32 1, i32 10000}