This is an archive of the discontinued LLVM Phabricator instance.

[GVNHoist] Fix: PR32821, add check for anticipability in case of infinite loops
AbandonedPublic

Authored by hiraditya on Apr 27 2017, 12:25 PM.

Download Raw Diff

Details

Reviewers

sebpop
chandlerc
davide
gberry
• dberlin

Summary

This fixes the case when gvn-hoist would hoist instructions when there is an infinite loop in the path from HoistBB to the end of the function. https://bugs.llvm.org/show_bug.cgi?id=32821

Diff Detail

Event Timeline

hiraditya created this revision.Apr 27 2017, 12:25 PM

I'm at a loss to understand why you believe you need to compute joint post-dominance (which is what this is) to compute ANTIC.

In D32614#739947, @dberlin wrote:

I'm at a loss to understand why you believe you need to compute joint post-dominance (which is what this is) to compute ANTIC.

If the set of BasicBlocks (WL) do not joint-post-dominate the hoisting point (HoistBB), then the expression to be hoisted (from WL) cannot be ANTIC in HoistBB.
It is a necessary condition what I'm checking here. The other check is the availability of operands in each basic block which is being checked elsewhere. Maybe the function name is not appropriate.
Please suggest your views.

Thanks,

In D32614#740119, @hiraditya wrote:

In D32614#739947, @dberlin wrote:

I'm at a loss to understand why you believe you need to compute joint post-dominance (which is what this is) to compute ANTIC.

If the set of BasicBlocks (WL) do not joint-post-dominate the hoisting point (HoistBB), then the expression to be hoisted (from WL) cannot be ANTIC in HoistBB.

Sure.
That is definitely a way of computing it.
It is a generally slow and unused way of computing it, because there are only certain points in the SSA graph where ANTIC can actually change.

It is a necessary condition what I'm checking here.

It is not necessary to perform graph reachability to do this.

https://pdfs.semanticscholar.org/6d0f/07ff330402b46e83d46142e202069d9aeceb.pdf

Stare at the down-safety step.
With a few bits, it is only actually necessary to compute and check antic at the possible phis you would insert to do your hoisting.

llvm/lib/Transforms/Scalar/GVNHoist.cpp
334	You can just use the return value of insert.

Regardless of what you do with the code (I'm not an expert on this pass or the stuff Danny is already discussing there), please simplify this test case to a more readable and clear test case for the fundamental issue you're hitting. Having tons of Clang-specific bits in here really opacifies what scenario you're trying to test.

Relatedly, there are many different ways to arrive at an infinite loop, I would really hope to see a reasonable *collection* of test cases that thoroughly exercise the kinds of CFGs that can interefere with reachability analyses due to infinite loops. For example, irreducible control flows that infinitely cycle seem usefully different from regular infinite loops.

There is also the problem brought up in http://lists.llvm.org/pipermail/llvm-dev/2015-July/088095.html now almost two years ago which we still haven't made sense of yet. We really should be able to hoist across an infinite loop without side-effects but not across one with side-effects in C++-modeling LLVM IR, and we shouldn't be able to do so in Java-modeling LLVM IR. =/ You should at least leave test cases that exercise both paths (whichever behavior you choose) and clear comments in the test cases describing how they should be updated and expanded when we have a real answer here.

(just marking this as needing changes so it isn't on my phab dashboard)

This revision now requires changes to proceed.Apr 30 2017, 8:46 PM

In D32614#740129, @dberlin wrote:

In D32614#740119, @hiraditya wrote:

In D32614#739947, @dberlin wrote:

I'm at a loss to understand why you believe you need to compute joint post-dominance (which is what this is) to compute ANTIC.

If the set of BasicBlocks (WL) do not joint-post-dominate the hoisting point (HoistBB), then the expression to be hoisted (from WL) cannot be ANTIC in HoistBB.

Sure.
That is definitely a way of computing it.
It is a generally slow and unused way of computing it, because there are only certain points in the SSA graph where ANTIC can actually change.

It is a necessary condition what I'm checking here.

It is not necessary to perform graph reachability to do this.

https://pdfs.semanticscholar.org/6d0f/07ff330402b46e83d46142e202069d9aeceb.pdf

Stare at the down-safety step.
With a few bits, it is only actually necessary to compute and check antic at the possible phis you would insert to do your hoisting.

Thanks for the reference, I'll read the paper and try to implement if that appears more efficient. Currently, I'm in the middle of preparing for a conference so I'll get back to this in a few days.

llvm/lib/Transforms/Scalar/GVNHoist.cpp
334	I'll do that. Thanks.

In D32614#740129, @dberlin wrote:

In D32614#740119, @hiraditya wrote:

In D32614#739947, @dberlin wrote:

I'm at a loss to understand why you believe you need to compute joint post-dominance (which is what this is) to compute ANTIC.

If the set of BasicBlocks (WL) do not joint-post-dominate the hoisting point (HoistBB), then the expression to be hoisted (from WL) cannot be ANTIC in HoistBB.

Sure.
That is definitely a way of computing it.
It is a generally slow and unused way of computing it, because there are only certain points in the SSA graph where ANTIC can actually change.

It is a necessary condition what I'm checking here.

It is not necessary to perform graph reachability to do this.

https://pdfs.semanticscholar.org/6d0f/07ff330402b46e83d46142e202069d9aeceb.pdf

Stare at the down-safety step.
With a few bits, it is only actually necessary to compute and check antic at the possible phis you would insert to do your hoisting.

From the paper and the book (http://ssabook.gforge.inria.fr/latest/book.pdf page: 151), it seems DownSafety algorithm does exactly what I'm doing in the function (anticReachable), IIUC.

DownSafety:
...
10: for each f ∈ {Φ’s in the program} do
11: if ∃ path P to program exit or alteration of expression along which f is not used
12: downsafe (f ) ← false
...

It checks for each path from point P to the program exit. They have simplified the problem by assuming that end of the function is reachable from every node which is not the
case with LLVM representation of CFG (and that caused the bug).
If I implement the ANTIC the way they have done, it would require iterating through all dominance frontiers to figure out PHI insertion points.
Also, just ensuring downSafety at the PHI is not sufficient to guarantee downsafety at HoistPt because there might be safety issues between a definition of instruction and its dominance frontier.

e.g.,

B -> D -> E
|               |
C             |
|               |
v             v
F <-------/

F will have the PHI for instructions in C and D, but E might throw etc.
Please correct me if I'm wrong because all this is based on my limited understanding of paper and the book chapter.

Thanks,

Addressed comments from @chandlerc and @dberlin

I don't see what would have addressed my concerns.

There still is only a single test case for an infinite loop. There are *several different* CFGs that fail to terminate. You should be testing them.

Examples off the top of my head:

A non-loop cycle in the CFG. Note that you need to set up the entire hoisting to be *inside* of the outer loop and then make it invalid due to an infinite non-loop cycle that exists within the loop.
A *potential* cycle due to indirect-br
An exceptional cycle due to cyclic exception edges from invokes
Either indirect-br or exceptions combined with a non-loop cycle.

I understand that you may naturally have an algorithm that handles all of these without special casing. That is good! But you should *test* them to ensure that when your code sees unexpected or rarely formed control flows it continues to behave well.

Put differently, it would be good to add fairly extensive testing that actively tries to construct corner cases to break the safety checks of this pass as it has had a long history of subtle and difficult to find correctness bugs.

Have you or anyone that cares about GVN hoist considered building a tool to synthetically generate random CFGs with some properties to see how the pass behaves? That might help more pro-actively find issues.

This revision now requires changes to proceed.May 8 2017, 3:56 PM

In D32614#749208, @chandlerc wrote:

I don't see what would have addressed my concerns.

There still is only a single test case for an infinite loop. There are *several different* CFGs that fail to terminate. You should be testing them.

Examples off the top of my head:

A non-loop cycle in the CFG.

The code checks for cycle, not loops so it can detect all kinds of explicit cycles as different from other three you mentioned.

Note that you need to set up the entire hoisting to be *inside* of the outer loop and then make it invalid due to an infinite non-loop cycle that exists within the loop.

Dominance relation takes care of that (see partitionCandidates)

A *potential* cycle due to indirect-br

An exceptional cycle due to cyclic exception edges from invokes

Either indirect-br or exceptions combined with a non-loop cycle.

All the indirect branch targets in the path are excluded in the safety checks (see safeToHoistLDSt and safeToHoistScalar)
So all kinds of cycles are handled.

Put differently, it would be good to add fairly extensive testing that actively tries to construct corner cases to break the safety checks of this pass as it has had a long history of subtle and difficult to find correctness bugs.

I have bootstrapped clang, ran SPEC2006. The pass was disabled on multiple occasions, I agree, but on several of these occasions the pass was disabled because it exposed bugs in other places, for example:
https://bugs.llvm.org//show_bug.cgi?id=30806
https://bugs.llvm.org//show_bug.cgi?id=32811
https://bugs.llvm.org/show_bug.cgi?id=32153

From what I understand from the cases you pointed out, this was the only unhandled case.
Thanks for the review.

I understand that you may naturally have an algorithm that handles all of these without special casing. That is good! But you should *test* them to ensure that when your code sees unexpected or rarely formed control flows it continues to behave well.

Put differently, it would be good to add fairly extensive testing that actively tries to construct corner cases to break the safety checks of this pass as it has had a long history of subtle and difficult to find correctness bugs.

Have you or anyone that cares about GVN hoist considered building a tool to synthetically generate random CFGs with some properties to see how the pass behaves? That might help more pro-actively find issues.

You are continuing to argue that you do not need to add tests because they will pass. That seems to be missing the point. Tests are what verify and validate these kinds of arguments and assumptions, both now and in the future.

I already said in my response that I acknowledge that these tests will likely pass. It still seems extremely important to have specific, targeted test coverage of the different kinds of control flows. I will stop repeating myself after this message.

In D32614#749269, @chandlerc wrote:

You are continuing to argue that you do not need to add tests because they will pass. That seems to be missing the point. Tests are what verify and validate these kinds of arguments and assumptions, both now and in the future.

I'll add more tests which will reflect different aspects of cyclic control flow. I was trying to explain what program points handle specific cases pointed by you. It will be good if you can suggest any resources which can help create a variety of CFGs.

I already said in my response that I acknowledge that these tests will likely pass. It still seems extremely important to have specific, targeted test coverage of the different kinds of control flows. I will stop repeating myself after this message.

Addressed @chandlerc 's comments: Essentially, I've added test cases for cycles due to direct branches, indirect branches and invoke instructions.

Test cases show:

hoisting does not happen when anticipability cannot be guaranteed because of infinite loops in one of the paths
when it is okay to hoist out of indirect branch targets and irreducible control flow because anticipability is satisfied
that cycle due to indirect branches prevents hoisting
that no hoisting happens when there is a cycle due to invoke
instruction can be hoisted out of catch blocks when legality is satisfied

Please let me know if I may have missed any test case.

Adding ':' for basic block labels in the testcase

sanjoy added a subscriber: sanjoy.May 14 2017, 1:58 PM

sanjoy added inline comments.May 14 2017, 2:07 PM

llvm/lib/Transforms/Scalar/GVNHoist.cpp
325	This is a shallow drive by comment, but do you have to worry about function calls that throw / infloop here?

hiraditya marked an inline comment as done.May 18 2017, 2:37 PM

hiraditya added inline comments.

llvm/lib/Transforms/Scalar/GVNHoist.cpp
325	That is handled by the function hasEHOnPath called later.

@dberlin
When SSAPRE inserts PHI (expression PHIs), it does so at places where (potentially) multiple expressions may merge. If a PHI has a bottom (⊥) entry, that means the expression is partially available.
The concept may work for GVN-Hoist to help factor out anticipability by working on inverted graph. If we insert an outgoing PHI (if I may), to a basic block with multiple successors, as instructions are hoisted upwards to a nearest common dominator.
And we start walking the CFG to figure out if any outgoing PHI has a ⊥, that means the expression is not anticipable in the basic block having that outgoing PHI.
To minimize the number of outgoing PHIs, we will only insert them for expressions with multiple occurrences.
This will help remove the need to check for reachability.

Please let me know if this would be a way to factor out anticipability of expressions.

Thanks,
-Aditya

Drive-by comment.

llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll
287–290	These tests can be reduced quite a bit, IMHO (for example you don't need the attributes).

• dberlin resigned from this revision.Jun 19 2017, 12:01 AM

Merged with GVNHoist long time back.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

GVNHoist.cpp

54 lines

test/

Transforms/

GVNHoist/

infinite-loop-direct.ll

97 lines

infinite-loop-indirect.ll

290 lines

Diff 98638

llvm/lib/Transforms/Scalar/GVNHoist.cpp

Context not available.
	}	}

	// Return true when all paths from HoistBB to the end of the function pass	// Return true when all paths from HoistBB to the end of the function pass
	// through one of the blocks in WL.	// through one of the blocks in WL. Check for anticipability via graph
	bool hoistingFromAllPaths(const BasicBlock *HoistBB,	// reachability when dominance (HoistBB dominates WL) has already been
	SmallPtrSetImpl<const BasicBlock *> &WL) {	// established. If a leaf node in CFG can be reached from HoistBB without
		// crossing an element from WL, that means any expression in WL cannot be
	// Copy WL as the loop will remove elements from it.	// anticipable at HoistBB. We do not check for availability of operands at
	SmallPtrSet<const BasicBlock *, 2> WorkList(WL.begin(), WL.end());	// this stage because in some cases operands can be made available.
		bool anticReachable(const BasicBlock *HoistBB,
	for (auto It = df_begin(HoistBB), E = df_end(HoistBB); It != E;) {	const SmallPtrSetImpl<const BasicBlock *> &WL) {
	// There exists a path from HoistBB to the exit of the function if we are	SmallPtrSet<const BasicBlock *, 2> Remaining;
	// still iterating in DF traversal and we removed all instructions from	SmallVector<const BasicBlock*, 8> Queue;
	// the work list.	Queue.push_back(HoistBB);
	if (WorkList.empty())	// Perform BFS from HoistBB on its successors until an element from
	return false;	// WL is found. A path where no element from WL is found indicates
		// it may be unsafe to hoist to HoistBB i.e., not-anticipable.
	const BasicBlock BB = It;	while(!Queue.empty()) {
	if (WorkList.erase(BB)) {	const BasicBlock *BB = Queue.back();
	// Stop DFS traversal when BB is in the work list.	Queue.pop_back();
		sanjoyUnsubmitted Not Done Reply Inline Actions This is a shallow drive by comment, but do you have to worry about function calls that throw / infloop here? sanjoy: This is a shallow drive by comment, but do you have to worry about function calls that throw /…
		hiradityaAuthorUnsubmitted Not Done Reply Inline Actions That is handled by the function hasEHOnPath called later. hiraditya: That is handled by the function hasEHOnPath called later.
	It.skipChildren();	if (WL.count(BB))
	continue;	continue;
	}

	// We reached the leaf Basic Block => not all paths have this instruction.	// We reached the leaf Basic Block => not all paths have this instruction.
	if (!BB->getTerminator()->getNumSuccessors())	if (!BB->getTerminator()->getNumSuccessors())
	return false;	return false;

	// When reaching the back-edge of a loop, there may be a path through the	for (const BasicBlock *Succ : BB->getTerminator()->successors()) {
	// loop that does not pass through B or C before exiting the loop.	if (!Remaining.insert(Succ).second) // Loop.
		dberlinUnsubmitted Done Reply Inline Actions You can just use the return value of insert. dberlin: You can just use the return value of insert.
		hiradityaAuthorUnsubmitted Not Done Reply Inline Actions I'll do that. Thanks. hiraditya: I'll do that. Thanks.
	if (successorDominate(BB, HoistBB))	return false;
	return false;	Queue.push_back(Succ);
		}
	// Increment DFS traversal when not skipping children.
	++It;
	}	}

	return true;	return true;
	}	}

Context not available.
	SmallPtrSetImpl<const BasicBlock *> &WL,	SmallPtrSetImpl<const BasicBlock *> &WL,
	int &NBBsOnAllPaths) {	int &NBBsOnAllPaths) {
	// Check that the hoisted expression is needed on all paths.	// Check that the hoisted expression is needed on all paths.
	if (!hoistingFromAllPaths(HoistBB, WL))	if (!anticReachable(HoistBB, WL))
	return false;	return false;

	for (const BasicBlock *BB : WL)	for (const BasicBlock *BB : WL)
Context not available.
	// loading from the same address: for instance there may be a branch on	// loading from the same address: for instance there may be a branch on
	// which the address of the load may not be initialized.	// which the address of the load may not be initialized.
	if ((HoistBB == NewHoistBB \|\| BB == NewHoistBB \|\|	if ((HoistBB == NewHoistBB \|\| BB == NewHoistBB \|\|
	hoistingFromAllPaths(NewHoistBB, WL)) &&	anticReachable(NewHoistBB, WL)) &&
	// Also check that it is safe to move the load or store from HoistPt	// Also check that it is safe to move the load or store from HoistPt
	// to NewHoistPt, and from Insn to NewHoistPt.	// to NewHoistPt, and from Insn to NewHoistPt.
	safeToHoistLdSt(NewHoistPt, HoistPt, UD, K, NumBBsOnAllPaths) &&	safeToHoistLdSt(NewHoistPt, HoistPt, UD, K, NumBBsOnAllPaths) &&
Context not available.

llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll

This file was added.

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of infinite loops and irreducible control flow.

				; Check that bitcast is not hoisted beacuse down safety is not guaranteed.
				; CHECK-LABEL: @bazv1
				; CHECK: if.then.i:
				; CHECK: bitcast
				; CHECK-NEXT: load
				; CHECK: if.then4.i:
				; CHECK: bitcast
				; CHECK-NEXT: load

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				; Function Attrs: noreturn nounwind uwtable
				define void @bazv1() local_unnamed_addr {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x.sroa.2.0..sroa_idx2 = getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				store %class.base* null, %class.base** %x.sroa.2.0..sroa_idx2, align 8
				call void @_Z3foo3bar(%class.bar* nonnull %agg.tmp)
				%0 = load %class.base, %class.base* %x.sroa.2.0..sroa_idx2, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %if.then.i, label %if.else.i

				if.then.i: ; preds = %entry
				%2 = bitcast %class.base* %0 to void (%class.base)**
				%vtable.i = load void (%class.base), void (%class.base)*** %2, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable.i, i64 2
				%3 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				call void %3(%class.base* %0)
				br label %while.cond.preheader

				if.else.i: ; preds = %entry
				%tobool.i = icmp eq %class.base* %0, null
				br i1 %tobool.i, label %while.cond.preheader, label %if.then4.i

				if.then4.i: ; preds = %if.else.i
				%4 = bitcast %class.base* %0 to void (%class.base)**
				%vtable6.i = load void (%class.base), void (%class.base)*** %4, align 8
				%vfn7.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable6.i, i64 3
				%5 = load void (%class.base), void (%class.base)* %vfn7.i, align 8
				call void %5(%class.base* nonnull %0)
				br label %while.cond.preheader

				while.cond.preheader: ; preds = %if.then.i, %if.else.i, %if.then4.i
				br label %while.cond

				while.cond: ; preds = %while.cond.preheader, %while.cond
				%call = call i32 @sleep(i32 10)
				br label %while.cond
				}

				declare void @_Z3foo3bar(%class.bar*) local_unnamed_addr

				declare i32 @sleep(i32) local_unnamed_addr

				; Check that the load is hoisted even if it is inside an irreducible control flow
				; because the load is anticipable on all paths.

				; CHECK-LABEL: @bazv
				; CHECK: bb2:
				; CHECK-NOT: load
				; CHECK-NOT: bitcast

				define void @bazv() {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%0 = load %class.base, %class.base* %x, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %bb1, label %bb4

				bb1:
				%b1 = bitcast %class.base* %0 to void (%class.base)**
				%i = load void (%class.base), void (%class.base)*** %b1, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %i, i64 2
				%cmp.j = icmp eq %class.base* %0, %1
				br i1 %cmp.j, label %bb2, label %bb3

				bb2:
				%l1 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb3

				bb3:
				%l2 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb2

				bb4:
				%b2 = bitcast %class.base* %0 to void (%class.base)**
				ret void
				}

llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll

This file was added.

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of indirect branches.

				; Check that the bitcast is is not hoisted because it is after an indirect call
				; CHECK-LABEL: @foo
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1
				; CHECK: bitcast

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				@bar = local_unnamed_addr global i32 ()* null, align 8
				@bar1 = local_unnamed_addr global i32 ()* null, align 8

				define i32 @foo(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%1 = load i32 (), i32 ()* @bar, align 8
				%call = tail call i32 %1()
				%b2 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}


				; However, when the instruction is before the indirect call it is completely
				; safe to do so because the loop preheader already has such an instruction.

				; CHECK-LABEL: @foo1
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1:
				; CHECK-NOT: bitcast

				define i32 @foo1(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%y1 = load %class.base, %class.base* %x, align 8
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%1 = load i32 (), i32 ()* @bar, align 8
				%y2 = load %class.base, %class.base* %x, align 8
				%call = tail call i32 %1()
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}

				; Check that the bitcast is hoisted even when they are indirect
				; branch targets because both the targets are known and anticipability
				; is guaranteed.

				; CHECK-LABEL: @test13
				; CHECK: bitcast
				; CHECK-LABEL: B2:
				; CHECK-NOT: bitcast

				define i32 @test13(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				ret i32 123
				F:
				ret i32 1422
				}

				; Check that the bitcast is not hoisted because anticipability
				; cannot be guaranteed here as one of the indirect branch targets
				; do not have the bitcast instruction.

				; CHECK-LABEL: @test14
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test14(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2, label %T]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				%pi = load i32, i32* %i, align 4
				ret i32 %pi
				F:
				%pl = load i32, i32* %P
				ret i32 %pl
				}


				; Check that the bitcast is not hoisted because of a cycle
				; due to indirect branches
				; CHECK-LABEL: @test16
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test16(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%0 = load i32, i32* %i, align 4
				store i32 %0, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				indirectbr i32* %P, [label %BrBlock, label %B2]

				F:
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]
				}


				@_ZTIi = external constant i8*

				; Check that an instruction is not hoisted out of landing pad (%lpad4)
				; Also within a landing pad no redundancies are removed by gvn-hoist,
				; however an instruction may be hoisted into a landing pad if
				; landing pad has direct branches (e.g., %lpad to %catch1, %catch)
				; This CFG has a cycle (%lpad -> %catch1 -> %lpad4 -> %lpad)

				; CHECK-LABEL: @foo2
				; Check that nothing gets hoisted out of %lpad
				; CHECK-LABEL: lpad:
				; CHECK: %bc1 = add i32 %0, 10
				; CHECK: %bc7 = add i32 %0, 10

				; Check that the add is hoisted
				; CHECK-LABEL: catch1:
				; CHECK-NEXT: invoke

				; Check that the add is hoisted
				; CHECK-LABEL: catch:
				; CHECK-NEXT: load

				; Check that other adds are not hoisted
				; CHECK-LABEL: lpad4:
				; CHECK: %bc5 = add i32 %0, 10
				; CHECK-LABEL: unreachable:
				; CHECK: %bc2 = add i32 %0, 10

				; Function Attrs: noinline uwtable
				define i32 @foo2(i32* nocapture readonly %i) local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				%0 = load i32, i32* %i, align 4
				%cmp = icmp eq i32 %0, 0
				br i1 %cmp, label %try.cont, label %if.then

				if.then:
				%exception = tail call i8* @__cxa_allocate_exception(i64 4) #2
				%1 = bitcast i8* %exception to i32*
				store i32 %0, i32* %1, align 16
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				lpad:
				%2 = landingpad { i8*, i32 }
				catch i8* bitcast (i8** @_ZTIi to i8*)
				catch i8* null
				%bc1 = add i32 %0, 10
				%3 = extractvalue { i8*, i32 } %2, 0
				%4 = extractvalue { i8*, i32 } %2, 1
				%5 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #2
				%matches = icmp eq i32 %4, %5
				%bc7 = add i32 %0, 10
				%6 = tail call i8* @__cxa_begin_catch(i8* %3) #2
				br i1 %matches, label %catch1, label %catch

				catch1:
				%bc3 = add i32 %0, 10
				invoke void @__cxa_rethrow() #3
				to label %unreachable unwind label %lpad4

				catch:
				%bc4 = add i32 %0, 10
				%7 = load i32, i32* %i, align 4
				%add = add nsw i32 %7, 1
				tail call void @__cxa_end_catch()
				br label %try.cont

				lpad4:
				%8 = landingpad { i8*, i32 }
				cleanup
				%bc5 = add i32 %0, 10
				tail call void @__cxa_end_catch() #2
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				try.cont:
				%k.0 = phi i32 [ %add, %catch ], [ 0, %entry ]
				%bc6 = add i32 %0, 10
				ret i32 %k.0

				unreachable:
				%bc2 = add i32 %0, 10
				ret i32 %bc2
				}

				declare i8* @__cxa_allocate_exception(i64) local_unnamed_addr

				declare void @__cxa_throw(i8, i8, i8*) local_unnamed_addr

				declare i32 @__gxx_personality_v0(...)

				; Function Attrs: nounwind readnone
				declare i32 @llvm.eh.typeid.for(i8*) #1

				declare i8* @__cxa_begin_catch(i8*) local_unnamed_addr

				declare void @__cxa_end_catch() local_unnamed_addr

				declare void @__cxa_rethrow() local_unnamed_addr

				attributes #0 = { noinline uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind }
				attributes #3 = { noreturn }
				davideUnsubmitted Not Done Reply Inline Actions These tests can be reduced quite a bit, IMHO (for example you don't need the attributes). davide: These tests can be reduced quite a bit, IMHO (for example you don't need the attributes).