This is an archive of the discontinued LLVM Phabricator instance.

Avoid infinite loops in branch folding
ClosedPublic

Authored by andrew.w.kaylor on Dec 8 2016, 11:02 AM.

Download Raw Diff

Details

Reviewers

MatzeB
rnk

Commits

rGff6a1edfa8e6: Avoid infinite loops in branch folding
rL289486: Avoid infinite loops in branch folding

Summary

This reintroduces a fix for a problem where the branch folding optimization can go into an infinite loop shuffling blocks at the end of a function. This problem was originally fixed in D14996 but that change was reverted as seen in D22839 because it was causing compile time to blow up in some circumstances. At the time this was reverted I believed that the original problem could no longer be reproduced, but I have since seen it with a particular set of optimization options with the spec2006/483 benchmark.

The test case I am adding here is a version of the 483 code that was reduced using bugpoint. I am not clear as to exactly why this particular case is triggering the problem, but I wasn't able to reduce it any further. I suspect that the state it is in now just happens to trigger different behavior prior to branch folding when optimizing for size.

In any event, the important thing is that the potential for the problem still exists. The root cause is that when the branch folding code is hit there is a block near the end of the function with a function call followed by a branch to a block that is higher in the function, then three or more exception handling blocks which are possible successors by way of exceptions in the called function. In this case, the code prior to this patch would rotate the exception blocks to the end of the function in infinite succession.

The compilation time problem was caused by the previous attempt to find a non-exception handling block beyond the block to be moved. In a function containing a large number of exception handling blocks this led to quadratic or worse behavior. The new fix simply gives up if the block following the block to be moved is an exception handling block.

Diff Detail

Repository: rL LLVM

Event Timeline

andrew.w.kaylor updated this revision to Diff 80784.Dec 8 2016, 11:02 AM

andrew.w.kaylor retitled this revision from to Avoid infinite loops in branch folding.

andrew.w.kaylor updated this object.

andrew.w.kaylor added reviewers: rnk, MatzeB.

andrew.w.kaylor set the repository for this revision to rL LLVM.

andrew.w.kaylor added a subscriber: llvm-commits.

rnk added inline comments.Dec 8 2016, 4:07 PM

lib/CodeGen/BranchFolding.cpp
1637 ↗	(On Diff #80784)	Let's check isEHPad before analyzeBranch. It's cheaper, and it makes no sense to try to arrange a fallthrough to an EH pad. The analogy with landingpads would be this situation: invoke ... to %unreachable unwind %lpad next: ; cannot arrange fallthrough for some reason ret void lpad: landingpad ... Without your change, it looks like we'll try to get lpad to follow the invoke, which isn't useful.
test/CodeGen/X86/branchfolding-catchpads.ll
98 ↗	(On Diff #80784)	Can we manually make this simpler?

andrew.w.kaylor added inline comments.Dec 9 2016, 12:03 PM

lib/CodeGen/BranchFolding.cpp
1637 ↗	(On Diff #80784)	OK, that definitely makes sense.
test/CodeGen/X86/branchfolding-catchpads.ll
98 ↗	(On Diff #80784)	You'd really think so, wouldn't you? I tried removing most of these pieces and even taking out one of the arguments to the g() call or removing one of the switch branches to unreachable blocks avoids the infinite loop. I strongly suspect that this combination of things is just pushing it over some threshold for a size optimization, but I have to admit that I don't completely understand it. I'm running the test without optimizations (llc does default to OptNone, right?), but even just removing the optsize function attribute causes this test case not to hang. I'll try a run that dumps before every pass to see if I can figure out what's really going on. Honestly, I'm not sure how useful this is as a regression test. It reproduces the failure right now, but I have no reason to think that it will still reproduce the failure a month from now. What it does accomplish, however, is documenting the kind of complexity that can trigger this bug. When I reverted the previous fix, I did so because I had convinced myself that this failure wasn't possible anymore after trying a few variations on the other two test cases in this file. I'd like to hope that anyone looking at this test case in the future would just think "OK, let's just assume that it's still possible." Maybe that's better done by adding some foreboding comments in the code.

I moved the isEHPad() check, as suggested.

I was able to reduce the test case a little more. The arguments to g() were only necessary to prevent a tail call optimization. Simply removing 'tail' from the call instruction dealt with that. I was also able to remove a few of the switch statement entries and simplify the control flow slightly. The remaining complexity is necessary to prevent intermediate passes (I think mostly the instruction selector) from simplifying a few remaining branches and reordering blocks such that the conditions for the failure are no longer met.

I updated the comments to better explain the failure condition a bit more.

lgtm

test/CodeGen/X86/branchfolding-catchpads.ll
98 ↗	(On Diff #80784)	In theory we have MIR so that we can dump the MI right before the problematic branch folding pass. It might be possible to do an MIR test that shows we don't attempt to fall through to EH pads, instead of trying to test that we don't infinite loop. Anyway, these are suggestions, I'm happy with the test as is.

This revision is now accepted and ready to land.Dec 12 2016, 8:28 AM

Closed by commit rL289486: Avoid infinite loops in branch folding (authored by akaylor). · Explain WhyDec 12 2016, 3:15 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

BranchFolding.cpp

14 lines

test/

CodeGen/

WinEH/

wineh-noret-cleanup.ll

12 lines

X86/

branchfolding-catchpads.ll

64 lines

Diff 81149

llvm/trunk/lib/CodeGen/BranchFolding.cpp

Show First 20 Lines • Show All 1,618 Lines • ▼ Show 20 Lines	if (!CurFallsThru) {
MBB->moveBefore(SuccBB);		MBB->moveBefore(SuccBB);
MadeChange = true;		MadeChange = true;
goto ReoptimizeBlock;		goto ReoptimizeBlock;
}		}
}		}

// Okay, there is no really great place to put this block. If, however,		// Okay, there is no really great place to put this block. If, however,
// the block before this one would be a fall-through if this block were		// the block before this one would be a fall-through if this block were
// removed, move this block to the end of the function.		// removed, move this block to the end of the function. There is no real
		// advantage in "falling through" to an EH block, so we don't want to
		// perform this transformation for that case.
		//
		// Also, Windows EH introduced the possibility of an arbitrary number of
		// successors to a given block. The analyzeBranch call does not consider
		// exception handling and so we can get in a state where a block
		// containing a call is followed by multiple EH blocks that would be
		// rotated infinitely at the end of the function if the transformation
		// below were performed for EH "FallThrough" blocks. Therefore, even if
		// that appears not to be happening anymore, we should assume that it is
		// possible and not remove the "!FallThrough()->isEHPad" condition below.
MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;		MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;
SmallVector<MachineOperand, 4> PrevCond;		SmallVector<MachineOperand, 4> PrevCond;
if (FallThrough != MF.end() &&		if (FallThrough != MF.end() &&
		!FallThrough->isEHPad() &&
!TII->analyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&		!TII->analyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
PrevBB.isSuccessor(&*FallThrough)) {		PrevBB.isSuccessor(&*FallThrough)) {
MBB->moveAfter(&MF.back());		MBB->moveAfter(&MF.back());
MadeChange = true;		MadeChange = true;
return MadeChange;		return MadeChange;
}		}
}		}
}		}
▲ Show 20 Lines • Show All 312 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/WinEH/wineh-noret-cleanup.ll

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; CXX-LABEL: test:			; CXX-LABEL: test:
	; CXX-LABEL: $ip2state$test:			; CXX-LABEL: $ip2state$test:
	; CXX-NEXT: .long .Lfunc_begin0@IMGREL			; CXX-NEXT: .long .Lfunc_begin0@IMGREL
	; CXX-NEXT: .long -1			; CXX-NEXT: .long -1
	; CXX-NEXT: .long .Ltmp0@IMGREL+1			; CXX-NEXT: .long .Ltmp0@IMGREL+1
	; CXX-NEXT: .long 1			; CXX-NEXT: .long 1
	; CXX-NEXT: .long .Ltmp1@IMGREL+1			; CXX-NEXT: .long .Ltmp1@IMGREL+1
	; CXX-NEXT: .long -1			; CXX-NEXT: .long -1
	; CXX-NEXT: .long "?catch$2@?0?test@4HA"@IMGREL			; CXX-NEXT: .long "?catch$3@?0?test@4HA"@IMGREL
	; CXX-NEXT: .long 2			; CXX-NEXT: .long 2
	; CXX-NEXT: .long .Ltmp2@IMGREL+1			; CXX-NEXT: .long .Ltmp2@IMGREL+1
	; CXX-NEXT: .long 3			; CXX-NEXT: .long 3
	; CXX-NEXT: .long .Ltmp3@IMGREL+1			; CXX-NEXT: .long .Ltmp3@IMGREL+1
	; CXX-NEXT: .long 2			; CXX-NEXT: .long 2
	; CXX-NEXT: .long "?catch$4@?0?test@4HA"@IMGREL			; CXX-NEXT: .long "?catch$5@?0?test@4HA"@IMGREL
	; CXX-NEXT: .long 4			; CXX-NEXT: .long 4

	; SEH-LABEL: test:			; SEH-LABEL: test:
	; SEH-LABEL: .Llsda_begin0:			; SEH-LABEL: .Llsda_begin0:
	; SEH-NEXT: .long .Ltmp0@IMGREL+1			; SEH-NEXT: .long .Ltmp0@IMGREL+1
	; SEH-NEXT: .long .Ltmp1@IMGREL+1			; SEH-NEXT: .long .Ltmp1@IMGREL+1
	; SEH-NEXT: .long dummy_filter@IMGREL			; SEH-NEXT: .long dummy_filter@IMGREL
	; SEH-NEXT: .long .LBB0_2@IMGREL			; SEH-NEXT: .long .LBB0_3@IMGREL
	; SEH-NEXT: .long .Ltmp0@IMGREL+1			; SEH-NEXT: .long .Ltmp0@IMGREL+1
	; SEH-NEXT: .long .Ltmp1@IMGREL+1			; SEH-NEXT: .long .Ltmp1@IMGREL+1
	; SEH-NEXT: .long dummy_filter@IMGREL			; SEH-NEXT: .long dummy_filter@IMGREL
	; SEH-NEXT: .long .LBB0_4@IMGREL			; SEH-NEXT: .long .LBB0_5@IMGREL
	; SEH-NEXT: .long .Ltmp2@IMGREL+1			; SEH-NEXT: .long .Ltmp2@IMGREL+1
	; SEH-NEXT: .long .Ltmp3@IMGREL+1			; SEH-NEXT: .long .Ltmp3@IMGREL+1
	; SEH-NEXT: .long "?dtor$5@?0?test@4HA"@IMGREL			; SEH-NEXT: .long "?dtor$2@?0?test@4HA"@IMGREL
	; SEH-NEXT: .long 0			; SEH-NEXT: .long 0
	; SEH-NEXT: .long .Ltmp2@IMGREL+1			; SEH-NEXT: .long .Ltmp2@IMGREL+1
	; SEH-NEXT: .long .Ltmp3@IMGREL+1			; SEH-NEXT: .long .Ltmp3@IMGREL+1
	; SEH-NEXT: .long dummy_filter@IMGREL			; SEH-NEXT: .long dummy_filter@IMGREL
	; SEH-NEXT: .long .LBB0_4@IMGREL			; SEH-NEXT: .long .LBB0_5@IMGREL
	; SEH-NEXT: .Llsda_end0:			; SEH-NEXT: .Llsda_end0:

llvm/trunk/test/CodeGen/X86/branchfolding-catchpads.ll

	Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines

	; This test verifies the case where three funclet blocks all meet the old			; This test verifies the case where three funclet blocks all meet the old
	; criteria to be placed at the end. The order of the blocks is not important			; criteria to be placed at the end. The order of the blocks is not important
	; for the purposes of this test. The failure mode is an infinite loop during			; for the purposes of this test. The failure mode is an infinite loop during
	; compilation.			; compilation.
	;			;
	; CHECK-LABEL: .def test2;			; CHECK-LABEL: .def test2;

				declare void @g()

				define void @test3() optsize personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
				entry:
				switch i32 undef, label %if.end57 [
				i32 64, label %sw.bb
				i32 128, label %sw.epilog
				i32 256, label %if.then56
				i32 1024, label %sw.bb
				i32 4096, label %sw.bb33
				i32 16, label %sw.epilog
				i32 8, label %sw.epilog
				i32 32, label %sw.bb44
				]

				sw.bb:
				unreachable

				sw.bb33:
				br i1 undef, label %if.end57, label %while.cond.i163.preheader

				while.cond.i163.preheader:
				unreachable

				sw.bb44:
				%temp0 = load void (), void ()* undef
				invoke void %temp0()
				to label %if.end57 unwind label %catch.dispatch

				sw.epilog:
				%temp1 = load i8, i8* undef
				br label %if.end57

				catch.dispatch:
				%cs = catchswitch within none [label %catch1, label %catch2, label %catch3] unwind to caller

				catch1:
				%c1 = catchpad within %cs [i8* null, i32 8, i8* null]
				unreachable

				catch2:
				%c2 = catchpad within %cs [i8* null, i32 32, i8* null]
				unreachable

				catch3:
				%c3 = catchpad within %cs [i8* null, i32 64, i8* null]
				unreachable

				if.then56:
				call void @g()
				br label %if.end57

				if.end57:
				ret void
				}

				; This test exercises a complex case that produced an infinite loop during
				; compilation when the two cases above did not. The multiple targets from the
				; entry switch are not actually fundamental to the failure, but they are
				; necessary to suppress various control flow optimizations that would prevent
				; the conditions that lead to the failure.
				;
				; CHECK-LABEL: .def test3;