This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/CodeGen/
-
CodeGen/
-
BranchFolding.cpp
-
test/Transforms/BranchFolding/
-
Transforms/
-
BranchFolding/
-
single-block-funclets.ll

Differential D14996

[WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets
ClosedPublic

Authored by andrew.w.kaylor on Nov 25 2015, 1:34 PM.

Download Raw Diff

Details

Reviewers

majnemer
rnk
JosephTremoulet

Commits

rG9efb2332e254: [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets
rL254629: [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets

Summary

When a function contains multiple funclets that are represented by a single block, we can get into a situation where the BranchFolding pass goes into an infinite loop trying to place each of these blocks at the end of the function. This patch adds a simple selection criteria to avoid this problem.

Diff Detail

Repository: rL LLVM

Event Timeline

andrew.w.kaylor updated this revision to Diff 41178.Nov 25 2015, 1:34 PM

andrew.w.kaylor retitled this revision from to [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets.

andrew.w.kaylor updated this object.

andrew.w.kaylor added reviewers: JosephTremoulet, rnk, majnemer.

andrew.w.kaylor set the repository for this revision to rL LLVM.

andrew.w.kaylor added a subscriber: llvm-commits.

It seems odd to me that the fix specifically checks for EHFuncletEntry blocks. Isn't this a more general problem with the algorithm? I.e., wouldn't it fail for any two blocks that can't have fall-through to or from them, and won't it still fail with your fix if the same function has a single-block funclet and a non-funclet block that can't have fall-through to or from it?

JosephTremoulet added inline comments.Nov 30 2015, 8:41 AM

lib/CodeGen/BranchFolding.cpp
1578 ↗	(On Diff #41178)	Basically I'm wishing this could be `InsertAfter is a block which will reach this codepath` instead of `InsertAfter->IsEHFuncletEntry()`, though I don't know how one would (or if one could) express that...

In D14996#298570, @JosephTremoulet wrote:

It seems odd to me that the fix specifically checks for EHFuncletEntry blocks. Isn't this a more general problem with the algorithm? I.e., wouldn't it fail for any two blocks that can't have fall-through to or from them, and won't it still fail with your fix if the same function has a single-block funclet and a non-funclet block that can't have fall-through to or from it?

I agree that it seems like this could be a generic problem, but I've only seen it happen with the new Windows EH constructs.

I'll poke at it a bit and see if I can come up with something that will fail with non-EH blocks as well as a generalized way to avoid the problem.

After looking at this closer, I don't think the problem can arise without EH pads.

The condition for moving a block to the end looks like this:

if (FallThrough != MF.end() &&
          !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
          PrevBB.isSuccessor(&*FallThrough)) {

There are some complications regarding where FallThrough gets its value, but I think that as a reasonable approximation, this condition means that the candidate block to be moved must have a block before it (PrevBB) and a block after it (FallThrough) such that PrevBB ends with an analyzable branch and FallThrough is a successor of PrevBB, and for the infinite looping condition to occur, there must be multiple blocks that meet this condition with the same PrevBB. This happens in the Windows EH case (and currently only Windows EH) because PrevBB can end with an unconditional branch to a normal block and still have multiple EH pad successors.

Simple if-else conditions are handled earlier in the loop where there is already a check against swapping two blocks back and forth (isBetterFallthrough).

In D14996#298891, @andrew.w.kaylor wrote:

After looking at this closer, I don't think the problem can arise without EH pads.

The condition for moving a block to the end looks like this:
...

I can't think of another example myself, but then that leaves me confused as to why this clause is here in the first place. Looking at rL31149 where it was introduced, I don't see mention or tests for this specific clause, so it seems plausible that this case was added just for completeness but hasn't actually been exercised before...

lib/CodeGen/BranchFolding.cpp
1568 ↗	(On Diff #41178)	What would you think of changing this to `assert(MBB->isEHFuncletEntry())` to verify your analysis and ensure we won't loop infinitely with other cases (if they ever arise)?

The "move-to-end" condition definitely does happen with non-EH blocks. I had convinced myself that we couldn't get into a situation where it happened in such a way that we would cycle through multiple blocks moving them past one another, but I'm not entirely certain of that anymore.

OK, I'm back to having convinced myself that we can't get into the infinite loop condition with non-exception code.

Here is the basic pattern when I'm seeing the "move-to-end" code being hit in non-exception code:

BB1:
  ...
  <conditional branch> BB3
  ; else fall through to BB2

BB2:
  <conditional branch> BB4
  <unconditional branch> BB5

BB3:
  ; Anything that doesn't fall through

BB4:
  ; Anything

BB5:
  ; Anything

In this case, BB3 will be moved to the end because (1) the previous block doesn't fall through to BB3, (2) BB3 doesn't fall through to any block, and (3) the previous block has an analyzable branch condition and the block immediately following BB3 is one of the previous block's successors.

The infinite loop condition can only occur when the following pattern becomes present at the end of a function:

BB1:
  ; non-fallthrough successors BB2 and BB3

BB2:
  ; Whatever

BB3:
  ; Whatever

If BB1 jumps unconditionally to either of these blocks, the edge will be rewritten to fall through. The code will consider the previous block to not have an analyzable branch if it contains conditional branches to more than one destination, so the previous block can have at most two destinations in non-exceptional code and one of them must be unconditional.

QED

So, I think the code is correct as I have it in the review.

Thanks for the explanation. I buy your argument that the code as you have it is correct, modulo some minor (non-blocking) qualms about the possibility of other pseudo-op terminators that have the same qualities we're seeing with cleanupret but aren't EH-related.

However, this also gets me wondering whether the optimization really ought to be firing in this situation in the first place. The comment mentions that we're looking for situations where "the block before this one would be a fall-through if this block were removed", but then the check for that is just that there is a flow edge with an analyzable terminator from the previous block to the later block, and I think what you're saying is that with these nonterminating EH cases it turns out that even when you make those blocks adjacent, the edge can't become fallthrough. So maybe it would be simpler and lose no optimization opportunities if we just reject cases where the later block is an EH Funclet entry, since those can never be fallen into?

lib/CodeGen/BranchFolding.cpp
1563 ↗	(On Diff #41178)	i.e. maybe we just want to add `&& !FallThrough->isEHFuncletEntry()` to this condition? (or if there's some predicate like `canBeFallenInto(&*FallThrough)` that could be used, all the better)

In D14996#300165, @JosephTremoulet wrote:

Thanks for the explanation. I buy your argument that the code as you have it is correct, modulo some minor (non-blocking) qualms about the possibility of other pseudo-op terminators that have the same qualities we're seeing with cleanupret but aren't EH-related.

However, this also gets me wondering whether the optimization really ought to be firing in this situation in the first place. The comment mentions that we're looking for situations where "the block before this one would be a fall-through if this block were removed", but then the check for that is just that there is a flow edge with an analyzable terminator from the previous block to the later block, and I think what you're saying is that with these nonterminating EH cases it turns out that even when you make those blocks adjacent, the edge can't become fallthrough. So maybe it would be simpler and lose no optimization opportunities if we just reject cases where the later block is an EH Funclet entry, since those can never be fallen into?

Yeah, that works too. I don't see an existing way of checking to see if an MBB can be fallen into, but I think if this meets the criteria we're looking for then FallThrough should be either PrevTBB or PrevFBB. I'll do some testing to make sure that is true and then I'll update my changes one way or another.

It turns out that there was a case I was overlooking. It looks more or less like this:

BB1:
  call ; from invoke, handled by BB2 and BB3
  jmp BB4
BB2:
  ; EH pad
BB3:
  ; EH pad
BB4:
  ; whatever

We'd like BB1 to fall through to BB4 in this case. To do that, we need to look past any EH pads to see if the opportunity is present.

Also, it appears that isEHFuncletEntry isn't reliable at this point for some reason but isEHPad is.

LGTM, thanks.

This revision is now accepted and ready to land.Dec 2 2015, 6:48 PM

Closed by commit rL254629: [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets (authored by akaylor). · Explain WhyDec 3 2015, 10:58 AM

This revision was automatically updated to reflect the committed changes.

jevinskie added a subscriber: jevinskie.Dec 3 2015, 11:43 AM

This change leads to compile time explosion in some cases where we have a sequence with a LOT of landing pads in the function. Because the BranchFolding code itself is already somewhat quadratic with the do { foreachblock() } while(!changed); pattern, this change makes the compiletime cubic in my example.

I also do not quite understand what the code intends to do. Glancing at the code this purely looks like an optimization to me (if that is true we could maybe bound it to a fixed number of blocks that is skipped).
However the reasoning given here is that this fixed endless loops, but at least with current ToT the testcase passes without a problem when I remove the loop.

The purpose of this change was to avoid a case where the old algorithm was getting into an infinite loop moving different blocks to the end of the function. As I recall, this situation arose because the old code was sometimes mistakenly deciding that control flow could fall through from a normal block to an EH block that was reachable from that block via an exception. If multiple EH blocks were reachable, the old code wanted to place each of them immediately after the block from which they were reached.

But basically I think you're right that this is inherently an optimization in as much as the change here was meant to preserve the code that was trying to move a non-EH fall through block into an optimized position while avoiding the infinite loop problem.

As to the test case not failing without the code change for ToT, it looks like when the EH IR was redefined and the test case was updated, it no longer met the conditions that triggered the infinite loop. I'm not sure it isn't possible to create the conditions for the infinite loop though.

I think the safest change would be to eliminate the code I introduced to skip over EH blocks but add a condition so that it didn't move the Fallthrough block if it is an EH block.

if (FallThrough != MF.end() &&
    !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&

PrevBB.isSuccessor(&*FallThrough)) {

+ PrevBB.isSuccessor(&*FallThrough) &&
+ !FallThrough->isEHPad()) {

I'll play around with the test case and see if I can get it back to where it fails without this change.

In D14996#475871, @andrew.w.kaylor wrote:
The purpose of this change was to avoid a case where the old algorithm was getting into an infinite loop moving different blocks to the end of the function. As I recall, this situation arose because the old code was sometimes mistakenly deciding that control flow could fall through from a normal block to an EH block that was reachable from that block via an exception. If multiple EH blocks were reachable, the old code wanted to place each of them immediately after the block from which they were reached.

But basically I think you're right that this is inherently an optimization in as much as the change here was meant to preserve the code that was trying to move a non-EH fall through block into an optimized position while avoiding the infinite loop problem.

As to the test case not failing without the code change for ToT, it looks like when the EH IR was redefined and the test case was updated, it no longer met the conditions that triggered the infinite loop. I'm not sure it isn't possible to create the conditions for the infinite loop though.

I think the safest change would be to eliminate the code I introduced to skip over EH blocks but add a condition so that it didn't move the Fallthrough block if it is an EH block.
if (FallThrough != MF.end() &&
    !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
PrevBB.isSuccessor(&*FallThrough)) {

+ PrevBB.isSuccessor(&*FallThrough) &&
+ !FallThrough->isEHPad()) {

I'll play around with the test case and see if I can get it back to where it fails without this change.

Any news on this?

In D14996#483565, @MatzeB wrote:

Any news on this?

Sorry, no, I've been distracted with other things. Would you like me to commit the change I suggested or just revert this change completely until I can prove it is needed?

In D14996#483566, @andrew.w.kaylor wrote:

In D14996#483565, @MatzeB wrote:

Any news on this?

Sorry, no, I've been distracted with other things. Would you like me to commit the change I suggested or just revert this change completely until I can prove it is needed?

I'm fine with either.

andrew.w.kaylor mentioned this in D22839: Revert EH-specific branch folding changes to avoid compilation slow down.Jul 26 2016, 5:36 PM

andrew.w.kaylor mentioned this in D27582: Avoid infinite loops in branch folding.Dec 8 2016, 11:02 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

BranchFolding.cpp

8 lines

test/

Transforms/

BranchFolding/

single-block-funclets.ll

110 lines

Diff 41783

llvm/trunk/lib/CodeGen/BranchFolding.cpp

Show First 20 Lines • Show All 1,558 Lines • ▼ Show 20 Lines	if (!CurFallsThru) {
}		}
}		}

// Okay, there is no really great place to put this block. If, however,		// Okay, there is no really great place to put this block. If, however,
// the block before this one would be a fall-through if this block were		// the block before this one would be a fall-through if this block were
// removed, move this block to the end of the function.		// removed, move this block to the end of the function.
MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;		MachineBasicBlock PrevTBB = nullptr, PrevFBB = nullptr;
SmallVector<MachineOperand, 4> PrevCond;		SmallVector<MachineOperand, 4> PrevCond;
		// We're looking for cases where PrevBB could possibly fall through to
		// FallThrough, but if FallThrough is an EH pad that wouldn't be useful
		// so here we skip over any EH pads so we might have a chance to find
		// a branch target from PrevBB.
		while (FallThrough != MF.end() && FallThrough->isEHPad())
		++FallThrough;
		// Now check to see if the current block is sitting between PrevBB and
		// a block to which it could fall through.
if (FallThrough != MF.end() &&		if (FallThrough != MF.end() &&
!TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&		!TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
PrevBB.isSuccessor(&*FallThrough)) {		PrevBB.isSuccessor(&*FallThrough)) {
MBB->moveAfter(&MF.back());		MBB->moveAfter(&MF.back());
MadeChange = true;		MadeChange = true;
return MadeChange;		return MadeChange;
}		}
}		}
▲ Show 20 Lines • Show All 306 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/BranchFolding/single-block-funclets.ll

				; RUN: llc -mtriple=x86_64-pc-windows-msvc < %s \| FileCheck %s

				declare i32 @__CxxFrameHandler3(...)

				declare void @throw()
				declare i16 @f()

				define i16 @test1(i16 %a, i8* %b) personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
				entry:
				%cmp = icmp eq i16 %a, 10
				br i1 %cmp, label %if.then, label %if.else

				if.then:
				%call1 = invoke i16 @f()
				to label %cleanup unwind label %catch.dispatch

				if.else:
				%call2 = invoke i16 @f()
				to label %cleanup unwind label %catch.dispatch

				catch.dispatch:
				catchpad [i8* null, i32 8, i8* null]
				to label %catch unwind label %catch.dispatch.2

				catch:
				invoke void @throw() noreturn
				to label %unreachable unwind label %catchendblock

				catch.dispatch.2:
				catchpad [i8* null, i32 64, i8* null]
				to label %catch.2 unwind label %catchendblock

				catch.2:
				store i8 1, i8* %b
				invoke void @throw() noreturn
				to label %unreachable unwind label %catchendblock

				catchendblock:
				catchendpad unwind to caller

				cleanup:
				%retval = phi i16 [ %call1, %if.then ], [ %call2, %if.else ]
				ret i16 %retval

				unreachable:
				unreachable
				}

				; This test verifies the case where two funclet blocks meet the old criteria
				; to be placed at the end. The order of the blocks is not important for the
				; purposes of this test. The failure mode is an infinite loop during
				; compilation.
				;
				; CHECK-LABEL: .def test1;

				define i16 @test2(i16 %a, i8* %b) personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
				entry:
				%cmp = icmp eq i16 %a, 10
				br i1 %cmp, label %if.then, label %if.else

				if.then:
				%call1 = invoke i16 @f()
				to label %cleanup unwind label %catch.dispatch

				if.else:
				%call2 = invoke i16 @f()
				to label %cleanup unwind label %catch.dispatch

				catch.dispatch:
				catchpad [i8* null, i32 8, i8* null]
				to label %catch unwind label %catch.dispatch.2

				catch:
				invoke void @throw() noreturn
				to label %unreachable unwind label %catchendblock

				catch.dispatch.2:
				%c2 = catchpad [i8* null, i32 32, i8* null]
				to label %catch.2 unwind label %catch.dispatch.3

				catch.2:
				store i8 1, i8* %b
				catchret %c2 to label %cleanup

				catch.dispatch.3:
				%c3 = catchpad [i8* null, i32 64, i8* null]
				to label %catch.3 unwind label %catchendblock

				catch.3:
				store i8 2, i8* %b
				catchret %c3 to label %cleanup

				catchendblock:
				catchendpad unwind to caller

				cleanup:
				%retval = phi i16 [ %call1, %if.then ], [ %call2, %if.else ], [ -1, %catch.2 ], [ -1, %catch.3 ]
				ret i16 %retval

				unreachable:
				unreachable
				}

				; This test verifies the case where three funclet blocks all meet the old
				; criteria to be placed at the end. The order of the blocks is not important
				; for the purposes of this test. The failure mode is an infinite loop during
				; compilation.
				;
				; CHECK-LABEL: .def test2;