This is an archive of the discontinued LLVM Phabricator instance.

[WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets
ClosedPublic

Authored by andrew.w.kaylor on Nov 25 2015, 1:34 PM.

Details

Summary

When a function contains multiple funclets that are represented by a single block, we can get into a situation where the BranchFolding pass goes into an infinite loop trying to place each of these blocks at the end of the function. This patch adds a simple selection criteria to avoid this problem.

Diff Detail

Repository
rL LLVM

Event Timeline

andrew.w.kaylor retitled this revision from to [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets.
andrew.w.kaylor updated this object.
andrew.w.kaylor set the repository for this revision to rL LLVM.
andrew.w.kaylor added a subscriber: llvm-commits.

It seems odd to me that the fix specifically checks for EHFuncletEntry blocks. Isn't this a more general problem with the algorithm? I.e., wouldn't it fail for any two blocks that can't have fall-through to or from them, and won't it still fail with your fix if the same function has a single-block funclet and a non-funclet block that can't have fall-through to or from it?

lib/CodeGen/BranchFolding.cpp
1578 ↗(On Diff #41178)

Basically I'm wishing this could be InsertAfter is a block which will reach this codepath instead of InsertAfter->IsEHFuncletEntry(), though I don't know how one would (or if one could) express that...

It seems odd to me that the fix specifically checks for EHFuncletEntry blocks. Isn't this a more general problem with the algorithm? I.e., wouldn't it fail for any two blocks that can't have fall-through to or from them, and won't it still fail with your fix if the same function has a single-block funclet and a non-funclet block that can't have fall-through to or from it?

I agree that it seems like this could be a generic problem, but I've only seen it happen with the new Windows EH constructs.

I'll poke at it a bit and see if I can come up with something that will fail with non-EH blocks as well as a generalized way to avoid the problem.

After looking at this closer, I don't think the problem can arise without EH pads.

The condition for moving a block to the end looks like this:

if (FallThrough != MF.end() &&
          !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
          PrevBB.isSuccessor(&*FallThrough)) {

There are some complications regarding where FallThrough gets its value, but I think that as a reasonable approximation, this condition means that the candidate block to be moved must have a block before it (PrevBB) and a block after it (FallThrough) such that PrevBB ends with an analyzable branch and FallThrough is a successor of PrevBB, and for the infinite looping condition to occur, there must be multiple blocks that meet this condition with the same PrevBB. This happens in the Windows EH case (and currently only Windows EH) because PrevBB can end with an unconditional branch to a normal block and still have multiple EH pad successors.

Simple if-else conditions are handled earlier in the loop where there is already a check against swapping two blocks back and forth (isBetterFallthrough).

After looking at this closer, I don't think the problem can arise without EH pads.

The condition for moving a block to the end looks like this:
...

I can't think of another example myself, but then that leaves me confused as to why this clause is here in the first place. Looking at rL31149 where it was introduced, I don't see mention or tests for this specific clause, so it seems plausible that this case was added just for completeness but hasn't actually been exercised before...

lib/CodeGen/BranchFolding.cpp
1568 ↗(On Diff #41178)

What would you think of changing this to assert(MBB->isEHFuncletEntry()) to verify your analysis and ensure we won't loop infinitely with other cases (if they ever arise)?

The "move-to-end" condition definitely does happen with non-EH blocks. I had convinced myself that we couldn't get into a situation where it happened in such a way that we would cycle through multiple blocks moving them past one another, but I'm not entirely certain of that anymore.

OK, I'm back to having convinced myself that we can't get into the infinite loop condition with non-exception code.

Here is the basic pattern when I'm seeing the "move-to-end" code being hit in non-exception code:

BB1:
  ...
  <conditional branch> BB3
  ; else fall through to BB2

BB2:
  <conditional branch> BB4
  <unconditional branch> BB5

BB3:
  ; Anything that doesn't fall through

BB4:
  ; Anything

BB5:
  ; Anything

In this case, BB3 will be moved to the end because (1) the previous block doesn't fall through to BB3, (2) BB3 doesn't fall through to any block, and (3) the previous block has an analyzable branch condition and the block immediately following BB3 is one of the previous block's successors.

The infinite loop condition can only occur when the following pattern becomes present at the end of a function:

BB1:
  ; non-fallthrough successors BB2 and BB3

BB2:
  ; Whatever

BB3:
  ; Whatever

If BB1 jumps unconditionally to either of these blocks, the edge will be rewritten to fall through. The code will consider the previous block to not have an analyzable branch if it contains conditional branches to more than one destination, so the previous block can have at most two destinations in non-exceptional code and one of them must be unconditional.

QED

So, I think the code is correct as I have it in the review.

Thanks for the explanation. I buy your argument that the code as you have it is correct, modulo some minor (non-blocking) qualms about the possibility of other pseudo-op terminators that have the same qualities we're seeing with cleanupret but aren't EH-related.

However, this also gets me wondering whether the optimization really ought to be firing in this situation in the first place. The comment mentions that we're looking for situations where "the block before this one would be a fall-through if this block were removed", but then the check for that is just that there is a flow edge with an analyzable terminator from the previous block to the later block, and I think what you're saying is that with these nonterminating EH cases it turns out that even when you make those blocks adjacent, the edge can't become fallthrough. So maybe it would be simpler and lose no optimization opportunities if we just reject cases where the later block is an EH Funclet entry, since those can never be fallen into?

lib/CodeGen/BranchFolding.cpp
1563 ↗(On Diff #41178)

i.e. maybe we just want to add && !FallThrough->isEHFuncletEntry() to this condition? (or if there's some predicate like canBeFallenInto(&*FallThrough) that could be used, all the better)

Thanks for the explanation. I buy your argument that the code as you have it is correct, modulo some minor (non-blocking) qualms about the possibility of other pseudo-op terminators that have the same qualities we're seeing with cleanupret but aren't EH-related.

However, this also gets me wondering whether the optimization really ought to be firing in this situation in the first place. The comment mentions that we're looking for situations where "the block before this one would be a fall-through if this block were removed", but then the check for that is just that there is a flow edge with an analyzable terminator from the previous block to the later block, and I think what you're saying is that with these nonterminating EH cases it turns out that even when you make those blocks adjacent, the edge can't become fallthrough. So maybe it would be simpler and lose no optimization opportunities if we just reject cases where the later block is an EH Funclet entry, since those can never be fallen into?

Yeah, that works too. I don't see an existing way of checking to see if an MBB can be fallen into, but I think if this meets the criteria we're looking for then FallThrough should be either PrevTBB or PrevFBB. I'll do some testing to make sure that is true and then I'll update my changes one way or another.

andrew.w.kaylor updated this revision to Diff 41688.EditedDec 2 2015, 4:07 PM
andrew.w.kaylor edited edge metadata.

It turns out that there was a case I was overlooking. It looks more or less like this:

BB1:
  call ; from invoke, handled by BB2 and BB3
  jmp BB4
BB2:
  ; EH pad
BB3:
  ; EH pad
BB4:
  ; whatever

We'd like BB1 to fall through to BB4 in this case. To do that, we need to look past any EH pads to see if the opportunity is present.

Also, it appears that isEHFuncletEntry isn't reliable at this point for some reason but isEHPad is.

JosephTremoulet accepted this revision.Dec 2 2015, 6:48 PM
JosephTremoulet edited edge metadata.

LGTM, thanks.

This revision is now accepted and ready to land.Dec 2 2015, 6:48 PM
This revision was automatically updated to reflect the committed changes.
MatzeB added a subscriber: MatzeB.Jul 5 2016, 3:14 PM

This change leads to compile time explosion in some cases where we have a sequence with a LOT of landing pads in the function. Because the BranchFolding code itself is already somewhat quadratic with the do { foreachblock() } while(!changed); pattern, this change makes the compiletime cubic in my example.

MatzeB added a comment.Jul 5 2016, 3:31 PM

I also do not quite understand what the code intends to do. Glancing at the code this purely looks like an optimization to me (if that is true we could maybe bound it to a fixed number of blocks that is skipped).
However the reasoning given here is that this fixed endless loops, but at least with current ToT the testcase passes without a problem when I remove the loop.

The purpose of this change was to avoid a case where the old algorithm was getting into an infinite loop moving different blocks to the end of the function. As I recall, this situation arose because the old code was sometimes mistakenly deciding that control flow could fall through from a normal block to an EH block that was reachable from that block via an exception. If multiple EH blocks were reachable, the old code wanted to place each of them immediately after the block from which they were reached.

But basically I think you're right that this is inherently an optimization in as much as the change here was meant to preserve the code that was trying to move a non-EH fall through block into an optimized position while avoiding the infinite loop problem.

As to the test case not failing without the code change for ToT, it looks like when the EH IR was redefined and the test case was updated, it no longer met the conditions that triggered the infinite loop. I'm not sure it isn't possible to create the conditions for the infinite loop though.

I think the safest change would be to eliminate the code I introduced to skip over EH blocks but add a condition so that it didn't move the Fallthrough block if it is an EH block.

if (FallThrough != MF.end() &&
    !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
  • PrevBB.isSuccessor(&*FallThrough)) {

+ PrevBB.isSuccessor(&*FallThrough) &&
+ !FallThrough->isEHPad()) {

I'll play around with the test case and see if I can get it back to where it fails without this change.

The purpose of this change was to avoid a case where the old algorithm was getting into an infinite loop moving different blocks to the end of the function. As I recall, this situation arose because the old code was sometimes mistakenly deciding that control flow could fall through from a normal block to an EH block that was reachable from that block via an exception. If multiple EH blocks were reachable, the old code wanted to place each of them immediately after the block from which they were reached.

But basically I think you're right that this is inherently an optimization in as much as the change here was meant to preserve the code that was trying to move a non-EH fall through block into an optimized position while avoiding the infinite loop problem.

As to the test case not failing without the code change for ToT, it looks like when the EH IR was redefined and the test case was updated, it no longer met the conditions that triggered the infinite loop. I'm not sure it isn't possible to create the conditions for the infinite loop though.

I think the safest change would be to eliminate the code I introduced to skip over EH blocks but add a condition so that it didn't move the Fallthrough block if it is an EH block.

if (FallThrough != MF.end() &&
    !TII->AnalyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
  • PrevBB.isSuccessor(&*FallThrough)) {

+ PrevBB.isSuccessor(&*FallThrough) &&
+ !FallThrough->isEHPad()) {

I'll play around with the test case and see if I can get it back to where it fails without this change.

Any news on this?

Any news on this?

Sorry, no, I've been distracted with other things. Would you like me to commit the change I suggested or just revert this change completely until I can prove it is needed?

Any news on this?

Sorry, no, I've been distracted with other things. Would you like me to commit the change I suggested or just revert this change completely until I can prove it is needed?

I'm fine with either.