This is an archive of the discontinued LLVM Phabricator instance.

[WinEH] Update CoreCLR EH state numbering
ClosedPublic

Authored by JosephTremoulet on Dec 7 2015, 11:11 PM.

Details

Summary

Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:

  1. Each state has a "HandlerParentState", which is the state of the next outer handler enclosing this state's handler (same as nearest ancestor per the ParentPad linkage on EH pads, but skipping over catchswitches).
  2. Each state has a "TryParentState", which: a) for a catchpad that's not the last handler on its catchswitch, is the state of the next catchpad on that catchswitch. b) for all other pads, is the state of the pad whose try region is the next outer try region enclosing this state's try region. The "try regions are not present as such in the IR, but will be inferred based on the placement of invokes and pads which reach each other by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.

Diff Detail

Event Timeline

JosephTremoulet retitled this revision from to [WinEH] Update CoreCLR EH state numbering.
JosephTremoulet updated this object.
JosephTremoulet updated this object.
  • Update worklist seeding to include all pads that unwind to caller (or are cleanuppads with no cleanupret)
  • fix wineh-coreclr test to report proper outer scope for nested fault cleanuppad
  • adjust getEHPadFromPredecessor and calls to it for CLR EH so it behaves correctly for nested CLR funclets
lib/CodeGen/WinEHPrepare.cpp
452

I have to admit that I'm not familiar enough with Core CLR to know what's possible and what isn't, so maybe this isn't an issue, but it strikes me here that a nested cleanup pad that never returns will be treated like a top level cleanup pad. I'm thinking of something like this:

try {
  try {
    f()
  } catch (...) {
    try {
      g()
    } finally {
      throw() // noreturn
    }
  }
} catch (...) {
}

The throw call here would unwind to the outer catch handler, but I think CPI->getCleanupRetUnwindDest() would return null.

456

I'm probably not understanding this correctly. Can you talk me through how a case like the following would be handled?

try {
  try {
  } catch { // inner catch
  }
} catch { // outer catch
  try {
  } catch { // catch catch
  }
}

If I have everything straight, both "outer catch" and "catch catch" will have null unwind destinations. So that would make the parent state of "catch catch" -1, I think. Is that what you want?

lib/CodeGen/WinEHPrepare.cpp
452

Yes, good catch, that's a case that wouldn't be handled correctly as written. We've discussed a plan, which I think we now need for cases like this and the irreducible loop, and which I am working on, to add "bundles" to calls and invokes in EH handlers indicating their handler membership (i.e. "color"), analogous to the "within" annotation that D15139 adds to the pads themselves. In that world, the IR for your example would look like:

invoke void @f()
  to label %whatever unwind label %inner_catch_switch
inner_catch_switch:
  %ics = catchswitch within none [label %inner_catch] unwind label %outer_catch_switch
inner_catch:
  %ic = catchpad within %ics [...]
  invoke void @g() ["within" token %ic]
    to label %whatever2 unwind label %finally
finally:
  %f = cleanuppad within %ic []
  invoke void @throw() ["within" token %f]
    to label %unreachable unwind label %outer_catch_switch

I'd expect this code to be updated to look through uses of the cleanuppad (%f) for not just cleanuprets but also "within" bundles on calls/invokes, and find the "unwind label %outer_catch_switch" on the invoke.

As far as staging the changes, I was thinking it makes sense to go ahead and check in this code that gets us from "completely broken" to "correct except in cleanups that have lost their cleanupret", once D15139 lands, since that will be enough to unblock downstreaming changes into LLILC while the bundle change iterates. Though I probably should at least include a FIXME here in this change.

It would of course be possible to hunt for invokes in the cleanup without the bundles making it easy, but since we seem to need the bundles for exotic-but-theoretically-possible cases where the correct color is irresolvably ambiguous, I'd rather not bother with that just to rewrite it to use the token bundles; but I would still like to checkpoint this step.

And, of course, I should be more concrete about "cases where I think we need the bundles". Here's an example (courtesy @majnemer): Suppose we started with:

...
 } finally {
    if (_) {
      left();
      __unreachable;
    } else {
      right();
      __unreachable;
    }
  }
...
left();
try {
  foo();
} catch { // catch 1
  /* stuff */
}
...
right();
try {
  bar();
} catch { // catch 2
  /* stuff */
}

which would become IR like

finally:
  %fin = cleanuppad []
  br i1 %_, label %left_unreachable, label %right_unreachable
left_unreachable:
  call void @left()
  unreachable
right_unreachable:
  call void @right()
  unreachable
...
left_bait:
  call void @left()
  invoke void @foo()
    to label %whatever unwind label %catchswitch_1
...
right_bait:
  call void @right()
  invoke void @bar()
    to label %whatever_else unwind label %catchswitch_2

then, although I'm pretty sure we don't actually have this transformation, it seems reasonable to expect to be able to rewrite that to:

finally:
  %fin = cleanuppad []
  br i1 %_, label %left_bait, label %right_bait
...
left_bait:
  call void @left()
  invoke void @foo()
    to label %whatever unwind label %catchswitch_1
...
right_bait:
  call void @right()
  invoke void @bar()
    to label %whatever_else unwind label %catchswitch_2

Then the "finally" cleanup has no cleanupret, and if you try to figure out where the cleanup's "unwind dest" is by scanning its invokes, you'll find both %catchswitch_1 and %catchswitch_2, and picking either of them would lead you to generate bad code if dynamically the other path is taken. Having the front-end tag the calls/invokes in a cleanup with the token of the cleanup they're in makes it unambiguous when you want to infer a cleanup's "unwind destination" from an invoke -- the inference is valid if and only if the invoke has a matching token.

Of course, this code would still have to deal with the case that there isn't a cleanupret and aren't any invokes either. I've convinced myself that in that case it's correct to treat the cleanup as at the topmost level inside its enclosing funclet.

456

Yes, that's what I want. The relationship I need is:

  • If a catchpad is not the first on its catchswitch, then the immediately prior catchpad on its same catchswitch is its parent
  • otherwise, if the try region for EH pad A is the innermost try region enclosing the try region for EH pad B, then A is the parent of B

Since the try region for "outer catch" is disjoint with the try region for "catch catch", neither is an ancestor of the other; since no try region encloses the try region for "catch catch", its parent is -1.

lib/CodeGen/WinEHPrepare.cpp
456

Oops, got the first rule reversed. It's:

  • If a catchpad is not the last on its catchswitch, then the immediately following catchpad on its same catchswitch is its parent
  • otherwise, if the try region for EH pad A is the innermost try region enclosing the try region for EH pad B, then A is the parent of B
JosephTremoulet updated this object.
  • refactor toplevel test into isToplevelForCLR, give a fuller implementation
  • rebase
lib/CodeGen/WinEHPrepare.cpp
439

This comment makes me nervous. Besides indicating where exceptions that occur inside the catchpads unwind the unwind clause of a catchswitch also indicates where an exception will go if it is not caught by any of the handlers in the catchswitch. So in order to change this clause from some EH pad to "unwinds to caller" we'd need to know both that none of the catch pads ever throws an exception and that all exceptions that come to this catch switch are caught by one of the catch pads (and even then, it seems like a bad idea). It doesn't seem like any optimization outside of the WinEH-specific handling should be able to confirm this second condition since it depends on the personality function. Is this really happening?

453

Does the phrase "exceptions propogating out of it" include both exceptions that cause the cleanup pad to be executed but reach the cleanupret instruction and exceptions that occur within the cleanup handler? It seems from the code that follows that it does, but it wasn't entirely clear to me.

466

This doesn't seem quite right. Consider the following scenario:

cleanup:
  %cl = cleanuppad within none []
  invoke void @g()
    to label %cleanup.cont unwind label %cleanup.cs
cleanup.cs:
  %cs = catchswitch within %cl [label %cleanup.catch] unwind label %nested.cleanup
cleanup.catch:
  %cp = catchpad within %cs []
  catchret from %cp to label %cleanup.cont
nested.cleanup:
  %nc = cleanuppad within %cl []
  call void @h()
  cleanupret from %nc unwind to caller
cleanup.cont:
  cleanupret from %cl unwind to caller

This case may return either true or false, depending on whether it encounters the nested catchswitch or the outer cleanupret first in the user list.

481

Shouldn't this be "if (!UselessCleanups.count(CPI))"?

lib/CodeGen/WinEHPrepare.cpp
439

The only place I'm aware of making this transformation is in SimplifyCFG's SimplifyUnreachable -- if the catchswitch (and invokes etc within the catches) unwinds to a cleanuppad that immediately its unreachable, we'll removeUnwindEdge each of those preds -- so in that case,
a) the inference is coming from the unwind destination, not analysis of the switch clauses, and
b) the same transformation will get applied to the invokes and such in the catches
which may be of some comfort.

Point well taken that preserving the agreement between unwind destinations in catches w.r.t. each other and the catchswitch in general requires careful non-local analysis.

453

Yeah, it needs to mean both, I'll try to make that more clear.

466

Yes, this was supposed to check if the catchswitch dest's parent is also the cleanup and continue to the next use if so. Good catch, thanks, I'll fix and expand the test.

481

Whoops, yes, of course. I'll maybe just remove this loop, though -- when the invokes are tagged with the cleanup, getting to this point means no exceptions escape the cleanup that aren't supposed to reach all the way up to caller, and so we can report it as top-level by the same reasoning that we use for faux-unwinds to caller catchswitches on line 454.

lib/CodeGen/WinEHPrepare.cpp
439

OK. I got the impression from this comment that this was happening if nothing was unwinding from within the catch switch.

JosephTremoulet updated this object.
  • Rebase on top of funclet bundle changes
  • Extend to handle missing cases; rework as a top-down pass and then a bottom-up pass instead of using the IsToplevel/getEHPred scheme
  • Split ClrEHUnwindMapEntry's Parent field into HandlerParentState and TryParentState; update comments and naming to keep the distinction clear
  • Expand lit test to cover several of the corner cases
majnemer added inline comments.Dec 25 2015, 2:36 AM
lib/CodeGen/WinEHPrepare.cpp
513–514

Is calculateStateNumbersForInvokes correct for your personality? I'm surprised it went through untouched. Specifically, the bit where we use getCleanupRetUnwindDest seems a little sketchy...

lib/CodeGen/WinEHPrepare.cpp
513–514

Yes, it works out, although we're not really making use of its logic. This bit is the part that's irrelevant for CLR EH:

BasicBlock *FuncletUnwindDest;
auto *FuncletPad =
    dyn_cast<FuncletPadInst>(FuncletEntryBB->getFirstNonPHI());
assert(FuncletPad || FuncletEntryBB == &Fn->getEntryBlock());
if (!FuncletPad)
  FuncletUnwindDest = nullptr;
else if (auto *CatchPad = dyn_cast<CatchPadInst>(FuncletPad))
  FuncletUnwindDest = CatchPad->getCatchSwitch()->getUnwindDest();
else if (auto *CleanupPad = dyn_cast<CleanupPadInst>(FuncletPad))
  FuncletUnwindDest = getCleanupRetUnwindDest(CleanupPad);
else
  llvm_unreachable("unexpected funclet pad!");

BasicBlock *InvokeUnwindDest = II->getUnwindDest();
int BaseState = -1;
if (FuncletUnwindDest == InvokeUnwindDest) {
  auto BaseStateI = FuncInfo.FuncletBaseStateMap.find(FuncletPad);
  if (BaseStateI != FuncInfo.FuncletBaseStateMap.end())
    BaseState = BaseStateI->second;
}

You're right that FuncletUnwindDest will take on a meaningless value for cleanupret-less cleanups, but since we never populate FuncletBaseStateMap when targeting CLR, we never reach BaseState = BaseStateI->second, regardless of whether FuncletUnwindDest happens to match InvokeUnwindDest.

majnemer accepted this revision.Jan 3 2016, 11:04 PM
majnemer edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jan 3 2016, 11:04 PM