In particular, this handles SSA values that are live *out* of a handler.
The existing code only handles values that are live *in* to a handler.
Details
Diff Detail
Event Timeline
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
452–457 | Yes, it will produce a dangling pointer. It has the same semantics as the source one would write for it: void f() { int *data = nullptr; try { throw 42; } catch (int e) { data = (int*)alloca(e); } data[0] = 42; // store to dead stack memory } Rejecting this isn't our job, it's Clang's. | |
521–522 | Won't that require that the intrinsic call has no arguments? | |
1468–1471 | Let's make this errs(). I really want this check to stay in for a release build. It's not expensive, and it's the #1 place where outlining can fall down and go off the rails. We should emit a reasonable crash diagnostic. |
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
452–457 | What if the programmer wasn't at fault? Let's say we had: int n; void f() { int *x = (int *)alloca(n); try { throw 42; } catch (int e) { do_something(x); return; } } Is it unreasonable for LLVM to sink the dynamic alloca into the catch block? | |
521–522 | ||
1468–1471 | Sounds good. |
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
383 | Can you add a comment here explaining that the blocks have been split to ensure that the endcatch call will immediately precede the terminator? | |
427 | This doesn't seem right. A nested landing pad block generally begins with a call to endcatch, but it isn't a normal block. | |
460 | Does this demote arguments in EH blocks even if the value is defined in an EH block? Also, isn't this going to demote the landing pad values which we will later promote? | |
508 | Is this inserting the store in the entry block? That doesn't sound right. |
The next patch has a lot of changes because I realized I had to do more work for phi nodes using constants in EH return points.
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
383 | Sure. | |
427 | I think we should stop emitting those endcatch calls in the frontend. I don't think they help identify regions to outline. They way I see it, barring code commoning optimizations, the handlers are just the BBs delimited by the begincatch / endcatch calls. A nested landingpad simply starts a new set of handlers. | |
452–457 | I think in general that's too hard for LLVM. If you add a cleanup to your example, do_something(x) could escape x, and then the cleanup would run in the parent frame and use the escaped x, which is now dead. What's wrong with leaving it alone and calling DemoteRegToStack on the dynamic alloca? If it doesn't do the right thing (I haven't tested it) it sounds like a bug in DemoteRegToStack. | |
460 | This all runs before outlining, so the only arguments it will demote are ones in the parent function. This shouldn't demote selector and ehptr values because they typically aren't live *out* of a landingpad. Even if we demote them by accident in a corner case, I'm not worried about the overhead, so long as we don't demote and then promote in the common cases. | |
508 | Why not? Arguments are SSA values that are defined once at function entry. This is only less than optimal when you have paths through the function that don't reach an invoke with a handler using the argument. We could try to sink the store somewhere closer to a dominating invoke to stay out of any fast paths through the funciton, but we'd want to stay hoisted out of any loop bodies. This analysis seems hard and not really worth it. If you're still concerned, an easier direction might be to have llvm.argument.escape and llvm.argument.recover intrinsics to recover the address of the argument and ensure that the parent homes escaped register parameters. | |
521–522 | done |
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
427 | That's a further divergence from the Itanium model, which I believe does need that call, but I guess we've got enough of those now that one more wouldn't matter. We're jumping through a fair number of hoops to maintain at least a semblance of conformance to the basic structure of the Itanium model. My understanding is that the primary benefit of this is that the back end passes know how to handle landing pads and we don't need to teach them something new. Is that right? I suppose it doesn't matter much which intrinsics we use along the way. | |
508 | Oh, nevermind. I was seeing 'argument' but thinking 'operand' because of where this set was originally populated, which also explains the irrelevancy of my comment above. Sorry about that. |
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
427 | Yes, it is another Itanium divergence. We probably want to rename these intrinsics, as they are pretty WinEH specific at this point. By backend passes, I'm assuming you mean the LLVM IR mid-level optimization passes? Everything in lib/CodeGen had to be redone, so we didn't save anything there. At this point, now that I can see the difficulties entailed in implementing WinEHPrepare, I'm happy to move to some other mid-level representation that doesn't resemble Itanium EH, even if we have to change the mid-level passes somewhat. The main thing that I want to preserve is that we should outline late to support optimization. If we can't run SROA on objects with destructors because they are all escaped to an uninlineable call to the destructor on the exceptional codepath, all follow-on optimization is dead in the water. I don't want to have to make an interprocedural version of SROA or teach SROA to be (slightly) interprocedural. If we can't run SROA on objects with destructors, we will never be able to promote the heap memory from unique_ptr<int> to an i32 alloca by matching up the new/delete pairs, for example. Ultimately, I think we can save ourselves the Itanium-style EH hoop jumping by switching to landingswitch (alternatively named "dispatchblock"), or something like it. |
I have one question, but I expect you have a good answer.
Looks good.
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
417 | Does this put the endcatch call in the split successor or leave it in the original block? |
lib/CodeGen/WinEHPrepare.cpp | ||
---|---|---|
417 | Good catch! This code splits before the call, but I didn't want that. It's not easy to observe in a test because we remove these calls and branches after outlining. |
Please sort this include with the others.