This is an archive of the discontinued LLVM Phabricator instance.

Initial SEH IR generation implementation
ClosedPublic

Authored by rnk on Oct 3 2014, 1:59 PM.

Details

Summary

Most of the complexity here is emitting filter expressions as standalone
LLVM functions. The filter functions are used in catch clauses in place
of type_info globals, and LLVM does the rest of the lowering.

Major aspects that still need work:

  • __try body outlining with noinline and optnone.
  • Cleanup support for __finally and destructors. LLVM won't be able to lower the current IR to anything useful.
  • Local variable access in filter expressions.
  • Don't use TLS to figure out which filter function fired. This requires work in LLVM.

Diff Detail

Event Timeline

rnk updated this revision to Diff 14405.Oct 3 2014, 1:59 PM
rnk retitled this revision from to Initial SEH IR generation implementation.
rnk updated this object.
rnk added reviewers: rjmccall, rsmith, majnemer.
rnk added a subscriber: Unknown Object (MLST).
ABataev added a subscriber: ABataev.Oct 5 2014, 8:17 PM

Reid,

__try body outlining with noinline and optnone.

You can use CapturedStmt for body outlining. It works just like Lambdas, but all captured vars are passed by reference

rnk added a comment.Oct 7 2014, 5:05 PM
In D5607#5, @ABataev wrote:

Reid,

__try body outlining with noinline and optnone.

You can use CapturedStmt for body outlining. It works just like Lambdas, but all captured vars are passed by reference

Thanks, I actually tried that in a subsequent patch layered on this one. It worked great, but Richard isn't particularly happy with the direction. He feels that CapturedStmt is an implementation artifact that shouldn't be in the AST because it doesn't reflect anything user-written. On the other hand, IMO the AST already has explicit nodes for implicit things that the user didn't write, such as MaterializeTemporaryExpr. I will probably try doing outlining during IRGen to compare the approaches.

He feels that CapturedStmt is an implementation artifact that shouldn't be in the AST because it doesn't reflect anything user-written.

Actually, initially this class was designed to be a basic class for other statements, which may require function outlining. But Doug Gregor recommended to make it a separate statement.

rnk added a comment.Oct 14 2014, 3:49 PM

Ping. I'm mostly interested in coming to consensus on direction.

rjmccall edited edge metadata.Nov 8 2014, 11:56 PM

I think the best solution here is definitely to let the backend handle outlining handler functions: to get reasonable code, the outlining has to be done by something that understands the layout of the outer function's frame, and only the backend can do that. No high-level intrinsic can capture the relationship between the functions in any sort of reliable way without completely disabling inlining of the outer function. And we really want to still be able to do data-flow optimizations in the outer function; we just have to ensure that any live-in values have been spilled to the stack at the call site.

Unfortunately, I think the same principle is going to end up applying to filter functions. If they don't use values from the enclosing frame, then fine, you can outline them in the frontend; but if they do there's just no way that a frontend solution is ever going to work. This might mean that we need to change the invoke IR pattern again so that the filter functions are just basic blocks within the function (which end in some special intrinsic/instruction that guides outlining).

I think you need to have that conversation on the LLVM side before you can make any useful progress on the Clang side. Please CC me on that conversation when you start it.

lib/CodeGen/CGException.cpp
104

Seems like a reasonable thing for the CXXABI to return.

438

Don't do this. Make a flag somehow.

rnk added a comment.Nov 10 2014, 1:38 PM
In D5607#13, @rjmccall wrote:

I think the best solution here is definitely to let the backend handle outlining handler functions: to get reasonable code, the outlining has to be done by something that understands the layout of the outer function's frame, and only the backend can do that. No high-level intrinsic can capture the relationship between the functions in any sort of reliable way without completely disabling inlining of the outer function. And we really want to still be able to do data-flow optimizations in the outer function; we just have to ensure that any live-in values have been spilled to the stack at the call site.

I agree, this is the right representation for catches and cleanups for the reasons you mention. I also don't want to introduce a new EH representation that is only used by SEH.

Unfortunately, I think the same principle is going to end up applying to filter functions. If they don't use values from the enclosing frame, then fine, you can outline them in the frontend; but if they do there's just no way that a frontend solution is ever going to work. This might mean that we need to change the invoke IR pattern again so that the filter functions are just basic blocks within the function (which end in some special intrinsic/instruction that guides outlining).

OK. I was thinking that cleanups are common (C++ destructors) while __try and __except filter expressions are rare. Therefore we can do something slow and dumb in the frontend for them. But, if we have to implement late outlining for EH handlers, we might as well extend it to filter expressions. __try bodies will still be outlined for simplicity.

I think you need to have that conversation on the LLVM side before you can make any useful progress on the Clang side. Please CC me on that conversation when you start it.

Sure.

rnk updated this revision to Diff 18273.Jan 15 2015, 4:18 PM
rnk edited edge metadata.
  • Rewrite without frontend outlining
rjmccall added inline comments.Jan 16 2015, 2:28 AM
lib/CodeGen/CGException.cpp
1702

Can you at least document what the point of this is?

1772

Doesn't the finally handler need to be active within the catch handler? You probably need to push it first.

1821

Shouldn't you be branching on the filter result or something? It feels wrong for the control flow here to be implicit.

1830

Yeah, you're popping it after you pop the catch. That can't possibly work if you pushed the catch first.

This also suggests that you aren't testing that adequately. :)

lib/Sema/SemaChecking.cpp
489

This checking can run in template instantiations, but getCurScope() isn't meaningful there.

rnk added inline comments.Jan 16 2015, 2:14 PM
lib/CodeGen/CGException.cpp
1702

Sure, check out the comment.

1702

Sure, see the new comment.

1772

A __try can only have one handler, __except or __finally, and not both. I'll straighten out the control flow to clarify that.

1821

The idea in this patch is that we enter the landing pad, and we sort of non-determinstically already know what the selector is. The @llvm.eh.seh.filter call acts as a barrier against code motion of side effects. It's kind of like a suspend / resume point in a coroutine. What happens in practice is that the code between the typeid check and the @llvm.eh.seh.filter call gets outlined.

Now that I'm thinking about this harder, I'll admit this is kind of broken for nested cleanups, as the cleanups will run before the filter, which is not what actually happens in practice (filters run during phase 1). =/

What do you think about going back to outlining filters in the frontend? I've had many weeks of coffee and whiteboard discussions trying to figure out the least disruptive way to fit filter functions into LLVM's EH scheme, and we haven't come up with anything good. We can still do backend outlining for cleanups and C++ EH. LLVM's EH model just doesn't have a place for code that gets executed during phase 1 EH unwinding.

1830

As mentioned, this is impossible, but it's a reasonable comment. :)

lib/Sema/SemaChecking.cpp
489

Right. :( This is always a syntactic nesting property, though. Instantiation cannot change it. Should I just check it during the parse (if so, where?), or just suppress this Sema check during instantiation, since it will always give the same result?

rnk updated this revision to Diff 18321.Jan 16 2015, 2:15 PM
  • straighten control flow
rnk updated this revision to Diff 18473.Jan 20 2015, 5:58 PM
  • Go back to outlined filter expressions with no captures
rjmccall added inline comments.Jan 20 2015, 6:13 PM
lib/CodeGen/CGException.cpp
1772

Okay, makes sense.

1821

I think outlining filters is an acceptable implementation model given the oddity of the control flow. Can we get away with just disallowing references to local variables in them for now, or is that an important use case?

lib/Sema/SemaChecking.cpp
489

In theory, you can construct builtin calls in template instantiations with e.g. non-type template arguments. It's probably safe to just not diagnose that well as long as you don't crash during IRGen if you miss a case.

You don't really have a choice about where you detect this; there *are* often separate Sema entrypoints for parsing vs. instantiation, but we don't want to burden every function call parse with an extra check just for this. I think Sema tracks that's currently instantiating a function body, but I don't remember where.

rnk added inline comments.Jan 21 2015, 10:37 AM
lib/CodeGen/CGException.cpp
1821

We can definitely get away with rejecting local variable capture for now. I can't find any instances of it in the code I have available.

However, eventually we will need to support it, at which point I would want to use the @llvm.frameallocate @llvm.framerecover intrinsics. Those inhibit inlining, which I think is OK for this particular corner case. We already support similarly heinous extensions like labels as values that inhibit inlining. :)

lib/Sema/SemaChecking.cpp
489

OK. We'll produce undef for these during codegen if they sneak in some how.

rnk added a comment.Jan 21 2015, 3:43 PM

I think we're in reasonable agreement about the direction here, so I'm going to go ahead and commit this to unblock Andy who is working on the EH preparation changes.

With this commit, basic __try { } __except() { } will start to compile and behave correctly.

rnk updated this revision to Diff 18566.Jan 21 2015, 4:02 PM
  • Final version of the patch
This revision was automatically updated to reflect the committed changes.