Page MenuHomePhabricator

Outline functions for SEH with the MSVC environment
AbandonedPublic

Authored by andrew.w.kaylor on Dec 5 2014, 3:34 PM.

Details

Reviewers
rnk
Summary

This is a preliminary implementation of a pass to outline filter functions and exception handlers to support exception handling when compiling for the MSVC environment (that is, using the MSVC runtime personality functions). Because the MS personality functions are designed to call filter functions and exception handlers outside of the frame context in which the exception handler appears, the functions are extracted from their original location and inserted into an external function with the signature dictated by the personality function.

This patch represents a work in progress. It is not yet intended as a candidate to commit for the following reasons:

  • It only outlines SEH filter functions (not finally handlers and not C++ EH handlers) but the rest should be just a matter of different heuristics to find the end points of the region to be outlined.
  • It doesn’t remove the code after it has been outlined. That should be fairly simple but I didn’t want to delay discussion while implementing it.
  • It depends on IR semantics that are being implemented by other in-flight patches.
  • I haven’t implemented tests for it yet. So far I’ve just been manually verifying functionality in the debugger.

I hope that it will be a reasonable basis for discussion and that once the necessary design decisions have been made I will be able to quickly refactor it to become product-ready.

Diff Detail

Repository
rL LLVM

Event Timeline

andrew.w.kaylor retitled this revision from to Outline functions for SEH with the MSVC environment.
andrew.w.kaylor updated this object.
andrew.w.kaylor edited the test plan for this revision. (Show Details)
andrew.w.kaylor added a reviewer: rnk.

Sorry about the lack of contexts. I don't have arcanist set up yet and the version of svn I have installed doesn't seem to want to give me extra lines of context. Most of the interesting stuff is in the new file though so hopefully this will be OK.

andrew.w.kaylor set the repository for this revision to rL LLVM.Dec 5 2014, 5:14 PM
andrew.w.kaylor added a subscriber: Unknown Object (MLST).
rnk edited edge metadata.Dec 8 2014, 1:30 PM

Cool! I have some ideas for making this more robust.

include/llvm/Transforms/Utils/Cloning.h
144–146

These should probably merge into maybeEndCloning() or something like that, given that you had to thread state from shouldStopAt into createTerminateInst.

lib/CodeGen/MSVCEHPrepare.cpp
164

At the end of the day, EH lowering should not fail if the IR passes verification. I have some ideas for how cut back the fragile pattern matching, but I'm sure there are some edge cases I haven't thought of. For now, it's OK to assert on really uncommon ones.

184–191

This makes sense, but I'd just make it an assert, since we'll want that assert in the final code anyway.

194–221

This code can be made a lot more robust by reusing some of the techniques in the SjLj EH preparation code, which replaces the landingpad aggregate with something else. It optimistically iterates the use chain of the landingpad and replaces extractvalue instrs with the actual pointer and selector values. However, if it there are uses remaining, it synthesizes an aggregate and replaces all uses of the aggregate with that.

For the outliner, we don't want to replace things in place, we want to add to the value map instead. We probably want to produce something like:

Instruction *EHAggregate; // Null if there are no uses of the aggregate other than extracts.
SmallVector<Instruction*, 4> EHPtrExtracts;
SmallVector<Instruction*, 4> EHSelExtracts;

And in each outlined filter, add mappings for all of those instructions.

237

I think this can be handled more robustly by setting up the value map for the selector and these intrinsic calls to return the constant 1 for the active selector and 0 for inactive selectors. The cloner already knows how to prune conditional branches like the ones you're pattern matching away, so we'd get that more or less for free. If the pruning analysis fails, we'd get something that is conservatively correct.

In this model, we'd start the cloning just after the landing pad instruction. Filters would have to be emitted before cleanups. This creates some challenges for clang IR emission, but I think I can handle it.

305

Yeah, this analysis is going to be fun. We can set it aside for later. I think there is some interesting prior art in SjLj preparation that will help.

327

We should be able to insert multiple return instructions to handle this case.

test/CodeGen/X86/seh-outline.ll
2

I think you can run opt -codegenprepare and FileCheck the output. Grep around test/ for -codegenprep to see what the other preparation tests do.

andrew.w.kaylor added inline comments.Dec 8 2014, 3:27 PM
include/llvm/Transforms/Utils/Cloning.h
144–146

Yeah, that makes sense.

lib/CodeGen/MSVCEHPrepare.cpp
164

At this point the assertions are as much documentation of my assumptions as anything else. Basically, I put assertions in for any condition that would cause this code not to do what I meant for it to do. I think that once this code is sufficiently robust the assertions will either go away or will be things that we genuinely don't believe can happen.

237

I think I understand what you're suggesting here. I'll rework the code and update the review.

327

I was thinking that this is an indication of poorly formed IR, but maybe I'm being overly narrow in my thinking. I was viewing this as code that clang would generate to represent the code between the parens of an __except statement, which should always evaluate to a single value, but I suppose I can see an arbitrary client of llvm calling the llvm.eh.seh.filter intrinsic to terminate any number of branches. Since this code will replace each of them with a return instruction, that should be fine.

The real problem I'm trying to solve here is finding the beginning and end of the filter handler. Following your suggestion of starting at the landing pad with mapped values to reduce the following instructions to the flow for the selector I mean to be handling, what I want to be sure of is that all control paths from the landing pad instruction will lead to a call to the llvm.eh.seh.filter intrinsic (or something that terminates in the case where an exception occurs within the filter function?). But I can probably handle that better by asserting that this code doesn't see a return instruction as input.

andrew.w.kaylor edited edge metadata.
  • Re-implemented filter outlining in a more robust manner
  • Added support for cleanup outlining
  • Added code necessary to invoke the pass from opt for testing
  • Updated the test to use opt and check outlined functions

I'm not sure the cleanup handling is correct. MSVC disallows __try in a block that requires unwinding, so I expected that we'd do the same in clang. Does the MSVC SEH personality function even support exception handlers and cleanup in the same block? Also, it isn't clear to me how the cleanup handlers will be connected to the landingpad after they've been outlined.

I'm still not handling references to frame variables within the handlers, but I have an idea how I can do that with the CloningDirector class as I've revised it here.

I'm also still not removing code that has been outlined.

rnk added a comment.Dec 12 2014, 3:50 PM

Neat, so these tests pass? Seems like this approach is working! =D

The MSVCEHPrepare.cpp file has a bunch of Allman style braces, which should be fixed.

I feel like this is probably a good chunk of logic to commit, but we're a bit blocked on review. Long term, will need to do analysis to share regions of cleanup code in separate outlined functions rather than duplicating code.

lib/Transforms/Utils/CloneFunction.cpp
327

This comment seems like it got mis-edited. :) Maybe "..., we want to stop everything. The cloning director is reponsible for inserting a terminator here."

383

ditto

Yeah, the test passes (...now...there was a change missing in the last revision I uploaded). It was kind of a beautiful thing to see it come together. One minute it was producing a garbled mess full of bad references, then after one more change it all simplified to the expected results.

These latest changes include a fix for an unmapped variable problem during outlining. I also fixed the comments that were wrong and ran the new source file through clang-format.

I expect that this will need to wait until the related changes are committed and the intended form of the IR is settled, but I think they are otherwise ready to go.

rnk added a comment.Jan 13 2015, 4:58 PM

OK, so the MSVC EH type is landed now and the x64 SEH tables are available. Want to rebase and I'll take another look? I can start adding seh or C++ EH stuff to Clang to support this as well.

Rebased and updated the form of the expected input IR.

I am proposing replacing uses of the llvm.eh.typeid.for intrinsic with new intrinsics -- llvm.seh.filterid.for if the comparison is intended to lead to a filter sequence (before it is outlined) and llvm.seh.handlerid.for if the comparison is intended to lead to the actual SEH exception handler body. The main thing this accomplishes is that it gives me a way to determine whether or not filter function outlining has already been done. I think it will also help with removing the filter code from the main function body after it has been outlined. As an added benefit, we will no longer be pretending that function pointers are type IDs.

I have not implemented the intrinsics proposed above and so the code is currently looking for them by function name. If you like this suggestion, I'll submit a patch to implement these intrinsics that will go in before this one.

My seh-outline.ll test case asserts at Win64Exception.cpp:224 if you try passing it to llc (because the number of landing pad clauses doesn't match the number of typeids in the landing pad instruction). That failure goes away if I remove the "cleanup" clause from my landing pad instruction in the test case IR. It seemed to me that this indicated that SEH cleanup isn't fully implemented yet at that level.

In general, I'm not feeling terribly comfortable about the way that I'm handling cleanup in this patch. More to the point, I don't have a clear idea of what the end-to-end solution is supposed to look like for cleanup. Maybe we can discuss this more on the LLVMDev list. With regards to this patch, the biggest hole seems to be that I'm creating a cleanup function but not linking the landing pad instruction to it in any way. I don't have a proposed solution for that yet.

rnk added a comment.Jan 15 2015, 3:20 PM

Rebased and updated the form of the expected input IR.

I am proposing replacing uses of the llvm.eh.typeid.for intrinsic with new intrinsics -- llvm.seh.filterid.for if the comparison is intended to lead to a filter sequence (before it is outlined) and llvm.seh.handlerid.for if the comparison is intended to lead to the actual SEH exception handler body. The main thing this accomplishes is that it gives me a way to determine whether or not filter function outlining has already been done. I think it will also help with removing the filter code from the main function body after it has been outlined. As an added benefit, we will no longer be pretending that function pointers are type IDs.

David suggested simply renaming this intrinsic to llvm.eh.selector.for(i8*) to generalize it over SEH. Continuing to use typeid is a bit crazy. Do you think it's better to generalize the typeid intrinsic or add a new special purpose intrinsic? We'd have to teach GVN about it.

I have not implemented the intrinsics proposed above and so the code is currently looking for them by function name. If you like this suggestion, I'll submit a patch to implement these intrinsics that will go in before this one.

My seh-outline.ll test case asserts at Win64Exception.cpp:224 if you try passing it to llc (because the number of landing pad clauses doesn't match the number of typeids in the landing pad instruction). That failure goes away if I remove the "cleanup" clause from my landing pad instruction in the test case IR. It seemed to me that this indicated that SEH cleanup isn't fully implemented yet at that level.

In general, I'm not feeling terribly comfortable about the way that I'm handling cleanup in this patch. More to the point, I don't have a clear idea of what the end-to-end solution is supposed to look like for cleanup. Maybe we can discuss this more on the LLVMDev list. With regards to this patch, the biggest hole seems to be that I'm creating a cleanup function but not linking the landing pad instruction to it in any way. I don't have a proposed solution for that yet.

Yeah, I'm still not sure how to represent prepared cleanups in IR. Maybe just @llvm.seh.cleanup(void ()* @mycleanup)?

David suggested simply renaming this intrinsic to llvm.eh.selector.for(i8*) to generalize it over SEH. Continuing to use typeid is a bit crazy. Do you think it's better to generalize the typeid intrinsic or add a new special purpose intrinsic? We'd have to teach GVN about it.

I like the idea of a generalized selector intrinsics, but I have two problems that I'm not sure it solves.

  1. I need a way to determine whether or not the outlining has already been done. I suppose I could use the presence of calls to "llvm.eh.seh.filter" (or whatever else ends up in that place) to do that, but that's a new intrinsic too. There's also still the problem of outlining the clean handlers (and recognizing when that has been done) and native Windows support for C++ EH.
  1. I was kind of hoping to use a dead code elimination trick similar to what you suggested for the outlined functions to get rid of the filter code after it has been outlined. I could possibly still do that if the filter dispatch and handler dispatch were calling the same intrinsic. I haven't tried that out yet.

Having said that, both of these problems can probably be solved without having two different intrinsics that essentially mean the same thing just in different places. So let me see if I can work this out with the general selector intrinsic.

I'll also think about the big picture a bit more and try to look ahead to how Windows support for C++ EH will fit into the plan.

Yeah, I'm still not sure how to represent prepared cleanups in IR. Maybe just @llvm.seh.cleanup(void ()* @mycleanup)?

How would that be attached to the landing pad instruction? Specifically, how would we find it when generating the .xdata table?

Updated the patch to expect a single llvm.eh.selector.for intrinsic.
Removed cleanup outlining because that seems not to be settled yet.
Added support for pruning filter code that has been outlined.
Updated test to verify that outlined code has been removed from its original location.

andrew.w.kaylor abandoned this revision.Feb 10 2015, 3:27 PM

These changes are no longer needed for SEH, since we are now outlining filter functions in the front end.

The bulk of these changes were reworked for use in D7363.