This is an archive of the discontinued LLVM Phabricator instance.

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2
ClosedPublic

Authored by tentzen on May 19 2021, 5:07 PM.

Details

Summary

This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:

  • First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary.
  • Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees).
  • Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:

Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:

A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.

One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.

Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad701522988e212495285dade8efac41a24d4).

Part-2 : LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
gAlfonso-bit added inline comments.
llvm/lib/Analysis/EHPersonalities.cpp
82–83 ↗(On Diff #347591)

I would check for EHA first before checking if the personality is Asynchronous

tentzen added inline comments.Aug 15 2021, 12:37 AM
llvm/lib/Analysis/EHPersonalities.cpp
82–83 ↗(On Diff #347591)

OK, make sense. will do.
thank you for the feedback!

aganea added a subscriber: aganea.Sep 20 2021, 10:47 AM
majnemer added inline comments.Sep 21 2021, 10:18 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1305

nit: bcause -> because

Thank David.
Does anyone have any more feedbacks?
We would like to land this patch in shortly.
thanks,

Looks like these patches (yes, the previous one too) are failing to capture the audience;
at least from the phabricator, it doesn't look like there has been an/any substantial review. :/

The previous did get lots of substantial and quality reviews (especially those from rjmccall).
Roman, you asked about LLVM design & implementation when reviewing part-1. Here you go; Do you have any feedback?
thanks,

llvm/include/llvm/CodeGen/SelectionDAGISel.h
332

In the previous checkin, you went through and replaced the term EHa with asyncEH or similar in all identifiers and comments, didn't you? If so, I think it would be good to do the same with this one.

llvm/include/llvm/CodeGen/WinEHFuncInfo.h
128–132

Here you've got the signature and the name both indicating that this works at a block scope, and a comment indicating that it is for EHa. I wonder if it wouldn't be better to have the signature indicate block scope and use the name to indicate EHa, something like calculateCXXStateForAsyncEH.

llvm/include/llvm/IR/BasicBlock.h
198

To me, the word "faulty" implies that a fault definitely exists. I think that getFirstMayFaultInst, though perhaps more awkward, would be more clear.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1305

what if the divisior is not a constant but evaluates to 0 at run-time? I assume that's handled differently, but without context I wouldn't know where, so maybe mentioning in the comment would be good.

llvm/lib/CodeGen/BranchFolding.cpp
1207

do the tests (either the new ones you're adding or existing ones) exercise the case where we need this?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

How do we know that this is a CleanupPad and not a Catchswitch?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1325–1327

How do we know this doesn't wind back out of the block we care about? Is it worth adding an assertion?

llvm/lib/CodeGen/WinEHPrepare.cpp
253–256

I'm surprised this doesn't cause a problem for super-deep CFGs (pathological cases); consider using a worklist.

267–269

I think you're saying here that it's legal for a side exit to target a successor with any ancestor state. But also you're assigning each such successor the exact parent state of its least predecessor ("least" by comparing state numbers). How do you know that it shouldn't be the grandparent (or any other ancestor) of the least predecessor?

269

nit: should be "trumps others" or "triumphs over others"

272

Oh, I didn't notice this when I wrote the comment above. I agree with clang-tidy on this one (for both of these functions).

llvm/lib/IR/BasicBlock.cpp
212

I didn't notice when commenting on this name earlier that this method is on the IR BasicBlock class. I don't know if there's a better term to use for this than "fault". Regardless, I think it would be good to make the comment on the declaration very explicit that this checks for loads/stores (which may dereference a null pointer) and calls/invokes (which may propagate exceptions), since I don't think we have a well-defined IR concept for this particular property of those instructions. (FWIW, The last time I went looking, the closest I could find was the logic in SimplifyCFG that decides how much code it can remove before an unconditional branch to unreachable, which was open-coded.)

216

Did you mean CallBase here? This will skip InvokeInsts (and CallBrInsts).

Thank Joseph's in-depth code review and many great suggestions. will work on it and reply back later..

tentzen updated this revision to Diff 386688.Nov 11 2021, 4:55 PM

Update per Joseph Tremoulet's feedback.

Joseph,
I greatly appreciate your thorough review and insightful feedbacks.
new revision is provided. also review it again.
thank you!

llvm/include/llvm/CodeGen/SelectionDAGISel.h
332

good point. done.

llvm/include/llvm/CodeGen/WinEHFuncInfo.h
128–132

yes makes sense. thanks.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1305

as stated in last line comment, in that case the DIV will not be the first instruction in the EH region. we can skip Nop..

llvm/lib/CodeGen/BranchFolding.cpp
1207

Yes, the **Dtor one. it's how we found this bug. thanks.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

fixed. thanks.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1325–1327

good point. fixed in the new patch.

llvm/lib/CodeGen/WinEHPrepare.cpp
253–256

done. recursion is replaced by a worklist loop..

267–269

the new state only assigned to successor when it's smaller than successor's current state. See the first line in the while loop.
if ( ... && EHInfo.BlockToStateMap[BB] <= State) ..

272

done. recursion is replaced by a worklist loop..

llvm/lib/IR/BasicBlock.cpp
212

renamed it with getFirstMayFaultInst(). also added more comments in header. thanks!

216

good catch! done.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

Nit: variable name still indicates cleanup. I'd suggest EHPadMBB.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Yes I see that you're taking the state of the least connected predecessor. What I'm not following is how you know that the least connected predecessor is a sibling, rather than say a neice -- what if the sibling ends in unreached, for example? Something like:

B1: (state = 0
{ // new scope
B2: (state = 1)
  { // new scope
  B2: (state = 2)
     { // new scope
     B3: (state = 3)
        if (_) {
        B4:
            goto B7;
        }
        B5: (state = 3)
        __assume(0);
     } // exit scope
     B6: (state TBD)
  } // exit scope
 B7: (state TBD)
} // exit scope
B8: (state TBD)

And maybe optimization turned __assume(0) into unreachable and got rid of B6 entirely. So then B4 is the only predecessor of B7. Doesn't that mean we'll assign state 2 to B7? But B7 should have state 1. How do we know this doesn't happen?

Hi Joseph, all good points. see replies below.
thank you!

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

good catch. will fix it. thanks.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Very good point. In this case, B7 actually is a side-entry that is led by a block of seh_scope_begin() intrinsic (see CpodeGen/CGStmt.cpp change in Part-1 patch). As such, the state will be reset via EHInfo.InvokeStateMap[] when real code blocks are proceeded.

dpaoliello added inline comments.Dec 6 2021, 3:46 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1686

Is this still required? LLVM is no longer adding 1 to the addresses in its SEH tables (although it does for its IP2State table).

tentzen added inline comments.Dec 9 2021, 6:54 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1686

Yes, this is required. -EHa is applied to C++ code too. Without this Nop, exception occurs on the instruction immediately following an EH_LABEL won't be caught due to current "+1+offset" mechanism.

This comment was removed by gmh5225.
gmh5225 added a comment.EditedMar 1 2022, 12:32 AM

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Sorry for late reply (thank Aaron Smith for bringing this up to my attention).
Looks like there is a corner bug in Part-1 Clang FE patch (https://reviews.llvm.org/D80344), not directly related to this LLVM part. I will look into it and work on a separate patch shortly.
thank you for trying this feature in advance.

Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2022, 4:30 PM
efriedma added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1681

If the issue here that the "end" label for a region can accidentally include the following instruction after the region? Would it make sense to explicitly insert a nop earlier, in that case?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

I assume this is trying to trick optimizations into avoiding optimizations on the cleanup block if the __try involves multiple basic blocks.

Have you considered trying to fix the control flow graph? I mean, I understand the difficulty of expressing the control flow at the IR level, given we don't have IR versions of "load" and "store" with an unwind edge, but those concerns don't really apply in the same way in a MachineFunction. You can mark the cleanup block as a successor of any block that can throw an exception, without explicitly marking up the control flow on each individual instruction. (This is how exceptions normally work.)

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1304

getBasicBlock() can return null, in general. Maybe since this is running during isel, that isn't an issue, though. Maybe worth adding a comment.

1322

If a load is folded into the terminator, it can fault... but I guess that's unlikely to come up in practice. The most likely scenario probably involves indirectbr. Most other indirect jumps probably aren't terminators. I guess a jump table could theoretically fault, but if you cared you could probably just disable those.

bowang added a subscriber: bowang.Apr 22 2022, 12:43 AM
tentzen added inline comments.Apr 22 2022, 1:39 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1681

I don't see the value of emitting it earlier. I think the closer to emission is safer. Or there is always a risk of being changed or reordered by other phases. Besides, there are a couple of similar cases of emitting Nop in that file (like functionPrfix). I think inserting the nop where the EH_Label is emitted is safest and most straight-forward.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

The Cleanup block is turned into a dtor funclet at the end. And the beginning address of that funclet will be recorded in EH table in .xdata section. So it's really address-taken by definition. If you verify this by checking Linker relocation records or the EH table in .asm file. IMO this flag should be set regardless of the SEH feature.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1304

ok, will add a comment. thanks.

1322

Right, it's unlikely in practice. I don't think it's concern here.
thank you for pointing out.

efriedma added inline comments.Apr 22 2022, 12:52 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1681

Could you at least clarify the comment to clarify what aspect of the SEH region is problematic? We currently use EH_LABEL for both the beginning and end of a region, and it isn't clear which one this is meant to apply to.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

That's not what "address-taken" is supposed to mean.

It's supposed to be equivalent to BasicBlock::hasAddressTaken, which means that the block is the target of a blockaddress expression (used for indirectbr). Primarily, this restricts certain optimizations because in general, we can't split a critical edge from an indirectbr.

hasAddressTaken() is not a general "will the address of the block eventually be emitted somewhere in the assembly". That can happen for a variety of unrelated reasons, including exception handling, debug info, jump tables, and branch lowering.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

I apologize for the ridiculous delay in responding, until just now I was under the mistaken impression that you hadn't replied.

I'm still confused by the logic here. Your comment states "Side exits can ONLY jump into parent scopes (lower state number)". My example had a jump to a grandparent, and you pointed out that its destination would be marked as a side entry. Are those the only two cases, would it be equally correct to say "Side exits can ONLY jump to either the immediate parent scope or a side-entry of another ancestor" ? Or is there a way to go straight to grandparent that doesn't get a side entry?

I think an equivalent question is: at the continue on line 300, could you assert EHInfo.BlockToStateMap[BB] == State || (block is side entry) ?

tentzen added inline comments.Apr 25 2022, 5:29 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1681

It's for both.
EH_LABEL marks the end of the previous region and the beginning of the following region. The Nop is added to prevent a potential-faulty-instruction from being included in the previous region and to ensure this potential faulty instruction still in the present region.
Yes, I will add more comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

I'm not sure about your interpretation. the comment of this flag from the header is:

/// Returns true if there are any uses of this basic block other than
/// direct branches, switches, etc. to it.
bool hasAddressTaken() const {
  return getBasicBlockBits().BlockAddressRefCount != 0;
}

it does include "switch" which I think refers to jump-table.
Besides, the dtor-funclet in EH table is invoked via indirectbr/indirect-call. Does not that also meet your interpretation?

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Sorry for the confusion. the comment "jump into parent scopes" here does not imply it's the immediate parent. it can be grandparents, etc as in your example. Will fix the comment.
Those two are actually refer to the same scenario. Jumping into a protected parant/grandparent's scope induces a warning, and under that situation the target label is marked as side-entry (JumpDiagnostics.cpp change in Part-1 patch).

efriedma added inline comments.Apr 27 2022, 3:15 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

We intentionally do not check for hasAddressTaken() here. Even if something refers to the address, it's not legal to branch to the block, so it can't contain any useful instructions. AsmPrinter::emitFunctionBody has some code to print a fake label if there's a reference to an erased block.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

"other than direct branches, switches, etc." means MachineBasicBlock terminators, exception pads, and jump tables. Anything which is *not* one of those, we mark up, because target-independent code can't analyze it. But in almost all cases, the above list covers basically everything except indirectbr until you get very late in the compile process.

If we do end up performing some transform that materializes the address of a block outside of the above cases, it could make sense to mark a block address-taken. (Some target-specific code does this. Not sure if that code is really doing it correctly, but none of those calls are relevant to this patch.) But we should not be preemptively marking basic block address-taken: we should wait until such a transform is actually performed.

For a dtor-funclet, we don't actually "take the address" of the funclet until the AsmPrinter, when we compute the exception tables. We could add an address-taken marking at that point... but nothing would check it that late in the pipeline.

Thanks for the explanations and your patience. This patch LGTM, modulo resolution of @efriedma's concerns (so I'll leave it to him to mark accepted).

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Jumping into a protected [ancestor scope other than the immediate parent scope] ... [can't occur unless] the target label is marked as side-entry

That is the key point I was missing. Yes, I think it would be great to include that in the comment and/or as an assertion.

a protected parant/grandparent's

What happens if an unprotected block is only reachable from side-exits of protected regions? Do we end up protecting it, and is that okay? This too would be good to include in comment.

tentzen added inline comments.Apr 28 2022, 5:54 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Hmm, really? there are at least three places where hasAddressTaken() is used to guard optimizations in the same pass (BranchFolding.cpp). why is OptimizeBranches() special?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

This is exactly the case that a target-independent code (like BranchFolding.cpp) must be aware of it.
The dtor-funclet cleanup block is genuinely address-taken the moment it's created.
Yes, it's not emitted until AsmPrinter. but it does not mean the block is not address-taken until AsmPrinter.

efriedma added inline comments.Apr 28 2022, 8:33 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

The other places in the pass are checking hasAddressTaken() to try to figure out if they can rewrite the predecessors. Here, we can trivially rewrite all the predecessors: there aren't any.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

There's not really any point to arguing what the person who originally wrote the comment meant. Sorry, I shouldn't have tried to take the discussion in that direction.

But I still think this is confusing: there's multiple different use-cases smashed together into one boolean.

How about this: we split up isBlockAddressTarget(), isAsyncSEHTarget(), and maybe isMachineLabelUsed() for the target-specific stuff that messes with MachineBasicBlock labels in weird ways. And then as a transitional measure hasAddressTaken() checks all of them, maybe? Really, I expect most of the relevant places that would check hasAddressTaken(), or something like that.

Then we can go through later and refine various places to check exactly what they need. I suspect most places don't actually need all three.

tentzen added inline comments.Apr 29 2022, 1:55 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

I feel doing that is over-engineering and I don't see much value out of it.

note that in addition to BasicBlock.h, it's also commented in MachineBasicBlock.h :

/// Test whether this block is **potentially the target of an indirect branch.**
bool hasAddressTaken() const { return AddressTaken; }

Also, some present usages are far from what you described.
See the usages in RetpolineThunkInserter where CaptureSpec and CallTarget blocks are not even address-taken.

The usage of hasAddressTaken() in this revision is much more appropriate and legitimate.

efriedma added inline comments.Apr 29 2022, 1:54 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

Looked a bit more. It turns out there's an actual bug here: mixing up the different markings could confuse the AsmPrinter if we mix together the different bits. Posted D124697.

With that, the "MBBLabelUsed" is probably close enough to what you need, if we need something. I guess we don't really need to distinguish between the SEH uses and other target-specific uses.

I'm still not sure where, exactly, the MBB is being stored that necessitates this. If I look at an MIR dump, the only place the EHPad shows up is as a successor of the block that can throw an exception. And as far as I can tell, none of the side-tables in WinEHFuncInfo contain any references to MBBs.

tentzen added inline comments.Apr 29 2022, 11:06 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

The EH table is named $stateUnwindMap$, the 'state' and 'action' map table. It is stored in $cppxdata$ that is the primary EH table passed to _CxxFrameHandler3/4 by __ehhandler$**. Check the new test case windows-seh-EHa-cpp-Dtors01.ll.
LLVM code that emits it is in WinException.cpp:

// UnwindMapEntry {
//   int32_t ToState;
//   void  (*Action)();
// };
if (UnwindMapXData) {
  OS.emitLabel(UnwindMapXData);
  for (const CxxUnwindMapEntry &UME : FuncInfo.CxxUnwindMap) {
    MCSymbol *CleanupSym =
        getMCSymbolForMBB(Asm, UME.Cleanup.dyn_cast<MachineBasicBlock *>());
    AddComment("ToState");
    OS.emitInt32(UME.ToState);

    AddComment("Action");
    OS.emitValue(create32bitRef(CleanupSym), 4);
  }

note getMCSymbolForMBB() that refers to the ehcleanup block we were talking about here.

efriedma added inline comments.Apr 30 2022, 2:58 AM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I guess given that we have references to an MBB stored in FuncInfo, we can't really delete the block. So I guess this is right, sort of. But it would still be undefined behavior to jump to it, so I'm not sure this really solves whatever problem you ran into.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

I was more concerned about the other end of it... but looking into it, I guess FunctionLoweringInfo::set() is actually where we end up computing the mapping. So we need to ensure that the FuncInfo tables stay consistent through the entirety of codegen. Which seems like an extreme restriction, but I guess it's unlikely to matter that much in practice.

tentzen added inline comments.May 2 2022, 3:11 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

it's MBB.pred_empty(), so there is no edge jumping to it.
Without this change, an EH Cleanuppad block could be removed and EH runtime would not be able to dtor a live object during unwinding.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3000

the FuncInfo and its related tables is the existent framework that handles EH.
Yes, I'm sure it is reliable and consistent with IR or C++ EH today would not be working at all.
This patch is adding SEH support that is built on top of existent EH framework.

efriedma added inline comments.May 2 2022, 4:17 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

That's a contradiction: you're saying there's no edge jumping to it, but somehow the "EH runtime" is executing the code in the block.

In particular, even if you preserve the block, anything involving dominance relationships, like register allocation, is likely to destroy unreachable code. Maybe you can sort of get away with it in trivial cases.

tentzen added inline comments.May 2 2022, 8:42 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I meant no edge jumps into it through IR's control flow graph during compilation. EH runtime invokes it via an indirect branch during execution time.

It's the whole point of 'address-taken' attribute; it tells optimizations/transformations that although this block is unreached via IR control flow, its address is taken and runtime (or some other code) can invoke this block via an indirect branch.

Register-allocation, or and all other optimizations (like this BranchFolding) should check whether or not it's address-taken before optimizing it away.
If more cases are revealed, we should fix them. 'Retpoline' and other code are also relying on this flag to guard an unreachable block.

efriedma added inline comments.May 3 2022, 12:09 AM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I think you're misunderstanding what I mean by "control flow graph".

Every MachineBasicBlock has a "successor" list, and a "predecessor" list. These can be accessed by MachineBasicBlock::successors() and MachineBasicBlock::predecessors(). MachineBasicBlock::pred_empty() queries the predecessor list. This isn't directly tied to any instructions in the block itself; they're just std::vector<MachineBasicBlock*>. But they reflect the possible directions for control flow. For example, suppose you have an indirectbr. This gets lowered to an indirect jump instruction; none of the destinations are directly encoded into the basic block. However, the block has a successor list; the jump must jump to one of those successors, or the behavior is undefined. And the destination has a predecessor list; the block with the indirectbr must be in that list.

This is a bit different from LLVM IR, where every control flow edge is explicitly attached to a terminator instruction.

This successor list is essential to the way MachineFunction IR works. Before register allocation, the IR is in SSA form; computing where an SSA value is live requires an accurate control flow graph.

If the EH runtime is jumping to a MachineBasicBlock with no predecessors, that means there's a missing call to MachineBasicBlock::addSuccessor() somewhere.

tentzen added inline comments.May 3 2022, 1:57 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

For Switch jump table, yes, pred. & succ. lists are well maintained because they are real control-flow.

We are talking about EH unwinding flow here. EH runtime cannot be in the list. it's why we rely on address-taken flag to convey this scenario to optimizations.

efriedma added inline comments.May 3 2022, 2:21 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

We model EH unwinding flow similarly to any other control flow. Take the following example:

echo "struct A {~A(); }; void f(); void g() { A a; f(); }" | clang++ --target=x86_64-pc-windows-msvc -x c++ - -o - -S -mllvm -stop-before=greedy

This dumps the MIR as it appears during register allocation. If you look at the output, it contains:

bb.0.entry:
  successors: %bb.1(0x40000000), %bb.2(0x40000000)

bb.1 is the normal continuation, bb.2 is the cleanuppad. bb.2 isn't directly reachable; the only way to jump to the block is through the exception unwinder. But we still model the edge.

tentzen added inline comments.May 3 2022, 2:43 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Yes, we all know that. that is the whole point of this design.
For Asynch Exception, this design only tracks EH state at scope (or block) level, not at instruction level. As such, after seh-scope-intrinsic is lowered, the design relies on address-taken flag to guard ehcleanup pads.
Please review the design on the top.

efriedma added inline comments.May 3 2022, 3:35 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

The reason we agreed not to track the control flow per-instruction at the LLVM IR level is the difficulty of adding the control flow edges: it would require a completely different form of control flow modelling to attach an exception edge to an LLVM IR load instruction. So we use an approximation: controlflow uses the edge encoded on the nearest llvm.seh.try.begin().

We don't need to use the same tradeoff here, though. Control flow doesn't work the same way on MachineBasicBlocks as it does on LLVM IR. We should decide the best way to model the control flow based on the constructs available, independent of what we decided for LLVM IR.

As far as I can tell, there isn't any reason you can't just mark the appropriate edges explicitly.

Alternatively, you could lower llvm.seh.try.begin to a pseudo-instruction, and mark the edge from that instruction to the exception-handling block, the same way we do it on LLVM IR. The control flow in that case wouldn't be precisely correct, but it would be close enough that you can make it work. I don't see any reason to prefer that approach, though.

Depending purely on the address-taken flag to preserve a block with no predecessors isn't ever going to work well; it's incompatible with how the backend reasons about control flow and liveness.

tentzen added inline comments.May 3 2022, 6:21 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Good to know there is no disagreement about tracking at seh-scope level, not at instruction level.
Good to know you also don't think there is any reason to continue modeling EH flow at scope-level after LLVM IR Lowering.

I agree that address-taken mechanism is not perfect. but it's an existent mechanism in LLVM framework. As said earlier, it's used in many places today. IMO, it's robust and well-understood mechanism employed in several compiler code based (LLVM is 4th one I experienced). In compiler & code-gen space, when we see address-taken, we all know what the implication is.

This revision is about SEH. Any concern or opinion about an existent framework should be resolved in a separate patch.

efriedma requested changes to this revision.May 3 2022, 6:34 PM
This revision now requires changes to proceed.May 3 2022, 6:34 PM

I don't think the current patch makes sense, and I'm not sure how else to explain my concerns. Maybe a different perspective would be helpful; please start a thread on Discourse.

Hey Eli, I agree that address-taken is not perfect, but I’m not sure I have any better ideas right now. Maybe @rnk and @JosephTremoulet have thoughts?’

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Sorry for late reply (thank Aaron Smith for bringing this up to my attention).
Looks like there is a corner bug in Part-1 Clang FE patch (https://reviews.llvm.org/D80344), not directly related to this LLVM part. I will look into it and work on a separate patch shortly.
thank you for trying this feature in advance.

What is the current status of this fix?

Patuti added a subscriber: Patuti.May 11 2022, 12:49 PM

What is the current status of this fix?

I'm waiting for Eli's https://reviews.llvm.org/D124697/new/ to land in first so I can submit a revision that uses new AddressTaken flag as requested.
After this patch is landed and SEH HW feature is really completed, I'll work on that bug.
Thank you for being patient.

jyu2 added a subscriber: jyu2.Oct 17 2022, 2:52 PM
jyu2 added a comment.Oct 18 2022, 8:26 AM

Hi @tentzen,
Any updates on this patch? Would like to see it moving forward.
We have customer trying to use clang on window which depend on asynchronous exception.
Thanks!
Jennifer

tentzen updated this revision to Diff 469062.Oct 19 2022, 3:27 PM

Per Eli Friedman's feedback, replace MBB->hasAddressTaken() with new MBB->isMachineBlockAddressTaken()

jyu2 added a comment.Oct 24 2022, 3:39 PM

Hi,
Any update? I did run time test for a simple test, it works for me.

Thanks.
Jennifer

Reverse ping @tentzen. Is there any block issues for it moving on? I don't see serious from a quick review, just some nits.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1677

duplicated.

1685

Check for call instructions?

llvm/lib/CodeGen/AsmPrinter/WinException.cpp
766

Check it in test case?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2999–3000

The comment format seems uncommon, reformat it?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1302

It's more preferred to use .. E = MF->end(); MBBI != E; ..

1306

Any reason this has to be done in SelectionDAGISel. If we do it after ISel, we can check mayRaiseFPException for FP instructions too.

1315–1316

Is there any chance a terminator will raise exception?
If we don't emit begin/end labels, why adding them to EHInfo by line 1297?

1322–1324

You can use getFirstTerminator.

llvm/lib/CodeGen/WinEHPrepare.cpp
232

Didn't see its free. Would there be memory leak?

254–256

There are many function calls, would be better to simplify to

if (Fn && Fn->isIntrinsic()) {
  Intrinsic::ID IID = Fn->getIntrinsicID();
  if (IID == ...)
270

ditto.

llvm/lib/IR/BasicBlock.cpp
212

We'd better use machine instruction infos.

llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll
291

This is useless and misleading. Ditto for other tests.

Reverse ping @tentzen. Is there any block issues for it moving on? I don't see serious from a quick review, just some nits.

All issues had been addressed. Just need an approval.

LGTM with @pengfei 's comments addressed.

tentzen marked 11 inline comments as done.Nov 7 2022, 4:29 PM
tentzen added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1677

good eyes. thanks.

1685

SEH HW exception will be caught in callee. No need to inject a nop before the call

llvm/lib/CodeGen/AsmPrinter/WinException.cpp
766

yes, without this change, tests will fail

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1302

ok.

1306

It's convenient and a natural place to do this table. It's already at the end of DAGISel.
If we are still missing some features, I prefer a follow-up patch. This patch has been tested and proved robust.

1315–1316

No. terminator won't raise exception.
Good catch, will move them to the top of that block to avoid unused IPToState entry.

1322–1324

I think I did it once. the reason getFirstTerminator() not used is that there could be real instructions between 1st and 2nd terminator. it's being a while, don't remember exactly. I think it's better not changing it now. this is minor anyways.

llvm/lib/CodeGen/WinEHPrepare.cpp
254–256

it does not really matter as duplicated Fn->getIntrinsicID() will be CSEed away

270

ditto.

llvm/lib/IR/BasicBlock.cpp
212

this is the style copied from neighborhood code. do you see any problem or potential bug?

llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll
291

this is testing SEH related table and flag is put properly.
which part is misleading?

tentzen updated this revision to Diff 473832.Nov 7 2022, 5:00 PM
tentzen marked 10 inline comments as done.

update per review feedbacks

pengfei added inline comments.Nov 7 2022, 7:17 PM
llvm/lib/CodeGen/WinEHPrepare.cpp
270

Still not see where it been freed. Should we push the object instead?

llvm/lib/IR/BasicBlock.cpp
212

I'm concerned about the FP exceptions as explained above. Especially FP instructions are more target specific, e.g., given an integet to FP conversion, CVTDQ2PD doesn't raise any FP exception but CVTDQ2PS does.
But I'm fine to improve it in a follow up.

llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll
291

I mean the !2 = !{!"clang version 11.0.0 ...

tentzen marked 3 inline comments as done.Nov 8 2022, 4:01 PM
tentzen added inline comments.
llvm/lib/CodeGen/WinEHPrepare.cpp
270

Oh, my overlooked. done.

llvm/lib/IR/BasicBlock.cpp
212

ok, thanks.

llvm/test/CodeGen/X86/windows-seh-EHa-CppCatchDotDotDot.ll
291

I see. done.

tentzen updated this revision to Diff 474105.Nov 8 2022, 4:04 PM
tentzen marked 3 inline comments as done.

update per Pengfei's review feedbacks.

pengfei accepted this revision.Nov 8 2022, 4:29 PM

Thanks @tentzen ! This patch LGTM. Please wait a few days for other reviewers.

Yes, it's true. this patch is targeting X86_64 only, not including X86 yet.
X86 EH mechanism is different, it's dynamic State tracking, not static PC-table based as X64, AArch64, etc.

Yes, it's true. this patch is targeting X86_64 only, not including X86 yet.
X86 EH mechanism is different, it's dynamic State tracking, not static PC-table based as X64, AArch64, etc.

Got it, thank you!

Ping @tentzen @efriedma. LLVM 16.0 will be branched this month. Should we consider to land this patch to the release?

lebedev.ri resigned from this revision.Jan 12 2023, 5:33 PM

Was this change not committed? Why is this review still open?

efriedma accepted this revision.Jan 12 2023, 6:38 PM

Not merged yet.

This revision is now accepted and ready to land.Jan 12 2023, 6:38 PM
victork1996 added a comment.EditedJan 25 2023, 5:38 AM

Hi @tentzen :)
I compiled clang's main branch after applying these changes, and I think I found a bug.
I tried to compile the following code with the "-fasync-exceptions" flag and the compiler crashed :(
can you take a look?

void foo()
{
    try
    {}
    catch (...)
    {
        // The 'return' statement causes a crash, it compiles just fine without the 'return'.
        return;
    }
}

thanks :)

nickdesaulniers added inline comments.
llvm/lib/CodeGen/BranchFolding.cpp
1207

If the EH runtime is jumping to a MachineBasicBlock with no predecessors, that means there's a missing call to MachineBasicBlock::addSuccessor() somewhere.

I agree with @efriedma here. Even for indirectbr, we still correctly model successor and predecessor lists. Having those be imprecise or omit edges (whether direct or indirect) is asking for trouble.

Any updates on when this change can land? Really look forward to LLVM's support for SEH. My project got stuck with MVSC due to this missing capability.

FYI, I just got in touch with @tentzen. He is not working with this now, but he said the revert is just a mistake.
I suggest we reland it as is if no strong objections from other reviewers.
We have customers that looking forward to it as well. I'd like to work with community for the followups.

This revision was landed with ongoing or failed builds.Mar 28 2023, 6:01 PM
This revision was automatically updated to reflect the committed changes.

Hi @tentzen :)
I compiled clang's main branch after applying these changes, and I think I found a bug.
I tried to compile the following code with the "-fasync-exceptions" flag and the compiler crashed :(
can you take a look?

void foo()
{
    try
    {}
    catch (...)
    {
        // The 'return' statement causes a crash, it compiles just fine without the 'return'.
        return;
    }
}

thanks :)

I can fix the crash by D147165, but I don't have confidence whether the codegen is expected or not. Please take a look, thanks!

Hello,

I'm trying to move our code base from using ancient LLVM 11 to main branch LLVM. I experienced clang crashes with below messages while using an assert enabled LLVM build. The problem goes away if I remove /EHa from our cmake.

A single unwind edge may only enter one EH pad
  invoke void @llvm.seh.scope.end()
          to label %"??1?$_Compressed_pair@V?$allocator@D@std@@V?$_String_val@U?$_Simple_types@D@std@@@2@$00@std@@QEAA@XZ.exit6.i" unwind label %ehcleanup.i5.i, !dbg !4121
in function ??1CrashdumpCollecter@debugging@skyfall@@QEAA@XZ
fatal error: error in backend: Broken function found, compilation aborted!

The crash happened in

#6 0x00007ff76757bf80 `anonymous namespace'::VerifierLegacyPass::runOnFunction C:\dev\llvm-project\llvm\lib\IR\Verifier.cpp:6532:0

Hello,

I'm trying to move our code base from using ancient LLVM 11 to main branch LLVM. I experienced clang crashes with below messages while using an assert enabled LLVM build. The problem goes away if I remove /EHa from our cmake.

A single unwind edge may only enter one EH pad
  invoke void @llvm.seh.scope.end()
          to label %"??1?$_Compressed_pair@V?$allocator@D@std@@V?$_String_val@U?$_Simple_types@D@std@@@2@$00@std@@QEAA@XZ.exit6.i" unwind label %ehcleanup.i5.i, !dbg !4121
in function ??1CrashdumpCollecter@debugging@skyfall@@QEAA@XZ
fatal error: error in backend: Broken function found, compilation aborted!

The crash happened in

#6 0x00007ff76757bf80 `anonymous namespace'::VerifierLegacyPass::runOnFunction C:\dev\llvm-project\llvm\lib\IR\Verifier.cpp:6532:0

I observed same failure in our internal testing. I'm looking into it. In the meanwhile, I think you can remove /EHa as a workaround. The option was a dumb without this patch.