Page MenuHomePhabricator

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2
Needs RevisionPublic

Authored by tentzen on May 19 2021, 5:07 PM.

Details

Summary

This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:

  • First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary.
  • Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees).
  • Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:

Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:

A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.

One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.

Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad701522988e212495285dade8efac41a24d4).

Part-2 : LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

I think you're saying here that it's legal for a side exit to target a successor with any ancestor state. But also you're assigning each such successor the exact parent state of its least predecessor ("least" by comparing state numbers). How do you know that it shouldn't be the grandparent (or any other ancestor) of the least predecessor?

272

Oh, I didn't notice this when I wrote the comment above. I agree with clang-tidy on this one (for both of these functions).

llvm/lib/IR/BasicBlock.cpp
212

I didn't notice when commenting on this name earlier that this method is on the IR BasicBlock class. I don't know if there's a better term to use for this than "fault". Regardless, I think it would be good to make the comment on the declaration very explicit that this checks for loads/stores (which may dereference a null pointer) and calls/invokes (which may propagate exceptions), since I don't think we have a well-defined IR concept for this particular property of those instructions. (FWIW, The last time I went looking, the closest I could find was the logic in SimplifyCFG that decides how much code it can remove before an unconditional branch to unreachable, which was open-coded.)

216

Did you mean CallBase here? This will skip InvokeInsts (and CallBrInsts).

Thank Joseph's in-depth code review and many great suggestions. will work on it and reply back later..

tentzen updated this revision to Diff 386688.Nov 11 2021, 4:55 PM

Update per Joseph Tremoulet's feedback.

Joseph,
I greatly appreciate your thorough review and insightful feedbacks.
new revision is provided. also review it again.
thank you!

llvm/include/llvm/CodeGen/SelectionDAGISel.h
332

good point. done.

llvm/include/llvm/CodeGen/WinEHFuncInfo.h
128–132

yes makes sense. thanks.

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1305

as stated in last line comment, in that case the DIV will not be the first instruction in the EH region. we can skip Nop..

llvm/lib/CodeGen/BranchFolding.cpp
1207

Yes, the **Dtor one. it's how we found this bug. thanks.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

fixed. thanks.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1325–1327

good point. fixed in the new patch.

llvm/lib/CodeGen/WinEHPrepare.cpp
253–256

done. recursion is replaced by a worklist loop..

267–269

the new state only assigned to successor when it's smaller than successor's current state. See the first line in the while loop.
if ( ... && EHInfo.BlockToStateMap[BB] <= State) ..

272

done. recursion is replaced by a worklist loop..

llvm/lib/IR/BasicBlock.cpp
212

renamed it with getFirstMayFaultInst(). also added more comments in header. thanks!

216

good catch! done.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

Nit: variable name still indicates cleanup. I'd suggest EHPadMBB.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Yes I see that you're taking the state of the least connected predecessor. What I'm not following is how you know that the least connected predecessor is a sibling, rather than say a neice -- what if the sibling ends in unreached, for example? Something like:

B1: (state = 0
{ // new scope
B2: (state = 1)
  { // new scope
  B2: (state = 2)
     { // new scope
     B3: (state = 3)
        if (_) {
        B4:
            goto B7;
        }
        B5: (state = 3)
        __assume(0);
     } // exit scope
     B6: (state TBD)
  } // exit scope
 B7: (state TBD)
} // exit scope
B8: (state TBD)

And maybe optimization turned __assume(0) into unreachable and got rid of B6 entirely. So then B4 is the only predecessor of B7. Doesn't that mean we'll assign state 2 to B7? But B7 should have state 1. How do we know this doesn't happen?

Hi Joseph, all good points. see replies below.
thank you!

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2887–2888

good catch. will fix it. thanks.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Very good point. In this case, B7 actually is a side-entry that is led by a block of seh_scope_begin() intrinsic (see CpodeGen/CGStmt.cpp change in Part-1 patch). As such, the state will be reset via EHInfo.InvokeStateMap[] when real code blocks are proceeded.

dpaoliello added inline comments.Dec 6 2021, 3:46 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1321

Is this still required? LLVM is no longer adding 1 to the addresses in its SEH tables (although it does for its IP2State table).

tentzen added inline comments.Dec 9 2021, 6:54 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1321

Yes, this is required. -EHa is applied to C++ code too. Without this Nop, exception occurs on the instruction immediately following an EH_LABEL won't be caught due to current "+1+offset" mechanism.

This comment was removed by gmh5225.
gmh5225 added a comment.EditedMar 1 2022, 12:32 AM

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Sorry for late reply (thank Aaron Smith for bringing this up to my attention).
Looks like there is a corner bug in Part-1 Clang FE patch (https://reviews.llvm.org/D80344), not directly related to this LLVM part. I will look into it and work on a separate patch shortly.
thank you for trying this feature in advance.

Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2022, 4:30 PM
efriedma added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1316

If the issue here that the "end" label for a region can accidentally include the following instruction after the region? Would it make sense to explicitly insert a nop earlier, in that case?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

I assume this is trying to trick optimizations into avoiding optimizations on the cleanup block if the __try involves multiple basic blocks.

Have you considered trying to fix the control flow graph? I mean, I understand the difficulty of expressing the control flow at the IR level, given we don't have IR versions of "load" and "store" with an unwind edge, but those concerns don't really apply in the same way in a MachineFunction. You can mark the cleanup block as a successor of any block that can throw an exception, without explicitly marking up the control flow on each individual instruction. (This is how exceptions normally work.)

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1323

getBasicBlock() can return null, in general. Maybe since this is running during isel, that isn't an issue, though. Maybe worth adding a comment.

1341

If a load is folded into the terminator, it can fault... but I guess that's unlikely to come up in practice. The most likely scenario probably involves indirectbr. Most other indirect jumps probably aren't terminators. I guess a jump table could theoretically fault, but if you cared you could probably just disable those.

bowang added a subscriber: bowang.Apr 22 2022, 12:43 AM
tentzen added inline comments.Apr 22 2022, 1:39 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1316

I don't see the value of emitting it earlier. I think the closer to emission is safer. Or there is always a risk of being changed or reordered by other phases. Besides, there are a couple of similar cases of emitting Nop in that file (like functionPrfix). I think inserting the nop where the EH_Label is emitted is safest and most straight-forward.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

The Cleanup block is turned into a dtor funclet at the end. And the beginning address of that funclet will be recorded in EH table in .xdata section. So it's really address-taken by definition. If you verify this by checking Linker relocation records or the EH table in .asm file. IMO this flag should be set regardless of the SEH feature.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
1323

ok, will add a comment. thanks.

1341

Right, it's unlikely in practice. I don't think it's concern here.
thank you for pointing out.

efriedma added inline comments.Apr 22 2022, 12:52 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1316

Could you at least clarify the comment to clarify what aspect of the SEH region is problematic? We currently use EH_LABEL for both the beginning and end of a region, and it isn't clear which one this is meant to apply to.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

That's not what "address-taken" is supposed to mean.

It's supposed to be equivalent to BasicBlock::hasAddressTaken, which means that the block is the target of a blockaddress expression (used for indirectbr). Primarily, this restricts certain optimizations because in general, we can't split a critical edge from an indirectbr.

hasAddressTaken() is not a general "will the address of the block eventually be emitted somewhere in the assembly". That can happen for a variety of unrelated reasons, including exception handling, debug info, jump tables, and branch lowering.

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

I apologize for the ridiculous delay in responding, until just now I was under the mistaken impression that you hadn't replied.

I'm still confused by the logic here. Your comment states "Side exits can ONLY jump into parent scopes (lower state number)". My example had a jump to a grandparent, and you pointed out that its destination would be marked as a side entry. Are those the only two cases, would it be equally correct to say "Side exits can ONLY jump to either the immediate parent scope or a side-entry of another ancestor" ? Or is there a way to go straight to grandparent that doesn't get a side entry?

I think an equivalent question is: at the continue on line 300, could you assert EHInfo.BlockToStateMap[BB] == State || (block is side entry) ?

tentzen added inline comments.Apr 25 2022, 5:29 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1316

It's for both.
EH_LABEL marks the end of the previous region and the beginning of the following region. The Nop is added to prevent a potential-faulty-instruction from being included in the previous region and to ensure this potential faulty instruction still in the present region.
Yes, I will add more comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

I'm not sure about your interpretation. the comment of this flag from the header is:

/// Returns true if there are any uses of this basic block other than
/// direct branches, switches, etc. to it.
bool hasAddressTaken() const {
  return getBasicBlockBits().BlockAddressRefCount != 0;
}

it does include "switch" which I think refers to jump-table.
Besides, the dtor-funclet in EH table is invoked via indirectbr/indirect-call. Does not that also meet your interpretation?

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Sorry for the confusion. the comment "jump into parent scopes" here does not imply it's the immediate parent. it can be grandparents, etc as in your example. Will fix the comment.
Those two are actually refer to the same scenario. Jumping into a protected parant/grandparent's scope induces a warning, and under that situation the target label is marked as side-entry (JumpDiagnostics.cpp change in Part-1 patch).

efriedma added inline comments.Apr 27 2022, 3:15 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

We intentionally do not check for hasAddressTaken() here. Even if something refers to the address, it's not legal to branch to the block, so it can't contain any useful instructions. AsmPrinter::emitFunctionBody has some code to print a fake label if there's a reference to an erased block.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

"other than direct branches, switches, etc." means MachineBasicBlock terminators, exception pads, and jump tables. Anything which is *not* one of those, we mark up, because target-independent code can't analyze it. But in almost all cases, the above list covers basically everything except indirectbr until you get very late in the compile process.

If we do end up performing some transform that materializes the address of a block outside of the above cases, it could make sense to mark a block address-taken. (Some target-specific code does this. Not sure if that code is really doing it correctly, but none of those calls are relevant to this patch.) But we should not be preemptively marking basic block address-taken: we should wait until such a transform is actually performed.

For a dtor-funclet, we don't actually "take the address" of the funclet until the AsmPrinter, when we compute the exception tables. We could add an address-taken marking at that point... but nothing would check it that late in the pipeline.

Thanks for the explanations and your patience. This patch LGTM, modulo resolution of @efriedma's concerns (so I'll leave it to him to mark accepted).

llvm/lib/CodeGen/WinEHPrepare.cpp
267–269

Jumping into a protected [ancestor scope other than the immediate parent scope] ... [can't occur unless] the target label is marked as side-entry

That is the key point I was missing. Yes, I think it would be great to include that in the comment and/or as an assertion.

a protected parant/grandparent's

What happens if an unprotected block is only reachable from side-exits of protected regions? Do we end up protecting it, and is that okay? This too would be good to include in comment.

tentzen added inline comments.Apr 28 2022, 5:54 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Hmm, really? there are at least three places where hasAddressTaken() is used to guard optimizations in the same pass (BranchFolding.cpp). why is OptimizeBranches() special?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

This is exactly the case that a target-independent code (like BranchFolding.cpp) must be aware of it.
The dtor-funclet cleanup block is genuinely address-taken the moment it's created.
Yes, it's not emitted until AsmPrinter. but it does not mean the block is not address-taken until AsmPrinter.

efriedma added inline comments.Apr 28 2022, 8:33 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

The other places in the pass are checking hasAddressTaken() to try to figure out if they can rewrite the predecessors. Here, we can trivially rewrite all the predecessors: there aren't any.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

There's not really any point to arguing what the person who originally wrote the comment meant. Sorry, I shouldn't have tried to take the discussion in that direction.

But I still think this is confusing: there's multiple different use-cases smashed together into one boolean.

How about this: we split up isBlockAddressTarget(), isAsyncSEHTarget(), and maybe isMachineLabelUsed() for the target-specific stuff that messes with MachineBasicBlock labels in weird ways. And then as a transitional measure hasAddressTaken() checks all of them, maybe? Really, I expect most of the relevant places that would check hasAddressTaken(), or something like that.

Then we can go through later and refine various places to check exactly what they need. I suspect most places don't actually need all three.

tentzen added inline comments.Apr 29 2022, 1:55 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

I feel doing that is over-engineering and I don't see much value out of it.

note that in addition to BasicBlock.h, it's also commented in MachineBasicBlock.h :

/// Test whether this block is **potentially the target of an indirect branch.**
bool hasAddressTaken() const { return AddressTaken; }

Also, some present usages are far from what you described.
See the usages in RetpolineThunkInserter where CaptureSpec and CallTarget blocks are not even address-taken.

The usage of hasAddressTaken() in this revision is much more appropriate and legitimate.

efriedma added inline comments.Apr 29 2022, 1:54 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

Looked a bit more. It turns out there's an actual bug here: mixing up the different markings could confuse the AsmPrinter if we mix together the different bits. Posted D124697.

With that, the "MBBLabelUsed" is probably close enough to what you need, if we need something. I guess we don't really need to distinguish between the SEH uses and other target-specific uses.

I'm still not sure where, exactly, the MBB is being stored that necessitates this. If I look at an MIR dump, the only place the EHPad shows up is as a successor of the block that can throw an exception. And as far as I can tell, none of the side-tables in WinEHFuncInfo contain any references to MBBs.

tentzen added inline comments.Apr 29 2022, 11:06 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

The EH table is named $stateUnwindMap$, the 'state' and 'action' map table. It is stored in $cppxdata$ that is the primary EH table passed to _CxxFrameHandler3/4 by __ehhandler$**. Check the new test case windows-seh-EHa-cpp-Dtors01.ll.
LLVM code that emits it is in WinException.cpp:

// UnwindMapEntry {
//   int32_t ToState;
//   void  (*Action)();
// };
if (UnwindMapXData) {
  OS.emitLabel(UnwindMapXData);
  for (const CxxUnwindMapEntry &UME : FuncInfo.CxxUnwindMap) {
    MCSymbol *CleanupSym =
        getMCSymbolForMBB(Asm, UME.Cleanup.dyn_cast<MachineBasicBlock *>());
    AddComment("ToState");
    OS.emitInt32(UME.ToState);

    AddComment("Action");
    OS.emitValue(create32bitRef(CleanupSym), 4);
  }

note getMCSymbolForMBB() that refers to the ehcleanup block we were talking about here.

efriedma added inline comments.Apr 30 2022, 2:58 AM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I guess given that we have references to an MBB stored in FuncInfo, we can't really delete the block. So I guess this is right, sort of. But it would still be undefined behavior to jump to it, so I'm not sure this really solves whatever problem you ran into.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

I was more concerned about the other end of it... but looking into it, I guess FunctionLoweringInfo::set() is actually where we end up computing the mapping. So we need to ensure that the FuncInfo tables stay consistent through the entirety of codegen. Which seems like an extreme restriction, but I guess it's unlikely to matter that much in practice.

tentzen added inline comments.May 2 2022, 3:11 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

it's MBB.pred_empty(), so there is no edge jumping to it.
Without this change, an EH Cleanuppad block could be removed and EH runtime would not be able to dtor a live object during unwinding.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2916

the FuncInfo and its related tables is the existent framework that handles EH.
Yes, I'm sure it is reliable and consistent with IR or C++ EH today would not be working at all.
This patch is adding SEH support that is built on top of existent EH framework.

efriedma added inline comments.May 2 2022, 4:17 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

That's a contradiction: you're saying there's no edge jumping to it, but somehow the "EH runtime" is executing the code in the block.

In particular, even if you preserve the block, anything involving dominance relationships, like register allocation, is likely to destroy unreachable code. Maybe you can sort of get away with it in trivial cases.

tentzen added inline comments.May 2 2022, 8:42 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I meant no edge jumps into it through IR's control flow graph during compilation. EH runtime invokes it via an indirect branch during execution time.

It's the whole point of 'address-taken' attribute; it tells optimizations/transformations that although this block is unreached via IR control flow, its address is taken and runtime (or some other code) can invoke this block via an indirect branch.

Register-allocation, or and all other optimizations (like this BranchFolding) should check whether or not it's address-taken before optimizing it away.
If more cases are revealed, we should fix them. 'Retpoline' and other code are also relying on this flag to guard an unreachable block.

efriedma added inline comments.May 3 2022, 12:09 AM
llvm/lib/CodeGen/BranchFolding.cpp
1207

I think you're misunderstanding what I mean by "control flow graph".

Every MachineBasicBlock has a "successor" list, and a "predecessor" list. These can be accessed by MachineBasicBlock::successors() and MachineBasicBlock::predecessors(). MachineBasicBlock::pred_empty() queries the predecessor list. This isn't directly tied to any instructions in the block itself; they're just std::vector<MachineBasicBlock*>. But they reflect the possible directions for control flow. For example, suppose you have an indirectbr. This gets lowered to an indirect jump instruction; none of the destinations are directly encoded into the basic block. However, the block has a successor list; the jump must jump to one of those successors, or the behavior is undefined. And the destination has a predecessor list; the block with the indirectbr must be in that list.

This is a bit different from LLVM IR, where every control flow edge is explicitly attached to a terminator instruction.

This successor list is essential to the way MachineFunction IR works. Before register allocation, the IR is in SSA form; computing where an SSA value is live requires an accurate control flow graph.

If the EH runtime is jumping to a MachineBasicBlock with no predecessors, that means there's a missing call to MachineBasicBlock::addSuccessor() somewhere.

tentzen added inline comments.May 3 2022, 1:57 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

For Switch jump table, yes, pred. & succ. lists are well maintained because they are real control-flow.

We are talking about EH unwinding flow here. EH runtime cannot be in the list. it's why we rely on address-taken flag to convey this scenario to optimizations.

efriedma added inline comments.May 3 2022, 2:21 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

We model EH unwinding flow similarly to any other control flow. Take the following example:

echo "struct A {~A(); }; void f(); void g() { A a; f(); }" | clang++ --target=x86_64-pc-windows-msvc -x c++ - -o - -S -mllvm -stop-before=greedy

This dumps the MIR as it appears during register allocation. If you look at the output, it contains:

bb.0.entry:
  successors: %bb.1(0x40000000), %bb.2(0x40000000)

bb.1 is the normal continuation, bb.2 is the cleanuppad. bb.2 isn't directly reachable; the only way to jump to the block is through the exception unwinder. But we still model the edge.

tentzen added inline comments.May 3 2022, 2:43 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Yes, we all know that. that is the whole point of this design.
For Asynch Exception, this design only tracks EH state at scope (or block) level, not at instruction level. As such, after seh-scope-intrinsic is lowered, the design relies on address-taken flag to guard ehcleanup pads.
Please review the design on the top.

efriedma added inline comments.May 3 2022, 3:35 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

The reason we agreed not to track the control flow per-instruction at the LLVM IR level is the difficulty of adding the control flow edges: it would require a completely different form of control flow modelling to attach an exception edge to an LLVM IR load instruction. So we use an approximation: controlflow uses the edge encoded on the nearest llvm.seh.try.begin().

We don't need to use the same tradeoff here, though. Control flow doesn't work the same way on MachineBasicBlocks as it does on LLVM IR. We should decide the best way to model the control flow based on the constructs available, independent of what we decided for LLVM IR.

As far as I can tell, there isn't any reason you can't just mark the appropriate edges explicitly.

Alternatively, you could lower llvm.seh.try.begin to a pseudo-instruction, and mark the edge from that instruction to the exception-handling block, the same way we do it on LLVM IR. The control flow in that case wouldn't be precisely correct, but it would be close enough that you can make it work. I don't see any reason to prefer that approach, though.

Depending purely on the address-taken flag to preserve a block with no predecessors isn't ever going to work well; it's incompatible with how the backend reasons about control flow and liveness.

tentzen added inline comments.May 3 2022, 6:21 PM
llvm/lib/CodeGen/BranchFolding.cpp
1207

Good to know there is no disagreement about tracking at seh-scope level, not at instruction level.
Good to know you also don't think there is any reason to continue modeling EH flow at scope-level after LLVM IR Lowering.

I agree that address-taken mechanism is not perfect. but it's an existent mechanism in LLVM framework. As said earlier, it's used in many places today. IMO, it's robust and well-understood mechanism employed in several compiler code based (LLVM is 4th one I experienced). In compiler & code-gen space, when we see address-taken, we all know what the implication is.

This revision is about SEH. Any concern or opinion about an existent framework should be resolved in a separate patch.

efriedma requested changes to this revision.May 3 2022, 6:34 PM
This revision now requires changes to proceed.May 3 2022, 6:34 PM

I don't think the current patch makes sense, and I'm not sure how else to explain my concerns. Maybe a different perspective would be helpful; please start a thread on Discourse.

Hey Eli, I agree that address-taken is not perfect, but I’m not sure I have any better ideas right now. Maybe @rnk and @JosephTremoulet have thoughts?’

Code below will cause a crash.

 void
 parse()
{
    throw std::exception("throw--------");
    return;
}

 int
 main()
{
    try
    {
        parse();
    }
    catch (const std::exception &e)
    {
        printf("except=%s\n", e.what());
        // Crash when return in catch scope
        return -1;
    }

    system("pause");
    printf("end\n");
    return 0;
}

Seems that function 'EmitSehCppScopeEnd' in PopCleanupBlock can not find the insert point.
CGF.Builder.GetInsertBlock() or CGF.getInvokeDest() equals nullptr.

Sorry for late reply (thank Aaron Smith for bringing this up to my attention).
Looks like there is a corner bug in Part-1 Clang FE patch (https://reviews.llvm.org/D80344), not directly related to this LLVM part. I will look into it and work on a separate patch shortly.
thank you for trying this feature in advance.

What is the current status of this fix?

Patuti added a subscriber: Patuti.May 11 2022, 12:49 PM

What is the current status of this fix?

I'm waiting for Eli's https://reviews.llvm.org/D124697/new/ to land in first so I can submit a revision that uses new AddressTaken flag as requested.
After this patch is landed and SEH HW feature is really completed, I'll work on that bug.
Thank you for being patient.