This is an archive of the discontinued LLVM Phabricator instance.

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1
ClosedPublic

Authored by asmith on May 20 2020, 5:01 PM.

Details

Summary

This patch is the Part-1 (FE Clang) implementation of HW Exception handling.

This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. The plan will add AArch64 & ARM and X86 targets later.

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -feh-asynch, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE: : Without the -EHa or -feh-asynch, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules. First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(…). As such, when that happens in the middle of an object’s life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option.

One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed.
The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks.
Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others.
If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed:

note: jump bypasses variable with a non-trivial destructor

In this case, this region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State.

Implementation:
Part-1: Clang implementation described below.

  • Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining.
  • Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary.
  • All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small.

Part-2: LLVM implementation described below.

  • For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap).
  • For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done.
  • If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address.
  • The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions).
  • Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
asmith added inline comments.May 22 2020, 7:41 AM
clang/lib/CodeGen/CGException.cpp
603

nit - extra space after //

1677

nit -- comment about closing brace

clang/lib/CodeGen/CodeGenModule.cpp
600

nit - braces around single statement

This should likely be at least 3 patches: llvm middle-end, llvm codegen, clang.
Langref changes missing for new intrinsics.
Please post all patches with full context (-U99999)

tentzen marked 3 inline comments as done.May 22 2020, 3:25 PM

This should likely be at least 3 patches: llvm middle-end, llvm codegen, clang.
Langref changes missing for new intrinsics.
Please post all patches with full context (-U99999)

OK, makes sense. I will separate it to at least two patches, one Clang and one LLVM CodeGen as there is not much change in middle-end & Opt.
thank you for pointing out missing the change in Langref.rst. I was thinking to update docs/ExceptionHandling.rst after this patch is accepted. Do you think this types of intrinsic should be described in Langref?
thanks,

tentzen updated this revision to Diff 265812.May 22 2020, 3:47 PM
tentzen retitled this revision from [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC option -EHa) to [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1.
tentzen edited the summary of this revision. (Show Details)
tentzen added a reviewer: lebedev.ri.

Per Roman Lebedev's feedback, divide the patch into Clang and LLVM.

lebedev.ri resigned from this revision.May 22 2020, 4:12 PM

This should likely be at least 3 patches: llvm middle-end, llvm codegen, clang.
Langref changes missing for new intrinsics.
Please post all patches with full context (-U99999)

I was thinking to update docs/ExceptionHandling.rst after this patch is accepted. Do you think this types of intrinsic should be described in Langref?
thanks,

It may be helpful (even for the reviewers) to first specify their behavior,
instead of writing that after-the-fact "backwardly" based on the implementation.

It may be helpful (even for the reviewers) to first specify their behavior,
instead of writing that after-the-fact "backwardly" based on the implementation.

For reviewers, the purpose of those intrinsic are described in Summary section:

  • Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining.
  • Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary.
tentzen updated this revision to Diff 265819.May 22 2020, 5:06 PM
tentzen removed a reviewer: lebedev.ri.

fixed tidy warnings

It may be helpful (even for the reviewers) to first specify their behavior,
instead of writing that after-the-fact "backwardly" based on the implementation.

For reviewers, the purpose of those intrinsic are described in Summary section:

Like the disscussion we just had in HWLoops patch, unless the behavior of new intrinsics/etc is stated in langref, they are unspecified.

  • Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining.
  • Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary.
tentzen updated this revision to Diff 266374.May 26 2020, 4:45 PM

update LangRef.rst for new intrinsics

Is there any feedback? thanks,

majnemer added inline comments.Jun 3 2020, 6:18 PM
clang/include/clang/AST/Stmt.h
1838

I think this should be isSideEntry to be consistent.

clang/lib/CodeGen/CGException.cpp
628–634

I think this logic should move into MicrosoftCXXABI::getCatchAllTypeInfo.

tentzen marked 3 inline comments as done.Jun 3 2020, 11:44 PM

thank you David. will update and submit a new patch shortly.

clang/lib/CodeGen/CGException.cpp
628–634

good point. thanks

tentzen updated this revision to Diff 268375.Jun 4 2020, 12:36 AM
tentzen marked an inline comment as done.
tentzen edited the summary of this revision. (Show Details)

update a couple of changes per David Majnemer's suggestion.

does it look good to you? or any more comment?
thanks

asmith accepted this revision.Sep 3 2020, 11:03 AM
This revision is now accepted and ready to land.Sep 3 2020, 11:03 AM
lebedev.ri requested changes to this revision.Sep 3 2020, 11:35 AM

I feel like this may be trying to do too many things at once.
Was there an RFC?
There are several patches here:

  1. langref
  2. llvm side of the patch (maybe should be in the previous patch)
  3. SideEntry changes - missing astdump changes, serialization/deserialization
  4. the rest of the owl
clang/lib/CodeGen/CGCleanup.cpp
785

Here and elsewhere, the comments start from Capital Letter

This revision now requires changes to proceed.Sep 3 2020, 11:35 AM

Yes there was an RFC and discussion and several requests for comments.
http://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html

Yes there was an RFC and discussion and several requests for comments.
http://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html

It is probably good to state so in the patch's description.
But as far as i can tell the disscussion died down without an explicit acceptance?

I've made some brief comments about the code, but I think I have much larger concerns here.

The first is whether LLVM really intends to satisfy the constraints necessary to make these exceptions work, which I don't think you've gotten clear consensus about at all. Unfortunately, llvm-dev is not an effective place to gather this kind of consensus. For a change of this magnitude, I think you need to be proactively gathering a group of interested parties to sort out whether this is something we should support, not just expecting people to reply to an RFC.

The second is that I think the concept of block-level tracking state / control flow is potentially really problematic for LLVM, and the entire design here seems contingent on it. Again, this is something we need to ensure consensus on, not something we can really just sign off on in code review.

The third is that I'm not really convinced that the way we handle cleanups in Clang today is sufficiently "atomic" in design to accommodate this. I need to think about it.

clang/include/clang/Basic/LangOptions.def
135

This is usually shortened as "async", but I'm not sure why it's being shortened at all. Also, hardware faults are synchronous. Also, it applies in all language modes.

clang/lib/CodeGen/CGCleanup.cpp
781

Readers of this code shouldn't have to remember what cl.exe command-line flags mean. Please call this something like RequiresSEHScopeEnd.

791

Please put braces around the outer if, and please be consistent about braces for the inner clauses.

1319

I think you should take the callee by value. And please rename this to not use "EHa".

clang/lib/CodeGen/CGDecl.cpp
2008 ↗(On Diff #268375)

This seems like a pretty random place to do this. Presumably this need to be tied to basically any scope with EH protection?

Hi, John, thank you for reviewing this patch and providing feedback.
regarding your comments:

1, In the RFC thread early, Reid K (the major contributor of Windows SEH/EH support) had agreed that "the value proposition is clear and large".
2, could you elaborate why do you think "block-level tracking state / control flow is potentially really problematic for LLVM.."? Could you be more specific or provide some examples? Remainder that this design also relies on existent EH framework to compute block state, and it has passed MSVC SEH suite that contains many complicated SEH cases.
3, I don't understand your concern "the way we handle cleanups in Clang today is sufficiently "atomic" in design to accommodate this". From Cleanup perspective, there is no difference between HW exception and non-HW exceptions. The key of handling HW exception is to catch exceptions on those faulty / memory instructions. Once it's caught the cleanup process is identical to non-HW exception. again, could you elaborate further?

Finally, regarding naming/option in the code, -EHa or "Asynchronous" is the terminology used in MSVC in Windows. Programmers who use/care this feature should know it well as it's the feature to match MSVC in Windows.
Thanks,

Hi, John, thank you for reviewing this patch and providing feedback.
regarding your comments:

1, In the RFC thread early, Reid K (the major contributor of Windows SEH/EH support) had agreed that "the value proposition is clear and large".

And I agree with him that the potential benefits are substantial. I'm just saying that one barely-trafficked RFC thread is not evidence of a community consensus.

2, could you elaborate why do you think "block-level tracking state / control flow is potentially really problematic for LLVM.."? Could you be more specific or provide some examples?

As I understand it, you are creating implicit edges within the function to the active SEH handler (potentially part of the same function) from arbitrary places within covered basic blocks, turning basic blocks from a single-entry single-(internal-)exit representation to a single-entry multiple-exit representation. That is a huge change. How do you expect LLVM transformations to preserve or merge this information across blocks? How do you expect transformations to not move code between blocks in problematic ways, or reorder code within blocks in ways that will suddenly be visible to the optimizer? These are the standard objections to supporting features like this, that they have basically unspecified behavior and appear to work by good intentions and happy thoughts.

Remainder that this design also relies on existent EH framework to compute block state, and it has passed MSVC SEH suite that contains many complicated SEH cases.

Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

3, I don't understand your concern "the way we handle cleanups in Clang today is sufficiently "atomic" in design to accommodate this". From Cleanup perspective, there is no difference between HW exception and non-HW exceptions. The key of handling HW exception is to catch exceptions on those faulty / memory instructions. Once it's caught the cleanup process is identical to non-HW exception. again, could you elaborate further?

There is a major difference between HW exceptions and non-HW exceptions, which is that non-HW exceptions are observed only at call sites. It is quite easy to write code in IRGen that works because the code emitted at a certain point isn't totally arbitrary.

Finally, regarding naming/option in the code, -EHa or "Asynchronous" is the terminology used in MSVC in Windows. Programmers who use/care this feature should know it well as it's the feature to match MSVC in Windows.

Yes, but it's not known well to all the compiler developers who are reading and maintaining the Clang codebase. I'm not saying you should rename the command-line flag, I'm asking that you not just say "EHa" in comments and identifiers.

Thank you for prompt reply again.

[rjmccall] And I agree with him that the potential benefits are substantial. I'm just saying that one barely-trafficked RFC thread is not evidence of a community consensus.

OK, thanks. it's good to know you are also supportive in adding this feature.

[rjmccall] As I understand it, you are creating implicit edges within the function to the active SEH handler (potentially part of the same function) from arbitrary places within covered basic blocks, turning basic blocks from a single-entry single-(internal-)exit representation to a single-entry multiple-exit representation. That is a huge change. How do you expect LLVM transformations to preserve or merge this information across blocks? How do you expect transformations to not move code between blocks in problematic ways, or reorder code within blocks in ways that will suddenly be visible to the optimizer? These are the standard objections to supporting features like this, that they have basically unspecified behavior and appear to work by good intentions and happy thoughts.

There is absolutely NO explicit edge added in this design. You misunderstood the term of SEME here. It refers to a SEH scope-region of many blocks that can have multi-blocks jumping out of this region, not multi-branches jumping out of a block. Since blocks in LLVM are already 'completely' connected by Control Flow (no lexical concept at all before layout), there is nothing we need to do to build this SEME region. For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region. As described earlier, it's a little sub-optimal, but given that the amount of code directly under a SEH scope is very small, the impact to overall code quality is minor. Volatile-ness is robust as it's one fundamental attribute that is expected to be honored by entire framework.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

[rjmccall] There is a major difference between HW exceptions and non-HW exceptions, which is that non-HW exceptions are observed only at call sites. It is quite easy to write code in IRGen that works because the code emitted at a certain point isn't totally arbitrary

still not completely sure what you are referring to. let me guess. Are you concerned that the Cleanup mechanism of a SEH region may not be setup properly if there is no call instruction in the region? This is actually a good question. The answer is that the presence of seh.try.begin() intrinsic will ensure that a Cleanup stack is established properly because there is an EH edge built for seh.try.begin() automatically. hope this answer your concern here.

[rjmccall] Yes, but it's not known well to all the compiler developers who are reading and maintaining the Clang codebase. I'm not saying you should rename the command-line flag, I'm asking that you not just say "EHa" in comments and identifiers.

Ok, no problem. more comments will be added.

Thank you for prompt reply again.

[rjmccall] And I agree with him that the potential benefits are substantial. I'm just saying that one barely-trafficked RFC thread is not evidence of a community consensus.

OK, thanks. it's good to know you are also supportive in adding this feature.

[rjmccall] As I understand it, you are creating implicit edges within the function to the active SEH handler (potentially part of the same function) from arbitrary places within covered basic blocks, turning basic blocks from a single-entry single-(internal-)exit representation to a single-entry multiple-exit representation. That is a huge change. How do you expect LLVM transformations to preserve or merge this information across blocks? How do you expect transformations to not move code between blocks in problematic ways, or reorder code within blocks in ways that will suddenly be visible to the optimizer? These are the standard objections to supporting features like this, that they have basically unspecified behavior and appear to work by good intentions and happy thoughts.

There is absolutely NO explicit edge added in this design.

I didn't say there was. I said that there was an *implicit* edge: from the memory access to the handler. It's precisely because that edge is implicit that it's problematic for analysis and optimization.

For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region.

When is this attribute applied? Because you can't just apply it at a single point; you also need to mark operations when they're copied/inlined into the region. We once had a similar attempt at handling exceptions with a setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, something frontend-centric) that had persistent problems because of poor modeling in the IR, which it feels like this is doomed to repeat.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously as changes go into LLVM? Because this sort of thing is extremely easy to break.

[rjmccall] There is a major difference between HW exceptions and non-HW exceptions, which is that non-HW exceptions are observed only at call sites. It is quite easy to write code in IRGen that works because the code emitted at a certain point isn't totally arbitrary

still not completely sure what you are referring to. let me guess. Are you concerned that the Cleanup mechanism of a SEH region may not be setup properly if there is no call instruction in the region? This is actually a good question. The answer is that the presence of seh.try.begin() intrinsic will ensure that a Cleanup stack is established properly because there is an EH edge built for seh.try.begin() automatically. hope this answer your concern here.

Does this happen whenever there are changes to the active cleanups, and not just when entering the scope? Since you're not modeling the control edge from the access to the handler, IRGen isn't naturally going to create a cleanup just because there's a memory access in a scope, and even if it did, the cleanup would appear to be unreachable.

That's really a major part of my concern, that the test suite might be passing because of some common structure to the tests which minimizes the impact of the poor modeling in IR. IR modeling inadequacies are not going to cause obvious problems if in practice the SEH scope is always entered in a function that doesn't do much inside the scope besides call a function, and then the trap happens with the callee. In fact, I would be a lot more confident in this feature if you actually forced the IR to model it that way, so that there's nothing but a by-fiat unoptimizable invoke within the scope. You could extract the function in the frontend and then, if necessary, inline it in a very late pass after any IR-level optimizations, although that would leave me still concerned about LLVM CodeGen's transformations.

But even doing that wouldn't help with code like the following, where the only operation within the scope of s2 is a memory access:

void test() {
  std::string s1 = <blah>;
  int *ip = get_invalid_ptr();
  std::string s2 = <blah>;
  *ip = 0;
}

[rjmccall] Yes, but it's not known well to all the compiler developers who are reading and maintaining the Clang codebase. I'm not saying you should rename the command-line flag, I'm asking that you not just say "EHa" in comments and identifiers.

Ok, no problem. more comments will be added.

Comments and identifiers. So, for example, the intrinsic should not be called eha_scope_begin.

Thank you for prompt reply again.

[rjmccall] And I agree with him that the potential benefits are substantial. I'm just saying that one barely-trafficked RFC thread is not evidence of a community consensus.

OK, thanks. it's good to know you are also supportive in adding this feature.

[rjmccall] As I understand it, you are creating implicit edges within the function to the active SEH handler (potentially part of the same function) from arbitrary places within covered basic blocks, turning basic blocks from a single-entry single-(internal-)exit representation to a single-entry multiple-exit representation. That is a huge change. How do you expect LLVM transformations to preserve or merge this information across blocks? How do you expect transformations to not move code between blocks in problematic ways, or reorder code within blocks in ways that will suddenly be visible to the optimizer? These are the standard objections to supporting features like this, that they have basically unspecified behavior and appear to work by good intentions and happy thoughts.

There is absolutely NO explicit edge added in this design.

I didn't say there was. I said that there was an *implicit* edge: from the memory access to the handler. It's precisely because that edge is implicit that it's problematic for analysis and optimization.

Oh, sorry I misread it.
For HW exceptions the 'implicit' edge is unavoidable unless an explicit approach (like previous iload/istore) is employed. iload approach was extensively discussed and evaluated when it's proposed. As far as I know, the concerns were that it could create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.
So this design applies an IMPLICT approach. I respectfully disagree that it's problematic because as long as volatile attribute is honored it's robust. please see the C & C++ SEH rule stated in patch description section.

For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region.

When is this attribute applied? Because you can't just apply it at a single point; you also need to mark operations when they're copied/inlined into the region. We once had a similar attempt at handling exceptions with a setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, something frontend-centric) that had persistent problems because of poor modeling in the IR, which it feels like this is doomed to repeat.

This is applied only once in FE. Consider it's like from the source code. So NO there is no other place we need to change. Per SEH semantic, the inline code is immune from volatile constraint.
I don't think this is the same as SJLJ story. I did once look into SJLJ problem when I found every single test in MSVC Setjmp suite was broken. it's because the implicit back edge from longjmp (or a callee with longjump) was not modeled properly. actually I have a simple idea that can fix this SJLJ problem robustly, but I've been clogged in this SEH task.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously as changes go into LLVM? Because this sort of thing is extremely easy to break.

Yes, as long as it's a bug/flaw in the design/implementation. New opt/analysis/tools that violates original basic volatile/cleanup/EH-framework exposed by SEH test cases is not included. Isn't this the policy of Clang/LLVM community?

[rjmccall] There is a major difference between HW exceptions and non-HW exceptions, which is that non-HW exceptions are observed only at call sites. It is quite easy to write code in IRGen that works because the code emitted at a certain point isn't totally arbitrary

still not completely sure what you are referring to. let me guess. Are you concerned that the Cleanup mechanism of a SEH region may not be setup properly if there is no call instruction in the region? This is actually a good question. The answer is that the presence of seh.try.begin() intrinsic will ensure that a Cleanup stack is established properly because there is an EH edge built for seh.try.begin() automatically. hope this answer your concern here.

Does this happen whenever there are changes to the active cleanups, and not just when entering the scope? Since you're not modeling the control edge from the access to the handler, IRGen isn't naturally going to create a cleanup just because there's a memory access in a scope, and even if it did, the cleanup would appear to be unreachable.

That's really a major part of my concern, that the test suite might be passing because of some common structure to the tests which minimizes the impact of the poor modeling in IR. IR modeling inadequacies are not going to cause obvious problems if in practice the SEH scope is always entered in a function that doesn't do much inside the scope besides call a function, and then the trap happens with the callee. In fact, I would be a lot more confident in this feature if you actually forced the IR to model it that way, so that there's nothing but a by-fiat unoptimizable invoke within the scope.

Yes, this is exactly what we are doing. There is a seh.try.begin() injected for every level of SEH scope. In addition to marking the beginning of SEH SEME region, it also guarantees a cleanup is setup. This seh.try.begin() intrinsic must be turned into an invoke (not a call) because SEH state is not zero.

You could extract the function in the frontend and then, if necessary, inline it in a very late pass after any IR-level optimizations, although that would leave me still concerned about LLVM CodeGen's transformations.

But even doing that wouldn't help with code like the following, where the only operation within the scope of s2 is a memory access:

void test() {
  std::string s1 = <blah>;
  int *ip = get_invalid_ptr();
  std::string s2 = <blah>;
  *ip = 0;
}

For C++ case under -EHa, seh.try.begin(), an EhaScopeBegin() is injected at the beginning of the scope if Dtor() cleanup is required.

[rjmccall] Yes, but it's not known well to all the compiler developers who are reading and maintaining the Clang codebase. I'm not saying you should rename the command-line flag, I'm asking that you not just say "EHa" in comments and identifiers.

Ok, no problem. more comments will be added.

Comments and identifiers. So, for example, the intrinsic should not be called eha_scope_begin.

There is absolutely NO explicit edge added in this design.

I didn't say there was. I said that there was an *implicit* edge: from the memory access to the handler. It's precisely because that edge is implicit that it's problematic for analysis and optimization.

Oh, sorry I misread it.
For HW exceptions the 'implicit' edge is unavoidable unless an explicit approach (like previous iload/istore) is employed. iload approach was extensively discussed and evaluated when it's proposed. As far as I know, the concerns were that it could create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.

Hmm. I suppose it's not that different from the general problem with setjmp/longjmp. I think you'll still have representation problems here, but I'll address them below; I concede that in principle marking regions with an intrinsic that instructions can't necessary be moved over is workable, and so you don't need to turn every faulting instruction into an invoke. You may need to mark the initial seh.try.begin with something like returns_twice or otherwise communicate that there's non-obvious control flow within that function.

So this design applies an IMPLICT approach. I respectfully disagree that it's problematic because as long as volatile attribute is honored it's robust. please see the C & C++ SEH rule stated in patch description section.

You also need to make sure that potentially-faulting instructions aren't moved *into* the _try region. I don't think that just making accesses with the _try volatile achieves that. Probably the easiest way to do this would be to outline the _try region through most of the LLVM pipeline and only inline it very late.

For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region.

When is this attribute applied? Because you can't just apply it at a single point; you also need to mark operations when they're copied/inlined into the region. We once had a similar attempt at handling exceptions with a setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, something frontend-centric) that had persistent problems because of poor modeling in the IR, which it feels like this is doomed to repeat.

This is applied only once in FE. Consider it's like from the source code. So NO there is no other place we need to change. Per SEH semantic, the inline code is immune from volatile constraint.

The SEH semantics say that faulting instructions not lexically within the _try can be re-ordered with each other, but it's not clear that it's okay to re-order them with instructions within the _try. If I had to guess, I would say that the expectation is that inlining is disabled within a _try scope.

I don't think this is the same as SJLJ story. I did once look into SJLJ problem when I found every single test in MSVC Setjmp suite was broken. it's because the implicit back edge from longjmp (or a callee with longjump) was not modeled properly. actually I have a simple idea that can fix this SJLJ problem robustly, but I've been clogged in this SEH task.

The issue I'm describing was somewhat different from the general problems of using setjmp. The C setjmp/longjmp library functions don't guarantee that stores to non-volatile objects will be visible after the longjmp; that is, they require user cooperation in order to get consistency. This creates a problem when trying to use setjmp/longjmp directly in IR to implement language exceptions: while you can relatively easily mark all the accesses within a lexical try as volatile in the frontend, non-volatile accesses can be introduced into the function (e.g. by inlining) that won't be volatile, and then they potentially get lost if an exception is thrown. A simple example of this is something like:

void assign_and_throw(int *ip) {
  *ip = 5;
  throw some_exception();
}

void test() {
  int i;
  try {
    i = 0;
    assign_and_throw(&i);
  } catch (...) {
    printf("%d\n", i);
  }
}

The frontend marks the assignment to i in test as volatile, so the compiler must emit it as a real store, and so it will be observed in the catch handler after the call to longjmp done within the throw operation. And if no inlining is done, the compiler has no choice but to emit the assignment to *ip in assign_and_throw as a real store as well. But if the compiler inlines the call to assign_and_throw, that store to *ip becomes a non-volatile store directly to i, and since there's no direct control-flow path from there to a location that can read it, that store appears to be dead and can be discarded. That is allowed under the rules for setjmp — if the user wants to ensure that this store is visible, they would have to declare i as a volatile int and ip as a volatile int * — but of course not under the rules for throw/catch. This is why frontends cannot naively lower things like EH to setjmp without any influence on the inliner.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously as changes go into LLVM? Because this sort of thing is extremely easy to break.

Yes, as long as it's a bug/flaw in the design/implementation. New opt/analysis/tools that violates original basic volatile/cleanup/EH-framework exposed by SEH test cases is not included. Isn't this the policy of Clang/LLVM community?

I think you're missing what I'm asking. If LLVM accepts this feature, it will become our collective responsibility as a project to keep it working. You have a large external correctness test suite for this feature. It does not sound like you intend to open-source that test suite; instead, you intend to extract a small number of unit tests from it and add those to the LLVM test suite. So I'm asking if you're at least going to have an external CI bot which will run the full test suite for this feature to ensure it hasn't been broken by the last day of commits. It does not seem reasonable to expect that the few unit tests you extract will themselves be sufficient to keep the feature tested.

For C++ case under -EHa, seh.try.begin(), an EhaScopeBegin() is injected at the beginning of the scope if Dtor() cleanup is required.

Okay, I think I understand how this is supposed to work, thanks.

@rjmccall wrote:
I think you're missing what I'm asking. If LLVM accepts this feature, it will become our collective responsibility as a project to keep it working. You have a large external correctness test suite for this feature. It does not sound like you intend to open-source that test suite; instead, you intend to extract a small number of unit tests from it and add those to the LLVM test suite. So I'm asking if you're at least going to have an external CI bot which will run the full test suite for this feature to ensure it hasn't been broken by the last day of commits. It does not seem reasonable to expect that the few unit tests you extract will themselves be sufficient to keep the feature tested.

Microsoft's SEH tests are open source and there was a suggestion in one of the threads on llvm-dev to run them on the buildbots. We can look into that as a follow on to this patch.

https://lists.llvm.org/pipermail/llvm-dev/2019-November/136695.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/140614.html

@rjmccall wrote:
I think you're missing what I'm asking. If LLVM accepts this feature, it will become our collective responsibility as a project to keep it working. You have a large external correctness test suite for this feature. It does not sound like you intend to open-source that test suite; instead, you intend to extract a small number of unit tests from it and add those to the LLVM test suite. So I'm asking if you're at least going to have an external CI bot which will run the full test suite for this feature to ensure it hasn't been broken by the last day of commits. It does not seem reasonable to expect that the few unit tests you extract will themselves be sufficient to keep the feature tested.

Microsoft's SEH tests are open source and there was a suggestion in one of the threads on llvm-dev to run them on the buildbots. We can look into that as a follow on to this patch.

https://lists.llvm.org/pipermail/llvm-dev/2019-November/136695.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/140614.html

That's great. I think that needs to be taken as more than just a suggestion, though; it's critical part of the testing story for this proposed feature.

There is absolutely NO explicit edge added in this design.

I didn't say there was. I said that there was an *implicit* edge: from the memory access to the handler. It's precisely because that edge is implicit that it's problematic for analysis and optimization.

Oh, sorry I misread it.
For HW exceptions the 'implicit' edge is unavoidable unless an explicit approach (like previous iload/istore) is employed. iload approach was extensively discussed and evaluated when it's proposed. As far as I know, the concerns were that it could create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.

Hmm. I suppose it's not that different from the general problem with setjmp/longjmp. I think you'll still have representation problems here, but I'll address them below; I concede that in principle marking regions with an intrinsic that instructions can't necessary be moved over is workable, and so you don't need to turn every faulting instruction into an invoke. You may need to mark the initial seh.try.begin with something like returns_twice or otherwise communicate that there's non-obvious control flow within that function.

OK, good to know this. thank you!
Other than blocking some opts, does this returns_twice attribute have some other implications?

So this design applies an IMPLICT approach. I respectfully disagree that it's problematic because as long as volatile attribute is honored it's robust. please see the C & C++ SEH rule stated in patch description section.

You also need to make sure that potentially-faulting instructions aren't moved *into* the _try region. I don't think that just making accesses with the _try volatile achieves that. Probably the easiest way to do this would be to outline the _try region through most of the LLVM pipeline and only inline it very late.

The intrinsic is set with unknown memory access. Plus returns_twice you just suggested, is not it sufficient to block potentially-faulting instructions from being moved across?

For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region.

When is this attribute applied? Because you can't just apply it at a single point; you also need to mark operations when they're copied/inlined into the region. We once had a similar attempt at handling exceptions with a setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, something frontend-centric) that had persistent problems because of poor modeling in the IR, which it feels like this is doomed to repeat.

This is applied only once in FE. Consider it's like from the source code. So NO there is no other place we need to change. Per SEH semantic, the inline code is immune from volatile constraint.

The SEH semantics say that faulting instructions not lexically within the _try can be re-ordered with each other, but it's not clear that it's okay to re-order them with instructions within the _try. If I had to guess, I would say that the expectation is that inlining is disabled within a _try scope.

Reordering is NOT okay for instructions 'directly' within a _try. 'directly' here means it's the code from source code originally, i.e., inlined code or the code in callees is not included. Disabling inline in general is good, but not necessary.

I don't think this is the same as SJLJ story. I did once look into SJLJ problem when I found every single test in MSVC Setjmp suite was broken. it's because the implicit back edge from longjmp (or a callee with longjump) was not modeled properly. actually I have a simple idea that can fix this SJLJ problem robustly, but I've been clogged in this SEH task.

The issue I'm describing was somewhat different from the general problems of using setjmp. The C setjmp/longjmp library functions don't guarantee that stores to non-volatile objects will be visible after the longjmp; that is, they require user cooperation in order to get consistency. This creates a problem when trying to use setjmp/longjmp directly in IR to implement language exceptions: while you can relatively easily mark all the accesses within a lexical try as volatile in the frontend, non-volatile accesses can be introduced into the function (e.g. by inlining) that won't be volatile, and then they potentially get lost if an exception is thrown. A simple example of this is something like:

void assign_and_throw(int *ip) {
  *ip = 5;
  throw some_exception();
}

void test() {
  int i;
  try {
    i = 0;
    assign_and_throw(&i);
  } catch (...) {
    printf("%d\n", i);
  }
}

The frontend marks the assignment to i in test as volatile, so the compiler must emit it as a real store, and so it will be observed in the catch handler after the call to longjmp done within the throw operation. And if no inlining is done, the compiler has no choice but to emit the assignment to *ip in assign_and_throw as a real store as well. But if the compiler inlines the call to assign_and_throw, that store to *ip becomes a non-volatile store directly to i, and since there's no direct control-flow path from there to a location that can read it, that store appears to be dead and can be discarded. That is allowed under the rules for setjmp — if the user wants to ensure that this store is visible, they would have to declare i as a volatile int and ip as a volatile int * — but of course not under the rules for throw/catch.

As said above, SEH semantic only applies to code "directly" under SEH scope for few obvious reasons (Ex, indirect callees or callees not in the same module wont be visible). if faulty instructions are allowed to be reordered inside a callee, those instructions are allowed to do so even after they are inlined into _try.
Volatile-ness here is primary for reordering constraint, not for data-flow. Per LLVM EH framework, a data-defined in _try will flow through explicit EH edge into the Handler. In this SEH design, there will be a seh.try.end() at every exit of SEH region that ensures data defined in _try scope flow to the use in Handler.

This is why frontends cannot naively lower things like EH to setjmp without any influence on the inliner.

why do you want to do that? LLVM EH offers a robust data-flow representation while SJLJ data-flow modeling is incomplete.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously as changes go into LLVM? Because this sort of thing is extremely easy to break.

Yes, as long as it's a bug/flaw in the design/implementation. New opt/analysis/tools that violates original basic volatile/cleanup/EH-framework exposed by SEH test cases is not included. Isn't this the policy of Clang/LLVM community?

I think you're missing what I'm asking. If LLVM accepts this feature, it will become our collective responsibility as a project to keep it working. You have a large external correctness test suite for this feature. It does not sound like you intend to open-source that test suite; instead, you intend to extract a small number of unit tests from it and add those to the LLVM test suite. So I'm asking if you're at least going to have an external CI bot which will run the full test suite for this feature to ensure it hasn't been broken by the last day of commits. It does not seem reasonable to expect that the few unit tests you extract will themselves be sufficient to keep the feature tested.

As Aaron replied, yes MS SEH (and EH) test suite are (or will be) open sourced.
BTW I have to point out that to pass the entire suite we need 2nd missing feature (Local-Unwind) in place too. As a heads up, a proposed Local-Unwind design and implementation can be seen here (https://github.com/tentzen/llvm-project ). We will bring it back to RFC once -EHa is accepted and landed.

For C++ case under -EHa, seh.try.begin(), an EhaScopeBegin() is injected at the beginning of the scope if Dtor() cleanup is required.

Okay, I think I understand how this is supposed to work, thanks.

There is absolutely NO explicit edge added in this design.

I didn't say there was. I said that there was an *implicit* edge: from the memory access to the handler. It's precisely because that edge is implicit that it's problematic for analysis and optimization.

Oh, sorry I misread it.
For HW exceptions the 'implicit' edge is unavoidable unless an explicit approach (like previous iload/istore) is employed. iload approach was extensively discussed and evaluated when it's proposed. As far as I know, the concerns were that it could create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.

Hmm. I suppose it's not that different from the general problem with setjmp/longjmp. I think you'll still have representation problems here, but I'll address them below; I concede that in principle marking regions with an intrinsic that instructions can't necessary be moved over is workable, and so you don't need to turn every faulting instruction into an invoke. You may need to mark the initial seh.try.begin with something like returns_twice or otherwise communicate that there's non-obvious control flow within that function.

OK, good to know this. thank you!
Other than blocking some opts, does this returns_twice attribute have some other implications?

So this design applies an IMPLICT approach. I respectfully disagree that it's problematic because as long as volatile attribute is honored it's robust. please see the C & C++ SEH rule stated in patch description section.

You also need to make sure that potentially-faulting instructions aren't moved *into* the _try region. I don't think that just making accesses with the _try volatile achieves that. Probably the easiest way to do this would be to outline the _try region through most of the LLVM pipeline and only inline it very late.

The intrinsic is set with unknown memory access. Plus returns_twice you just suggested, is not it sufficient to block potentially-faulting instructions from being moved across?

I've changed my mind; you've mostly got edges in the place you need, and you don't need returns_twice. But see the comment below about unreachable ends of scopes; I don't know how to handle that well.

For LLVM transformations/optimizations, the only constrain added is volatile attribute on memory operations within a SEH region.

When is this attribute applied? Because you can't just apply it at a single point; you also need to mark operations when they're copied/inlined into the region. We once had a similar attempt at handling exceptions with a setjmp-based ABI (not the "SJLJ" exceptions supported by the backends, something frontend-centric) that had persistent problems because of poor modeling in the IR, which it feels like this is doomed to repeat.

This is applied only once in FE. Consider it's like from the source code. So NO there is no other place we need to change. Per SEH semantic, the inline code is immune from volatile constraint.

The SEH semantics say that faulting instructions not lexically within the _try can be re-ordered with each other, but it's not clear that it's okay to re-order them with instructions within the _try. If I had to guess, I would say that the expectation is that inlining is disabled within a _try scope.

Reordering is NOT okay for instructions 'directly' within a _try. 'directly' here means it's the code from source code originally, i.e., inlined code or the code in callees is not included. Disabling inline in general is good, but not necessary.
As said above, SEH semantic only applies to code "directly" under SEH scope for few obvious reasons (Ex, indirect callees or callees not in the same module wont be visible). if faulty instructions are allowed to be reordered inside a callee, those instructions are allowed to do so even after they are inlined into _try.

I feel almost certain that this is wrong. Let me try to explain. It would be helpful to have a more precise statement of semantics than I can find online, but we can start from the basics.

"X and Y can be reordered" is how optimizer folks talk, but it's not generally how semantics are specified. As a general rule (ignoring sequence points and concurrency for simplicity), C and C++ require operations to be executed in a particular order on the abstract machine, and there are no rules explicitly sanctioning reordering. Instead, the "as if" rule gives the implementation broad leeway to do things differently from how they're specified on the abstract machine, as long as the difference cannot be observed by a valid program. Faults generally correspond to conditions that have undefined behavior under the language, and so the implementation may assume they do not happen in a valid program, and so the consequences of them happening cannot be observed by a valid program, and so faults can be ignored when deciding whether to reorder operations, which allows a lot of reordering.

_try should be understood as fully defining the behavior of faulting operations written within its scope so that they have the defined semantics of resuming execution in the _except clause. Because faults have the defined behavior of ending execution of the _try block, whether the fault occurred is observable, so the "as if" leeway stops applying. Thus, other operations with side-effects cannot be reordered with potentially faulty operations written within the _try, no more than they could be reordered with a break statement. That applies whether those other operations are written within the _try or not; it's just that potentially-faulty operations written within the _try must always be treated as having side-effects.

-EHa more generally should be understand as partially defining the behavior of faulting operations even if they are not written in a _try: if the operation faults, the behavior is defined so that scopes are unwound and control resumes at an _except clause if one is dynamically active, but this may be observed at an indeterminate point. It is hard to describe these semantics formally, but the intended rules for the implementation are pretty clear: potentially faulty operations outside of a _try can be reordered around each other or even moved to different scopes, as per normal optimization rules, but whenever those operations are executed, if they fault they must trigger an unwind and cause execution to resume at an _except clause if one is active.

So I think you cannot allow operations inlined from a call made within a _try to be reordered with operations written within the _try, or to happen outside the _try. Since LLVM doesn't promise not to reorder volatile and non-volatile stores to provably non-aliased locations, you cannot safely inline non-volatile stores within a _try. You can either disable inlining or mark the calls in some way that will cause the inliner to make any inlined stores volatile (and whatever else is necessary to ensure they will not be reordered).

Volatile-ness here is primary for reordering constraint, not for data-flow. Per LLVM EH framework, a data-defined in _try will flow through explicit EH edge into the Handler. In this SEH design, there will be a seh.try.end() at every exit of SEH region that ensures data defined in _try scope flow to the use in Handler.

Where is the "explicit EH edge into the handler" from a faulting instruction? It seems like there's an EH edge to the innermost cleanup from the last cleanup scope you entered, but that's not the same as from a faulting instruction, and the next site that does have a potential EH edge is the end of the scope, which you have no guarantee is actually reachable via the ordinary CFG. So I think you actually *are* relying on volatile for its data-flow guarantees.

This is why frontends cannot naively lower things like EH to setjmp without any influence on the inliner.

why do you want to do that? LLVM EH offers a robust data-flow representation while SJLJ data-flow modeling is incomplete.

We don't, it's a bad representation for EH. But the problems seem very analogous to what you're dealing with here.

[rjmccall] Does it pass under all combinations of optimization and code-generation settings? How do LLVM contributors ensure that this test suite continues to pass?

Yes. this is just the first patch. there will be second patch to land LLVM code-gen part in, then 3rd patch to add some test cases (Ex, xcpt4u.c from MSVC). Without option -EHa, the patch is a zero-impact change for now.

Do you expect to maintain a CI bot that runs the full test suite continuously as changes go into LLVM? Because this sort of thing is extremely easy to break.

Yes, as long as it's a bug/flaw in the design/implementation. New opt/analysis/tools that violates original basic volatile/cleanup/EH-framework exposed by SEH test cases is not included. Isn't this the policy of Clang/LLVM community?

I think you're missing what I'm asking. If LLVM accepts this feature, it will become our collective responsibility as a project to keep it working. You have a large external correctness test suite for this feature. It does not sound like you intend to open-source that test suite; instead, you intend to extract a small number of unit tests from it and add those to the LLVM test suite. So I'm asking if you're at least going to have an external CI bot which will run the full test suite for this feature to ensure it hasn't been broken by the last day of commits. It does not seem reasonable to expect that the few unit tests you extract will themselves be sufficient to keep the feature tested.

As Aaron replied, yes MS SEH (and EH) test suite are (or will be) open sourced.

Okay, great. We just need a firm commitment that those tests will be run continually on the main branches and during release qualification. If they're open-source, you may be able to add them to the extended LLVM test suite (not just the checked-in regression tests), which I think those things already happen for. Other people should be able to help you with the right process for that.

Reordering is NOT okay for instructions 'directly' within a _try. 'directly' here means it's the code from source code originally, i.e., inlined code or the code in callees is not included. Disabling inline in general is good, but not necessary.
As said above, SEH semantic only applies to code "directly" under SEH scope for few obvious reasons (Ex, indirect callees or callees not in the same module wont be visible). if faulty instructions are allowed to be reordered inside a callee, those instructions are allowed to do so even after they are inlined into _try.

I feel almost certain that this is wrong. Let me try to explain. It would be helpful to have a more precise statement of semantics than I can find online, but we can start from the basics.

"X and Y can be reordered" is how optimizer folks talk, but it's not generally how semantics are specified. As a general rule (ignoring sequence points and concurrency for simplicity), C and C++ require operations to be executed in a particular order on the abstract machine, and there are no rules explicitly sanctioning reordering. Instead, the "as if" rule gives the implementation broad leeway to do things differently from how they're specified on the abstract machine, as long as the difference cannot be observed by a valid program. Faults generally correspond to conditions that have undefined behavior under the language, and so the implementation may assume they do not happen in a valid program, and so the consequences of them happening cannot be observed by a valid program, and so faults can be ignored when deciding whether to reorder operations, which allows a lot of reordering.

_try should be understood as fully defining the behavior of faulting operations written within its scope so that they have the defined semantics of resuming execution in the _except clause. Because faults have the defined behavior of ending execution of the _try block, whether the fault occurred is observable, so the "as if" leeway stops applying. Thus, other operations with side-effects cannot be reordered with potentially faulty operations written within the _try, no more than they could be reordered with a break statement. That applies whether those other operations are written within the _try or not; it's just that potentially-faulty operations written within the _try must always be treated as having side-effects.

-EHa more generally should be understand as partially defining the behavior of faulting operations even if they are not written in a _try: if the operation faults, the behavior is defined so that scopes are unwound and control resumes at an _except clause if one is dynamically active, but this may be observed at an indeterminate point. It is hard to describe these semantics formally, but the intended rules for the implementation are pretty clear: potentially faulty operations outside of a _try can be reordered around each other or even moved to different scopes, as per normal optimization rules, but whenever those operations are executed, if they fault they must trigger an unwind and cause execution to resume at an _except clause if one is active.

So I think you cannot allow operations inlined from a call made within a _try to be reordered with operations written within the _try, or to happen outside the _try. Since LLVM doesn't promise not to reorder volatile and non-volatile stores to provably non-aliased locations, you cannot safely inline non-volatile stores within a _try. You can either disable inlining or mark the calls in some way that will cause the inliner to make any inlined stores volatile (and whatever else is necessary to ensure they will not be reordered).

Oh, yes you are right. I was talking about reordering between inlined faulty instructions is allowed. Reordering inlined instruction with any 'direct' volatile instruction should be prohibited. I overlooked LLVM doesn't promise not to reorder volatile and non-volatile. I will simply disable inlining into a _try scope. thanks!

Volatile-ness here is primary for reordering constraint, not for data-flow. Per LLVM EH framework, a data-defined in _try will flow through explicit EH edge into the Handler. In this SEH design, there will be a seh.try.end() at every exit of SEH region that ensures data defined in _try scope flow to the use in Handler.

Where is the "explicit EH edge into the handler" from a faulting instruction? It seems like there's an EH edge to the innermost cleanup from the last cleanup scope you entered, but that's not the same as from a faulting instruction, and the next site that does have a potential EH edge is the end of the scope, which you have no guarantee is actually reachable via the ordinary CFG. So I think you actually *are* relying on volatile for its data-flow guarantees.

Oh, right, I take it back. volatile-ness is used to assure 2nd (reordering) and 3rd (global-state) rules for C-code as described in head summary of this patch. I was thinking about abnormal exits (like https://reviews.llvm.org/D77936#change-6bDmAmTUlS0o) in _try-finally case where dataflow is guaranteed to reach _finally.

As Aaron replied, yes MS SEH (and EH) test suite are (or will be) open sourced.

Okay, great. We just need a firm commitment that those tests will be run continually on the main branches and during release qualification. If they're open-source, you may be able to add them to the extended LLVM test suite (not just the checked-in regression tests), which I think those things already happen for.

Yes, that is the plan after -EHa Part-II is done. but again, it won't be the entire suite without the support of Local-Unwind.

Okay. I think we're on the same page about representation now. If you can come up with a good replacement for "eha" in the intrinsic names, I think this is pretty much ready to go.

llvm/docs/LangRef.rst
11584

This is a very emission-centric description of the intrinsic. Maybe something like:

LLVM's ordinary exception-handling representation associates EH cleanups and
handlers only with ``invoke``s, which normally correspond only to call sites.  To
support arbitrary faulting instructions, it must be possible to recover the current
EH scope for any instruction.  Turning every operation in LLVM that could fault
into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
large number of intrinsics, impede optimization of those operations, and make
compilation slower by introducing many extra basic blocks.  These intrinsics can
be used instead to mark the region protected by a cleanup, such as for a local
C++ object with a non-trivial destructor.  ``llvm.eha.scope.begin`` is used to mark
the start of the region; it is aways called with ``invoke``, with the unwind block
being the desired unwind destination for any potentially-throwing instructions
within the region.  `llvm.eha.scope.end` is used to mark when the scope ends
and the EH cleanup is no longer required (e.g. because the destructor is being
called).

So, about that — how does this pairing actually work? I think I understand how it's *meant* to work, by just scanning the structure of the CFG. But Clang doesn't promise to emit blocks with that kind of structure, and in practice block structure gets pretty complicated when you have multiple edges out of a region and they have to go to destinations at different cleanup depths. I don't really understand how your analysis looks through that sort of thing.

Example:

#include <string>
void test(int x, int &state) {
  state = 0;
  std::string a;
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    state = 3;
    if (x > 10) {
      state = 4;
      return;
    }
    state = 5;
    std::string c;
    state = 6;
  }
  state = 7;
}

IIRC, the way Clang will emit this is something like:

void test(int x, int &state) {
  int jumpdest;
  state = 0;
  std::string a;
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    state = 3;
    if (x > 10) {
      state = 4;
      jumpdest = 0;
      goto destroy_b;
    }
    state = 5;
    std::string c;
    state = 6;
    c.~std::string();
    jumpdest = 1;
  destroy_b:
    b.~std::string();
    switch (jumpdest) {
    case 0: goto destroy_a;
    case 1: goto fallthrough;
    }
  fallthrough:
    ;
  }
  state = 7;
destroy_a:
  a.~std::string();
}

The point being that I'm not sure the stack-like relationship between these cleanup scopes that's present in the source actually reliably survives into IR. That makes me concerned about the implicit pairing happening with these intrinsics.

Okay. I think we're on the same page about representation now. If you can come up with a good replacement for "eha" in the intrinsic names, I think this is pretty much ready to go.

how about seh.cppscope.begin() and seh.cppscope.end()? or any suggestion?

This is a very emission-centric description of the intrinsic. Maybe something like:

Very good writing for LangRef.rst:-). Will update it with your writing. Thanks!

As for pairing question, we simply utilizes existent Cleanup mechanism as you described in which abnormal exits and normal fall-through are merged in a 'cleanup' pad. What this implementation does is to inject a scope.begin right after a ctor and a scope.end right before dtor in cleanup-pad. Use the same example, it will be something like below:

void test(int x, int &state) {
  int jumpdest;
  state = 0;
  std::string a;  
  **seh.cppscope.begin();  // for object a**
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    **seh.cppscope.begin();  // for b**
    state = 3;
    if (x > 10) {
      state = 4;
      jumpdest = 0;
      goto destroy_b;
    }
    state = 5;
    std::string c;
    **seh.cppscope.begin();  // for c**
    state = 6;
    **seh.cppscope.end(); // for c dtor**
    c.~std::string();
    jumpdest = 1;
  destroy_b:
    **seh.cppscope.end();   // for b dtor**
    b.~std::string();
    switch (jumpdest) {
    case 0: goto destroy_a;
    case 1: goto fallthrough;
    }
  fallthrough:
    ;
  }
  state = 7;
destroy_a:
  **seh.cppscope.begin();  // for a dtor **
  a.~std::string();
}

Okay. I think we're on the same page about representation now. If you can come up with a good replacement for "eha" in the intrinsic names, I think this is pretty much ready to go.

how about seh.cppscope.begin() and seh.cppscope.end()? or any suggestion?

I like that better. Maybe even just seh.scope.begin? Or seh.cleanupscope.begin? Cleanups aren't *inherently* a C++ feature, it's just that C doesn't officially have them and MSVC doesn't really support any non-C++ languages that do.

As for pairing question, we simply utilizes existent Cleanup mechanism as you described in which abnormal exits and normal fall-through are merged in a 'cleanup' pad. What this implementation does is to inject a scope.begin right after a ctor and a scope.end right before dtor in cleanup-pad. Use the same example, it will be something like below:

void test(int x, int &state) {
  int jumpdest;
  state = 0;
  std::string a;  
  **seh.cppscope.begin();  // for object a**
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    **seh.cppscope.begin();  // for b**
    state = 3;
    if (x > 10) {
      state = 4;
      jumpdest = 0;
      goto destroy_b;
    }
    state = 5;
    std::string c;
    **seh.cppscope.begin();  // for c**
    state = 6;
    **seh.cppscope.end(); // for c dtor**
    c.~std::string();
    jumpdest = 1;
  destroy_b:
    **seh.cppscope.end();   // for b dtor**
    b.~std::string();
    switch (jumpdest) {
    case 0: goto destroy_a;
    case 1: goto fallthrough;
    }
  fallthrough:
    ;
  }
  state = 7;
destroy_a:
  **seh.cppscope.begin();  // for a dtor **
  a.~std::string();
}

Okay. I feel fairly convinced that this will work for most cleanup situations. You might need to do some IR-testing work in Clang to make sure Clang calls seh.scope.begin the right way when emitting conditional cleanups:

  • We need to call it at some dominating point rather than at the current IP when we push the cleanup.
  • We need to call it in the right order for the cleanups we enter.

Conditional cleanups come up when we need a cleanup in code that's not unconditionally run during a expression evaluation, like in the 2nd or 3rd operand of ?: or in the RHS of &&.

OK, great.
Summarize our discussion above, I will follow up with:

  • rename intrinsic eha.scope.begin() with seh.scope.begin().
  • update LangRef.rst with detailed explanation
  • not to allow inlining under SEH direct lexical scope
  • add conditional cleanup test case and make sure that eha.scope.begin() is emitted in the right place
  • add sideEntry label support in AST-dumper and de/serialization.

thank you for your thorough review and valuable feedbacks. I'm in the middle of something else. I will post another patch latter.

Sounds good to me, thanks!

When do you plan to enable this on x86?

tentzen updated this revision to Diff 306287.Nov 18 2020, 7:11 PM

per review feedback from John McCall and others, this update includes:

  • Rename intrinsic eha.scope.begin() with seh.scope.begin().
  • Update LangRef.rst with detailed explanation
  • Add conditional cleanup test case and move the emission of seh.scope.begin() into EHScopeStack::pushCleanup()
  • Add sideEntry label support in AST-dumper and de/serialization.

also add tests for Windows ABI temps' ctor and dtor process

With your latest patch update clang segfaults for me even when '-feh-asynch' is not passed.

tentzen updated this revision to Diff 306469.Nov 19 2020, 10:37 AM

Fixed the clang fault when built without '-feh-asynch' option.
EHStack's CGF field must be initialized even without -feh-asynch.

llvm/docs/LangRef.rst
11587

Typo: aways -> always

11591

Maybe add something here like

Any set of operations can then be confined to the region by reading their leaf inputs via volatile loads and writing their root outputs via volatile stores.

IIUC, for something like x = 1 / y, the volatile load of y and volatile store of x being things you can't move past these intrinsic calls is how you make sure the potentially-faulting divide stays in the right scope. IMO it's a bit confusing to talk about "arbitrary faulting instructions" being constrained without that extra bit.

tentzen added inline comments.Nov 19 2020, 12:03 PM
llvm/docs/LangRef.rst
11591

Very good point! will clarify as you suggested..

tentzen updated this revision to Diff 306510.Nov 19 2020, 12:54 PM

Per Joseph's feedback, further clarify the semantic of llvm.seh.try.begin and llvm.seh.try.end.

Do we need to consider FP exceptions in _try block?

clang/include/clang/Driver/Options.td
886

It's better to follow alphabetical for it and line 1531.

clang/lib/CodeGen/CGCleanup.cpp
1287

Move { to the same line with else and better to add curly brackets for PopCleanupBlock();

clang/test/CodeGen/windows-seh-EHa-TryInFinally.cpp
2

Should this be a C file? I saw LangRef says they are used for C function.

llvm/docs/LangRef.rst
11534

Keep the same length with above line.

11560

Keep the same length with above line.

tentzen marked 2 inline comments as done.Nov 19 2020, 10:17 PM

Do we need to consider FP exceptions in _try block?

Yes, FP exception is handled as long as FP exceptions are not disabled (Ex via _controlfp() runtime) and FP exception code is filtered & handled via ___except() statement (Ex, ___except(GetExceptionCode()==EXCEPTION_FLT_INEXACT_RESULT)).

clang/include/clang/Driver/Options.td
886

really? I see some others are not in alphabetical order. it's natural to right besides fcxx_exceptions.

clang/test/CodeGen/windows-seh-EHa-TryInFinally.cpp
2

yes it's a C-function that can be placed in .cpp file.

Does it really work for x86_64 architecture only? I have tried using this with x86 build and produced binaries looks like this should work on this platform as well, but I have not tried running produced executables to see the results yet tbh.

Does it really work for x86_64 architecture only? I have tried using this with x86 build and produced binaries looks like this should work on this platform as well, but I have not tried running produced executables to see the results yet tbh.

This is just part-1 (Clang) of SEH HW exception support. We need another patch for LLVM part to work for x86_64. After that we will need another two LLVM changes to enable it for X86 and Arm64.

Do we need to consider FP exceptions in _try block?

Yes, FP exception is handled as long as FP exceptions are not disabled (Ex via _controlfp() runtime) and FP exception code is filtered & handled via ___except() statement (Ex, ___except(GetExceptionCode()==EXCEPTION_FLT_INEXACT_RESULT)).

I see. If this is the case, you may need to assign FPE_Strict to _try block to preserve FP instructions' order.

Do we need to consider FP exceptions in _try block?

Yes, FP exception is handled as long as FP exceptions are not disabled (Ex via _controlfp() runtime) and FP exception code is filtered & handled via ___except() statement (Ex, ___except(GetExceptionCode()==EXCEPTION_FLT_INEXACT_RESULT)).

I see. If this is the case, you may need to assign FPE_Strict to _try block to preserve FP instructions' order.

Is FPE_Strict in Clang/LLVM equivalent to /fp:strict in VC++/Windows. -EHa does not automatically imply /fp:strict semantic in Windows. So for FP sensitive cases, users can explicitly specify -EHa and -FPE_Strict together.

Hi, Is there any more concerns, feedbacks or suggestions?
thanks,

This patch looks ready to land. Are there any other concerns or feedback?

I'm sorry that it's taking a while to find time to review the implementation; I'll try to get to it this week.

rjmccall added inline comments.Jan 19 2021, 11:07 AM
clang/lib/CodeGen/CGCleanup.cpp
1341

We generally prefer to get intrinsic functions with CGM.getIntrinsic.

clang/lib/CodeGen/CGException.cpp
465

Please remove the comment here. The option name should be sufficiently self-descriptive.

Anyway, I don't think this change is right, because we *do* still need to push a terminate scope: we need C++ exceptions to trigger a call to std::terminate. It's just that such scopes aren't fully terminal when async exceptions are enabled, because MSVC defines those exceptions as passing through noexcept and so on. (I assume that's true; can you link to documentation about it?)

554

Again, please try to refer to this in a more driver-agnostic way: "under async exceptions" rather than "when -EHa". But actually as mentioned above I think this is incorrect.

603

The comment here isn't explaining anything, it's just repeating what the code is doing. If you want a useful comment, you could explain why it's important to mark the scope.

1668

Please use dyn_cast for all of these.

1678

Volatilizing every block that's reachable from the try block seems like it makes a lot of assumptions about where branches within the try can reach. For example, a goto could certainly go to a block that's already been emitted, as could break or continue if the emitter just makes slightly different decisions about emission order. Please look at how FragileHazards (in the ObjC emission code) collects blocks in order to do its transforms — I think you can probably extract a reasonable common base out. Alternatively, I think you could handle this directly in the insertion callback (CodeGenFunction::InsertHelper) when we're in an appropriate try scope.

clang/lib/Driver/ToolChains/Clang.cpp
6584

For consistency with the existing options, please spell this option -fasync-exceptions, and please spell the corresponding LangOption AsyncExceptions.

clang/lib/Frontend/CompilerInvocation.cpp
2786 ↗(On Diff #306510)

You should emit an error if this is enabled on targets that are not in the appropriate Windows environment, since we don't (yet) support it there. I assume that's just the MSVC Windows environment and that this can't easily be supported on e.g. MinGW?

Does it have other target restrictions, like i386-only?

clang/lib/Sema/JumpDiagnostics.cpp
935

This doesn't seem like a reasonable assertion in the abstract. Even if we really only currently emit warnings with jumps to labels, that doesn't seem like something we should write code that relies on. And I'm sure this problem can come up with switch cases, unless that's structurally outlawed in some other way.

Also, you're making the correct setting of this flag dependent on whether we're emitting a warning vs. an error. Seems like we should be setting it regardless.

What conditions exactly do you want this flag set on? I would naturally assume it's specifically branches from a block outside the try, and you don't care about branches within the try? If the label is used in multiple places, do you need to be careful to only set the flag on those branches that come from outside the try?

Hi, John,
sorry for the delay. I'm still in the middle of something.
will context-switch a little bit latter. thanks,

tentzen marked an inline comment as not done.Feb 17 2021, 11:20 PM

thank you for the thorough review again. My answer for each comment below:

clang/lib/CodeGen/CGCleanup.cpp
1341

Does this really matter? there are more than 200 uses of CGM.CreateRuntimeFunction().

clang/lib/CodeGen/CGException.cpp
465

Yes, MSVC terminate() runtime bypasses HW exceptions to its caller.
Hmm, it's been a whole. I think I placed that code there because Clang's terminate runtime does not dispatch HW exception to caller when I tried a year ago. It issues an unhandled exception. I felt if a user explicitly specify -EHa, HW exception probably is more significant than C++ noexcept/nothrow directive.
Anyways, I can undo this code and let terminate-runtime handler it one way or the other.

554

will do.

603

Yes, will do. thanks.

1668

ok will fix them. thanks.

1678

A _try region is a Single Entry Multiple Exits regions. this code starts from Entry block and follows control-flow to reach all successors. Yes a block could have multi-predecessors. Note the 2nd line of this function: a visit flag is marked and checked

!V.insert(BB).second /* already visited */

As long as it follows control-flows, it does not matter what the lexical order is.

clang/lib/Driver/ToolChains/Clang.cpp
6584

OK will do.

clang/lib/Frontend/CompilerInvocation.cpp
2786 ↗(On Diff #306510)

are you sure it's better to emit an error? I think some users would like it being ignored on non-Window platforms so they can have a single option set across platforms.

clang/lib/Sema/JumpDiagnostics.cpp
935

A _try scope must be a (SEME) Single-Entry-Multi-Exits region. Branching into a _try is not allowed; it triggers an error. Clang handles it properly.
What we want to flag here is an branch (either initiated by a go-to/break/continue) from inner _try to outer _try.
This flag is set in this If-statement because that is exactly the place issue the warning we want to catch.

tentzen updated this revision to Diff 325966.Feb 23 2021, 7:30 PM

changed option name and a couple of minor changes per John McCall's comments.

asmith added a comment.Mar 9 2021, 2:20 PM

Hi John, how does this look now?

aganea added a subscriber: aganea.Apr 1 2021, 5:42 AM
tentzen updated this revision to Diff 334855.Apr 1 2021, 4:47 PM

Removed some files (mostly for Part 2 of this feature) that were accidentally put in last revision.

jansvoboda11 added inline comments.Apr 2 2021, 1:14 AM
clang/include/clang/Driver/Options.td
1291

Can you rebase and use the new option marshalling infrastructure instead?

https://clang.llvm.org/docs/InternalsManual.html#adding-new-command-line-option

tentzen updated this revision to Diff 335068.Apr 3 2021, 1:52 AM

rebase the patch

tentzen marked 13 inline comments as done.Apr 5 2021, 12:25 PM

hi, I believe I'd addressed all issues or concerns, and it's rebased to up-to-date source in new _main branch now. Does this look good to everyone? If I don't hear any objection in a couple of days, I will go ahead make this patch in. Again This is just part-1 of Windows SEH feature. Without new option specified, it's a zero-impact change.

clang/include/clang/Driver/Options.td
1291

SEH Async-excception option is implemented the same way as CXX-exception or ObjC-exception.

It would be good for @rjmccall / @rsmith / etc to actually finish reviewing this and accept it.
I would personally want to see the next patches - what changes are needed for llvm analysis, transforms?

asmith added a comment.Apr 6 2021, 9:49 PM

It would be good for @rjmccall / @rsmith / etc to actually finish reviewing this and accept it.
I would personally want to see the next patches - what changes are needed for llvm analysis, transforms?

All changes were provided a year ago and discussed on llvm-dev. I’m inclined to move forward unless any of the reviewers have additional comments???

asmith added a comment.Apr 6 2021, 9:52 PM

@rjmccall @rsmith Thanks for your help. Do you have any additional feedback before we commit?

could you clang-format the patch?

clang/lib/CodeGen/CGException.cpp
1666–1667
if (auto*LI = dyn_cast<...>(J)) {
  ...
}

and ditto below

llvm/docs/LangRef.rst
12277

prevent potential exceptions from being moved...

It would be good for @rjmccall / @rsmith / etc to actually finish reviewing this and accept it.
I would personally want to see the next patches - what changes are needed for llvm analysis, transforms?

All changes were provided a year ago and discussed on llvm-dev. I’m inclined to move forward unless any of the reviewers have additional comments???

Could you please add relevant links to the patch description?

tentzen updated this revision to Diff 336011.Apr 7 2021, 11:25 PM
tentzen edited the summary of this revision. (Show Details)
tentzen marked an inline comment as done.

clang-format fixes.

tentzen updated this revision to Diff 336033.Apr 8 2021, 1:51 AM

fixed command option typos in two test cases

Hi, (Last call for review!!)
Is there any more comments? This review has lasted for more than a year now. I believe I had addressed or answered all questions and concerns. Thank all reviewers' precious feedbacks.
For late comers, again
(1) Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

(2) John McCall had done a thorough and insightful review. please see details above.

(3) Finally, a prototype of entire Windows SEH implementation (including both HW exception handling and local unwind) is stored in https://github.com/tentzen/llvm-project. This prototype had been stressed by a Microsoft Windows internal SEH test suite.
Note this check-in is just part-1 of HW exception handling.

Could please someone accept it as a record?
thank you!

Hi,

(Last call for review!!)

Please refer to https://llvm.org/docs/CodeReview.html

Is there any more comments? This review has lasted for more than a year now. I believe I had addressed or answered all questions and concerns. Thank all reviewers' precious feedbacks.
For late comers, again
(1) Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

(2) John McCall had done a thorough and insightful review. please see details above.

(3) Finally, a prototype of entire Windows SEH implementation (including both HW exception handling and local unwind) is stored in https://github.com/tentzen/llvm-project. This prototype had been stressed by a Microsoft Windows internal SEH test suite.
Note this check-in is just part-1 of HW exception handling.

So can you actually post the other patches as reviews?

Could please someone accept it as a record?
thank you!

bool keepFramePointer(const MachineFunction &MF) const override;

It would be good for @rjmccall / @rsmith / etc to actually finish reviewing this and accept it.
I would personally want to see the next patches - what changes are needed for llvm analysis, transforms?

All changes were provided a year ago and discussed on llvm-dev. I’m inclined to move forward unless any of the reviewers have additional comments???

Could you please add relevant links to the patch description?

Hi,

Hi,

(Last call for review!!)

Please refer to https://llvm.org/docs/CodeReview.html

Is there any more comments? This review has lasted for more than a year now. I believe I had addressed or answered all questions and concerns. Thank all reviewers' precious feedbacks.
For late comers, again
(1) Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

(2) John McCall had done a thorough and insightful review. please see details above.

(3) Finally, a prototype of entire Windows SEH implementation (including both HW exception handling and local unwind) is stored in https://github.com/tentzen/llvm-project. This prototype had been stressed by a Microsoft Windows internal SEH test suite.
Note this check-in is just part-1 of HW exception handling.

So can you actually post the other patches as reviews?

Could please someone accept it as a record?
thank you!

Hi,

(Last call for review!!)

Please refer to https://llvm.org/docs/CodeReview.html

Is there any more comments? This review has lasted for more than a year now. I believe I had addressed or answered all questions and concerns. Thank all reviewers' precious feedbacks.
For late comers, again
(1) Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

(2) John McCall had done a thorough and insightful review. please see details above.

(3) Finally, a prototype of entire Windows SEH implementation (including both HW exception handling and local unwind) is stored in https://github.com/tentzen/llvm-project. This prototype had been stressed by a Microsoft Windows internal SEH test suite.
Note this check-in is just part-1 of HW exception handling.

So can you actually post the other patches as reviews?

Could please someone accept it as a record?
thank you!

Yes, I read that policy. and per policy, I waited for a week and pinged.

What other patches you were asking? LLVM HW EH Part-2 change or including Local Unwind?
I did at first put the whole design and implementation in RFC, but I was asked to separate Local Unwind and HW exception.
So I did and submitted a patch specific for HW exception, but was asked to further divide it.
So I did, and this is part-1 for Clang. After this patch is landed, yes I will work on next patch.

Meanwhile please refer to those RFC threads and Code/Wiki in https://github.com/tentzen/llvm-project for more details.

One step at a time, for now for this patch, is there any other feedback? thanks.

asmith accepted this revision.May 13 2021, 11:42 AM

Thanks everyone for the great feedback!
We'll move forward with landing this first patch and start the review of the next one.

Passed many CodeGen related test suites over the weekend.
It landed in..

commit 9ca9be098fedb14182c50c9dd700f3fa91c8d4c7 (HEAD -> main)
Author: Ten Tzen <tentzen@microsoft.com>
Date: Sun May 16 18:12:47 2021 -0700

[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1

Hello, this patch introduces a regression in one of our codebases. The compilation of several TUs fails with the following callstack:

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /media/project2/llvm-project/stage0/bin/clang-cl @__compile.rsp temp.i
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x00000000043c2633 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /media/project2/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:13
 #1 0x00000000043c08f0 llvm::sys::RunSignalHandlers() /media/project2/llvm-project/llvm/lib/Support/Signals.cpp:77:18
 #2 0x00000000043c1d7d llvm::sys::CleanupOnSignal(unsigned long) /media/project2/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000000004343206 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /media/project2/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:75:5
 #4 0x0000000004343206 CrashRecoverySignalHandler(int) /media/project2/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:388:51
 #5 0x00007f58152c4a90 __restore_rt sigaction.c:0:0
 #6 0x0000000003c5c11f llvm::ilist_sentinel<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void> >::empty() const /media/project2/llvm-project/llvm/include/llvm/ADT/ilist_node.h:248:36
 #7 0x0000000003c5c11f llvm::simple_ilist<llvm::Instruction>::empty() const /media/project2/llvm-project/llvm/include/llvm/ADT/simple_ilist.h:131:55
 #8 0x0000000003c5c11f llvm::BasicBlock::empty() const /media/project2/llvm-project/llvm/include/llvm/IR/BasicBlock.h:307:66
 #9 0x0000000003c5c11f llvm::BasicBlock::removePredecessor(llvm::BasicBlock*, bool) /media/project2/llvm-project/llvm/lib/IR/BasicBlock.cpp:328:7
#10 0x0000000004432005 markAliveBlocks(llvm::Function&, llvm::SmallPtrSetImpl<llvm::BasicBlock*>&, llvm::DomTreeUpdater*) /media/project2/llvm-project/llvm/lib/Transforms/Utils/Local.cpp:2290:15
#11 0x0000000004432005 llvm::removeUnreachableBlocks(llvm::Function&, llvm::DomTreeUpdater*, llvm::MemorySSAUpdater*) /media/project2/llvm-project/llvm/lib/Transforms/Utils/Local.cpp:2402:18
#12 0x00000000042e72c7 simplifyFunctionCFGImpl(llvm::Function&, llvm::TargetTransformInfo const&, llvm::DominatorTree*, llvm::SimplifyCFGOptions const&) /media/project2/llvm-project/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp:0:22
#13 0x00000000042e72c7 simplifyFunctionCFG(llvm::Function&, llvm::TargetTransformInfo const&, llvm::DominatorTree*, llvm::SimplifyCFGOptions const&) /media/project2/llvm-project/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp:275:18
#14 0x00000000042e70d1 llvm::SimplifyCFGPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /media/project2/llvm-project/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp:321:7
#15 0x0000000002e605bd llvm::detail::PassModel<llvm::Function, llvm::SimplifyCFGPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) BPFTargetMachine.cpp:0:0
#16 0x0000000003d54659 llvm::SmallPtrSet<void*, 2u>::operator=(llvm::SmallPtrSet<void*, 2u>&&) /media/project2/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:488:13
#17 0x0000000003d54659 llvm::PreservedAnalyses::operator=(llvm::PreservedAnalyses&&) /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:155:7
#18 0x0000000003d54659 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:509:16
#19 0x0000000002aa318d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) AMDGPUTargetMachine.cpp:0:0
#20 0x0000000003d5769e llvm::SmallPtrSet<void*, 2u>::operator=(llvm::SmallPtrSet<void*, 2u>&&) /media/project2/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:488:13
#21 0x0000000003d5769e llvm::PreservedAnalyses::operator=(llvm::PreservedAnalyses&&) /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:155:7
#22 0x0000000003d5769e llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /media/project2/llvm-project/llvm/lib/IR/PassManager.cpp:117:14
#23 0x0000000002aa2fed llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) AMDGPUTargetMachine.cpp:0:0
#24 0x0000000003d538f6 llvm::SmallPtrSet<void*, 2u>::operator=(llvm::SmallPtrSet<void*, 2u>&&) /media/project2/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:488:13
#25 0x0000000003d538f6 llvm::PreservedAnalyses::operator=(llvm::PreservedAnalyses&&) /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:155:7
#26 0x0000000003d538f6 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:509:16
#27 0x00000000045ba45e llvm::SmallPtrSetImplBase::isSmall() const /media/project2/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:194:33
#28 0x00000000045ba45e llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /media/project2/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:82:10
#29 0x00000000045ba45e llvm::PreservedAnalyses::~PreservedAnalyses() /media/project2/llvm-project/llvm/include/llvm/IR/PassManager.h:155:7
#30 0x00000000045ba45e (anonymous namespace)::EmitAssemblyHelper::EmitAssemblyWithNewPassManager(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) /media/project2/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1494:5
#31 0x00000000045b4868 std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >::~unique_ptr() /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/unique_ptr.h:360:12
#32 0x00000000045b4868 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) /media/project2/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1660:5
#33 0x0000000004bf8006 std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >::~unique_ptr() /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/unique_ptr.h:360:6
#34 0x0000000004bf8006 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /media/project2/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:334:7
#35 0x00000000058f8283 __gnu_cxx::__normal_iterator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >*, std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> > > > >::__normal_iterator(std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >* const&) /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/stl_iterator.h:979:20
#36 0x00000000058f8283 std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> > > >::begin() /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/stl_vector.h:812:16
#37 0x00000000058f8283 void clang::finalize<std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> > > > >(std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> >, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback> > > >&, clang::Sema const&) /media/project2/llvm-project/clang/include/clang/Sema/TemplateInstCallback.h:54:16
#38 0x00000000058f8283 clang::ParseAST(clang::Sema&, bool, bool) /media/project2/llvm-project/clang/lib/Parse/ParseAST.cpp:178:3
#39 0x0000000004b59658 clang::FrontendAction::Execute() /media/project2/llvm-project/clang/lib/Frontend/FrontendAction.cpp:955:10
#40 0x0000000004ad1d66 llvm::Error::getPtr() const /media/project2/llvm-project/llvm/include/llvm/Support/Error.h:277:12
#41 0x0000000004ad1d66 llvm::Error::operator bool() /media/project2/llvm-project/llvm/include/llvm/Support/Error.h:236:16
#42 0x0000000004ad1d66 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /media/project2/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:965:23
#43 0x0000000004bf3330 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /media/project2/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278:25
#44 0x00000000028477f1 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /media/project2/llvm-project/clang/tools/driver/cc1_main.cpp:246:15
#45 0x0000000002845c25 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /media/project2/llvm-project/clang/tools/driver/driver.cpp:338:12
#46 0x00000000049c3782 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1::operator()() const /media/project2/llvm-project/clang/lib/Driver/Job.cpp:404:30
#47 0x00000000049c3782 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) /media/project2/llvm-project/llvm/include/llvm/ADT/STLExtras.h:177:12
#48 0x0000000004342fd7 llvm::function_ref<void ()>::operator()() const /media/project2/llvm-project/llvm/include/llvm/ADT/STLExtras.h:0:12
#49 0x0000000004342fd7 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /media/project2/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:424:3
#50 0x00000000049c3229 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const /media/project2/llvm-project/clang/lib/Driver/Job.cpp:404:7
#51 0x0000000004996ce5 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const /media/project2/llvm-project/clang/lib/Driver/Compilation.cpp:196:15
#52 0x0000000004997177 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const /media/project2/llvm-project/clang/lib/Driver/Compilation.cpp:249:13
#53 0x00000000049aa87c llvm::SmallVectorBase<unsigned int>::empty() const /media/project2/llvm-project/llvm/include/llvm/ADT/SmallVector.h:73:47
#54 0x00000000049aa87c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) /media/project2/llvm-project/clang/lib/Driver/Driver.cpp:1538:23
#55 0x000000000284566d main /media/project2/llvm-project/clang/tools/driver/driver.cpp:510:21
#56 0x00007f5814d60082 __libc_start_main (/lib64/libc.so.6+0x27082)
#57 0x0000000002842d1e _start (/media/project2/llvm-project/stage0/bin/clang-cl+0x2842d1e)
clang-cl: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 13.0.0 (https://github.com/llvm/llvm-project.git 797ad701522988e212495285dade8efac41a24d4)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: /media/project2/llvm-project/stage0/bin

I'll paste the reproducer as soon as c-reduce completes.

Hi, Alexandre
Thank you for reporting this . From the callstack it does not seem related to this patch.
Could you share the repro case @__compile.rsp temp.i with me?
Thanks,

--Ten

aganea added a comment.Jun 3 2021, 8:31 AM

Thank you for reporting this . From the callstack it does not seem related to this patch.

It was actually git bisect which pointed to me to this patch, it could be a side-effect of it.

Could you share the repro case @__compile.rsp temp.i with me?

Will surely do, I don't have a simple repro yet.

I'm also seeing a 3rd issue, different from the ones reported above, but it might all be the same thing:

clang-cl: /mnt/llvm-project/clang/lib/CodeGen/CGCleanup.cpp:1322: void EmitSehScope(clang::CodeGen::CodeGenFunction &, llvm::FunctionCallee &): Assertion `BB && InvokeDest' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /mnt/llvm-project/stage0/bin/clang-cl @/mnt/llvm-project/__test/__compile.rsp temp.i
1.      (edited).cpp:1025:1 <Spelling=(edited).h:7:49>: current parser token 'static_assert'
2.      (edited).cpp:56:1 <Spelling=(edited).h:20:143>: LLVM IR generation of declaration '(edited)'
3.      (edited).cpp:91:18: Generating code for declaration '(edited)'
 #0 0x0000000005903223 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/mnt/llvm-project/stage0/bin/clang-cl+0x5903223)
 #1 0x0000000005900f6e llvm::sys::RunSignalHandlers() (/mnt/llvm-project/stage0/bin/clang-cl+0x5900f6e)
 #2 0x00000000059025cd llvm::sys::CleanupOnSignal(unsigned long) (/mnt/llvm-project/stage0/bin/clang-cl+0x59025cd)
 #3 0x0000000005876bc3 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #4 0x0000000005876d2e CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #5 0x00007f88cc4d73c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
 #6 0x00007f88cbf8818b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4618b)
 #7 0x00007f88cbf67859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859)
 #8 0x00007f88cbf67729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729)
 #9 0x00007f88cbf78f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
#10 0x0000000005c91e05 EmitSehScope(clang::CodeGen::CodeGenFunction&, llvm::FunctionCallee&) CGCleanup.cpp:0:0
#11 0x0000000005c8c918 clang::CodeGen::EHScopeStack::pushCleanup(clang::CodeGen::CleanupKind, unsigned long) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c8c918)
#12 0x0000000005bf80ef void clang::CodeGen::CodeGenFunction::pushFullExprCleanup<clang::CodeGen::CodeGenFunction::CallLifetimeEnd, clang::CodeGen::Address, llvm::Value*>(clang::CodeGen::CleanupKind, clang::CodeGen::Address, llvm::Value*) CGCall.cpp:0:0
#13 0x0000000005cb937c clang::CodeGen::CodeGenFunction::EmitMaterializeTemporaryExpr(clang::MaterializeTemporaryExpr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb937c)
#14 0x0000000005cb76a3 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb76a3)
#15 0x0000000005cbba94 clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cbba94)
#16 0x0000000005bf0114 clang::CodeGen::CodeGenFunction::EmitCallArg(clang::CodeGen::CallArgList&, clang::Expr const*, clang::QualType) (/mnt/llvm-project/stage0/bin/clang-cl+0x5bf0114)
#17 0x0000000005bef273 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, clang::CodeGen::CodeGenFunction::PrototypeWrapper, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const> >, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/mnt/llvm-project/stage0/bin/clang-cl+0x5bef273)
#18 0x0000000005e1368d commonEmitCXXMemberOrOperatorCall(clang::CodeGen::CodeGenFunction&, clang::CXXMethodDecl const*, llvm::Value*, llvm::Value*, clang::QualType, clang::CallExpr const*, clang::CodeGen::CallArgList&, clang::CodeGen::CallArgList*) CGExprCXX.cpp:0:0
#19 0x0000000005e13383 clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorCall(clang::CXXMethodDecl const*, clang::CodeGen::CGCallee const&, clang::CodeGen::ReturnValueSlot, llvm::Value*, llvm::Value*, clang::QualType, clang::CallExpr const*, clang::CodeGen::CallArgList*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5e13383)
#20 0x0000000005e15afa clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5e15afa)
#21 0x0000000005e16330 clang::CodeGen::CodeGenFunction::EmitCXXOperatorMemberCallExpr(clang::CXXOperatorCallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot) (/mnt/llvm-project/stage0/bin/clang-cl+0x5e16330)
#22 0x0000000005cd51f7 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cd51f7)
#23 0x0000000005cc27e1 clang::CodeGen::CodeGenFunction::EmitCallExprLValue(clang::CallExpr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cc27e1)
#24 0x0000000005cb7507 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb7507)
#25 0x0000000005e14dc2 clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5e14dc2)
#26 0x0000000005e16330 clang::CodeGen::CodeGenFunction::EmitCXXOperatorMemberCallExpr(clang::CXXOperatorCallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot) (/mnt/llvm-project/stage0/bin/clang-cl+0x5e16330)
#27 0x0000000005cd51f7 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cd51f7)
#28 0x0000000005cc27e1 clang::CodeGen::CodeGenFunction::EmitCallExprLValue(clang::CallExpr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cc27e1)
#29 0x0000000005cb7507 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb7507)
#30 0x0000000005cb7769 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb7769)
#31 0x0000000005cb7201 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5cb7201)
#32 0x0000000005d8eae2 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/mnt/llvm-project/stage0/bin/clang-cl+0x5d8eae2)
#33 0x0000000005d9a260 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/mnt/llvm-project/stage0/bin/clang-cl+0x5d9a260)
#34 0x0000000005ca66b1 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5ca66b1)
#35 0x0000000005ca7292 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/mnt/llvm-project/stage0/bin/clang-cl+0x5ca7292)
#36 0x0000000005c370de clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c370de)
#37 0x0000000005c2f1cc clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c2f1cc)
#38 0x0000000005c3aac3 clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c3aac3)
#39 0x0000000005c3ee3b clang::CodeGen::CodeGenModule::EmitDeclContext(clang::DeclContext const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c3ee3b)
#40 0x0000000005c3ee3b clang::CodeGen::CodeGenModule::EmitDeclContext(clang::DeclContext const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c3ee3b)
#41 0x0000000005c3ee3b clang::CodeGen::CodeGenModule::EmitDeclContext(clang::DeclContext const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c3ee3b)
#42 0x0000000005c3ee3b clang::CodeGen::CodeGenModule::EmitDeclContext(clang::DeclContext const*) (/mnt/llvm-project/stage0/bin/clang-cl+0x5c3ee3b)
#43 0x0000000006305660 (anonymous namespace)::CodeGeneratorImpl::HandleTopLevelDecl(clang::DeclGroupRef) ModuleBuilder.cpp:0:0
#44 0x0000000006302ee6 clang::BackendConsumer::HandleTopLevelDecl(clang::DeclGroupRef) CodeGenAction.cpp:0:0
#45 0x00000000072c76d4 clang::ParseAST(clang::Sema&, bool, bool) (/mnt/llvm-project/stage0/bin/clang-cl+0x72c76d4)
#46 0x000000000624a353 clang::FrontendAction::Execute() (/mnt/llvm-project/stage0/bin/clang-cl+0x624a353)
#47 0x00000000061b8628 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/mnt/llvm-project/stage0/bin/clang-cl+0x61b8628)
#48 0x00000000062fd312 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/mnt/llvm-project/stage0/bin/clang-cl+0x62fd312)
#49 0x000000000373fc85 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/mnt/llvm-project/stage0/bin/clang-cl+0x373fc85)
#50 0x000000000373db1d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#51 0x0000000006053ec2 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) Job.cpp:0:0
#52 0x0000000005876ae1 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/mnt/llvm-project/stage0/bin/clang-cl+0x5876ae1)
#53 0x00000000060538c7 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/mnt/llvm-project/stage0/bin/clang-cl+0x60538c7)
#54 0x000000000601c764 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/mnt/llvm-project/stage0/bin/clang-cl+0x601c764)
#55 0x000000000601cca7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/mnt/llvm-project/stage0/bin/clang-cl+0x601cca7)
#56 0x00000000060363c8 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/mnt/llvm-project/stage0/bin/clang-cl+0x60363c8)
#57 0x000000000373d3a1 main (/mnt/llvm-project/stage0/bin/clang-cl+0x373d3a1)
#58 0x00007f88cbf690b3 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b3)
#59 0x000000000373a7be _start (/mnt/llvm-project/stage0/bin/clang-cl+0x373a7be)
clang-cl: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 13.0.0 (https://github.com/llvm/llvm-project.git 797ad701522988e212495285dade8efac41a24d4)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: /mnt/llvm-project/stage0/bin

Actually adding -O2 to all the LIT tests in this patch will trigger the same assertion at CGCleanup.cpp:1326. The code's assuming there is always an invoke destination in the program, but in these cases the InvokeDest is null.

According to Microsoft's documentation, EHa means: async exceptions + C++ exceptions, with unwinding.
I am wondering if in:
static void EmitSehScope(CodeGenFunction &CGF,

                       llvm::FunctionCallee &SehCppScope) {
llvm::BasicBlock *InvokeDest = CGF.getInvokeDest();

..
}
invokeDest shouldn't be something like:
InvokeDest = CGF.getInvokeDest() || "does the function has attribute NoUnwind".
Not sure if we have all the data to test this at this point?

Hi, yes I can repro the crash with -fasync-exceptions plus -O2 option. Working on it now..
thank you for reporting this bug..
@aganea , this patch should be zero-impact without explicit option -fasync-exceptions. Are you also seeing a crash without this option?
thanks

aganea added a comment.Jun 3 2021, 1:38 PM

@aganea , this patch should be zero-impact without explicit option -fasync-exceptions. Are you also seeing a crash without this option?

I'm using /EHa which I expect translates to -fasync-exceptions. Removing /EHa doesn't cause the crash.

@aganea Oh, my mistake. did not mean to enable -fasync-exceptions under -EHa in this patch. will fix it shortly...

jansvoboda11 resigned from this revision.Aug 6 2021, 4:46 AM
bowang added a subscriber: bowang.Apr 14 2022, 12:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2022, 12:29 AM
Herald added a subscriber: MaskRay. · View Herald Transcript

Any updates on this patch? Would like to see it moving forward.

I was trying to use Google Breakpad with Clang. Breakpad is a crash handling library that generates a minidump at crash. The exception handler on Windows relies on SEH.

Breakpad works fine with MSVC. But when it is built with Clang, the exception code returned in exinfo>ExceptionRecord->ExceptionCode is incorrect. In my test program, an access to nullptr should trigger EXCEPTION_ACCESS_VIOLATION but what's returned is EXCEPTION_BREAKPOINT. I suspect it is due to Hardware Exception Handling in Clang.

I tried to add flag -fasync-exceptions but the problem stays the same.

I tried to add flag -fseh-exceptions but received an error error: invalid exception model 'seh' for target 'x86_64-pc-windows-msvc19.0.24245'

I'm not fully sure whether this patch can address my issue, but would appreciate if anyone could shed some light on this. Thanks in advance.

Any updates on this patch? Would like to see it moving forward.

I was trying to use Google Breakpad with Clang. Breakpad is a crash handling library that generates a minidump at crash. The exception handler on Windows relies on SEH.

Breakpad works fine with MSVC. But when it is built with Clang, the exception code returned in exinfo>ExceptionRecord->ExceptionCode is incorrect. In my test program, an access to nullptr should trigger EXCEPTION_ACCESS_VIOLATION but what's returned is EXCEPTION_BREAKPOINT. I suspect it is due to Hardware Exception Handling in Clang.

I tried to add flag -fasync-exceptions but the problem stays the same.

I tried to add flag -fseh-exceptions but received an error error: invalid exception model 'seh' for target 'x86_64-pc-windows-msvc19.0.24245'

I'm not fully sure whether this patch can address my issue, but would appreciate if anyone could shed some light on this. Thanks in advance.

Hi, this patch is just part-1 of Windows SEH feature support. Part-2 work had been sitting in https://reviews.llvm.org/D102817/new/ since last May 2021. It's been reviewed, and feedbacks had been addressed. but nobody has approved it. To handle Hardware Exception in Clang & LLVM, we both patches in. thanks.

Hi, this patch is just part-1 of Windows SEH feature support. Part-2 work had been sitting in https://reviews.llvm.org/D102817/new/ since last May 2021. It's been reviewed, and feedbacks had been addressed. but nobody has approved it. To handle Hardware Exception in Clang & LLVM, we both patches in. thanks.

Thanks for the feedback. This feature is really useful.

We are attempting to switch from MSVC to Clang company-wise. We resolved most issues but lastly got stuck by Clang not supporting SEH. I believe we are not an unique case running into this issue. I'd be very happy to see this feature get merged in.

Hi, this patch is just part-1 of Windows SEH feature support. Part-2 work had been sitting in https://reviews.llvm.org/D102817/new/ since last May 2021. It's been reviewed, and feedbacks had been addressed. but nobody has approved it. To handle Hardware Exception in Clang & LLVM, we both patches in. thanks.

Thanks for the feedback. This feature is really useful.

We are attempting to switch from MSVC to Clang company-wise. We resolved most issues but lastly got stuck by Clang not supporting SEH. I believe we are not an unique case running into this issue. I'd be very happy to see this feature get merged in.

Yes totally understood. but I had been blocked to move forward Part-2 patch https://reviews.llvm.org/D102817/new/ for almost one year. I cannot just land it in without an "approval".
thanks.

(Please mark this "closed" to reflect the fact that it was merged.)

clang/lib/Sema/JumpDiagnostics.cpp
935

I'm trying to follow this, and not really understanding. Take the following:

void f(int a);
void f(int ex, int lu, int lu2, int lu3) {
  __try {
    {
    int a = 1;
    foo:
      f(a);
    }
    __try {
        f(2);
        goto foo;
    } __except (ex){
    }
  } __except (ex){
  }
}

We emit an @llvm.seh.scope.begin() which... doesn't appear to do anything? I guess maybe that's just a bug, and you only intended to catch jumps into scopes involving a non-trivial destructor? Okay, let's try that:

struct A { ~A() noexcept; };
void f(int a);
void f(int ex, int lu, int lu2, int lu3) {
  __try {
    {
    A a;
    foo:
      f(1);
    }
    __try {
        f(2);
        goto foo;
    } __except (ex){
    }
  } __except (ex){
  }
}

Here we get... two @llvm.seh.scope.begin, and no @llvm.seh.scope.end?

Okay, maybe this is actually only meant to work with "try", not "_try". Let's see what happens:

struct A { ~A() noexcept; };
void f(int a);
void f(int ex, int lu, int lu2, int lu3) {
  try {
    f(0);
    {
    A a;
    foo:
      f(1);
    }
        f(2);
        goto foo;
  } catch (...){
  }
}

I get two llvm.seh.scope.begin in a row, followed by one llvm.seh.scope.end. Which still seems wrong.

I'm really not sure what you're hoping to accomplish here.

asmith commandeered this revision.Apr 21 2022, 11:23 PM
asmith edited reviewers, added: tentzen; removed: asmith.
asmith accepted this revision.Apr 21 2022, 11:28 PM
This revision is now accepted and ready to land.Apr 21 2022, 11:49 PM
asmith closed this revision.Apr 21 2022, 11:50 PM
Patuti added a subscriber: Patuti.May 11 2022, 12:49 PM