Diff Detail
Event Timeline
Updated the patch with the following changes:
- Emit llvm.x86.seh.recoverfp only for non-aarch64 targets. For aarch64 windows, the parent fp is always passed in x1. So we don't need a separate instrinsic to recover the fp.
- Enable SEH exceptions by default for ARM64 windows. If exceptions are not enabled then function which may raise an exception are by default marked as "nounwind".
- Removed ARM64 specific tests from exceptions-seh.c. I will push more comprehensive tests for ARM64 SEH soon.
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
471 ↗ | (On Diff #171994) | I'm suspicious about adding a new default for AArch64 specifically; why do we need this if x86-64 doesn't? |
What about the three stack pointer case of stack realignment plus a dynamic alloca? Typically this is the case where recoverfp is necessary.
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
471 ↗ | (On Diff #171994) | Yeah, -cc1 -fexceptions really means "should I emit destructor cleanups", and SEH doesn't require that at all. |
@rnk Sorry, I missed your comment earlier. I am not sure what scenario the three stack pointer case is. Could you please give me a test case for it?
I think you just need a local variable with an alignment greater than the ABI provides, something like:
struct Foo { Foo(); int x, y, z; }; int filter(int); void may_throw(); void seh_byval(int n) { Foo __declspec(align(32)) o; __try { may_throw(); } __except(filter(o.x)) { } }
We seem to be generating the same code as MSVC. Here's my test case:
struct Foo { void (*ptr)(); int x, y, z; }; int filter(int); void may_throw(); void seh_byval(int n) { struct Foo __declspec(align(32)) o; __try { may_throw(); } __except(filter(o.x)) { } }
MSVC:
|seh_byval$filt$0| PROC stp fp,lr,[sp,#-0x10]! mov fp,x1 |$LN5@seh_byval$| ldr x8,[fp,#0x10] ldr w0,[x8,#8] bl filter nop |$LN7@seh_byval$| ldp fp,lr,[sp],#0x10 ret
LLVM:
.seh_proc "?filt$0@0@seh_byval@@" ; %bb.0: ; %entry movz x8, #:abs_g1_s:.Lseh_byval$frame_escape_0 movk x8, #:abs_g0_nc:.Lseh_byval$frame_escape_0 add x8, x1, x8 ldr w0, [x8, #8] b filter
I can't compile the example you gave yet because I haven't applied your patches locally, but this is the "three stack pointer" case that I have in mind:
struct Foo { void (*ptr)(); int x, y, z; }; void escape(void *); void large_align(int a0, int a1, int a2, int a3, int a4, int a5, int a6, int a7, int stackarg) { struct Foo __declspec(align(32)) alignedlocal; alignedlocal.x = 42; int vla[a0]; escape(&alignedlocal); vla[0] = stackarg; escape(&vla[0]); }
This is LLVM's generated code:
"?large_align@@YAXHHHHHHHHH@Z": ; @"?large_align@@YAXHHHHHHHHH@Z" .seh_proc "?large_align@@YAXHHHHHHHHH@Z" ; %bb.0: ; %entry stp x21, x22, [sp, #-48]! ; 16-byte Folded Spill stp x19, x20, [sp, #16] ; 16-byte Folded Spill stp x29, x30, [sp, #32] ; 16-byte Folded Spill add x29, sp, #32 ; =32 sub x9, sp, #48 ; =48 and sp, x9, #0xffffffffffffffe0 mov x19, sp mov w8, #42 str w8, [x19, #8] mov w8, w0 ldr w22, [x29, #16] lsl x8, x8, #2 add x8, x8, #15 ; =15 lsr x15, x8, #4 mov x21, sp bl __chkstk mov x8, sp sub x20, x8, x15, lsl #4 mov sp, x20 add x0, x19, #0 ; =0 bl "?escape@@YAXPEAX@Z" mov x0, x20 str w22, [x20] bl "?escape@@YAXPEAX@Z" mov sp, x21 sub sp, x29, #32 ; =32 ldp x29, x30, [sp, #32] ; 16-byte Folded Reload ldp x19, x20, [sp, #16] ; 16-byte Folded Reload ldp x21, x22, [sp], #48 ; 16-byte Folded Reload ret
I see three pointers used to address the stack:
- sp: to address vla
- x19: to address locals, the so-called "base" pointer
- x29: to address parameters on the stack, looks like the traditional FP, points to FP+LR pair as well
At least for x86, the unwind info doesn't describe X19, it just describes X29, since that's what you need to restore CSRs and find the parent stack frame. We saw that the Windows EH runtime passes in some value based on the unwind info. For x86, it was just whatever EBP held, so recoverfp simply aligns that value forward to recover the base pointer (ESI) and then uses that to recover locals. For x64, the runtime passes in the value of RSP after the prologue ends, so we adjust it by the "parent frame offset" to get back the value we put in RBP. It looks like for x64 we never handled the case I'm asking you about, because this program doesn't print the right value:
#include <stdio.h> struct Foo { void (*ptr)(); int x, y, z; }; int filter(int n) { printf("o.x: %d\n", n); return 1; } void may_throw() { __builtin_trap(); } int main() { struct Foo __declspec(align(32)) o; o.x = 42; __try { may_throw(); } __except(filter(o.x)) { } }
I get "o.x: 0" instead of 42. I bet we can find something about that in bugzilla somewhere. =/
So, hopefully that explains the intended purpose of llvm.x86.seh.recoverfp, and why we might need to generalize it into something non-x86 specific. Maybe llvm.eh.recoverfp. Let me know if I can clarify anything else.
@rnk Thanks a lot for the clarification. Yes, I see o.x: 0 instead of 42. Supporting this case would mean implementing recoverfp as well as support generating the correct parent frame offset for arm64 windows. Do you think this can be done in a follow-up patch? So this patch and D53540 would add the basic support for SEH and we can go fix corner/more complex cases in follow-up patches. There are also several more comprehensive test cases in https://github.com/Microsoft/windows_seh_tests which we plan to address next.
Sure, there's no need to handle the case of highly aligned stack objects in the initial version. But I would recommend making incremental progress in a way that brings us closer to the desired final end state.
I'd recommend renaming llvm.x86.seh.recoverfp to something non-architecture specific, and then implementing it as a no-op in the AArch64 backend. It will simply return the FP it is given, with FIXME comments about handling highly-aligned locals in the future. That'll save us this target check in clang, which we don't need in the long run.
I have added a default lowering for llvm.eh.recoverfp in D53540. So this patch is no longer needed.