This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.
Details
Diff Detail
Event Timeline
How is the corresponding case handled for x86_64, where you afaik also need the original stack pointer?
lib/Target/AArch64/AArch64FastISel.cpp | ||
---|---|---|
3466 | Period at end of comments. |
I debugged MS ARM64 setjmp/longjmp functions and ARM64 EH runtime. If you looks like assembly output from CL for ARM64, it is using SP instead of FP.
Hmm, in my test it seems like it doesn't even use the SP as it was on on entry, it uses SP on entry + 16:
$ cat frame.c #include <setjmp.h> void other(char* buf); jmp_buf jb; void func(int a) { char buf[100]; setjmp(jb); other(buf); } $ cl -c frame.c -O2 Microsoft (R) C/C++ Optimizing Compiler Version 19.15.26726 for ARM64 Copyright (C) Microsoft Corporation. All rights reserved. frame.c $ dumpbin -nologo -disasm frame.obj Dump of file frame.obj File Type: COFF OBJECT func: 0000000000000000: F81F0FFE str lr,[sp,#-0x10]! 0000000000000004: 94000000 bl __security_push_cookie 0000000000000008: D10183FF sub sp,sp,#0x60 000000000000000C: 90000008 adrp x8,jb 0000000000000010: 91000100 add x0,x8,jb 0000000000000014: 910203E1 add x1,sp,#0x80 0000000000000018: 94000000 bl _setjmpex 000000000000001C: 910003E0 mov x0,sp 0000000000000020: 94000000 bl other 0000000000000024: 910183FF add sp,sp,#0x60 0000000000000028: 94000000 bl __security_pop_cookie 000000000000002C: F84107FE ldr lr,[sp],#0x10 0000000000000030: D65F03C0 ret
In this case, we first subtract sp by 0x10 and later by 0x60, but in the parameter to _setjmpex we pass sp + 0x80.
test/CodeGen/AArch64/sponentry.ll | ||
---|---|---|
2 | This test should test both with fastisel and dagisel, and should test cases both with and without fixed objects. |
Because __security_push_cookie has extra 0x10 subtraction on SP, if you removes buf[100], you will see no security_push_cookie is used. it will be sp on entry.
__security_push_cookie:
000000014000D680: D10043FF sub sp,sp,#0x10 000000014000D684: F0000471 adrp xip1,__scrt_native_dllmain_reason 000000014000D688: 91002231 add xip1,xip1,#8 000000014000D68C: F9400231 ldr xip1,[xip1] 000000014000D690: CB3163F1 sub xip1,sp,xip1 000000014000D694: F90007F1 str xip1,[sp,#8] 000000014000D698: D65F03C0 ret
__security_pop_cookie:
000000014000D69C: F0000471 adrp xip1,__scrt_native_dllmain_reason 000000014000D6A0: 91002231 add xip1,xip1,#8 000000014000D6A4: F94007F0 ldr xip0,[sp,#8] 000000014000D6A8: F9400231 ldr xip1,[xip1] 000000014000D6AC: CB3063F0 sub xip0,sp,xip0 000000014000D6B0: EB11021F cmp xip0,xip1 000000014000D6B4: 54000061 bne 000000014000D6C0 000000014000D6B8: 910043FF add sp,sp,#0x10 000000014000D6BC: D65F03C0 ret 000000014000D6C0: AA1003E0 mov x0,xip0
Except for the concern about the test, this looks sensible to me. As this adds a new intrinsic in the public IR, I'd prefer if someone else with a bit more experience with these things to approve it as well though - @efriedma?
test/CodeGen/AArch64/sponentry.ll | ||
---|---|---|
23 | Are you sure this actually triggers the fixed object codepath? (This looks a lot like the test example I posted earlier, and that wasn't at least designed to trigger fixed objects.) I think at least a function with varargs has fixed objects. |
test/CodeGen/AArch64/sponentry.ll | ||
---|---|---|
23 | Indeed, I tested this, and both functions show numFixed = 0. So is the other codepath actually tested with some other code, or still untested? |
All new target-independent intrinsics need to be documented in LangRef.
Thinking about it a bit, this seems like a reasonable intrinsic to support across all (or at least most) targets; the meaning is well-defined, and it can't be computed from any of the other frame-related intrinsics, so I think it makes sense to support cross-platform.
That said, maybe instead of adding a new intrinsic, we should rename llvm.addressofreturnaddress? It's the same thing on x86, and addressofreturnaddress doesn't really make sense on other targets anyway.
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
5186–5190 | What invariant are you relying on for fixed objects here? This seems like it could be fragile. Stack protectors and exception handling schemes often introduce new fixed objects. You might want to scan for the first or last parameter or CSR or whatever it is that you're really trying to be relative to. |
That said, maybe instead of adding a new intrinsic, we should rename llvm.addressofreturnaddress? It's the same thing on x86, and addressofreturnaddress doesn't really make sense on other targets anyway.
Nevermind, ignore this; apparently we actually need a separate llvm.addressofreturnaddress() to return the address of the return address on non-x86 targets, to implement the MSVC _AddressOfReturnAddress().
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
5186–5190 | You can also get fixed objects by just having a function with too many arguments to fit in registers. (Please add a testcase.) There isn't really any existing fixed object with the right offset, in general. But I think you can unconditionally compute the correct offset with something like DAG.getFrameIndex(MFI.CreateFixedObject(1, 0, true), VT). Or I guess you could use a pseudo-instruction, and expand it after frame lowering, but that doesn't seem obviously better. |
lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
5184 | This might become unnecessary. But if it doesn't, please use setFrameAddressIsTaken to force a frame pointer to be emitted, instead of checking whether frame pointer elimination is enabled. | |
5189 | Do you need the "if" statement here? CreateFixedObject should work whether or not there are other fixed objects, I think. |
Period at end of comments.