This is an archive of the discontinued LLVM Phabricator instance.

[COFF, ARM64] Implement Intrinsic.sponentry for AArch64
ClosedPublic

Authored by yinma on Oct 24 2018, 2:41 PM.

Details

Summary

This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.

Diff Detail

Event Timeline

yinma created this revision.Oct 24 2018, 2:41 PM
yinma added a reviewer: TomTan.

How is the corresponding case handled for x86_64, where you afaik also need the original stack pointer?

mgrang retitled this revision from Implement Intrinsic.sponentry for AArch64 to [COFF, ARM64] Implement Intrinsic.sponentry for AArch64.Oct 24 2018, 3:00 PM
mgrang added inline comments.Oct 24 2018, 3:01 PM
lib/Target/AArch64/AArch64FastISel.cpp
3466

Period at end of comments.

yinma added a comment.Oct 24 2018, 3:32 PM

I debugged MS ARM64 setjmp/longjmp functions and ARM64 EH runtime. If you looks like assembly output from CL for ARM64, it is using SP instead of FP.

I debugged MS ARM64 setjmp/longjmp functions and ARM64 EH runtime. If you looks like assembly output from CL for ARM64, it is using SP instead of FP.

Hmm, in my test it seems like it doesn't even use the SP as it was on on entry, it uses SP on entry + 16:

$ cat frame.c 
#include <setjmp.h>
void other(char* buf);
jmp_buf jb;
void func(int a) {
    char buf[100];
    setjmp(jb);
    other(buf);
}
$ cl -c frame.c -O2
Microsoft (R) C/C++ Optimizing Compiler Version 19.15.26726 for ARM64
Copyright (C) Microsoft Corporation.  All rights reserved.

frame.c
$ dumpbin -nologo -disasm frame.obj

Dump of file frame.obj

File Type: COFF OBJECT

func:
  0000000000000000: F81F0FFE  str         lr,[sp,#-0x10]!
  0000000000000004: 94000000  bl          __security_push_cookie
  0000000000000008: D10183FF  sub         sp,sp,#0x60
  000000000000000C: 90000008  adrp        x8,jb
  0000000000000010: 91000100  add         x0,x8,jb
  0000000000000014: 910203E1  add         x1,sp,#0x80
  0000000000000018: 94000000  bl          _setjmpex
  000000000000001C: 910003E0  mov         x0,sp
  0000000000000020: 94000000  bl          other
  0000000000000024: 910183FF  add         sp,sp,#0x60
  0000000000000028: 94000000  bl          __security_pop_cookie
  000000000000002C: F84107FE  ldr         lr,[sp],#0x10
  0000000000000030: D65F03C0  ret

In this case, we first subtract sp by 0x10 and later by 0x60, but in the parameter to _setjmpex we pass sp + 0x80.

test/CodeGen/AArch64/sponentry.ll
2

This test should test both with fastisel and dagisel, and should test cases both with and without fixed objects.

Because __security_push_cookie has extra 0x10 subtraction on SP, if you removes buf[100], you will see no security_push_cookie is used. it will be sp on entry.

__security_push_cookie:

000000014000D680: D10043FF  sub         sp,sp,#0x10
000000014000D684: F0000471  adrp        xip1,__scrt_native_dllmain_reason
000000014000D688: 91002231  add         xip1,xip1,#8
000000014000D68C: F9400231  ldr         xip1,[xip1]
000000014000D690: CB3163F1  sub         xip1,sp,xip1
000000014000D694: F90007F1  str         xip1,[sp,#8]
000000014000D698: D65F03C0  ret

__security_pop_cookie:

000000014000D69C: F0000471  adrp        xip1,__scrt_native_dllmain_reason
000000014000D6A0: 91002231  add         xip1,xip1,#8
000000014000D6A4: F94007F0  ldr         xip0,[sp,#8]
000000014000D6A8: F9400231  ldr         xip1,[xip1]
000000014000D6AC: CB3063F0  sub         xip0,sp,xip0
000000014000D6B0: EB11021F  cmp         xip0,xip1
000000014000D6B4: 54000061  bne         000000014000D6C0
000000014000D6B8: 910043FF  add         sp,sp,#0x10
000000014000D6BC: D65F03C0  ret
000000014000D6C0: AA1003E0  mov         x0,xip0

Because __security_push_cookie has extra 0x10 subtraction on SP, if you removes buf[100], you will see no security_push_cookie is used. it will be sp on entry.

Ah, I see. Ok, then I guess this aspect makes sense.

yinma updated this revision to Diff 171207.Oct 25 2018, 3:37 PM

Please review again

yinma marked 2 inline comments as done.Oct 25 2018, 3:37 PM

Except for the concern about the test, this looks sensible to me. As this adds a new intrinsic in the public IR, I'd prefer if someone else with a bit more experience with these things to approve it as well though - @efriedma?

test/CodeGen/AArch64/sponentry.ll
23

Are you sure this actually triggers the fixed object codepath? (This looks a lot like the test example I posted earlier, and that wasn't at least designed to trigger fixed objects.) I think at least a function with varargs has fixed objects.

mstorsjo added inline comments.Oct 26 2018, 1:02 PM
test/CodeGen/AArch64/sponentry.ll
23

Indeed, I tested this, and both functions show numFixed = 0. So is the other codepath actually tested with some other code, or still untested?

All new target-independent intrinsics need to be documented in LangRef.

Thinking about it a bit, this seems like a reasonable intrinsic to support across all (or at least most) targets; the meaning is well-defined, and it can't be computed from any of the other frame-related intrinsics, so I think it makes sense to support cross-platform.

That said, maybe instead of adding a new intrinsic, we should rename llvm.addressofreturnaddress? It's the same thing on x86, and addressofreturnaddress doesn't really make sense on other targets anyway.

rnk added inline comments.Oct 26 2018, 2:59 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
5186–5190

What invariant are you relying on for fixed objects here? This seems like it could be fragile. Stack protectors and exception handling schemes often introduce new fixed objects. You might want to scan for the first or last parameter or CSR or whatever it is that you're really trying to be relative to.

yinma updated this revision to Diff 171357.Oct 26 2018, 3:01 PM

add var_arg test case

yinma marked an inline comment as done.Oct 26 2018, 3:01 PM

That said, maybe instead of adding a new intrinsic, we should rename llvm.addressofreturnaddress? It's the same thing on x86, and addressofreturnaddress doesn't really make sense on other targets anyway.

Nevermind, ignore this; apparently we actually need a separate llvm.addressofreturnaddress() to return the address of the return address on non-x86 targets, to implement the MSVC _AddressOfReturnAddress().

efriedma added inline comments.Oct 29 2018, 2:54 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
5186–5190

You can also get fixed objects by just having a function with too many arguments to fit in registers. (Please add a testcase.)

There isn't really any existing fixed object with the right offset, in general. But I think you can unconditionally compute the correct offset with something like DAG.getFrameIndex(MFI.CreateFixedObject(1, 0, true), VT). Or I guess you could use a pseudo-instruction, and expand it after frame lowering, but that doesn't seem obviously better.

yinma updated this revision to Diff 171580.Oct 29 2018, 2:55 PM
yinma updated this revision to Diff 172004.Oct 31 2018, 1:34 PM
efriedma added inline comments.Oct 31 2018, 2:14 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
5184

This might become unnecessary. But if it doesn't, please use setFrameAddressIsTaken to force a frame pointer to be emitted, instead of checking whether frame pointer elimination is enabled.

5189

Do you need the "if" statement here? CreateFixedObject should work whether or not there are other fixed objects, I think.

yinma updated this revision to Diff 172026.Oct 31 2018, 2:51 PM
yinma updated this revision to Diff 172031.Oct 31 2018, 3:11 PM
This revision is now accepted and ready to land.Oct 31 2018, 3:15 PM
This revision was automatically updated to reflect the committed changes.