Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[AArch64, compiler-rt] Implement trampoline intrinsics

Authored by iamlouk on May 4 2023, 6:52 AM.



The llvm.init.trampoline intrinsic is custom lowered to a call of
__trampoline_setup in the AArch64 instruction selection. As these
trampolines require an executable stack, a fatal error is reported
on Android or AArch64 Darwin platforms instead.

The function __trampoline_setup is implemented in the compiler-rt
for the AArch64 architecture.

This patch is inspired by the implementation of the same
intrinsics on the PowerPC architecture.

Diff Detail

Event Timeline

iamlouk created this revision.May 4 2023, 6:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 6:52 AM
iamlouk requested review of this revision.May 4 2023, 6:52 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 4 2023, 6:52 AM
Herald added subscribers: llvm-commits, Restricted Project. · View Herald Transcript
compnerd added inline comments.May 6 2023, 9:48 AM

Does this apply to Windows? I think that the condition here is incorrect.


This will emit this on Windows as well, is this supposed to work there?

iamlouk added inline comments.May 7 2023, 11:37 PM

The x86 trampoline intrinsic (which inlines this logic) does work on Windows, but sadly I was not able to find out if Windows on ARM still allowes an executable stack via the online documentation I looked through, nor do I have the possibility to test it myself. I will search some more, but if I do not find a clear answer, I will change this condition here and the one where the call is emitted to exclude Windows in the next differential.


Same as above where the trampoline setup function is implemented: I will change this to exclude windows in the next differential I upload.

iamlouk updated this revision to Diff 522076.May 15 2023, 12:53 AM

Rebase and Windows on AArch64 treatment

The trampoline intrinsic is not treated for Windows on AArch64 anymore.

iamlouk marked 2 inline comments as done.May 15 2023, 12:55 AM
compnerd added inline comments.Jun 9 2023, 11:01 AM

Is this intrinsic available in GCC? When was it added there? This could be an ABI break otherwise. We should check the runtime that we are going to link against before lowering this I think.


Could you add negative checks for the other platforms please?

iamlouk added inline comments.Jun 28 2023, 2:15 AM

No, GCC/libgcc does not have this for AArch64. GCC actually writes a trampoline template into the text section, then copies it over onto the stack of the function that calls GCCs equivalent of init.trampoline, and then creates instructions that store the address of the function and nest parameter into the copied-over template.

On x86 for example, inlining the trampoline creation is very simple because there is a instruction for loading a 64 bit immediate, but the lack thereof makes this tricky on AArch64.

I can give a version that is more similar to what GCC does a try, if that is preferred. That would remove the call to compiler-rt (and the function there), but make the lowering of the intrinsic a lot more complex.

It is maybe worth pointing out that clang does not actually implement nested functions, and will never create these intrinsics, so I don't know what ABI would break here.

(Sorry for the very late reply)

compnerd added inline comments.Jun 30 2023, 8:15 PM

That makes this even less valuable - GCC will not create the call, clang will not create the function. I think that doing what GCC does here is the correct thing to do.

iamlouk abandoned this revision.Jul 5 2023, 11:34 AM

Thank you for your feedback compnerd. Changing the implementation to mimic GCC's way of handling it's equivalent of this intrinsic requires a completely new approach though, so I am abandoning this revision.