This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Generate SEH info for PAC instructions
ClosedPublic

Authored by mstorsjo on Oct 3 2022, 1:53 PM.

Diff Detail

Event Timeline

mstorsjo created this revision.Oct 3 2022, 1:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 1:53 PM
mstorsjo requested review of this revision.Oct 3 2022, 1:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 1:53 PM

For the following example:

int f(void g(), int a) { g(); return a; }

If I compile with the following command:

cl /c a.c /d2guardsignret

llvm-readobj gives me the following:

File: a.obj
Format: COFF-ARM64
Arch: aarch64
AddressSize: 64bit
UnwindInformation [
  RuntimeFunction {
    Function: f (0x0)
    ExceptionRecord: $unwind$f (0x0)
    ExceptionData {
      FunctionLength: 40
      Version: 0
      ExceptionData: No
      EpiloguePacked: Yes
      EpilogueOffset: 0
      ByteCodeLength: 8
      Prologue [
        0xd600              ; stp x19, lr, [sp, #0]
        0x01                ; sub sp, #16
        0xfc                ; Bad opcode!
        0xe4                ; end
      ]
    }
  }
]

So apparently there is, in fact, a way to encode this, using the undocumented opcode 0xfc. Why this isn't documented, I have no idea.


Mapping BTI instructions to no-ops seems fine; I can't imagine any other encoding makes sense, even if Microsoft does implement it at some point.

For the following example:

int f(void g(), int a) { g(); return a; }

If I compile with the following command:

cl /c a.c /d2guardsignret

llvm-readobj gives me the following:

File: a.obj
Format: COFF-ARM64
Arch: aarch64
AddressSize: 64bit
UnwindInformation [
  RuntimeFunction {
    Function: f (0x0)
    ExceptionRecord: $unwind$f (0x0)
    ExceptionData {
      FunctionLength: 40
      Version: 0
      ExceptionData: No
      EpiloguePacked: Yes
      EpilogueOffset: 0
      ByteCodeLength: 8
      Prologue [
        0xd600              ; stp x19, lr, [sp, #0]
        0x01                ; sub sp, #16
        0xfc                ; Bad opcode!
        0xe4                ; end
      ]
    }
  }
]

So apparently there is, in fact, a way to encode this, using the undocumented opcode 0xfc. Why this isn't documented, I have no idea.

Oh, interesting. I guess we can try to add that one then, to cover this case.

For the following example:

int f(void g(), int a) { g(); return a; }

If I compile with the following command:

cl /c a.c /d2guardsignret

llvm-readobj gives me the following:

File: a.obj
Format: COFF-ARM64
Arch: aarch64
AddressSize: 64bit
UnwindInformation [
  RuntimeFunction {
    Function: f (0x0)
    ExceptionRecord: $unwind$f (0x0)
    ExceptionData {
      FunctionLength: 40
      Version: 0
      ExceptionData: No
      EpiloguePacked: Yes
      EpilogueOffset: 0
      ByteCodeLength: 8
      Prologue [
        0xd600              ; stp x19, lr, [sp, #0]
        0x01                ; sub sp, #16
        0xfc                ; Bad opcode!
        0xe4                ; end
      ]
    }
  }
]

So apparently there is, in fact, a way to encode this, using the undocumented opcode 0xfc. Why this isn't documented, I have no idea.

I tried this out - apparently this option has been around since some time in MSVC 2019 at least. And apparently it's also possible to invoke this with /guard:signret. This generates the pacibsp instruction - I wonder if there's a way to make it generate paciasp instead - which I presume would require a separate unwinding opcode to distinguish from pacibsp?

However, curiously, I don't get the output you got, if I compile with the same command - I have to build with /O2 for it to produce this unwind info. Without /O2, I get packed unwind, with CR=2 (which is defined as reserved) - which apparently is a way to handle signed return addresses in packed unwind info too.

I filed https://github.com/MicrosoftDocs/cpp-docs/pull/4202 as an initial attempt to document this (and hopefully poke the right people to document it publicly).

However, curiously, I don't get the output you got, if I compile with the same command - I have to build with /O2 for it to produce this unwind info. Without /O2, I get packed unwind, with CR=2 (which is defined as reserved) - which apparently is a way to handle signed return addresses in packed unwind info too.

Sorry, I think I confused the commands when I posted.

mstorsjo updated this revision to Diff 465421.Oct 5 2022, 8:48 AM

Split out BTI to a separate patch, generating the new opcode for PAC.

mstorsjo retitled this revision from RFC: [AArch64] Add SEH_Nop for PAC/BTI instructions in prologues/epilogues to [AArch64] Generate SEH info for PAC instructions.Oct 5 2022, 8:49 AM
mstorsjo edited the summary of this revision. (Show Details)
mstorsjo updated this revision to Diff 466191.Oct 7 2022, 2:48 PM

Updated with the real name of the opcode.

efriedma added inline comments.Oct 11 2022, 10:52 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1883

Technically we could support this; we can represent it as "end_c; pac_sign_return_address; end". But we can leave that as a followup.

Probably worth adding a testcase for this, in any case.

mstorsjo updated this revision to Diff 467084.Oct 12 2022, 4:20 AM

Added a testcase which builds with "target-features"="+v8.3a", to test that we avoid generating "retab" at the moment.

efriedma accepted this revision.Oct 12 2022, 9:57 AM

LGTM

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
1021

The original whitespace is right here. (This whole switch is over-indented.)

This revision is now accepted and ready to land.Oct 12 2022, 9:57 AM
This revision was landed with ongoing or failed builds.Oct 12 2022, 12:22 PM
This revision was automatically updated to reflect the committed changes.