This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
1/12
AArch64FrameLowering.cpp
-
AArch64InstrInfo.h
1/3
AArch64InstrInfo.cpp
-
AArch64MachineFunctionInfo.h
-
AArch64MachineFunctionInfo.cpp
2/4
AArch64PointerAuth.h
3/12
AArch64PointerAuth.cpp
-
AArch64Subtarget.h
-
AArch64Subtarget.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
2/4
sign-return-address-tailcall.ll

Differential D156716

[AArch64][PAC] Check authenticated LR value during tail call
ClosedPublic

Authored by atrosinenko on Jul 31 2023, 9:48 AM.

Download Raw Diff

Details

Reviewers

ab
kristof.beyls
apazos
pcc
psmith
t.p.northover

Commits

rG1d2b558265bd: [AArch64][PAC] Check authenticated LR value during tail call

Summary

When performing a tail call, check the value of LR register after
authentication to prevent the callee from signing and spilling an
untrusted value. This commit implements a few variants of check,
more can be added later.

If it is safe to assume that executable pages are always readable,
LR can be checked just by dereferencing the LR value via LDR.

As an alternative, LR can be checked as follows:

  ; lowered AUT* instruction
  ; <some variant of check that LR contains a valid address>
  b.cond break_block
ret_block:
  ; lowered TCRETURN
break_block:
  brk 0xc471

As the existing methods either break the compatibility with execute-only
memory mappings or can degrade the performance, they are disabled by
default and can be explicitly enabled with a command line option.

Individual subtargets can opt-in to use one of the available methods
by updating AArch64FrameLowering::getAuthenticatedLRCheckMethod().

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

atrosinenko created this revision.Jul 31 2023, 9:48 AM

Herald added subscribers: JDevlieghere, hiraditya. · View Herald TranscriptJul 31 2023, 9:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 9:48 AM

This patch is inspired by the commit https://github.com/ahmedbougacha/llvm-project/commit/58cf59b84ca4e7930a640480fd5ad1ea194864f5 (and uses the same immediate operand for BRK instruction) but adds the checks during epilogue insertion instead of asm printing.

Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2023, 9:54 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B249257: Diff 545719.Jul 31 2023, 11:53 AM

atrosinenko added a child revision: D156785: [AArch64][PAC] Skip checking LR during tail call if FPAC is enabled.Aug 1 2023, 4:01 AM

kristof.beyls added inline comments.Aug 4 2023, 1:25 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	My understanding is that using a load instruction to check the LR can only be done when code segments are not "execute-only". The commit comment on https://github.com/llvm/llvm-project/commit/a932cd409b861582902211690b497cafc774bee6 suggests that at least LLD assumes that code generated for AArch64 targets is always compatible with execute-only segments. Therefore, I think that defaulting to checking-by-loading-lr is probably the wrong thing to do. It seems to me it would be better to default the other way and only allow checking-by-loading-lr when there is a guarantee that the target environment will not enforce "execute-only" code.
1917	This is just nitpicking: Would it be useful to have an isTailCallReturn function somewhere, and insert an assert(!isTailCallReturn(TI->getOpcode()) here? Given how a range of tail call pseudo opcodes have been added recently, it might be likely that a few more could get added in the future, in which case this switch statement needs to be adapted. I'm just always very cautious when doing a switch on a set of specific opcodes as opcodes tend to evolve over time and such switch statements might be hard to maintain correctly. That's why I tend to prefer having an assert in the default statement that hopefully catches when that happens. FWIW, https://github.com/llvm/llvm-project/blob/2df05cd01c17f3ef720e554dc7cde43df27e5224/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp#L275 already has a computation of an "IsTailCall" in this file (albeit that it doesn't consider TCRETURNriALL to be a tail call - is that an indication of an instance of the issue I described above?)).
1925	I think this comment would be easier to read if instead of just saying "Turn it into:", it would also clearly indicate the intended effect. For example: "To avoid generating a signing oracle, generate a code sequence that explicitly checks if the authentication passed or failed, as follows." With just having written those extra few words now, I think that this extra checking may not be needed if the target core implements FEAT_FPAC (and maybe FEAT_FPACCOMBINE?). If so, maybe a FIXME here would be good to not generate the extra checking code sequences when FEAT_FPAC is implemented by the targeted core?
llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
12	nitpick: I think I'd prefer `[[AUTIASP]]` rather than `[[AUT]]` as the macro name as that makes it clearer on first read exactly which authenticate operation is expected here. I do like the use of the macro though to hide away the difference between `hint #29` and `autiasp`.
20	nitpick: similarly, I think I'd prefer `[[XPACLRI]]`

Address the review comments.

Updated the patch.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	Updated. Though, I wonder if using the long snippet by default can introduce noticeable performance regression or not. Meanwhile, is it permitted to remove a dead load in the machine pipeline (and should I mark it as volatile somehow).
1917	Added an `AArch64InstrInfo::isTailCallReturnInst(MachineInstr &)` function and redirected there the existing code in `getArgumentStackToRestore()`. As far as I understand, the new `TCRETURNriALL` is only used by machine outliner and it seems to be used interchangeably with the existing `TCRETURNdi` instruction in https://github.com/llvm/llvm-project/blob/feafc2df43545e61a0ba67253284ecbabfd2ba09/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L8076C24-L8076C24. cc: @olista01
1925	Fixed. There exist separate review items D156784 and D156785 on taking FEAT_FPAC into account. I wonder if the particular CPU models should mention `+fpac` as supported or should it be only requested explicitly by the user - unlike many other subtarget features, FPAC doesn't make executable code explicitly fail on an unsupported CPU but silently makes it a bit less secure. So, the case "I have CPU X that implements all the instructions that are supported by Y (but not FEAT_FPAC)" may technically be unnoticed.
llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
12	Fixed
20	Fixed

Harbormaster completed remote builds in B250333: Diff 547228.Aug 4 2023, 9:31 AM

kristof.beyls added inline comments.Aug 8 2023, 6:09 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	Yeah, I would expect the long snippet could result in a noticeable performance regression. As discussed in last Monday's Pointer Authentication sync-up call, most current AArch64-based systems do not enforce execute-only. But that will probably change at some point in the future. Maybe it would be best to make the code generation dependent on whether the target platform enforces execute-only? That information would probably need to be stored in some kind of TargetInformation class - I'm not sure if that information is currently stored anywhere. @ab in that meeting also said that there was a 3rd option - checking the values of bits (I think, I don't fully remember) 55 and 56. Would that be a good option to implement/use by default? It seems to me that at least in principle, later optimization passes are allowed to remove load instructions that they can prove have no side-effects. It may be prudent indeed to mark the load as volatile - or use any other mechanism to indicate that the load does have a side-effect and shouldn't be removed by later optimizations.
1925	Good question! I don't have a clear answer, I'm afraid. I'd be interested to hear other people's opinions on this.

atrosinenko marked 2 inline comments as not done.Aug 10 2023, 7:12 AM

atrosinenko added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	Considering other possible options, IIUC something like this fragment is assumed. I glanced through Optimization Guides for a few CPU cores implementing FEAT_PAuth and it looks like `XPAC` instructions are usually faster than other PAuth-related instructions (say, latency is 2, throughput is 1 compared to 5/1 for `PAC` and `AUT*`) - this is somewhat expected as XPAC just clears a range of bits. On the other hand, EOR is still much more efficient. Thus, it is probably worth implementing yet another option, but I doubt it should be the default because it relies on the particular TBI setting in effect while XPAC "just works" taking into account current settings, as far as I understand.
1925	I think, in D156784 I could just add a TODO at now (to make hasFPAC available for usage) and FeatureFPAC may be added to relevant CPU cores by a later patch, if needed.

Rebased onto current main branch, will upload a few fixes shortly.

Harbormaster completed remote builds in B251971: Diff 549409.Aug 11 2023, 10:32 AM

Refactoring.

Reworked the patch

replaced "use a fast checker or not" boolean argument with multiple choices
created a separate source file and put checkAuthenticatedRegister() function there as it is likely to be later used by other parts of the codegen
exposed a tail-call-specific AArch64FrameLowering::checkAuthenticatedLR() similarly to signLR() and authenticateLR() so it can be used by outliner callbacks later
marked a dummy load instruction as volatile similar to Hexagon::PS_crash from HexagonInstrInfo.cpp
added the third implementation of authenticated address checker, similar to this code. Of course, more checkers can be implemented, but I think it would be better to leave this patch as a common implementation + a few "proofs of concept" and add more checkers via later patches, if needed.

No more changes remain planned on this patch, so I expect it to be ready for further review.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	I reverted this option to using a fast checker by default. As far as I understand, there are no explicit function telling whether this particular subtarget expects execute-only-compatible code as it is expected to "just work" on AArch64. On the other hand, IIUC the support for execute-only mappings was recently dropped at least on Linux because of interference with Privilege Access Never feature. At now, I just marked this option with a TODO because I expect DummyLoad to be much faster and even if someone want to use it in an execute-only environment, the issue is at least unlikely to be unnoticed.

Harbormaster completed remote builds in B252648: Diff 550338.Aug 15 2023, 9:12 AM

kristof.beyls added inline comments.Aug 18 2023, 1:12 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
261–265	I've done some further investigations and it seems that execute-only enforcement is most likely to get enabled in the not-too-distant future for some very popular AArch64 platforms. Therefore, having something that breaks the execute-only property enabled by default across all AArch64 platforms looks like a no-go. This would break most programs on those platforms. It seems that this patch also changes code generation for pac-ret (i.e. where only return addresses are protected by pointer authentication)? There are a number of AArch64 platforms that already ship with pac-ret. Enabling this for pac-ret moves the performance vs code-size vs security hardening trade-off for those platforms for their default code generation. Therefore, it seems to me that a lot of benchmarking would be needed to measure the code size and performance impact of this change before landing it for pac-ret. With the above 2 observations in mind, I think that: using the AuthCheckMethod::DummyLoad cannot be the default across all AArch64 platforms, as it will break most code on AArch64 platforms that plan to enable execute-only; and there is a strong indication that execute-only will be enforced on some of the most popular platforms. Protecting against this authentication oracle for pac-ret code generation could only be done by doing substantial amounts of benchmarking to help make a decision on whether this is a worthwhile performance vs code-size vs security hardening trade-off. I'd recommend to: (a) not change code generation for pac-ret at all in this patch. (b) change the default for AuthenticatedLRCheckMethod to something that does not break execute-only. Ultimately, it seems to me that each platform will have to decide where its default should be in the performance vs code size vs hardening trade-off. With very few software platforms currently choosing to pay the cost for fine-grain control-flow integrity, it seems to me that a default of not hardening against this authentication oracle may be the least bad option available.

atrosinenko added a parent revision: D159357: [AArch64] Move PAuth codegen down the machine pipeline.Sep 1 2023, 12:38 PM

Updated the patch based on the changes from D159357
Made the check opt-in for subtargets, "none" by default, with command line option to override
Fixed use after free in createCheckMemOperand function (the same way it is done in Hexagon)

Harbormaster completed remote builds in B256322: Diff 555465.Sep 1 2023, 1:48 PM

atrosinenko mentioned this in D159357: [AArch64] Move PAuth codegen down the machine pipeline.Sep 18 2023, 8:06 AM

atrosinenko edited the summary of this revision. (Show Details)Sep 19 2023, 3:41 AM

Updated after the D159357 was changed.

Harbormaster completed remote builds in B257410: Diff 557048.Sep 19 2023, 9:36 AM

atrosinenko removed a parent revision: D159357: [AArch64] Move PAuth codegen down the machine pipeline.Sep 25 2023, 3:36 AM

Rebased and updated the patch a bit more:

in getOutliningCandidateInfo, adjusted SequenceSize variable before its first use
updated code comments
skipped checking LR if Scadow Call Stack is enabled: the LR value just before TCRETURN* instruction is anyway not the one produced by AUT* (and we cannot check right after AUT* because we cannot be sure which register is usable as a temporary)

Ping.

Here is a summary of the contents of this patch:

implemented a standalone llvm::AArch64PAuth::checkAuthenticatedRegister utility function to emit one of a number of checks in case a pointer is AUT'ed and not immediately used for memory access
- placed this function into a sub-namespace instead of making it a static class member, so I don't have to put otherwise irrelevant AArch64PointerAuth class definition to header file
- note that the checks that are inserted by checkAuthenticatedRegister function are not specific to tail calls (but some of the checks may have restrictions - such as requiring AuthenticatedReg == LR because XPACLRI is encoded as HINT while generic XPAC* instructions require FEAT_PAUTH)
hooked it to AArch64PointerAuth class via checkAuthenticatedLR method dedicated to hardening tail calls
in machine outliner, update the costs computed by AArch64InstrInfo::getOutliningCandidateInfo method on a best-effort basis:
- in MachineOutlinerTailCall outlining mode, we need to insert checks in each caller of OUTLINED_FUNCTION
- in MachineOutlinerThunk mode, at most a single extra check is inserted in OUTLINED_FUNCTION itself
- other modes do not introduce new tail calls, so let's just try to account for the checks that would possibly be inserted later into the original candidates by the AArch64PointerAuth pass (including in the two aboves modes)
factored out isTailCallReturnInst and needsShadowCallStackPrologueEpilogue utility functions

Harbormaster completed remote builds in B257601: Diff 557353.Sep 26 2023, 6:35 AM

Thank you for the update!
I haven't looked at Arch64PointerAuth.cpp and the regression test in detail yet.
Most of the other parts of this patch looks fine to me, apart from the 2 nitpicks and the one probably bigger issue with getInstSizeInBytes() - see inline comments.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8316	I had to spend quite a bit of time to understand the logic here. I think, if possible, it would be easier to understand the logic if SequenceSize was updated much closer to where it is initially computed on line 8133. Ideally, getInstSizeInBytes(MI) would return the correct number of bytes. Actually, looking at the documentation of getInstSizeInBytes(MI), at https://github.com/llvm/llvm-project/blob/c4e2fcff788025415b523486efdbdac4f2b08c1e/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L83, it states that getInstSizeInBytes is guaranteed to return the maximum number of bytes. So on a tail call return that might produce lots of bytes, it should return the maximum number. If it does not produce the maximum number, I seem to remember that that can lead to hard-to-debug compilation failures where the code size estimation for computing whether a constant island is within range could go wrong. (I might be misremembering, it could be something different than a constant island going out of range). So, I think that getInstSizeInBytes(MI) needs to be adapted so it returns the correct maximum size in bytes for a tail call. When that is implemented, the change on this line in this patch may no longer be needed? Assuming I remember all of the above correctly, it may be good to add a regression test that verifies that getInstSizeInBytes is calculated correctly for large tail calls. https://reviews.llvm.org/D22870 is a patch that fixed a similar issue on another instruction. The test case that was added there could serve as inspiration.
llvm/lib/Target/AArch64/AArch64PointerAuth.h
89	nit pick - feel free to disagree. IIUC, all methods are pre-v8.3 compatible. Is it then worthwhile to call pre-v8.3 compatibility out in the help text? I think I'd remove the "(pre-v8.3 compatible)" part.
95	s/succeedes/succeeds/?

Updated the patch, thank you.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8316	On one hand, updating `getInstSizeInBytes(MI)` should fix not only the computation of `SequenceSize` but other possible callers of getInstSizeInBytes as well. On the other hand, inserting checker code before TCRETURN* instructions seems like just yet another code transformation (and TCRETURN* instructions are actually four bytes long after `AArch64PointerAuth` pass processes the function). Moreover, in `getInstSizeInBytes(MI)`, there is handling for several special cases and a comment stating // FIXME: We currently only handle pseudoinstructions that don't get expanded // before the assembly printer. Thus, I would rather keep updating `SequenceSize` ad-hoc, but move the update right after `NumBytesToCheckLRInTCEpilogue` is computed.
llvm/lib/Target/AArch64/AArch64PointerAuth.h
89	There will probably be non v8.2-compatible checkers in the future - for example, if we want to check an arbitrary register using XPAC. Though, I agree that it is better to drop the "(pre-v8.3 compatible)" part here: it would be better to add more comprehensible comment like "(requires FEAT_PAUTH)" to those checkers, I think.
95	Fixed.

Addressed the comments so far.

Harbormaster completed remote builds in B257722: Diff 557536.Oct 2 2023, 2:11 PM

kristof.beyls added inline comments.Oct 4 2023, 2:22 AM

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
8316	Thanks for pointing to that FIXME - indeed, it seems like as long as the expansion happens early enough, there should be no issue.
llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
147	I think this class could use a comment. Would something like the following be correct? // AddressCheckPseudoSourceValue represents the memory access inserted by the // the AuthCheckMethod::DummyLoad method to trigger a run-time error if the // authentication of a pointer failed.
160	Seeing this is a static variable made me think: would it be possible in a single run of any program using the LLVM libraries (such as clang, flang, rust, opt, ...) for MF.getTarget() to potentially return a different result sometimes within the same execution of the that program. If that could occur, then some from some executions of this function, the `CheckPSV` variable could be initialized with the wrong `TargetMachine` reference.... I cannot immediately think of an example, but I am also not sure. I think it would be more prudent to not have this as a static variable.
191	I'm wondering why any/all of the machineinsts created in this function need to have the FrameDestroy flag set? Do you know?
334	The convention in the LLVM code base is to use "FIXME" rather than "TODO".

Address the review comments.

atrosinenko added inline comments.Oct 6 2023, 7:11 AM

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
147	Added your comment, thank you.
160	I used the same approach as it is implemented in Hexagon backend: HexagonInstrInfo.cpp. For now TargetMachine is only used by `PseudoSourceValue` constructor to call `TM.getAddressSpaceForPseudoSourceKind(Kind)`. I agree that using static instance looks suspicious, but I guess allocating new instance from heap on each check is quite wasteful.
334	Updated

Harbormaster completed remote builds in B257779: Diff 557629.Oct 6 2023, 7:17 AM

atrosinenko added inline comments.Oct 6 2023, 7:23 AM

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp

191

Setting MachineInstr::FrameDestroy unconditionally in checkAuthenticatedRegister is definitely a mistake, thank you. Maybe I have to pass MI flags as an argument, but it seems that it works as-is, at least for generating DWARF debug info (see Epilogue Begin marker).

Added an explicit assertion that WinCFI is not requested as I don't yet emit any SEH opcodes.

$ cat /tmp/tail-call.c 
int caller_indirect(int *n, int (fptr)(int*)) {
  asm volatile ("" ::: "lr");
  *n = 42;
  return fptr(n);
}
$ ./bin/clang -O1 -target aarch64-linux-gnu /tmp/tail-call.c -c -o /tmp/tail-call.o -mbranch-protection=pac-ret -mllvm -aarch64-authenticated-lr-check-method=xpac-hint -g
$ dwarfdump -l /tmp/tail-call.o && llvm-objdump -d /tmp/tail-call.o 

.debug_line: line number info for a single cu
Source lines (from CU-DIE at .debug_info offset 0x0000000c):

            NS new statement, BB new basic block, ET end of text sequence
            PE prologue end, EB epilogue begin
            IS=val ISA number, DI=val discriminator value
<pc>        [lno,col] NS BB ET PE EB IS= DI= uri: "filepath"
0x00000000  [   1, 0] NS uri: "/tmp/tail-call.c"
0x0000000c  [   2, 3] NS PE
0x0000000c  [   3, 6] NS
0x00000010  [   4,10] NS ET EB
0x00000030  [   4,10] NS ET


/tmp/tail-call.o:       file format elf64-littleaarch64

Disassembly of section .text:

0000000000000000 <caller_indirect>:
       0: d503233f      paciasp
       4: f81f0ffe      str     x30, [sp, #-0x10]!
       8: 52800548      mov     w8, #0x2a
       c: b9000008      str     w8, [x0]
      10: f84107fe      ldr     x30, [sp], #0x10
      14: d50323bf      autiasp
      18: aa1e03f0      mov     x16, x30
      1c: d50320ff      xpaclri
      20: eb1e021f      cmp     x16, x30
      24: 54000041      b.ne    0x2c <caller_indirect+0x2c>
      28: d61f0020      br      x1
      2c: d4388e20      brk     #0xc471

kristof.beyls added inline comments.Oct 9 2023, 1:58 AM

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
160	I guess this would be allocating new instances on the stack rather than the heap? Anyway, I think it would be better to not rely on a static variable where we know this is likely to trigger a hard-to-track-down bug at some point in the future, for some use cases. If the cost would be too high, I guess it might be possible to "cache" CheckPSV once per MachineFunction being processed? That may require finding an object that is constructed once per MachineFunction and put that cached CheckPSV in there. Not sure if there is such an obvious object at the moment and if so, how much refactoring would be needed to make this possible. Anyway, I guess this doesn't get called that often (only once per tail call) so presumably it isn't worthwhile to make the code a lot more complicated for the small (would it even be measurable?) compile time gain.

Moved AddressCheckPseudoSourceValue to AArch64SubtargetInfo.

atrosinenko added inline comments.Oct 9 2023, 9:36 AM

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
160	Unfortunately, MachinePointerInfo stores a pointer to CheckPSV, so allocating it on stack is not possible. Turned out, each MachineFunction has PseudoSourceValueManager, but it is not subclassed per-target, so caching target-specific PSV would require some refactoring. On the other hand, AddressCheckPseudoSourceValue has quite generic semantics, so moved it to AArch64SubtargetInfo.

Harbormaster completed remote builds in B257791: Diff 557649.Oct 9 2023, 10:27 AM

I think that with the last change, all my comments have been addressed.
Please do react if I accidentally missed a comment that needs further work.
Assuming indeed all comments have been addressed: LGTM!

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
160	Thank you, that looks like a neat solution.

This revision is now accepted and ready to land.Oct 10 2023, 5:13 AM

atrosinenko mentioned this in D156784: [AArch64][PAC] Declare FPAC subtarget feature.Oct 11 2023, 5:39 AM

Added FIXME mentioning FEAT_FPAC to AArch64PointerAuth::checkAuthenticatedLR.

Thank you for the review. I read through the comments - they are already addressed except maybe for a few FPAC-related notes, so I added a FIXME to checkAuthenticatedLR function and left a note in the discussion of D156784.

I will rebase and retest this patch and land it shortly if everything works.

Harbormaster completed remote builds in B257813: Diff 557682.Oct 11 2023, 6:57 AM

Closed by commit rG1d2b558265bd: [AArch64][PAC] Check authenticated LR value during tail call (authored by atrosinenko). · Explain WhyOct 11 2023, 7:39 AM

This revision was automatically updated to reflect the committed changes.

atrosinenko added a commit: rG1d2b558265bd: [AArch64][PAC] Check authenticated LR value during tail call.

chill added a subscriber: chill.Oct 14 2023, 5:01 AM

chill added inline comments.

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
256	nit: The check for `FEAT_FPAC` perhaps can be done in `getAuthenticatedLRCheckMethod` (and possibly return `None`), so the logic of deciding whether to emit code or not is kept in one place.

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64FrameLowering.cpp

30 lines

AArch64InstrInfo.h

3 lines

AArch64InstrInfo.cpp

32 lines

AArch64MachineFunctionInfo.h

2 lines

AArch64MachineFunctionInfo.cpp

22 lines

AArch64PointerAuth.h

116 lines

AArch64PointerAuth.cpp

180 lines

AArch64Subtarget.h

27 lines

AArch64Subtarget.cpp

32 lines

test/

CodeGen/

AArch64/

sign-return-address-tailcall.ll

121 lines

Diff 557683

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 252 Lines • ▼ Show 20 Lines	static cl::opt<bool> OrderFrameObjects("aarch64-order-frame-objects",
cl::desc("sort stack allocations"),		cl::desc("sort stack allocations"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

cl::opt<bool> EnableHomogeneousPrologEpilog(		cl::opt<bool> EnableHomogeneousPrologEpilog(
"homogeneous-prolog-epilog", cl::Hidden,		"homogeneous-prolog-epilog", cl::Hidden,
cl::desc("Emit homogeneous prologue and epilogue for the size "		cl::desc("Emit homogeneous prologue and epilogue for the size "
"optimization (default = off)"));		"optimization (default = off)"));

STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");		STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");

/// Returns how much of the incoming argument stack area (in bytes) we should		/// Returns how much of the incoming argument stack area (in bytes) we should
/// clean up in an epilogue. For the C calling convention this will be 0, for		/// clean up in an epilogue. For the C calling convention this will be 0, for
/// guaranteed tail call conventions it can be positive (a normal return or a		/// guaranteed tail call conventions it can be positive (a normal return or a
		kristof.beylsUnsubmitted Not Done Reply Inline Actions My understanding is that using a load instruction to check the LR can only be done when code segments are not "execute-only". The commit comment on https://github.com/llvm/llvm-project/commit/a932cd409b861582902211690b497cafc774bee6 suggests that at least LLD assumes that code generated for AArch64 targets is always compatible with execute-only segments. Therefore, I think that defaulting to checking-by-loading-lr is probably the wrong thing to do. It seems to me it would be better to default the other way and only allow checking-by-loading-lr when there is a guarantee that the target environment will not enforce "execute-only" code. kristof.beyls: My understanding is that using a load instruction to check the LR can only be done when code…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions Updated. Though, I wonder if using the long snippet by default can introduce noticeable performance regression or not. Meanwhile, is it permitted to remove a dead load in the machine pipeline (and should I mark it as volatile somehow). atrosinenko: Updated. Though, I wonder if using the long snippet by default can introduce noticeable…
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Yeah, I would expect the long snippet could result in a noticeable performance regression. As discussed in last Monday's Pointer Authentication sync-up call, most current AArch64-based systems do not enforce execute-only. But that will probably change at some point in the future. Maybe it would be best to make the code generation dependent on whether the target platform enforces execute-only? That information would probably need to be stored in some kind of TargetInformation class - I'm not sure if that information is currently stored anywhere. @ab in that meeting also said that there was a 3rd option - checking the values of bits (I think, I don't fully remember) 55 and 56. Would that be a good option to implement/use by default? It seems to me that at least in principle, later optimization passes are allowed to remove load instructions that they can prove have no side-effects. It may be prudent indeed to mark the load as volatile - or use any other mechanism to indicate that the load does have a side-effect and shouldn't be removed by later optimizations. kristof.beyls: Yeah, I would expect the long snippet could result in a noticeable performance regression. As…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions Considering other possible options, IIUC something like this fragment is assumed. I glanced through Optimization Guides for a few CPU cores implementing FEAT_PAuth and it looks like `XPAC` instructions are usually faster than other PAuth-related instructions (say, latency is 2, throughput is 1 compared to 5/1 for `PAC` and `AUT`) - this is somewhat expected as XPAC just clears a range of bits. On the other hand, EOR is still much more efficient. Thus, it is probably worth implementing yet another option, but I doubt it should be the default because it relies on the particular TBI setting in effect while XPAC "just works" taking into account current settings, as far as I understand. atrosinenko:* Considering other possible options, IIUC something like [this fragment](https://github.
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions I reverted this option to using a fast checker by default. As far as I understand, there are no explicit function telling whether this particular subtarget expects execute-only-compatible code as it is expected to "just work" on AArch64. On the other hand, IIUC the support for execute-only mappings was recently dropped at least on Linux because of interference with Privilege Access Never feature. At now, I just marked this option with a TODO because I expect DummyLoad to be much faster and even if someone want to use it in an execute-only environment, the issue is at least unlikely to be unnoticed. atrosinenko: I reverted this option to using a fast checker by default. As far as I understand, there are no…
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I've done some further investigations and it seems that execute-only enforcement is most likely to get enabled in the not-too-distant future for some very popular AArch64 platforms. Therefore, having something that breaks the execute-only property enabled by default across all AArch64 platforms looks like a no-go. This would break most programs on those platforms. It seems that this patch also changes code generation for pac-ret (i.e. where only return addresses are protected by pointer authentication)? There are a number of AArch64 platforms that already ship with pac-ret. Enabling this for pac-ret moves the performance vs code-size vs security hardening trade-off for those platforms for their default code generation. Therefore, it seems to me that a lot of benchmarking would be needed to measure the code size and performance impact of this change before landing it for pac-ret. With the above 2 observations in mind, I think that: using the AuthCheckMethod::DummyLoad cannot be the default across all AArch64 platforms, as it will break most code on AArch64 platforms that plan to enable execute-only; and there is a strong indication that execute-only will be enforced on some of the most popular platforms. Protecting against this authentication oracle for pac-ret code generation could only be done by doing substantial amounts of benchmarking to help make a decision on whether this is a worthwhile performance vs code-size vs security hardening trade-off. I'd recommend to: (a) not change code generation for pac-ret at all in this patch. (b) change the default for AuthenticatedLRCheckMethod to something that does not break execute-only. Ultimately, it seems to me that each platform will have to decide where its default should be in the performance vs code size vs hardening trade-off. With very few software platforms currently choosing to pay the cost for fine-grain control-flow integrity, it seems to me that a default of not hardening against this authentication oracle may be the least bad option available. kristof.beyls: I've done some further investigations and it seems that execute-only enforcement is most likely…
/// tail call to a function that uses less stack space for arguments) or		/// tail call to a function that uses less stack space for arguments) or
/// negative (for a tail call to a function that needs more stack space than us		/// negative (for a tail call to a function that needs more stack space than us
/// for arguments).		/// for arguments).
static int64_t getArgumentStackToRestore(MachineFunction &MF,		static int64_t getArgumentStackToRestore(MachineFunction &MF,
MachineBasicBlock &MBB) {		MachineBasicBlock &MBB) {
MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr();		MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr();
bool IsTailCallReturn = false;
if (MBB.end() != MBBI) {
unsigned RetOpcode = MBBI->getOpcode();
IsTailCallReturn = RetOpcode == AArch64::TCRETURNdi \|\|
RetOpcode == AArch64::TCRETURNri \|\|
RetOpcode == AArch64::TCRETURNriBTI;
}
AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
		bool IsTailCallReturn = (MBB.end() != MBBI)
		? AArch64InstrInfo::isTailCallReturnInst(*MBBI)
		: false;

int64_t ArgumentPopSize = 0;		int64_t ArgumentPopSize = 0;
if (IsTailCallReturn) {		if (IsTailCallReturn) {
MachineOperand &StackAdjust = MBBI->getOperand(1);		MachineOperand &StackAdjust = MBBI->getOperand(1);

// For a tail-call in a callee-pops-arguments environment, some or all of		// For a tail-call in a callee-pops-arguments environment, some or all of
// the stack may actually be in use for the call's arguments, this is		// the stack may actually be in use for the call's arguments, this is
// calculated during LowerCall and consumed here...		// calculated during LowerCall and consumed here...
ArgumentPopSize = StackAdjust.getImm();		ArgumentPopSize = StackAdjust.getImm();
} else {		} else {
// ... otherwise the amount to pop is all of the argument space,		// ... otherwise the amount to pop is all of the argument space,
// conveniently stored in the MachineFunctionInfo by		// conveniently stored in the MachineFunctionInfo by
// LowerFormalArguments. This will, of course, be zero for the C calling		// LowerFormalArguments. This will, of course, be zero for the C calling
// convention.		// convention.
ArgumentPopSize = AFI->getArgumentStackToRestore();		ArgumentPopSize = AFI->getArgumentStackToRestore();
}		}

return ArgumentPopSize;		return ArgumentPopSize;
}		}

static bool produceCompactUnwindFrame(MachineFunction &MF);		static bool produceCompactUnwindFrame(MachineFunction &MF);
static bool needsWinCFI(const MachineFunction &MF);		static bool needsWinCFI(const MachineFunction &MF);
static StackOffset getSVEStackSize(const MachineFunction &MF);		static StackOffset getSVEStackSize(const MachineFunction &MF);
static bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF);

/// Returns true if a homogeneous prolog or epilog code can be emitted		/// Returns true if a homogeneous prolog or epilog code can be emitted
/// for the size optimization. If possible, a frame helper call is injected.		/// for the size optimization. If possible, a frame helper call is injected.
/// When Exit block is given, this check is for epilog.		/// When Exit block is given, this check is for epilog.
bool AArch64FrameLowering::homogeneousPrologEpilog(		bool AArch64FrameLowering::homogeneousPrologEpilog(
MachineFunction &MF, MachineBasicBlock *Exit) const {		MachineFunction &MF, MachineBasicBlock *Exit) const {
if (!MF.getFunction().hasMinSize())		if (!MF.getFunction().hasMinSize())
return false;		return false;
▲ Show 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	void AArch64FrameLowering::resetCFIToInitialState(

// Flip the RA sign state.		// Flip the RA sign state.
if (MFI.shouldSignReturnAddress(MF)) {		if (MFI.shouldSignReturnAddress(MF)) {
CFIIndex = MF.addFrameInst(MCCFIInstruction::createNegateRAState(nullptr));		CFIIndex = MF.addFrameInst(MCCFIInstruction::createNegateRAState(nullptr));
BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);		BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
}		}

// Shadow call stack uses X18, reset it.		// Shadow call stack uses X18, reset it.
if (needsShadowCallStackPrologueEpilogue(MF))		if (MFI.needsShadowCallStackPrologueEpilogue(MF))
insertCFISameValue(CFIDesc, MF, MBB, InsertPt,		insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
TRI.getDwarfRegNum(AArch64::X18, true));		TRI.getDwarfRegNum(AArch64::X18, true));

// Emit .cfi_same_value for callee-saved registers.		// Emit .cfi_same_value for callee-saved registers.
const std::vector<CalleeSavedInfo> &CSI =		const std::vector<CalleeSavedInfo> &CSI =
MF.getFrameInfo().getCalleeSavedInfo();		MF.getFrameInfo().getCalleeSavedInfo();
for (const auto &Info : CSI) {		for (const auto &Info : CSI) {
unsigned Reg = Info.getReg();		unsigned Reg = Info.getReg();
▲ Show 20 Lines • Show All 656 Lines • ▼ Show 20 Lines	static bool IsSVECalleeSave(MachineBasicBlock::iterator I) {
case AArch64::STR_PXI:		case AArch64::STR_PXI:
case AArch64::LDR_ZXI:		case AArch64::LDR_ZXI:
case AArch64::LDR_PXI:		case AArch64::LDR_PXI:
return I->getFlag(MachineInstr::FrameSetup) \|\|		return I->getFlag(MachineInstr::FrameSetup) \|\|
I->getFlag(MachineInstr::FrameDestroy);		I->getFlag(MachineInstr::FrameDestroy);
}		}
}		}

static bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF) {
if (!(llvm::any_of(
MF.getFrameInfo().getCalleeSavedInfo(),
[](const auto &Info) { return Info.getReg() == AArch64::LR; }) &&
MF.getFunction().hasFnAttribute(Attribute::ShadowCallStack)))
return false;

if (!MF.getSubtarget<AArch64Subtarget>().isXRegisterReserved(18))
report_fatal_error("Must reserve x18 to use shadow call stack");

return true;
}

static void emitShadowCallStackPrologue(const TargetInstrInfo &TII,		static void emitShadowCallStackPrologue(const TargetInstrInfo &TII,
MachineFunction &MF,		MachineFunction &MF,
MachineBasicBlock &MBB,		MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
const DebugLoc &DL, bool NeedsWinCFI,		const DebugLoc &DL, bool NeedsWinCFI,
bool NeedsUnwindInfo) {		bool NeedsUnwindInfo) {
// Shadow call stack prolog: str x30, [x18], #8		// Shadow call stack prolog: str x30, [x18], #8
BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXpost))		BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXpost))
▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
// assume that's false and set it to true in the case that there's a redzone.		// assume that's false and set it to true in the case that there's a redzone.
AFI->setHasRedZone(false);		AFI->setHasRedZone(false);

// Debug location must be unknown since the first debug location is used		// Debug location must be unknown since the first debug location is used
// to determine the end of the prologue.		// to determine the end of the prologue.
DebugLoc DL;		DebugLoc DL;

const auto &MFnI = *MF.getInfo<AArch64FunctionInfo>();		const auto &MFnI = *MF.getInfo<AArch64FunctionInfo>();
if (needsShadowCallStackPrologueEpilogue(MF))		if (MFnI.needsShadowCallStackPrologueEpilogue(MF))
emitShadowCallStackPrologue(*TII, MF, MBB, MBBI, DL, NeedsWinCFI,		emitShadowCallStackPrologue(*TII, MF, MBB, MBBI, DL, NeedsWinCFI,
MFnI.needsDwarfUnwindInfo(MF));		MFnI.needsDwarfUnwindInfo(MF));

if (MFnI.shouldSignReturnAddress(MF)) {		if (MFnI.shouldSignReturnAddress(MF)) {
BuildMI(MBB, MBBI, DL, TII->get(AArch64::PAUTH_PROLOGUE))		BuildMI(MBB, MBBI, DL, TII->get(AArch64::PAUTH_PROLOGUE))
.setMIFlag(MachineInstr::FrameSetup);		.setMIFlag(MachineInstr::FrameSetup);
if (NeedsWinCFI)		if (NeedsWinCFI)
HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR		HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR
▲ Show 20 Lines • Show All 501 Lines • ▼ Show 20 Lines	void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
DebugLoc DL;		DebugLoc DL;
bool NeedsWinCFI = needsWinCFI(MF);		bool NeedsWinCFI = needsWinCFI(MF);
bool EmitCFI = AFI->needsAsyncDwarfUnwindInfo(MF);		bool EmitCFI = AFI->needsAsyncDwarfUnwindInfo(MF);
bool HasWinCFI = false;		bool HasWinCFI = false;
bool IsFunclet = false;		bool IsFunclet = false;

if (MBB.end() != MBBI) {		if (MBB.end() != MBBI) {
DL = MBBI->getDebugLoc();		DL = MBBI->getDebugLoc();
IsFunclet = isFuncletReturnInstr(*MBBI);		IsFunclet = isFuncletReturnInstr(*MBBI);
		kristof.beylsUnsubmitted Not Done Reply Inline Actions This is just nitpicking: Would it be useful to have an isTailCallReturn function somewhere, and insert an assert(!isTailCallReturn(TI->getOpcode()) here? Given how a range of tail call pseudo opcodes have been added recently, it might be likely that a few more could get added in the future, in which case this switch statement needs to be adapted. I'm just always very cautious when doing a switch on a set of specific opcodes as opcodes tend to evolve over time and such switch statements might be hard to maintain correctly. That's why I tend to prefer having an assert in the default statement that hopefully catches when that happens. FWIW, https://github.com/llvm/llvm-project/blob/2df05cd01c17f3ef720e554dc7cde43df27e5224/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp#L275 already has a computation of an "IsTailCall" in this file (albeit that it doesn't consider TCRETURNriALL to be a tail call - is that an indication of an instance of the issue I described above?)). kristof.beyls: This is just nitpicking: Would it be useful to have an isTailCallReturn function somewhere…
		atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Added an `AArch64InstrInfo::isTailCallReturnInst(MachineInstr &)` function and redirected there the existing code in `getArgumentStackToRestore()`. As far as I understand, the new `TCRETURNriALL` is only used by machine outliner and it seems to be used interchangeably with the existing `TCRETURNdi` instruction in https://github.com/llvm/llvm-project/blob/feafc2df43545e61a0ba67253284ecbabfd2ba09/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L8076C24-L8076C24. cc: @olista01 atrosinenko: Added an `AArch64InstrInfo::isTailCallReturnInst(MachineInstr &)` function and redirected there…
}		}

MachineBasicBlock::iterator EpilogStartI = MBB.end();		MachineBasicBlock::iterator EpilogStartI = MBB.end();

auto FinishingTouches = make_scope_exit([&]() {		auto FinishingTouches = make_scope_exit([&]() {
if (AFI->shouldSignReturnAddress(MF)) {		if (AFI->shouldSignReturnAddress(MF)) {
BuildMI(MBB, MBB.getFirstTerminator(), DL,		BuildMI(MBB, MBB.getFirstTerminator(), DL,
TII->get(AArch64::PAUTH_EPILOGUE))		TII->get(AArch64::PAUTH_EPILOGUE))
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I think this comment would be easier to read if instead of just saying "Turn it into:", it would also clearly indicate the intended effect. For example: "To avoid generating a signing oracle, generate a code sequence that explicitly checks if the authentication passed or failed, as follows." With just having written those extra few words now, I think that this extra checking may not be needed if the target core implements FEAT_FPAC (and maybe FEAT_FPACCOMBINE?). If so, maybe a FIXME here would be good to not generate the extra checking code sequences when FEAT_FPAC is implemented by the targeted core? kristof.beyls: I think this comment would be easier to read if instead of just saying "Turn it into:", it…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions Fixed. There exist separate review items D156784 and D156785 on taking FEAT_FPAC into account. I wonder if the particular CPU models should mention `+fpac` as supported or should it be only requested explicitly by the user - unlike many other subtarget features, FPAC doesn't make executable code explicitly fail on an unsupported CPU but silently makes it a bit less secure. So, the case "I have CPU X that implements all the instructions that are supported by Y (but not FEAT_FPAC)" may technically be unnoticed. atrosinenko: Fixed. There exist separate review items D156784 and D156785 on taking FEAT_FPAC into account.
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Good question! I don't have a clear answer, I'm afraid. I'd be interested to hear other people's opinions on this. kristof.beyls: Good question! I don't have a clear answer, I'm afraid. I'd be interested to hear other…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions I think, in D156784 I could just add a TODO at now (to make hasFPAC available for usage) and FeatureFPAC may be added to relevant CPU cores by a later patch, if needed. atrosinenko: I think, in D156784 I could just add a TODO at now (to make hasFPAC available for usage) and…
.setMIFlag(MachineInstr::FrameDestroy);		.setMIFlag(MachineInstr::FrameDestroy);
if (NeedsWinCFI)		if (NeedsWinCFI)
HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR		HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR
}		}
if (needsShadowCallStackPrologueEpilogue(MF))		if (AFI->needsShadowCallStackPrologueEpilogue(MF))
emitShadowCallStackEpilogue(*TII, MF, MBB, MBB.getFirstTerminator(), DL);		emitShadowCallStackEpilogue(*TII, MF, MBB, MBB.getFirstTerminator(), DL);
if (EmitCFI)		if (EmitCFI)
emitCalleeSavedGPRRestores(MBB, MBB.getFirstTerminator());		emitCalleeSavedGPRRestores(MBB, MBB.getFirstTerminator());
if (HasWinCFI) {		if (HasWinCFI) {
BuildMI(MBB, MBB.getFirstTerminator(), DL,		BuildMI(MBB, MBB.getFirstTerminator(), DL,
TII->get(AArch64::SEH_EpilogEnd))		TII->get(AArch64::SEH_EpilogEnd))
.setMIFlag(MachineInstr::FrameDestroy);		.setMIFlag(MachineInstr::FrameDestroy);
if (!MF.hasWinCFI())		if (!MF.hasWinCFI())
▲ Show 20 Lines • Show All 2,056 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.h

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	public:
static bool hasBTISemantics(const MachineInstr &MI);		static bool hasBTISemantics(const MachineInstr &MI);

/// Returns the index for the immediate for a given instruction.		/// Returns the index for the immediate for a given instruction.
static unsigned getLoadStoreImmIdx(unsigned Opc);		static unsigned getLoadStoreImmIdx(unsigned Opc);

/// Return true if pairing the given load or store may be paired with another.		/// Return true if pairing the given load or store may be paired with another.
static bool isPairableLdStInst(const MachineInstr &MI);		static bool isPairableLdStInst(const MachineInstr &MI);

		/// Returns true if MI is one of the TCRETURN* instructions.
		static bool isTailCallReturnInst(const MachineInstr &MI);

/// Return the opcode that set flags when possible. The caller is		/// Return the opcode that set flags when possible. The caller is
/// responsible for ensuring the opc has a flag setting equivalent.		/// responsible for ensuring the opc has a flag setting equivalent.
static unsigned convertToFlagSettingOpc(unsigned Opc);		static unsigned convertToFlagSettingOpc(unsigned Opc);

/// Return true if this is a load/store that can be potentially paired/merged.		/// Return true if this is a load/store that can be potentially paired/merged.
bool isCandidateToMergeOrPair(const MachineInstr &MI) const;		bool isCandidateToMergeOrPair(const MachineInstr &MI) const;

/// Hint that pairing the given load or store is unprofitable.		/// Hint that pairing the given load or store is unprofitable.
▲ Show 20 Lines • Show All 515 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===- AArch64InstrInfo.cpp - AArch64 Instruction Information -------------===//		//===- AArch64InstrInfo.cpp - AArch64 Instruction Information -------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains the AArch64 implementation of the TargetInstrInfo class.		// This file contains the AArch64 implementation of the TargetInstrInfo class.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AArch64ExpandImm.h"		#include "AArch64ExpandImm.h"
#include "AArch64InstrInfo.h"		#include "AArch64InstrInfo.h"
#include "AArch64FrameLowering.h"		#include "AArch64FrameLowering.h"
#include "AArch64MachineFunctionInfo.h"		#include "AArch64MachineFunctionInfo.h"
		#include "AArch64PointerAuth.h"
#include "AArch64Subtarget.h"		#include "AArch64Subtarget.h"
#include "MCTargetDesc/AArch64AddressingModes.h"		#include "MCTargetDesc/AArch64AddressingModes.h"
#include "Utils/AArch64BaseInfo.h"		#include "Utils/AArch64BaseInfo.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineCombinerPattern.h"		#include "llvm/CodeGen/MachineCombinerPattern.h"
▲ Show 20 Lines • Show All 2,460 Lines • ▼ Show 20 Lines	bool AArch64InstrInfo::isPairableLdStInst(const MachineInstr &MI) {
case AArch64::LDURXi:		case AArch64::LDURXi:
case AArch64::LDRXpre:		case AArch64::LDRXpre:
case AArch64::LDURSWi:		case AArch64::LDURSWi:
case AArch64::LDRSWpre:		case AArch64::LDRSWpre:
return true;		return true;
}		}
}		}

		bool AArch64InstrInfo::isTailCallReturnInst(const MachineInstr &MI) {
		switch (MI.getOpcode()) {
		default:
		assert((!MI.isCall() \|\| !MI.isReturn()) &&
		"Unexpected instruction - was a new tail call opcode introduced?");
		return false;
		case AArch64::TCRETURNdi:
		case AArch64::TCRETURNri:
		case AArch64::TCRETURNriBTI:
		case AArch64::TCRETURNriALL:
		return true;
		}
		}

unsigned AArch64InstrInfo::convertToFlagSettingOpc(unsigned Opc) {		unsigned AArch64InstrInfo::convertToFlagSettingOpc(unsigned Opc) {
switch (Opc) {		switch (Opc) {
default:		default:
llvm_unreachable("Opcode has no flag setting equivalent!");		llvm_unreachable("Opcode has no flag setting equivalent!");
// 32-bit cases:		// 32-bit cases:
case AArch64::ADDWri:		case AArch64::ADDWri:
return AArch64::ADDSWri;		return AArch64::ADDSWri;
case AArch64::ADDWrr:		case AArch64::ADDWrr:
▲ Show 20 Lines • Show All 5,711 Lines • ▼ Show 20 Lines	AArch64InstrInfo::getOutliningCandidateInfo(
// not certainly true that the outlined function will have to sign its return		// not certainly true that the outlined function will have to sign its return
// address but this decision is made later, when the decision to outline		// address but this decision is made later, when the decision to outline
// has already been made.		// has already been made.
// The same holds for the number of additional instructions we need: On		// The same holds for the number of additional instructions we need: On
// v8.3a RET can be replaced by RETAA/RETAB and no AUT instruction is		// v8.3a RET can be replaced by RETAA/RETAB and no AUT instruction is
// necessary. However, at this point we don't know if the outlined function		// necessary. However, at this point we don't know if the outlined function
// will have a RET instruction so we assume the worst.		// will have a RET instruction so we assume the worst.
const TargetRegisterInfo &TRI = getRegisterInfo();		const TargetRegisterInfo &TRI = getRegisterInfo();
		// Performing a tail call may require extra checks when PAuth is enabled.
		// If PAuth is disabled, set it to zero for uniformity.
		unsigned NumBytesToCheckLRInTCEpilogue = 0;
if (FirstCand.getMF()		if (FirstCand.getMF()
->getInfo<AArch64FunctionInfo>()		->getInfo<AArch64FunctionInfo>()
->shouldSignReturnAddress(true)) {		->shouldSignReturnAddress(true)) {
// One PAC and one AUT instructions		// One PAC and one AUT instructions
NumBytesToCreateFrame += 8;		NumBytesToCreateFrame += 8;

		// PAuth is enabled - set extra tail call cost, if any.
		auto LRCheckMethod = Subtarget.getAuthenticatedLRCheckMethod();
		NumBytesToCheckLRInTCEpilogue =
		AArch64PAuth::getCheckerSizeInBytes(LRCheckMethod);
		// Checking the authenticated LR value may significantly impact
		// SequenceSize, so account for it for more precise results.
		if (isTailCallReturnInst(*RepeatedSequenceLocs[0].back()))
		SequenceSize += NumBytesToCheckLRInTCEpilogue;

// We have to check if sp modifying instructions would get outlined.		// We have to check if sp modifying instructions would get outlined.
// If so we only allow outlining if sp is unchanged overall, so matching		// If so we only allow outlining if sp is unchanged overall, so matching
// sub and add instructions are okay to outline, all other sp modifications		// sub and add instructions are okay to outline, all other sp modifications
// are not		// are not
auto hasIllegalSPModification = [&TRI](outliner::Candidate &C) {		auto hasIllegalSPModification = [&TRI](outliner::Candidate &C) {
int SPValue = 0;		int SPValue = 0;
MachineBasicBlock::iterator MBBI = C.front();		MachineBasicBlock::iterator MBBI = C.front();
for (;;) {		for (;;) {
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	if (FirstCand.getMF()
// If the sequence doesn't have enough candidates left, then we're done.		// If the sequence doesn't have enough candidates left, then we're done.
if (RepeatedSequenceLocs.size() < 2)		if (RepeatedSequenceLocs.size() < 2)
return std::nullopt;		return std::nullopt;
}		}

// Properties about candidate MBBs that hold for all of them.		// Properties about candidate MBBs that hold for all of them.
unsigned FlagsSetInAll = 0xF;		unsigned FlagsSetInAll = 0xF;

// Compute liveness information for each candidate, and set FlagsSetInAll.		// Compute liveness information for each candidate, and set FlagsSetInAll.
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I had to spend quite a bit of time to understand the logic here. I think, if possible, it would be easier to understand the logic if SequenceSize was updated much closer to where it is initially computed on line 8133. Ideally, getInstSizeInBytes(MI) would return the correct number of bytes. Actually, looking at the documentation of getInstSizeInBytes(MI), at https://github.com/llvm/llvm-project/blob/c4e2fcff788025415b523486efdbdac4f2b08c1e/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L83, it states that getInstSizeInBytes is guaranteed to return the maximum number of bytes. So on a tail call return that might produce lots of bytes, it should return the maximum number. If it does not produce the maximum number, I seem to remember that that can lead to hard-to-debug compilation failures where the code size estimation for computing whether a constant island is within range could go wrong. (I might be misremembering, it could be something different than a constant island going out of range). So, I think that getInstSizeInBytes(MI) needs to be adapted so it returns the correct maximum size in bytes for a tail call. When that is implemented, the change on this line in this patch may no longer be needed? Assuming I remember all of the above correctly, it may be good to add a regression test that verifies that getInstSizeInBytes is calculated correctly for large tail calls. https://reviews.llvm.org/D22870 is a patch that fixed a similar issue on another instruction. The test case that was added there could serve as inspiration. kristof.beyls: I had to spend quite a bit of time to understand the logic here. I think, if possible, it would…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions On one hand, updating `getInstSizeInBytes(MI)` should fix not only the computation of `SequenceSize` but other possible callers of getInstSizeInBytes as well. On the other hand, inserting checker code before TCRETURN* instructions seems like just yet another code transformation (and TCRETURN* instructions are actually four bytes long after `AArch64PointerAuth` pass processes the function). Moreover, in `getInstSizeInBytes(MI)`, there is handling for several special cases and a comment stating // FIXME: We currently only handle pseudoinstructions that don't get expanded // before the assembly printer. Thus, I would rather keep updating `SequenceSize` ad-hoc, but move the update right after `NumBytesToCheckLRInTCEpilogue` is computed. atrosinenko: On one hand, updating `getInstSizeInBytes(MI)` should fix not only the computation of…
		kristof.beylsUnsubmitted Done Reply Inline Actions Thanks for pointing to that FIXME - indeed, it seems like as long as the expansion happens early enough, there should be no issue. kristof.beyls: Thanks for pointing to that FIXME - indeed, it seems like as long as the expansion happens…
for (outliner::Candidate &C : RepeatedSequenceLocs)		for (outliner::Candidate &C : RepeatedSequenceLocs)
FlagsSetInAll &= C.Flags;		FlagsSetInAll &= C.Flags;

unsigned LastInstrOpcode = RepeatedSequenceLocs[0].back()->getOpcode();		unsigned LastInstrOpcode = RepeatedSequenceLocs[0].back()->getOpcode();

// Helper lambda which sets call information for every candidate.		// Helper lambda which sets call information for every candidate.
auto SetCandidateCallInfo =		auto SetCandidateCallInfo =
[&RepeatedSequenceLocs](unsigned CallID, unsigned NumBytesForCall) {		[&RepeatedSequenceLocs](unsigned CallID, unsigned NumBytesForCall) {
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	AArch64InstrInfo::getOutliningCandidateInfo(
bool AllStackInstrsSafe = std::all_of(		bool AllStackInstrsSafe = std::all_of(
FirstCand.front(), std::next(FirstCand.back()), IsSafeToFixup);		FirstCand.front(), std::next(FirstCand.back()), IsSafeToFixup);

// If the last instruction in any candidate is a terminator, then we should		// If the last instruction in any candidate is a terminator, then we should
// tail call all of the candidates.		// tail call all of the candidates.
if (RepeatedSequenceLocs[0].back()->isTerminator()) {		if (RepeatedSequenceLocs[0].back()->isTerminator()) {
FrameID = MachineOutlinerTailCall;		FrameID = MachineOutlinerTailCall;
NumBytesToCreateFrame = 0;		NumBytesToCreateFrame = 0;
SetCandidateCallInfo(MachineOutlinerTailCall, 4);		unsigned NumBytesForCall = 4 + NumBytesToCheckLRInTCEpilogue;
		SetCandidateCallInfo(MachineOutlinerTailCall, NumBytesForCall);
}		}

else if (LastInstrOpcode == AArch64::BL \|\|		else if (LastInstrOpcode == AArch64::BL \|\|
((LastInstrOpcode == AArch64::BLR \|\|		((LastInstrOpcode == AArch64::BLR \|\|
LastInstrOpcode == AArch64::BLRNoIP) &&		LastInstrOpcode == AArch64::BLRNoIP) &&
!HasBTI)) {		!HasBTI)) {
// FIXME: Do we need to check if the code after this uses the value of LR?		// FIXME: Do we need to check if the code after this uses the value of LR?
FrameID = MachineOutlinerThunk;		FrameID = MachineOutlinerThunk;
NumBytesToCreateFrame = 0;		NumBytesToCreateFrame = NumBytesToCheckLRInTCEpilogue;
SetCandidateCallInfo(MachineOutlinerThunk, 4);		SetCandidateCallInfo(MachineOutlinerThunk, 4);
}		}

else {		else {
// We need to decide how to emit calls + frames. We can always emit the same		// We need to decide how to emit calls + frames. We can always emit the same
// frame if we don't need to save to the stack. If we have to save to the		// frame if we don't need to save to the stack. If we have to save to the
// stack, then we need a different frame.		// stack, then we need a different frame.
unsigned NumBytesNoStackCalls = 0;		unsigned NumBytesNoStackCalls = 0;
▲ Show 20 Lines • Show All 1,024 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h

Show First 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	#endif
}		}
void setCalleeSaveBaseToFrameRecordOffset(int Offset) {		void setCalleeSaveBaseToFrameRecordOffset(int Offset) {
CalleeSaveBaseToFrameRecordOffset = Offset;		CalleeSaveBaseToFrameRecordOffset = Offset;
}		}

bool shouldSignReturnAddress(const MachineFunction &MF) const;		bool shouldSignReturnAddress(const MachineFunction &MF) const;
bool shouldSignReturnAddress(bool SpillsLR) const;		bool shouldSignReturnAddress(bool SpillsLR) const;

		bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF) const;

bool shouldSignWithBKey() const { return SignWithBKey; }		bool shouldSignWithBKey() const { return SignWithBKey; }
bool isMTETagged() const { return IsMTETagged; }		bool isMTETagged() const { return IsMTETagged; }

bool branchTargetEnforcement() const { return BranchTargetEnforcement; }		bool branchTargetEnforcement() const { return BranchTargetEnforcement; }

void setHasSwiftAsyncContext(bool HasContext) {		void setHasSwiftAsyncContext(bool HasContext) {
HasSwiftAsyncContext = HasContext;		HasSwiftAsyncContext = HasContext;
}		}
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	bool AArch64FunctionInfo::shouldSignReturnAddress(bool SpillsLR) const {			bool AArch64FunctionInfo::shouldSignReturnAddress(bool SpillsLR) const {
	if (!SignReturnAddress)			if (!SignReturnAddress)
	return false;			return false;
	if (SignReturnAddressAll)			if (SignReturnAddressAll)
	return true;			return true;
	return SpillsLR;			return SpillsLR;
	}			}

				static bool isLRSpilled(const MachineFunction &MF) {
				return llvm::any_of(
				MF.getFrameInfo().getCalleeSavedInfo(),
				[](const auto &Info) { return Info.getReg() == AArch64::LR; });
				}

	bool AArch64FunctionInfo::shouldSignReturnAddress(			bool AArch64FunctionInfo::shouldSignReturnAddress(
	const MachineFunction &MF) const {			const MachineFunction &MF) const {
	return shouldSignReturnAddress(llvm::any_of(			return shouldSignReturnAddress(isLRSpilled(MF));
	MF.getFrameInfo().getCalleeSavedInfo(),			}
	[](const auto &Info) { return Info.getReg() == AArch64::LR; }));
				bool AArch64FunctionInfo::needsShadowCallStackPrologueEpilogue(
				MachineFunction &MF) const {
				if (!(isLRSpilled(MF) &&
				MF.getFunction().hasFnAttribute(Attribute::ShadowCallStack)))
				return false;

				if (!MF.getSubtarget<AArch64Subtarget>().isXRegisterReserved(18))
				report_fatal_error("Must reserve x18 to use shadow call stack");

				return true;
	}			}

	bool AArch64FunctionInfo::needsDwarfUnwindInfo(			bool AArch64FunctionInfo::needsDwarfUnwindInfo(
	const MachineFunction &MF) const {			const MachineFunction &MF) const {
	if (!NeedsDwarfUnwindInfo)			if (!NeedsDwarfUnwindInfo)
	NeedsDwarfUnwindInfo = MF.needsFrameMoves() &&			NeedsDwarfUnwindInfo = MF.needsFrameMoves() &&
	!MF.getTarget().getMCAsmInfo()->usesWindowsCFI();			!MF.getTarget().getMCAsmInfo()->usesWindowsCFI();

	Show All 16 Lines

llvm/lib/Target/AArch64/AArch64PointerAuth.h

This file was added.

				//===-- AArch64PointerAuth.h -- Harden code using PAuth ---------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64POINTERAUTH_H
				#define LLVM_LIB_TARGET_AARCH64_AARCH64POINTERAUTH_H

				#include "llvm/CodeGen/MachineBasicBlock.h"
				#include "llvm/CodeGen/Register.h"

				namespace llvm {
				namespace AArch64PAuth {

				/// Variants of check performed on an authenticated pointer.
				///
				/// In cases such as authenticating the LR value when performing a tail call
				/// or when re-signing a signed pointer with a different signing schema,
				/// a failed authentication may not generate an exception on its own and may
				/// create an authentication or signing oracle if not checked explicitly.
				///
				/// A number of check methods modify control flow in a similar way by
				/// rewriting the code
				///
				/// ```
				/// <authenticate LR>
				/// <more instructions>
				/// ```
				///
				/// as follows:
				///
				/// ```
				/// <authenticate LR>
				/// <method-specific checker>
				/// ret_block:
				/// <more instructions>
				/// ...
				///
				/// break_block:
				/// brk <code>
				/// ```
				enum class AuthCheckMethod {
				/// Do not check the value at all
				None,
				/// Perform a load to a temporary register
				DummyLoad,
				/// Check by comparing bits 62 and 61 of the authenticated address.
				///
				/// This method modifies control flow and inserts the following checker:
				///
				/// ```
				/// eor Xtmp, Xn, Xn, lsl #1
				/// tbnz Xtmp, #62, break_block
				/// ```
				HighBitsNoTBI,
				/// Check by comparing the authenticated value with an XPAC-ed one without
				/// using PAuth instructions not encoded as HINT. Can only be applied to LR.
				///
				/// This method modifies control flow and inserts the following checker:
				///
				/// ```
				/// mov Xtmp, LR
				/// xpaclri ; encoded as "hint #7"
				/// ; Note: at this point, the LR register contains the address as if
				/// ; the authentication succeeded and the temporary register contains the
				/// ; real result of authentication.
				/// cmp Xtmp, LR
				/// b.ne break_block
				/// ```
				XPACHint,
				};

				#define AUTH_CHECK_METHOD_CL_VALUES_COMMON \
				clEnumValN(AArch64PAuth::AuthCheckMethod::None, "none", \
				"Do not check authenticated address"), \
				clEnumValN(AArch64PAuth::AuthCheckMethod::DummyLoad, "load", \
				"Perform dummy load from authenticated address"), \
				clEnumValN(AArch64PAuth::AuthCheckMethod::HighBitsNoTBI, \
				"high-bits-notbi", \
				"Compare bits 62 and 61 of address (TBI should be disabled)")

				#define AUTH_CHECK_METHOD_CL_VALUES_LR \
				AUTH_CHECK_METHOD_CL_VALUES_COMMON, \
				clEnumValN(AArch64PAuth::AuthCheckMethod::XPACHint, "xpac-hint", \
				"Compare with the result of XPACLRI")

				kristof.beylsUnsubmitted Not Done Reply Inline Actions nit pick - feel free to disagree. IIUC, all methods are pre-v8.3 compatible. Is it then worthwhile to call pre-v8.3 compatibility out in the help text? I think I'd remove the "(pre-v8.3 compatible)" part. kristof.beyls: nit pick - feel free to disagree. IIUC, all methods are pre-v8.3 compatible. Is it then…
				atrosinenkoAuthorUnsubmitted Done Reply Inline Actions There will probably be non v8.2-compatible checkers in the future - for example, if we want to check an arbitrary register using XPAC. Though, I agree that it is better to drop the "(pre-v8.3 compatible)" part here: it would be better to add more comprehensible comment like "(requires FEAT_PAUTH)" to those checkers, I think. atrosinenko: There will probably be non v8.2-compatible checkers in the future - for example, if we want to…
				/// Explicitly checks that pointer authentication succeeded.
				///
				/// Assuming AuthenticatedReg contains a value returned by one of the AUT*
				/// instructions, check the value using Method just before the instruction
				/// pointed to by MBBI. If the check succeeds, execution proceeds to the
				/// instruction pointed to by MBBI, otherwise a CPU exception is generated.
				kristof.beylsUnsubmitted Not Done Reply Inline Actions s/succeedes/succeeds/? kristof.beyls: s/succeedes/succeeds/?
				atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Fixed. atrosinenko: Fixed.
				///
				/// Some of the methods may need to know if the pointer was authenticated
				/// using an I-key or D-key and which register can be used as temporary.
				/// If an explicit BRK instruction is used to generate an exception, BrkImm
				/// specifies its immediate operand.
				///
				/// \returns The machine basic block containing the code that is executed
				/// after the check succeeds.
				MachineBasicBlock &checkAuthenticatedRegister(MachineBasicBlock::iterator MBBI,
				AuthCheckMethod Method,
				Register AuthenticatedReg,
				Register TmpReg, bool UseIKey,
				unsigned BrkImm);

				/// Returns the number of bytes added by checkAuthenticatedRegister.
				unsigned getCheckerSizeInBytes(AuthCheckMethod Method);

				} // end namespace AArch64PAuth
				} // end namespace llvm

				#endif

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp

//===-- AArch64PointerAuth.cpp -- Harden code using PAuth ------------------==//		//===-- AArch64PointerAuth.cpp -- Harden code using PAuth ------------------==//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "AArch64PointerAuth.h"

#include "AArch64.h"		#include "AArch64.h"
		#include "AArch64InstrInfo.h"
#include "AArch64MachineFunctionInfo.h"		#include "AArch64MachineFunctionInfo.h"
#include "AArch64Subtarget.h"		#include "AArch64Subtarget.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"

using namespace llvm;		using namespace llvm;
		using namespace llvm::AArch64PAuth;

#define AARCH64_POINTER_AUTH_NAME "AArch64 Pointer Authentication"		#define AARCH64_POINTER_AUTH_NAME "AArch64 Pointer Authentication"

namespace {		namespace {

class AArch64PointerAuth : public MachineFunctionPass {		class AArch64PointerAuth : public MachineFunctionPass {
public:		public:
static char ID;		static char ID;

AArch64PointerAuth() : MachineFunctionPass(ID) {}		AArch64PointerAuth() : MachineFunctionPass(ID) {}

bool runOnMachineFunction(MachineFunction &MF) override;		bool runOnMachineFunction(MachineFunction &MF) override;

StringRef getPassName() const override { return AARCH64_POINTER_AUTH_NAME; }		StringRef getPassName() const override { return AARCH64_POINTER_AUTH_NAME; }

private:		private:
		/// An immediate operand passed to BRK instruction, if it is ever emitted.
		const unsigned BrkOperand = 0xc471;

const AArch64Subtarget *Subtarget = nullptr;		const AArch64Subtarget *Subtarget = nullptr;
const AArch64InstrInfo *TII = nullptr;		const AArch64InstrInfo *TII = nullptr;
		const AArch64RegisterInfo *TRI = nullptr;

void signLR(MachineFunction &MF, MachineBasicBlock::iterator MBBI) const;		void signLR(MachineFunction &MF, MachineBasicBlock::iterator MBBI) const;

void authenticateLR(MachineFunction &MF,		void authenticateLR(MachineFunction &MF,
MachineBasicBlock::iterator MBBI) const;		MachineBasicBlock::iterator MBBI) const;

		bool checkAuthenticatedLR(MachineBasicBlock::iterator TI) const;
};		};

} // end anonymous namespace		} // end anonymous namespace

INITIALIZE_PASS(AArch64PointerAuth, "aarch64-ptrauth",		INITIALIZE_PASS(AArch64PointerAuth, "aarch64-ptrauth",
AARCH64_POINTER_AUTH_NAME, false, false)		AARCH64_POINTER_AUTH_NAME, false, false)

FunctionPass *llvm::createAArch64PointerAuthPass() {		FunctionPass *llvm::createAArch64PointerAuthPass() {
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	if (Subtarget->hasPAuth() && TerminatorIsCombinable && !NeedsWinCFI &&
}		}
if (NeedsWinCFI) {		if (NeedsWinCFI) {
BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PACSignLR))		BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PACSignLR))
.setMIFlag(MachineInstr::FrameDestroy);		.setMIFlag(MachineInstr::FrameDestroy);
}		}
}		}
}		}

		namespace {

		// Mark dummy LDR instruction as volatile to prevent removing it as dead code.
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I think this class could use a comment. Would something like the following be correct? // AddressCheckPseudoSourceValue represents the memory access inserted by the // the AuthCheckMethod::DummyLoad method to trigger a run-time error if the // authentication of a pointer failed. kristof.beyls: I think this class could use a comment. Would something like the following be correct? ``` //…
		atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Added your comment, thank you. atrosinenko: Added your comment, thank you.
		MachineMemOperand *createCheckMemOperand(MachineFunction &MF,
		const AArch64Subtarget &Subtarget) {
		MachinePointerInfo PointerInfo(Subtarget.getAddressCheckPSV());
		auto MOVolatileLoad =
		MachineMemOperand::MOLoad \| MachineMemOperand::MOVolatile;

		return MF.getMachineMemOperand(PointerInfo, MOVolatileLoad, 4, Align(4));
		}

		} // namespace

		MachineBasicBlock &llvm::AArch64PAuth::checkAuthenticatedRegister(
		MachineBasicBlock::iterator MBBI, AuthCheckMethod Method,
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Seeing this is a static variable made me think: would it be possible in a single run of any program using the LLVM libraries (such as clang, flang, rust, opt, ...) for MF.getTarget() to potentially return a different result sometimes within the same execution of the that program. If that could occur, then some from some executions of this function, the `CheckPSV` variable could be initialized with the wrong `TargetMachine` reference.... I cannot immediately think of an example, but I am also not sure. I think it would be more prudent to not have this as a static variable. kristof.beyls: Seeing this is a static variable made me think: would it be possible in a single run of any…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions I used the same approach as it is implemented in Hexagon backend: HexagonInstrInfo.cpp. For now TargetMachine is only used by `PseudoSourceValue` constructor to call `TM.getAddressSpaceForPseudoSourceKind(Kind)`. I agree that using static instance looks suspicious, but I guess allocating new instance from heap on each check is quite wasteful. atrosinenko: I used the same approach as it is implemented in Hexagon backend: [HexagonInstrInfo.cpp](https…
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I guess this would be allocating new instances on the stack rather than the heap? Anyway, I think it would be better to not rely on a static variable where we know this is likely to trigger a hard-to-track-down bug at some point in the future, for some use cases. If the cost would be too high, I guess it might be possible to "cache" CheckPSV once per MachineFunction being processed? That may require finding an object that is constructed once per MachineFunction and put that cached CheckPSV in there. Not sure if there is such an obvious object at the moment and if so, how much refactoring would be needed to make this possible. Anyway, I guess this doesn't get called that often (only once per tail call) so presumably it isn't worthwhile to make the code a lot more complicated for the small (would it even be measurable?) compile time gain. kristof.beyls: I guess this would be allocating new instances on the stack rather than the heap? Anyway, I…
		atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Unfortunately, MachinePointerInfo stores a pointer to CheckPSV, so allocating it on stack is not possible. Turned out, each MachineFunction has PseudoSourceValueManager, but it is not subclassed per-target, so caching target-specific PSV would require some refactoring. On the other hand, AddressCheckPseudoSourceValue has quite generic semantics, so moved it to AArch64SubtargetInfo. atrosinenko: Unfortunately, MachinePointerInfo stores a pointer to CheckPSV, so allocating it on stack is…
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Thank you, that looks like a neat solution. kristof.beyls: Thank you, that looks like a neat solution.
		Register AuthenticatedReg, Register TmpReg, bool UseIKey, unsigned BrkImm) {

		MachineBasicBlock &MBB = *MBBI->getParent();
		MachineFunction &MF = *MBB.getParent();
		const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
		const AArch64InstrInfo *TII = Subtarget.getInstrInfo();
		DebugLoc DL = MBBI->getDebugLoc();

		// First, handle the methods not requiring creating extra MBBs.
		switch (Method) {
		default:
		break;
		case AuthCheckMethod::None:
		return MBB;
		case AuthCheckMethod::DummyLoad:
		BuildMI(MBB, MBBI, DL, TII->get(AArch64::LDRWui), getWRegFromXReg(TmpReg))
		.addReg(AArch64::LR)
		.addImm(0)
		.addMemOperand(createCheckMemOperand(MF, Subtarget));
		return MBB;
		}

		// Control flow has to be changed, so arrange new MBBs.

		// At now, at least an AUT* instruction is expected before MBBI
		assert(MBBI != MBB.begin() &&
		"Cannot insert the check at the very beginning of MBB");
		// The block to insert check into.
		MachineBasicBlock *CheckBlock = &MBB;
		// The remaining part of the original MBB that is executed on success.
		MachineBasicBlock SuccessBlock = MBB.splitAt(std::prev(MBBI));
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I'm wondering why any/all of the machineinsts created in this function need to have the FrameDestroy flag set? Do you know? kristof.beyls: I'm wondering why any/all of the machineinsts created in this function need to have the…
		atrosinenkoAuthorUnsubmitted Not Done Reply Inline Actions Setting `MachineInstr::FrameDestroy` unconditionally in `checkAuthenticatedRegister` is definitely a mistake, thank you. Maybe I have to pass MI flags as an argument, but it seems that it works as-is, at least for generating DWARF debug info (see Epilogue Begin marker). Added an explicit assertion that WinCFI is not requested as I don't yet emit any SEH opcodes. $ cat /tmp/tail-call.c int caller_indirect(int n, int (fptr)(int)) { asm volatile ("" ::: "lr"); n = 42; return fptr(n); } $ ./bin/clang -O1 -target aarch64-linux-gnu /tmp/tail-call.c -c -o /tmp/tail-call.o -mbranch-protection=pac-ret -mllvm -aarch64-authenticated-lr-check-method=xpac-hint -g $ dwarfdump -l /tmp/tail-call.o && llvm-objdump -d /tmp/tail-call.o .debug_line: line number info for a single cu Source lines (from CU-DIE at .debug_info offset 0x0000000c): NS new statement, BB new basic block, ET end of text sequence PE prologue end, EB epilogue begin IS=val ISA number, DI=val discriminator value <pc> [lno,col] NS BB ET PE EB IS= DI= uri: "filepath" 0x00000000 [ 1, 0] NS uri: "/tmp/tail-call.c" 0x0000000c [ 2, 3] NS PE 0x0000000c [ 3, 6] NS 0x00000010 [ 4,10] NS ET EB 0x00000030 [ 4,10] NS ET /tmp/tail-call.o: file format elf64-littleaarch64 Disassembly of section .text: 0000000000000000 <caller_indirect>: 0: d503233f paciasp 4: f81f0ffe str x30, [sp, #-0x10]! 8: 52800548 mov w8, #0x2a c: b9000008 str w8, [x0] 10: f84107fe ldr x30, [sp], #0x10 14: d50323bf autiasp 18: aa1e03f0 mov x16, x30 1c: d50320ff xpaclri 20: eb1e021f cmp x16, x30 24: 54000041 b.ne 0x2c <caller_indirect+0x2c> 28: d61f0020 br x1 2c: d4388e20 brk #0xc471 atrosinenko:* Setting `MachineInstr::FrameDestroy` unconditionally in `checkAuthenticatedRegister` is…

		// The block that explicitly generates a break-point exception on failure.
		MachineBasicBlock *BreakBlock =
		MF.CreateMachineBasicBlock(MBB.getBasicBlock());
		MF.push_back(BreakBlock);
		MBB.splitSuccessor(SuccessBlock, BreakBlock);

		assert(CheckBlock->getFallThrough() == SuccessBlock);
		BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm);

		switch (Method) {
		case AuthCheckMethod::None:
		case AuthCheckMethod::DummyLoad:
		llvm_unreachable("Should be handled above");
		case AuthCheckMethod::HighBitsNoTBI:
		BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg)
		.addReg(AuthenticatedReg)
		.addReg(AuthenticatedReg)
		.addImm(1);
		BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX))
		.addReg(TmpReg)
		.addImm(62)
		.addMBB(BreakBlock);
		return *SuccessBlock;
		case AuthCheckMethod::XPACHint:
		assert(AuthenticatedReg == AArch64::LR &&
		"XPACHint mode is only compatible with checking the LR register");
		assert(UseIKey && "XPACHint mode is only compatible with I-keys");
		BuildMI(CheckBlock, DL, TII->get(AArch64::ORRXrs), TmpReg)
		.addReg(AArch64::XZR)
		.addReg(AArch64::LR)
		.addImm(0);
		BuildMI(CheckBlock, DL, TII->get(AArch64::XPACLRI));
		BuildMI(CheckBlock, DL, TII->get(AArch64::SUBSXrs), AArch64::XZR)
		.addReg(TmpReg)
		.addReg(AArch64::LR)
		.addImm(0);
		BuildMI(CheckBlock, DL, TII->get(AArch64::Bcc))
		.addImm(AArch64CC::NE)
		.addMBB(BreakBlock);
		return *SuccessBlock;
		}
		}

		unsigned llvm::AArch64PAuth::getCheckerSizeInBytes(AuthCheckMethod Method) {
		switch (Method) {
		case AuthCheckMethod::None:
		return 0;
		case AuthCheckMethod::DummyLoad:
		return 4;
		case AuthCheckMethod::HighBitsNoTBI:
		return 12;
		case AuthCheckMethod::XPACHint:
		return 20;
		}
		}

		bool AArch64PointerAuth::checkAuthenticatedLR(
		MachineBasicBlock::iterator TI) const {
		AuthCheckMethod Method = Subtarget->getAuthenticatedLRCheckMethod();

		if (Method == AuthCheckMethod::None)
		return false;

		// FIXME If FEAT_FPAC is implemented by the CPU, this check can be skipped.
		chillUnsubmitted Not Done Reply Inline Actions nit: The check for `FEAT_FPAC` perhaps can be done in `getAuthenticatedLRCheckMethod` (and possibly return `None`), so the logic of deciding whether to emit code or not is kept in one place. chill: nit: The check for `FEAT_FPAC` perhaps can be done in `getAuthenticatedLRCheckMethod` (and…

		assert(!TI->getMF()->hasWinCFI() && "WinCFI is not yet supported");

		// The following code may create a signing oracle:
		//
		// <authenticate LR>
		// TCRETURN ; the callee may sign and spill the LR in its prologue
		//
		// To avoid generating a signing oracle, check the authenticated value
		// before possibly re-signing it in the callee, as follows:
		//
		// <authenticate LR>
		// <check if LR contains a valid address>
		// b.<cond> break_block
		// ret_block:
		// TCRETURN
		// break_block:
		// brk <BrkOperand>
		//
		// or just
		//
		// <authenticate LR>
		// ldr tmp, [lr]
		// TCRETURN

		// TmpReg is chosen assuming X16 and X17 are dead after TI.
		assert(AArch64InstrInfo::isTailCallReturnInst(*TI) &&
		"Tail call is expected");
		Register TmpReg =
		TI->readsRegister(AArch64::X16, TRI) ? AArch64::X17 : AArch64::X16;
		assert(!TI->readsRegister(TmpReg, TRI) &&
		"More than a single register is used by TCRETURN");

		checkAuthenticatedRegister(TI, Method, AArch64::LR, TmpReg, /UseIKey=/true,
		BrkOperand);

		return true;
		}

bool AArch64PointerAuth::runOnMachineFunction(MachineFunction &MF) {		bool AArch64PointerAuth::runOnMachineFunction(MachineFunction &MF) {
if (!MF.getInfo<AArch64FunctionInfo>()->shouldSignReturnAddress(true))		const auto *MFnI = MF.getInfo<AArch64FunctionInfo>();
		if (!MFnI->shouldSignReturnAddress(true))
return false;		return false;

Subtarget = &MF.getSubtarget<AArch64Subtarget>();		Subtarget = &MF.getSubtarget<AArch64Subtarget>();
TII = Subtarget->getInstrInfo();		TII = Subtarget->getInstrInfo();
		TRI = Subtarget->getRegisterInfo();

SmallVector<MachineBasicBlock::iterator> DeletedInstrs;		SmallVector<MachineBasicBlock::iterator> DeletedInstrs;
		SmallVector<MachineBasicBlock::iterator> TailCallInstrs;

bool Modified = false;		bool Modified = false;
		bool HasAuthenticationInstrs = false;

for (auto &MBB : MF) {		for (auto &MBB : MF) {
for (auto &MI : MBB) {		for (auto &MI : MBB) {
auto It = MI.getIterator();		auto It = MI.getIterator();
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
// do nothing		if (AArch64InstrInfo::isTailCallReturnInst(MI))
		TailCallInstrs.push_back(It);
break;		break;
case AArch64::PAUTH_PROLOGUE:		case AArch64::PAUTH_PROLOGUE:
signLR(MF, It);		signLR(MF, It);
DeletedInstrs.push_back(It);		DeletedInstrs.push_back(It);
Modified = true;		Modified = true;
break;		break;
case AArch64::PAUTH_EPILOGUE:		case AArch64::PAUTH_EPILOGUE:
authenticateLR(MF, It);		authenticateLR(MF, It);
DeletedInstrs.push_back(It);		DeletedInstrs.push_back(It);
Modified = true;		Modified = true;
		HasAuthenticationInstrs = true;
break;		break;
}		}
}		}
}		}

		// FIXME Do we need to emit any PAuth-related epilogue code at all
		kristof.beylsUnsubmitted Not Done Reply Inline Actions The convention in the LLVM code base is to use "FIXME" rather than "TODO". kristof.beyls: The convention in the LLVM code base is to use "FIXME" rather than "TODO".
		atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Updated atrosinenko: Updated
		// when SCS is enabled?
		if (HasAuthenticationInstrs &&
		!MFnI->needsShadowCallStackPrologueEpilogue(MF)) {
		for (auto TailCall : TailCallInstrs)
		Modified \|= checkAuthenticatedLR(TailCall);
		}

for (auto MI : DeletedInstrs)		for (auto MI : DeletedInstrs)
MI->eraseFromParent();		MI->eraseFromParent();

return Modified;		return Modified;
}		}

llvm/lib/Target/AArch64/AArch64Subtarget.h

Show All 10 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64SUBTARGET_H		#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64SUBTARGET_H
#define LLVM_LIB_TARGET_AARCH64_AARCH64SUBTARGET_H		#define LLVM_LIB_TARGET_AARCH64_AARCH64SUBTARGET_H

#include "AArch64FrameLowering.h"		#include "AArch64FrameLowering.h"
#include "AArch64ISelLowering.h"		#include "AArch64ISelLowering.h"
#include "AArch64InstrInfo.h"		#include "AArch64InstrInfo.h"
		#include "AArch64PointerAuth.h"
#include "AArch64RegisterInfo.h"		#include "AArch64RegisterInfo.h"
#include "AArch64SelectionDAGInfo.h"		#include "AArch64SelectionDAGInfo.h"
#include "llvm/CodeGen/GlobalISel/CallLowering.h"		#include "llvm/CodeGen/GlobalISel/CallLowering.h"
#include "llvm/CodeGen/GlobalISel/InlineAsmLowering.h"		#include "llvm/CodeGen/GlobalISel/InlineAsmLowering.h"
#include "llvm/CodeGen/GlobalISel/InstructionSelector.h"		#include "llvm/CodeGen/GlobalISel/InstructionSelector.h"
#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"		#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"
#include "llvm/CodeGen/RegisterBankInfo.h"		#include "llvm/CodeGen/RegisterBankInfo.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
▲ Show 20 Lines • Show All 400 Lines • ▼ Show 20 Lines	const char* getChkStkName() const {
return "__chkstk";		return "__chkstk";
}		}

const char* getSecurityCheckCookieName() const {		const char* getSecurityCheckCookieName() const {
if (isWindowsArm64EC())		if (isWindowsArm64EC())
return "__security_check_cookie_arm64ec";		return "__security_check_cookie_arm64ec";
return "__security_check_cookie";		return "__security_check_cookie";
}		}

		/// Choose a method of checking LR before performing a tail call.
		AArch64PAuth::AuthCheckMethod getAuthenticatedLRCheckMethod() const;

		const PseudoSourceValue *getAddressCheckPSV() const {
		return AddressCheckPSV.get();
		}

		private:
		/// Pseudo value representing memory load performed to check an address.
		///
		/// This load operation is solely used for its side-effects: if the address
		/// is not mapped (or not readable), it triggers CPU exception, otherwise
		/// execution proceeds and the value is not used.
		class AddressCheckPseudoSourceValue : public PseudoSourceValue {
		public:
		AddressCheckPseudoSourceValue(const TargetMachine &TM)
		: PseudoSourceValue(TargetCustom, TM) {}

		bool isConstant(const MachineFrameInfo *) const override { return false; }
		bool isAliased(const MachineFrameInfo *) const override { return true; }
		bool mayAlias(const MachineFrameInfo *) const override { return true; }
		void printCustom(raw_ostream &OS) const override { OS << "AddressCheck"; }
		};

		std::unique_ptr<AddressCheckPseudoSourceValue> AddressCheckPSV;
};		};
} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/lib/Target/AArch64/AArch64Subtarget.cpp

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	ReservedRegsForRA("reserve-regs-for-regalloc", cl::desc("Reserve physical "
cl::CommaSeparated, cl::Hidden);		cl::CommaSeparated, cl::Hidden);

static cl::opt<bool> ForceStreamingCompatibleSVE(		static cl::opt<bool> ForceStreamingCompatibleSVE(
"force-streaming-compatible-sve",		"force-streaming-compatible-sve",
cl::desc(		cl::desc(
"Force the use of streaming-compatible SVE code for all functions"),		"Force the use of streaming-compatible SVE code for all functions"),
cl::Hidden);		cl::Hidden);

		static cl::opt<AArch64PAuth::AuthCheckMethod>
		AuthenticatedLRCheckMethod("aarch64-authenticated-lr-check-method",
		cl::Hidden,
		cl::desc("Override the variant of check applied "
		"to authenticated LR during tail call"),
		cl::values(AUTH_CHECK_METHOD_CL_VALUES_LR));

unsigned AArch64Subtarget::getVectorInsertExtractBaseCost() const {		unsigned AArch64Subtarget::getVectorInsertExtractBaseCost() const {
if (OverrideVectorInsertExtractBaseCost.getNumOccurrences() > 0)		if (OverrideVectorInsertExtractBaseCost.getNumOccurrences() > 0)
return OverrideVectorInsertExtractBaseCost;		return OverrideVectorInsertExtractBaseCost;
return VectorInsertExtractBaseCost;		return VectorInsertExtractBaseCost;
}		}

AArch64Subtarget &AArch64Subtarget::initializeSubtargetDependencies(		AArch64Subtarget &AArch64Subtarget::initializeSubtargetDependencies(
StringRef FS, StringRef CPUString, StringRef TuneCPUString) {		StringRef FS, StringRef CPUString, StringRef TuneCPUString) {
▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines	if (ReservedRegNames.count(TRI->getName(AArch64::X0 + i)))
ReserveXRegisterForRA.set(i);		ReserveXRegisterForRA.set(i);
}		}
// X30 is named LR, so we can't use TRI->getName to check X30.		// X30 is named LR, so we can't use TRI->getName to check X30.
if (ReservedRegNames.count("X30") \|\| ReservedRegNames.count("LR"))		if (ReservedRegNames.count("X30") \|\| ReservedRegNames.count("LR"))
ReserveXRegisterForRA.set(30);		ReserveXRegisterForRA.set(30);
// X29 is named FP, so we can't use TRI->getName to check X29.		// X29 is named FP, so we can't use TRI->getName to check X29.
if (ReservedRegNames.count("X29") \|\| ReservedRegNames.count("FP"))		if (ReservedRegNames.count("X29") \|\| ReservedRegNames.count("FP"))
ReserveXRegisterForRA.set(29);		ReserveXRegisterForRA.set(29);

		AddressCheckPSV.reset(new AddressCheckPseudoSourceValue(TM));
}		}

const CallLowering *AArch64Subtarget::getCallLowering() const {		const CallLowering *AArch64Subtarget::getCallLowering() const {
return CallLoweringInfo.get();		return CallLoweringInfo.get();
}		}

const InlineAsmLowering *AArch64Subtarget::getInlineAsmLowering() const {		const InlineAsmLowering *AArch64Subtarget::getInlineAsmLowering() const {
return InlineAsmLoweringInfo.get();		return InlineAsmLoweringInfo.get();
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	bool AArch64Subtarget::isNeonAvailable() const {
return hasNEON() && !isStreaming() && !isStreamingCompatible();		return hasNEON() && !isStreaming() && !isStreamingCompatible();
}		}

bool AArch64Subtarget::isSVEAvailable() const{		bool AArch64Subtarget::isSVEAvailable() const{
// FIXME: Also return false if FEAT_FA64 is set, but we can't do this yet		// FIXME: Also return false if FEAT_FA64 is set, but we can't do this yet
// as we don't yet support the feature in LLVM.		// as we don't yet support the feature in LLVM.
return hasSVE() && !isStreaming() && !isStreamingCompatible();		return hasSVE() && !isStreaming() && !isStreamingCompatible();
}		}

		// If return address signing is enabled, tail calls are emitted as follows:
		//
		// ```
		// <authenticate LR>
		// <check LR>
		// TCRETURN ; the callee may sign and spill the LR in its prologue
		// ```
		//
		// LR may require explicit checking because if FEAT_FPAC is not implemented
		// and LR was tampered with, then `<authenticate LR>` will not generate an
		// exception on its own. Later, if the callee spills the signed LR value and
		// neither FEAT_PAuth2 nor FEAT_EPAC are implemented, the valid PAC replaces
		// the higher bits of LR thus hiding the authentication failure.
		AArch64PAuth::AuthCheckMethod
		AArch64Subtarget::getAuthenticatedLRCheckMethod() const {
		if (AuthenticatedLRCheckMethod.getNumOccurrences())
		return AuthenticatedLRCheckMethod;

		// At now, use None by default because checks may introduce an unexpected
		// performance regression or incompatibility with execute-only mappings.
		return AArch64PAuth::AuthCheckMethod::None;
		}

llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll

This file was added.

				; RUN: llc -mtriple=aarch64 -asm-verbose=0 < %s \| FileCheck -DAUTIASP="hint #29" --check-prefixes=COMMON %s
				; RUN: llc -mtriple=aarch64 -asm-verbose=0 -aarch64-authenticated-lr-check-method=load < %s \| FileCheck -DAUTIASP="hint #29" --check-prefixes=COMMON,LDR %s
				; RUN: llc -mtriple=aarch64 -asm-verbose=0 -aarch64-authenticated-lr-check-method=high-bits-notbi < %s \| FileCheck -DAUTIASP="hint #29" --check-prefixes=COMMON,BITS-NOTBI,BRK %s
				; RUN: llc -mtriple=aarch64 -asm-verbose=0 -aarch64-authenticated-lr-check-method=xpac-hint < %s \| FileCheck -DAUTIASP="hint #29" -DXPACLRI="hint #7" --check-prefixes=COMMON,XPAC,BRK %s
				; RUN: llc -mtriple=aarch64 -asm-verbose=0 -aarch64-authenticated-lr-check-method=xpac-hint -mattr=v8.3a < %s \| FileCheck -DAUTIASP="autiasp" -DXPACLRI="xpaclri" --check-prefixes=COMMON,XPAC,BRK %s

				define i32 @tailcall_direct() "sign-return-address"="non-leaf" {
				; COMMON-LABEL: tailcall_direct:
				; COMMON: str x30, [sp, #-16]!
				; COMMON: ldr x30, [sp], #16
				;
				; COMMON-NEXT: [[AUTIASP]]
				kristof.beylsUnsubmitted Not Done Reply Inline Actions nitpick: I think I'd prefer `[[AUTIASP]]` rather than `[[AUT]]` as the macro name as that makes it clearer on first read exactly which authenticate operation is expected here. I do like the use of the macro though to hide away the difference between `hint #29` and `autiasp`. kristof.beyls: nitpick: I think I'd prefer `[[AUTIASP]]` rather than `[[AUT]]` as the macro name as that makes…
				atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Fixed atrosinenko: Fixed
				;
				; LDR-NEXT: ldr w16, [x30]
				;
				; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1
				; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
				;
				; XPAC-NEXT: mov x16, x30
				; XPAC-NEXT: [[XPACLRI]]
				kristof.beylsUnsubmitted Not Done Reply Inline Actions nitpick: similarly, I think I'd prefer `[[XPACLRI]]` kristof.beyls: nitpick: similarly, I think I'd prefer `[[XPACLRI]]`
				atrosinenkoAuthorUnsubmitted Done Reply Inline Actions Fixed atrosinenko: Fixed
				; XPAC-NEXT: cmp x16, x30
				; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
				;
				; COMMON-NEXT: b callee
				; BRK-NEXT: .[[FAIL]]:
				; BRK-NEXT: brk #0xc471
				tail call void asm sideeffect "", "~{lr}"()
				%call = tail call i32 @callee()
				ret i32 %call
				}

				define i32 @tailcall_indirect(ptr %fptr) "sign-return-address"="non-leaf" {
				; COMMON-LABEL: tailcall_indirect:
				; COMMON: str x30, [sp, #-16]!
				; COMMON: ldr x30, [sp], #16
				;
				; COMMON-NEXT: [[AUTIASP]]
				;
				; LDR-NEXT: ldr w16, [x30]
				;
				; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1
				; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
				;
				; XPAC-NEXT: mov x16, x30
				; XPAC-NEXT: [[XPACLRI]]
				; XPAC-NEXT: cmp x16, x30
				; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
				;
				; COMMON-NEXT: br x0
				; BRK-NEXT: .[[FAIL]]:
				; BRK-NEXT: brk #0xc471
				tail call void asm sideeffect "", "~{lr}"()
				%call = tail call i32 %fptr()
				ret i32 %call
				}

				define i32 @tailcall_direct_noframe() "sign-return-address"="non-leaf" {
				; COMMON-LABEL: tailcall_direct_noframe:
				; COMMON-NEXT: .cfi_startproc
				; COMMON-NEXT: b callee
				%call = tail call i32 @callee()
				ret i32 %call
				}

				define i32 @tailcall_indirect_noframe(ptr %fptr) "sign-return-address"="non-leaf" {
				; COMMON-LABEL: tailcall_indirect_noframe:
				; COMMON-NEXT: .cfi_startproc
				; COMMON-NEXT: br x0
				%call = tail call i32 %fptr()
				ret i32 %call
				}

				define i32 @tailcall_direct_noframe_sign_all() "sign-return-address"="all" {
				; COMMON-LABEL: tailcall_direct_noframe_sign_all:
				; COMMON-NOT: str{{.*}}x30
				; COMMON-NOT: ldr{{.*}}x30
				;
				; COMMON: [[AUTIASP]]
				;
				; LDR-NEXT: ldr w16, [x30]
				;
				; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1
				; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
				;
				; XPAC-NEXT: mov x16, x30
				; XPAC-NEXT: [[XPACLRI]]
				; XPAC-NEXT: cmp x16, x30
				; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
				;
				; COMMON-NEXT: b callee
				; BRK-NEXT: .[[FAIL]]:
				; BRK-NEXT: brk #0xc471
				%call = tail call i32 @callee()
				ret i32 %call
				}

				define i32 @tailcall_indirect_noframe_sign_all(ptr %fptr) "sign-return-address"="all" {
				; COMMON-LABEL: tailcall_indirect_noframe_sign_all:
				; COMMON-NOT: str{{.*}}x30
				; COMMON-NOT: ldr{{.*}}x30
				;
				; COMMON: [[AUTIASP]]
				;
				; LDR-NEXT: ldr w16, [x30]
				;
				; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1
				; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
				;
				; XPAC-NEXT: mov x16, x30
				; XPAC-NEXT: [[XPACLRI]]
				; XPAC-NEXT: cmp x16, x30
				; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
				;
				; COMMON-NEXT: br x0
				; BRK-NEXT: .[[FAIL]]:
				; BRK-NEXT: brk #0xc471
				%call = tail call i32 %fptr()
				ret i32 %call
				}

				declare i32 @callee()

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][PAC] Check authenticated LR value during tail callClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 557683

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

llvm/lib/Target/AArch64/AArch64InstrInfo.h

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp

llvm/lib/Target/AArch64/AArch64PointerAuth.h

llvm/lib/Target/AArch64/AArch64PointerAuth.cpp

llvm/lib/Target/AArch64/AArch64Subtarget.h

llvm/lib/Target/AArch64/AArch64Subtarget.cpp

llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll

[AArch64][PAC] Check authenticated LR value during tail call
ClosedPublic