diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -16229,6 +16229,13 @@ ``llvm.eh.`` prefix), are described in the `LLVM Exception Handling `_ document. +Pointer Authentication Intrinsics +--------------------------------- + +The LLVM pointer authentication intrinsics (which all start with +``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication +`_ document. + .. _int_trampoline: Trampoline Intrinsics diff --git a/llvm/docs/PointerAuth.md b/llvm/docs/PointerAuth.md new file mode 100644 --- /dev/null +++ b/llvm/docs/PointerAuth.md @@ -0,0 +1,301 @@ +# Pointer Authentication + +## Introduction + +Pointer Authentication is a mechanism by which certain pointers are signed, +are modified to embed that signature in their unused bits, and are +authenticated (have their signature checked) when used, to prevent pointers +of unknown origin from being injected into a process. + +To enforce Control Flow Integrity (CFI), this is mostly used for all code +pointers (function pointers, vtable entries, ...), but certain data pointers +specified by the ABI (vtable pointer, ...) are also authenticated. + +Additionally, with clang extensions, users can specify that a given pointer +be signed/authenticated. + +At the IR level, it is represented using: + +* a [set of intrinsics](#intrinsics) (to sign/authenticate pointers) + +It is implemented by the [AArch64 target](#aarch64-support), using the +[ARMv8.3 Pointer Authentication Code](#armv8-3-pointer-authentication-code) +instructions, to support the Darwin arm64e ABI. + + +## Concepts + +### Operations + +Pointer Authentication is based on three fundamental operations: + +#### Sign +* compute a cryptographic signature of a given pointer value +* embed it within the value +* return the signed value + +#### Auth +* compute a cryptographic signature of a given value +* compare it against the embedded signature +* remove the embedded signature +* return the raw, unauthenticated, value + +#### Strip +* remove the embedded signature +* return the unauthenticated value + + +### Diversity + +To prevent any signed pointer from being used instead of any other signed +pointer, the signatures are diversified, using additional inputs: + +* a key: one of a small, fixed set. The value of the key itself is not + directly accessible, but is referenced by ptrauth operations via an + identifier. + +* salt, or extra diversity data: additional data mixed in with the value and + used by the ptrauth operations. + A concrete value is called a "discriminator", and, in the special case where + the diversity data is a pointer to the storage location of the signed value, + the value is said to be "address-discriminated". + Additionally, an arbitrary small integer can be blended into an address + discriminator to produce a blended address discriminator. + +Keys are not necessarily interchangeable, and keys can be specified to be +incompatible with certain kinds of pointers (e.g., code vs data keys/pointers). +Which keys are appropriate for a given kind of pointer is defined by the +target implementation. + +## LLVM IR Representation + +### Intrinsics + +These intrinsics are provided by LLVM to expose pointer authentication +operations. + + +#### '``llvm.ptrauth.sign``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.sign(i64 , i32 , i64 ) +``` + +##### Overview: + +The '``llvm.ptrauth.sign``' intrinsic signs an unauthenticated pointer. + + +##### Arguments: + +The ``value`` argument is the unauthenticated (raw) pointer value to be signed. +The ``key`` argument is the identifier of the key to be used to generate the +signed value. +The ``extra data`` argument is the additional diversity data to be used as a +discriminator. + +##### Semantics: + +The '``llvm.ptrauth.sign``' intrinsic implements the `sign`_ operation. +It returns a signed value. + +If ``value`` is already a signed value, the behavior is undefined. + +If ``value`` is not a pointer value for which ``key`` is appropriate, the +behavior is undefined. + + +#### '``llvm.ptrauth.auth``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.auth(i64 , i32 , i64 ) +``` + +##### Overview: + +The '``llvm.ptrauth.auth``' intrinsic authenticates a signed pointer. + +##### Arguments: + +The ``value`` argument is the signed pointer value to be authenticated. +The ``key`` argument is the identifier of the key that was used to generate +the signed value. +The ``extra data`` argument is the additional diversity data to be used as a +discriminator. + +##### Semantics: + +The '``llvm.ptrauth.auth``' intrinsic implements the `auth`_ operation. +It returns a raw, unauthenticated value. +If ``value`` does not have a correct signature for ``key`` and ``extra data``, +the behavior is undefined. + + +#### '``llvm.ptrauth.strip``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.strip(i64 , i32 ) +``` + +##### Overview: + +The '``llvm.ptrauth.strip``' intrinsic strips the embedded signature out of a +possibly-signed pointer. + + +##### Arguments: + +The ``value`` argument is the signed pointer value to be stripped. +The ``key`` argument is the identifier of the key that was used to generate +the signed value. + +##### Semantics: + +The '``llvm.ptrauth.strip``' intrinsic implements the `strip`_ operation. +It returns an unauthenticated value. It does **not** check that the +signature is valid. + +If ``value`` is an unauthenticated pointer value, it is returned as-is, +provided the ``key`` is appropriate for the pointer. + +If ``value`` is not a pointer value for which ``key`` is appropriate, the +behavior is undefined. + +If ``value`` is a signed pointer value, but ``key`` does not identify the +same ``key`` that was used to generate ``value``, the behavior is undefined. + + +#### '``llvm.ptrauth.resign``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.resign(i64 , + i32 , i64 , + i32 , i64 ) +``` + +##### Overview: + +The '``llvm.ptrauth.resign``' intrinsic re-signs a signed pointer using +a different key and diversity data. + +##### Arguments: + +The ``value`` argument is the signed pointer value to be authenticated. +The ``old key`` argument is the identifier of the key that was used to generate +the signed value. +The ``old extra data`` argument is the additional diversity data to be used as a +discriminator in the auth operation. +The ``new key`` argument is the identifier of the key to use to generate the +resigned value. +The ``new extra data`` argument is the additional diversity data to be used as a +discriminator in the sign operation. + +##### Semantics: + +The '``llvm.ptrauth.resign``' intrinsic performs a combined `auth`_ and `sign`_ +operation, without exposing the intermediate unauthenticated pointer. +It returns a signed value. +If ``value`` does not have a correct signature for ``old key`` and +``old extra data``, the returned value is an invalid, poison pointer. + +#### '``llvm.ptrauth.sign_generic``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.sign_generic(i64 , i64 ) +``` + +##### Overview: + +The '``llvm.ptrauth.sign_generic``' intrinsic computes a generic signature of +arbitrary data. + +##### Arguments: + +The ``value`` argument is the arbitrary data value to be signed. +The ``extra data`` argument is the additional diversity data to be used as a +discriminator. + +##### Semantics: + +The '``llvm.ptrauth.sign_generic``' intrinsic computes the signature of a given +combination of value and additional diversity data. + +It returns a full signature value (as opposed to a signed pointer value, with +an embedded signature). + +As opposed to [``llvm.ptrauth.sign``](#llvm-ptrauth-sign), it does not interpret +``value`` as a pointer value. Instead, it is an arbitrary data value. + + +#### '``llvm.ptrauth.blend``' + +##### Syntax: + +```llvm +declare i64 @llvm.ptrauth.blend(i64
, i64 ) +``` + +##### Overview: + +The '``llvm.ptrauth.blend``' intrinsic blends a pointer address discriminator +with a small integer discriminator to produce a new discriminator. + +##### Arguments: + +The ``address discriminator`` argument is a pointer. +The ``integer discriminator`` argument is a small integer. + +##### Semantics: + +The '``llvm.ptrauth.blend``' intrinsic combines a small integer discriminator +with a pointer address discriminator, in a way that is specified by the target +implementation. + + +## AArch64 Support + +AArch64 is currently the only target with full support of the pointer +authentication primitives, based on ARMv8.3 instructions. + +### ARMv8.3 Pointer Authentication Code + +ARMv8.3 is an ISA extension that includes Pointer Authentication Code (PAC) +instructions. + +#### Keys + +5 keys are supported by ARMv8.3. + +Of those, 4 keys are interchangeably usable to specify the key used in IR +constructs: +* ``ASIA``/``ASIB`` are instruction keys (encoded as respectively 0 and 1). +* ``ASDA``/``ASDB`` are data keys (encoded as respectively 2 and 3). + +``ASGA`` is a special key that cannot be explicitly specified, and is only ever +used implicitly, to implement the +[``llvm.ptrauth.sign_generic``](#llvm-ptrauth-sign-generic) intrinsic. + +#### Instructions + +The IR [Intrinsics](#intrinsics) described above map onto these +instructions as such: +* [``llvm.ptrauth.sign``](#llvm-ptrauth-sign): ``PAC{I,D}{A,B}{Z,SP,}`` +* [``llvm.ptrauth.auth``](#llvm-ptrauth-auth): ``AUT{I,D}{A,B}{Z,SP,}`` +* [``llvm.ptrauth.strip``](#llvm-ptrauth-strip): ``XPAC{I,D}`` +* [``llvm.ptrauth.blend``](#llvm-ptrauth-blend): The semantics of the + blend operation are, in effect, specified by the ABI. arm64e specifies it as + a ``MOVK`` into the high 16-bits. +* [``llvm.ptrauth.sign_generic``](#llvm-ptrauth-sign-generic): ``PACGA`` +* [``llvm.ptrauth.resign``](#llvm-ptrauth-resign): ``AUT*+PAC*``. These are + represented as a single pseudo-instruction in the backend to guarantee that + the intermediate unauthenticated value is not spilled and attackable. diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst --- a/llvm/docs/Reference.rst +++ b/llvm/docs/Reference.rst @@ -35,6 +35,7 @@ OptBisect ORCv2 PDB/index + PointerAuth ScudoHardenedAllocator MemTagSanitizer Security @@ -215,3 +216,7 @@ :doc:`YamlIO` A reference guide for using LLVM's YAML I/O library. + +:doc:`PointerAuth` + A description of pointer authentication, its LLVM IR representation, and its + support in the backend. diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td --- a/llvm/include/llvm/IR/Intrinsics.td +++ b/llvm/include/llvm/IR/Intrinsics.td @@ -1608,6 +1608,61 @@ //===---------- Intrinsics to query properties of scalable vectors --------===// def int_vscale : DefaultAttrsIntrinsic<[llvm_anyint_ty], [], [IntrNoMem]>; + +//===----------------- Pointer Authentication Intrinsics ------------------===// +// + +// Sign an unauthenticated pointer using the specified key and discriminator, +// passed in that order. +// Returns the first argument, with some known bits replaced with a signature. +def int_ptrauth_sign : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i32_ty, llvm_i64_ty], + [IntrNoMem, ImmArg>]>; + +// Authenticate a signed pointer, using the specified key and discriminator. +// Returns the first argument, with the signature bits removed. +// The signature must be valid. +def int_ptrauth_auth : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i32_ty, llvm_i64_ty], + [IntrNoMem,ImmArg>]>; + +// Authenticate a signed pointer and resign it. +// The second (key) and third (discriminator) arguments specify the signing +// schema used for authenticating. +// The fourth and fifth arguments specify the schema used for signing. +// The signature must be valid. +// This is a combined form of @llvm.ptrauth.sign and @llvm.ptrauth.auth, with +// an additional integrity guarantee on the intermediate value. +def int_ptrauth_resign : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i32_ty, llvm_i64_ty, + llvm_i32_ty, llvm_i64_ty], + [IntrNoMem, ImmArg>, + ImmArg>]>; + +// Strip the embedded signature out of a signed pointer. +// The second argument specifies the key. +// This behaves like @llvm.ptrauth.auth, but doesn't require the signature to +// be valid. +def int_ptrauth_strip : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i32_ty], + [IntrNoMem, ImmArg>]>; + +// Blend a small integer discriminator with an address discriminator, producing +// a new discriminator value. +def int_ptrauth_blend : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i64_ty], + [IntrNoMem]>; + +// Compute the signature of a value, using a given discriminator. +// This differs from @llvm.ptrauth.sign in that it doesn't embed the computed +// signature in the pointer, but instead returns the signature as a value. +// That allows it to be used to sign non-pointer data: in that sense, it is +// generic. There is no generic @llvm.ptrauth.auth: instead, the signature +// can be computed using @llvm.ptrauth.sign_generic, and compared with icmp. +def int_ptrauth_sign_generic : Intrinsic<[llvm_i64_ty], + [llvm_i64_ty, llvm_i64_ty], + [IntrNoMem]>; + //===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//