Branch Target Identification (BTI) and Pointer Authentication (PAC) are architecture features introduced in v8.5a and 8.3a respectively. The new instructions have been added in the hint space so that binaries take advantage of support where it exists yet still run on older hardware.
The impact of each feature is:
- BTI: For executable pages that have been guarded, all indirect branches must have a destination that is a BTI instruction of the appropriate type. For the static linker, this means that PLT entries must have a "BTI c" as the first instruction in the sequence. BTI is an all or nothing property for a link unit, any indirect branch not landing on a valid destination will cause a Branch Target Exception.
- PAC: The dynamic loader encodes with PACIA the address of the destination that the PLT entry will load from the .plt.got, placing the result in a subset of the top-bits that are not valid virtual addresses. The PLT entry may authenticate these top-bits using the AUTIA instruction before branching to the destination. Use of PAC in PLT sequences is a contract between the dynamic loader and the static linker, it is independent of whether the relocatable objects use PAC.
BTI and PAC are independent features that can be combined. So we can have several combinations of PLT:
- Standard with no BTI or PAC
- BTI PLT with "BTI c" as first instruction.
- PAC PLT with "AUTIA1716" before the indirect branch to X17.
- BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the first indirect branch to X17.
The use of BTI and PAC in relocatable object files are encoded by feature bits in the .note.gnu.property section in a similar way to Intel CET. There is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND and two target feature bits defined:
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI
- All executable sections are compatible with BTI.
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC
- All executable sections have return address signing enabled.
Due to the properties of FEATURE_1_AND the static linker can tell when all input relocatable objects have the BTI and PAC feature bits set. The static linker uses this to enable the appropriate PLT sequence.
- Neither -> standard PLT
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT
- Both properties -> BTIPAC PLT
In addition to the .note.gnu.properties there are two new command line options:
--force-bti : Act as if all relocatable inputs had GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object that does not.
--pac-plt : Act as if all relocatable inputs had GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader and static linker no warning is given if it is not present in an input.
Two processor specific dynamic tags are used to communicate that a non standard PLT sequence is being used. DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC.
Implementation Notes:
Depends on llvm revisions for the properties, tags, llvm-readobj and llvm-objdump support D62595 D62596 D62598
Depends on lld revision D59780 for the .note.gnu.property implementation. I can remove the Intel CET code if it makes this easier to review?
For the changes to .note.gnu.property:
- There were some test cases in ld.bfd whereupon a single relocatable object file could have more than on FEATURE_1_AND in the .note.gnu.property section. The result for the relocatable object file was to set the bit for the object if it was set in any of the FEATURE_1_AND properties. I've altered LLD to handle multiple FEATURE_1_AND properties in the same file and added some test cases for that.
- There may be some merit in not merging the FEATURE_1_AND output into AndFeatures, instead we'd have X86Features and AArch64Features. This would make some of the if (Config->EMachine) checks go away but would mean more changes to the note parsing code.
For the command line options:
- They have been implemented to match the ld.bfd options ([PATCH, BFD, LD, AArch64, 0/4] Add support for AArch64 BTI and PAC in the linker) note that the command line options changed over the patch review cycle.
- The --force-bti option is intended for projects that may end up with assembler files without .note.gnu.property sections. A warning was chosen rather than an error to ease the initial porting effort for a project.
- There is scope to implement a --require-bti option like --require-cet that gives an error rather than warning.
For the PLT sequences:
- ld.bfd does not put "bti c" on the plt[N] entries when --pie is used. This is because ld.bfd can guarantee that the PLT address won't escape, however there is one small case in LLD where an ifunc has its address taken using a non-got reference when the PLT address can escape. There is potential for optimisation there, but at the moment we have to choose the PLT size early so it isn't easy to apply.
- The properties, features and dynamic tags are documented in: https://developer.arm.com/docs/ihi0056/latest/elf-for-the-arm-64-bit-architecture-aarch64-abi-2019q1-documentation
- The BTI and AUTIA instruction encoding is documented in https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf
- There is some documentation available on PAC in the general case https://events.static.linuxfound.org/sites/events/files/slides/slides_23.pdf
- BTI and PAC have some support in QEMU4.0 https://www.qemu.org/2019/04/24/qemu-4-0-0/
They are used only twice. We inline all other instructions. Maybe we should inline them and remove these varaibles?