This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] PAC/BTI code generation for LLVM generated functions
ClosedPublic

Authored by chill on Aug 10 2020, 6:59 AM.

Details

Summary

PAC/BTI-related codegen in the AArch64 backend is controlled by a set
of LLVM IR function attributes, added to the function by Clang, based
on command-line options and GCC-style function attributes. However,
functions, generated in the LLVM middle end (for example,
asan.module.ctor or __llvm_gcov_write_out) do not get any attributes
and the backend incorrectly does not do any PAC/BTI code generation.

This patch record the default state of PAC/BTI codegen in a set of
LLVM IR module-level attributes, based on command-line options:

  • "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET.
  • "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR.
  • "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key.

This set of attributes are always added for AArch64 targets (as
opposed, for example, to interpreting a missing attribute as having a
value 0) in order to be able to check for conflicts when combining
module attributed during LTO.

Module-level attributes are overridden by function level attributes.
All the decision making about whether to not to generate PAC and/or
BTI code is factored out into AArch64FunctionInfo, there shouldn't be
any places left, other than AArch64FunctionInfo, which directly
examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which
is/will-be handled by a separate patch.

Diff Detail

Event Timeline

chill created this revision.Aug 10 2020, 6:59 AM
chill requested review of this revision.Aug 10 2020, 6:59 AM
chill added a reviewer: tellenbach.

Module-level attributes are overridden by function level attributes.

This sounds like the way to go. It looks like that's what llvm/test/CodeGen/AArch64/pacbti-llvm-generated-funcs-2.ll is doing. Is there a test for no module attributes + function level attribute enabling pac/bti that tests that just that function gets the proper handling?

Hi @chill, thanks for this patch, looks like the correct way to handle this.

tellenbach added inline comments.Aug 11 2020, 6:25 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
5846

Is this a typo?

5856

Same.

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
149

The key type could in fact also be made an enum just like you did for the type of return address signing but no strong opinion here.

llvm/test/CodeGen/AArch64/pacbti-llvm-generated-funcs-1.ll
26

Do you need all of these attributes for the test case?

chill updated this revision to Diff 285046.Aug 12 2020, 5:10 AM
chill marked 4 inline comments as done.Aug 12 2020, 5:16 AM

Is there a test for no module attributes + function level attribute enabling pac/bti that tests that just that function gets the proper handling?

Yes, virtually every PAC/BTI test so far. I also added in the last update tests when module flags are created and when they aren't.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
5846

D'oh, bloody copy/paste. Good catch, thanks!

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
149

If I would be bothered to change it, I'd rather go in the opposite direction, replace the SignReturnAddress by two booleans. That'll remove the need for a typedef and make the code less verbose.

chill marked 2 inline comments as done.Aug 19 2020, 3:49 AM

Ping?

Does this conflict with D80791? Other than that it lgtm.

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
149

Yep, makes sense. Anyway, the current version is fine for me.

chill added a comment.Aug 24 2020, 9:19 AM

Does this conflict with D80791?

It does, we're using different module attributes.

Does this conflict with D80791?

It does, we're using different module attributes.

Okay, thanks. Please feel free to ping me again as soon as we know which version we want to use.

danielkiss added inline comments.Aug 29 2020, 5:38 AM
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
151–153

A way is needed to turn it to false even, when to the rule below turns it on. see D81251.
A setter function would be fine I think. If you agree I can add it when rebasing the D81251.

chill planned changes to this revision.Sep 1 2020, 4:47 AM

In addition to the disabling of BTI in D81251, there's an issue that we explicitly disable BTI via branch-protection=none, the attribute would just be missing and we'll pick up the module attributes, which is not what we want.

IMHO the right way to handle this is to add a boolean value to branch-target-enforcement: if the attribute is absent default to module flags, if it's present - do as the attribute value says.

I support the idea, once I proposed back then ( see D75181 )

chill updated this revision to Diff 289420.Sep 2 2020, 6:03 AM

Emit function-level LLVM IR attributes only when there's a GCC-style function attribute and be explicit with enabling/disabling.
Fixed an error where "none" in deferred to module attributes, instead of overriding them.

This revision is now accepted and ready to land.Sep 16 2020, 3:51 AM

The only question I have about module level attributes is what happens for LTO (which we use for Linux kernels used in Android)? Is there any conflict linking together objects from two different translation units with two different flags? Maybe this should fail hard; but it would be cool to alert developers at compile time, rather than have an awful bug to debug at runtime. Just food for thought; I don't know enough about LTO to know if it already has machinery to handle module level conflicts or not. Maybe @tejohnson or @pcc knows?

The only question I have about module level attributes is what happens for LTO (which we use for Linux kernels used in Android)? Is there any conflict linking together objects from two different translation units with two different flags? Maybe this should fail hard; but it would be cool to alert developers at compile time, rather than have an awful bug to debug at runtime. Just food for thought; I don't know enough about LTO to know if it already has machinery to handle module level conflicts or not. Maybe @tejohnson or @pcc knows?

I haven't looked at this patch or what it does in any detail, but any module level flag needs to specify its behavior when merging modules (for *LTO). The addModuleFlag calls added in this patch specify Override, which means:

/// Uses the specified value, regardless of the behavior or value of the
/// other module. If both modules specify **Override**, but the values
/// differ, an error will be emitted.
Override = 4,

So if it is specified in one module but not the other, the specified module flag will apply to the merged module. If it is specified in both modules (assuming both have the Override type, which it seems like the would), it will be an error if the values differ.

chill planned changes to this revision.Sep 17 2020, 1:30 AM

Thank you for your comments. I'd like to have the same behaviour with or without LTO, which is apparently not the case with this patch.

chill added a comment.Sep 18 2020, 3:25 AM

It looks like the only value that makes sense is Error - any other policy (existing or not) would potentially lead to meaningfully different code generated with or without LTO.

I think error is fine because realistically linking modules with different BTI settings will result probably a binary that crash runtime anyway if BTI is on.
otherwise turning BTI on just of a subset of object is useless.

Thanks @tejohnson for the feedback. Error sounds fine. The only other question I have is that a common pattern in the Linux kernel for progressive support of the ARM ISA extensions is to build isolated translation units that either use those features unconditionally but wont call into those translation units at runtime if the extensions are not present, or the kernel will live patch itself (ie. "alternatives" patching). Maybe @mrutland has thoughts; with BTI be runtime-patchable or otherwise enable-able at runtime. ie. one kernel image that runs on hardware both with and without BTI ISA extensions?

If the plan is to have isolated TUs built with BTI, then the "Error" policy sounds like this would be incompatible; but I don't think the point of BTI is to have some small portion of the code built with it. IIRC, doesn't BTI uses the no-op encoding space? So I guess one image could run on hardware regardless of actual BTI hardware support. So it wouldn't be patched in at runtime either. In that case, via kernel config I guess we'd build everything with BTI, and have build time errors for translation units that accidentally dropped KBUILD_CFLAGS, which happens a lot, but is fixable in kernel's build system sources.

The final piece of food for thought is ensuring that we can LTO C code built with BTI can link with out of line asm; Clang usually needs the assembler directives in place in the source being assembled to enable architectural extensions. If "Error" policy is used, can we still LTO C and asm sources? I assume we "should" be able to, but it might save trouble down the road to test that locally. Particularly, having C code that calls asm routines and the visibility of those routines during LTO has given us trouble in the past, when enabling CFI (control flow integrity; which I believe has some overlap with BTI but some subtle differences as well; I wouldn't call them perfect substitutes).

chill updated this revision to Diff 294029.Sep 24 2020, 5:38 AM
chill edited the summary of this revision. (Show Details)

LTO related fixes - user Error combining behaviour, always emit the attributes with values 0 or 1
so they are present and their values can be checked for being the same during.

This revision is now accepted and ready to land.Sep 24 2020, 5:38 AM
chill marked an inline comment as done.Sep 24 2020, 5:59 AM

Thanks @tejohnson for the feedback. Error sounds fine. The only other question I have is that a common pattern in the Linux kernel for progressive support of the ARM ISA extensions is to build isolated translation units that either use those features unconditionally but wont call into those translation units at runtime if the extensions are not present, or the kernel will live patch itself (ie. "alternatives" patching). Maybe @mrutland has thoughts; with BTI be runtime-patchable or otherwise enable-able at runtime. ie. one kernel image that runs on hardware both with and without BTI ISA extensions?

If the plan is to have isolated TUs built with BTI, then the "Error" policy sounds like this would be incompatible; but I don't think the point of BTI is to have some small portion of the code built with it. IIRC, doesn't BTI uses the no-op encoding space? So I guess one image could run on hardware regardless of actual BTI hardware support. So it wouldn't be patched in at runtime either. In that case, via kernel config I guess we'd build everything with BTI, and have build time errors for translation units that accidentally dropped KBUILD_CFLAGS, which happens a lot, but is fixable in kernel's build system sources.

The final piece of food for thought is ensuring that we can LTO C code built with BTI can link with out of line asm; Clang usually needs the assembler directives in place in the source being assembled to enable architectural extensions. If "Error" policy is used, can we still LTO C and asm sources? I assume we "should" be able to, but it might save trouble down the road to test that locally. Particularly, having C code that calls asm routines and the visibility of those routines during LTO has given us trouble in the past, when enabling CFI (control flow integrity; which I believe has some overlap with BTI but some subtle differences as well; I wouldn't call them perfect substitutes).

Yes, PAC/BTI can be done entirely in NOP-space, controlled by -march=... - if the architecture revision does not have an instruction, a NOP-space equivalent is used.
For the C/ASM link I couldn't see any issues with my basic testing with lld.

chill requested review of this revision.Sep 24 2020, 6:00 AM
danielkiss accepted this revision.Sep 24 2020, 6:32 AM

LGTM

@nickdesaulniers

one kernel image that runs on hardware both with and without BTI ISA extensions?

In case of BTI one image could run on both HW due to BTI is in the no-op space.

In that case, via kernel config I guess we'd build everything with BTI, and have build time errors for translation units that accidentally dropped KBUILD_CFLAGS, which happens a lot, but is fixable in kernel's build system sources.

actually when compiling with BTI the "-Wl,-fatal-warnings, -Wl,-z,force-bti" could be added to the linker flags to catch these things, I will send a patch to the kernel mailing list.

This revision is now accepted and ready to land.Sep 24 2020, 6:32 AM
nickdesaulniers accepted this revision.Sep 24 2020, 12:16 PM
nickdesaulniers added a subscriber: momchil.velikov.

Thank you @momchil.velikov for considering the case for LTO, and testing C+asm linkage.

actually when compiling with BTI the "-Wl,-fatal-warnings, -Wl,-z,force-bti" could be added to the linker flags to catch these things, I will send a patch to the kernel mailing list.

SGTM, I'd be happy to help test it.

clang/lib/CodeGen/CodeGenModule.cpp
593–594

I was wondering if Arch.isAArch64() would be more concise, but curiously it does not contain aarch64_32 ArchType.

Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2020, 4:01 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
pcc added a comment.Mar 15 2022, 11:37 AM

It looks like the only value that makes sense is Error - any other policy (existing or not) would potentially lead to meaningfully different code generated with or without LTO.

It should have been (the currently non-existent) "Min" -- with "Error" we get different behavior at link time with and without LTO, as a mixture of BTI and non-BTI objects ought to disable BTI in the final linked object, instead of producing an error. The code generation should be controlled by the function attribute so you ought to get the same code generated with and without LTO.

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 11:37 AM
clang/test/CodeGen/aarch64-sign-return-address.c