Page MenuHomePhabricator

Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0
ClosedPublic

Authored by MaskRay on Jan 20 2020, 4:42 PM.

Details

Summary

Similar to the function attribute prefix (prefix data),
"patchable-function-prefix" inserts data (M NOPs) before the function
entry label.

-fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry)
will look like:

  .type	foo,@function
.Ltmp0:               # @foo
  nop
foo:
.Lfunc_begin0:
  # optional `bti c` (AArch64 Branch Target Identification) or
  # `endbr64` (Intel Indirect Branch Tracking)
  nop

  .section  __patchable_function_entries,"awo",@progbits,get,unique,0
  .p2align  3
  .quad .Ltmp0

-fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable
placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html):

(a)         (b)

func:       func:
.Ltmp0:     bti c
  bti c     .Ltmp0:
  nop       nop

(a) needs no additional code. If the consensus is to go for (b), we will
need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp .

Diff Detail

Unit TestsFailed

TimeTest
60 msClang.Driver::Unknown Unit Message ("")
Script: -- : 'RUN: at line 1'; env CC_PRINT_OPTIONS=1 CC_PRINT_OPTIONS_FILE=/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/tools/clang/test/Driver/Output/cc-print-options.c.tmp.log /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/clang -no-canonical-prefixes -S -o /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/tools/clang/test/Driver/Output/cc-print-options.c.tmp.s /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/clang/test/Driver/cc-print-options.c

Event Timeline

MaskRay created this revision.Jan 20 2020, 4:42 PM

-fpatchable-function-entry=N,M (where M>0) + -mbranch-protection=bti is quite clear (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html).

For -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch, if you have preference for placement (a) or (b), please make a comment:) and/or reply at https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html
If you are not subscribed to the gcc-patches mailing list (like me), you will have to download the raw email file Other format: [Raw text] and reply...

(b) will require separate patches.

Unit tests: pass. 62041 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

-fpatchable-function-entry=N,M (where M>0) + -mbranch-protection=bti is quite clear (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html).

For -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch, if you have preference for placement (a) or (b), please make a comment:) and/or reply at https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html
If you are not subscribed to the gcc-patches mailing list (like me), you will have to download the raw email file Other format: [Raw text] and reply...

(b) will require separate patches.

FWIW it looks like GCC have gone with (b) https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01348.html , FWIW my sentiments tend towards the argument given in https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01322.html for option b.

nickdesaulniers marked an inline comment as done.Jan 21 2020, 10:18 AM

Thanks for the patch!

llvm/lib/IR/Verifier.cpp
1860

Why do we make a copy of the StringRef? (Sorry I didn't notice in code review of the below test case previously). I assume this has something to do with avoiding "consuming" an attribute?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1301

I think it would be better to use TargetInstrInfo::getNoop() via MF->getSubtarget().getInstrInfo()->getNoop(Noop);. Then you could make this a non-virtual function call in the base AsmPrinter class. In fact, emitting nops is useful and could be a method on the base AsmPrinter, and that could be reused between AsmPrinter::EmitFunctionHeader() and AsmPrinter::EmitFunctionBody().

llvm/test/CodeGen/AArch64/patchable-function-entry-bti.ll
56

Ditto about .Lfunc_end2 placement (see below comment).

llvm/test/CodeGen/AArch64/patchable-function-entry.ll
83

Can you please include where .Lfunc_end5 should be in a CHECK, so that reviewers can verify that .size is set correctly?

84

What is this NOFSECT checking? That there's no entry for f1? Why?

92
95

Please add a CHECK for prefix: after the nop CHECK.

MaskRay updated this revision to Diff 239386.Jan 21 2020, 11:24 AM
MaskRay marked 9 inline comments as done.

Address review comments.
Fix a bug when "patchable-function-entry" is 0 while "patchable-function-prefix" > 0

llvm/lib/IR/Verifier.cpp
1860

My fault. StringRef::getAsInteger does not consume the prefix in case of a parse error.

Updated.

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1301

TIL getNoop. I was looking for such an operation bug did not find one. This is much more elegant! Added AsmPrinter::emitNops.

Unfortunately we cannot retire LowerPATCHABLE_FUNCTION_ENTER.

AsmPrinter (assembly printer/object file emitter) does the following:

  1. Emit data before the function entry
  2. Emit the function entry label and the label for __patchable_function_entries
  3. Emit the function body (which may have optional bti c or endbr64)

Due to bti c (AArch64) and endbr64 (x86), we cannot make the handling of the function attribute "patchable-function-entry" generic.
(-fpatchable-function-entry=, -mbranch-protection= (AArch64 BTI) and -fcf-protection= (Intel Indirect Branch Tracking) are implemented as passes which can prepend instructions to the function body.)

llvm/test/CodeGen/AArch64/patchable-function-entry.ll
84

Added a comment above.

+;; Without -function-sections, f2 is in the same text section as f1.
+;; They share the __patchable_function_entries section.
+;; With -function-sections, f1 and f2 are in different text sections.
+;; Use separate __patchable_function_entries.
MaskRay updated this revision to Diff 239389.Jan 21 2020, 11:29 AM

Improve tests a bit

Unit tests: fail. 62080 tests passed, 1 failed and 784 were skipped.

failed: Clang.Driver/cc-print-options.c

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: fail. 62080 tests passed, 1 failed and 784 were skipped.

failed: Clang.Driver/cc-print-options.c

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

llvm/include/llvm/CodeGen/AsmPrinter.h
456

unsigned N

Can we mark this method const?

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
2806

guard or assert that N > 0

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1301

Looks like we can still modify AsmPrinter::EmitFunctionBody() and ARMAsmPrinter::EmitSled() to now make use of AsmPrinter::emitNops() in this change.

MaskRay updated this revision to Diff 239430.Jan 21 2020, 1:56 PM
MaskRay marked an inline comment as done.

emitNops(int N) -> emitNops(unsigned N);

llvm/include/llvm/CodeGen/AsmPrinter.h
456

We can't. void AsmPrinter::EmitToStreamer(MCStreamer &S, const MCInst &Inst) { is not const.

Unit tests: pass. 62081 tests passed, 0 failed and 784 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1301

bump

MaskRay updated this revision to Diff 239626.Jan 22 2020, 9:32 AM

Use emitNops() to replace two NOP emitting sites.

MaskRay marked 4 inline comments as done.Jan 22 2020, 9:33 AM
MaskRay added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
2806

Should N==0 be allowed? It does no harm.

Unit tests: pass. 62111 tests passed, 0 failed and 808 were skipped.

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

I think these "patchable function entry" sequences are effectively binary interface, i.e. ABI, as other binary processing tools that use these patchable function entries will make assumptions about the structure of these sequences.
Therefore, I think it's important that clang and gcc do produce the same sequences, otherwise we're causing de-facto ABI incompatibility.

MaskRay added a comment.EditedJan 22 2020, 11:22 AM

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

I think these "patchable function entry" sequences are effectively binary interface, i.e. ABI, as other binary processing tools that use these patchable function entries will make assumptions about the structure of these sequences.
Therefore, I think it's important that clang and gcc do produce the same sequences, otherwise we're causing de-facto ABI incompatibility.

Not considering AArch64 BTI, we don't have a compatibility issue. I shall also mention that nobody on their side considers endbr64 yet.

For AArch64 BTI users, due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 , the Linux kernel may adopt a better configure-time test than a simple if $(cc-option,-fpatchable-function-entry=2). It should detect behavior differences across different versions of GCC and Clang. External users should be aware of these differences. I think we have some implementation leeway here.

Eventually the two compilers should be compatible, but I don't want to say that we should give up and blindly follow their decision;-) I wish people can evaluate the pros and cons not only from implementation complexity and status quo. I also think we should give x86 folks a chance to chime in. GCC 10 is at a very late stage, so they don't want to change the .Lpatch placement.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

I've left a comment in D72222 but thought it would be worth drawing attention to it here as D72222 is quite old and easily missed, CI has found errors in 3 linux files for an allmodconfig build on a clang 10. The details are in the comment in D72222, which isn't at fault itself but it is the first that enables -fpatchable-functions to be selected by the kernel make system.

nickdesaulniers accepted this revision.Jan 23 2020, 9:51 AM
nickdesaulniers added a subscriber: hans.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

I've left a comment in D72222 but thought it would be worth drawing attention to it here as D72222 is quite old and easily missed, CI has found errors in 3 linux files for an allmodconfig build on a clang 10. The details are in the comment in D72222, which isn't at fault itself but it is the first that enables -fpatchable-functions to be selected by the kernel make system.

linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/?h=next-20200123 (ie, kernel staging tree) test for support in arch/arm64/Kconfig via seeing if the compiler recognizes -fpatchable-function-entry=2 (if so, tries to use it).

Failing asserts should be fixed first before this series, then this series should be rebased on top.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

Did it land before the branch, or was it cherry-picked over by @hans ? I worry that that might make feature detection in the Linux kernel for this feature difficult. We (kernel, @mrutland) may need to at least keep in mind that there may be separate checks for cc-option for -fpatchable-function-entry=M vs -fpatchable-function-entry=M,N.

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

I think these "patchable function entry" sequences are effectively binary interface, i.e. ABI, as other binary processing tools that use these patchable function entries will make assumptions about the structure of these sequences.
Therefore, I think it's important that clang and gcc do produce the same sequences, otherwise we're causing de-facto ABI incompatibility.

I agree 100%; if there's no rush to get this into clang-10, then I'd rather take the series as is and iterate on fixes. Or maybe a final patch on top in this series as a final fixup to patchable function entry and BTI interwork would be satisfactory before landing the whole set? Either way, I consider this patch reviewed, so I'm signing off on it, but I would appreciate it @MaskRay if you did *NOT* land the series without @kristof.beyls ' explicit sign off and response to this suggestion (and with it rebased on fixes for the failed assertions @peter.smith cites above). I don't feel strongly about it though, so I won't be offended if anyone strongly disagrees.

This revision is now accepted and ready to land.Jan 23 2020, 9:51 AM

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

I've left a comment in D72222 but thought it would be worth drawing attention to it here as D72222 is quite old and easily missed, CI has found errors in 3 linux files for an allmodconfig build on a clang 10. The details are in the comment in D72222, which isn't at fault itself but it is the first that enables -fpatchable-functions to be selected by the kernel make system.

linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/?h=next-20200123 (ie, kernel staging tree) test for support in arch/arm64/Kconfig via seeing if the compiler recognizes -fpatchable-function-entry=2 (if so, tries to use it).

Failing asserts should be fixed first before this series, then this series should be rebased on top.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

Did it land before the branch, or was it cherry-picked over by @hans ? I worry that that might make feature detection in the Linux kernel for this feature difficult. We (kernel, @mrutland) may need to at least keep in mind that there may be separate checks for cc-option for -fpatchable-function-entry=M vs -fpatchable-function-entry=M,N.

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

I think these "patchable function entry" sequences are effectively binary interface, i.e. ABI, as other binary processing tools that use these patchable function entries will make assumptions about the structure of these sequences.
Therefore, I think it's important that clang and gcc do produce the same sequences, otherwise we're causing de-facto ABI incompatibility.

I agree 100%; if there's no rush to get this into clang-10, then I'd rather take the series as is and iterate on fixes. Or maybe a final patch on top in this series as a final fixup to patchable function entry and BTI interwork would be satisfactory before landing the whole set? Either way, I consider this patch reviewed, so I'm signing off on it, but I would appreciate it @MaskRay if you did *NOT* land the series without @kristof.beyls ' explicit sign off and response to this suggestion (and with it rebased on fixes for the failed assertions @peter.smith cites above). I don't feel strongly about it though, so I won't be offended if anyone strongly disagrees.

The driver change -fpatchable-function-entry=N,0 is in release/10.x . D73070 (this), D73071 and D73072 are not.

I wish we don't make -fpatchable-function-entry= in clang 10.0.0, then there will no no problem at all.

(For this patch series, I don't see how they can make the situation worse. BTI c's interaction with -mbranch-protection=bti is an existing problem in GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424)
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01334.html ) Current release/10.x will do: nop; nop; bti c , which is considered incorrect. I fixed this in rG9a24488cb67a90f889529987275c5e411ce01dda (not in release/10.x)

hans added a comment.Jan 23 2020, 11:56 AM

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

I've left a comment in D72222 but thought it would be worth drawing attention to it here as D72222 is quite old and easily missed, CI has found errors in 3 linux files for an allmodconfig build on a clang 10. The details are in the comment in D72222, which isn't at fault itself but it is the first that enables -fpatchable-functions to be selected by the kernel make system.

linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/?h=next-20200123 (ie, kernel staging tree) test for support in arch/arm64/Kconfig via seeing if the compiler recognizes -fpatchable-function-entry=2 (if so, tries to use it).

Failing asserts should be fixed first before this series, then this series should be rebased on top.

-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.

Did it land before the branch, or was it cherry-picked over by @hans ? I worry that that might make feature detection in the Linux kernel for this feature difficult. We (kernel, @mrutland) may need to at least keep in mind that there may be separate checks for cc-option for -fpatchable-function-entry=M vs -fpatchable-function-entry=M,N.

I don't mind landing this without aarch64 bti interplay matching GCC precisely; I think it can be done in a follow up. I just request we don't try to land this in the clang-10 release unless that is also resolved. Otherwise clang-11.

I think these "patchable function entry" sequences are effectively binary interface, i.e. ABI, as other binary processing tools that use these patchable function entries will make assumptions about the structure of these sequences.
Therefore, I think it's important that clang and gcc do produce the same sequences, otherwise we're causing de-facto ABI incompatibility.

I agree 100%; if there's no rush to get this into clang-10, then I'd rather take the series as is and iterate on fixes. Or maybe a final patch on top in this series as a final fixup to patchable function entry and BTI interwork would be satisfactory before landing the whole set? Either way, I consider this patch reviewed, so I'm signing off on it, but I would appreciate it @MaskRay if you did *NOT* land the series without @kristof.beyls ' explicit sign off and response to this suggestion (and with it rebased on fixes for the failed assertions @peter.smith cites above). I don't feel strongly about it though, so I won't be offended if anyone strongly disagrees.

The driver change -fpatchable-function-entry=N,0 is in release/10.x . D73070 (this), D73071 and D73072 are not.

I wish we don't make -fpatchable-function-entry= in clang 10.0.0, then there will no no problem at all.

(For this patch series, I don't see how they can make the situation worse. BTI c's interaction with -mbranch-protection=bti is an existing problem in GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424)
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01334.html ) Current release/10.x will do: nop; nop; bti c , which is considered incorrect. I fixed this in rG9a24488cb67a90f889529987275c5e411ce01dda (not in release/10.x)

I haven't followed this discussion, but it sounds like there are changes that should be pushed to 10.x. Can you please summarize which ones?

This revision was automatically updated to reflect the committed changes.