In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Unit tests: pass. 61253 tests passed, 0 failed and 736 were skipped.
clang-tidy: fail. Please fix clang-tidy findings.
clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
clang/include/clang/Basic/DiagnosticDriverKinds.td | ||
---|---|---|
77 | We already have err_drv_argument_not_allowed_with, which can be used instead of adding this. | |
clang/include/clang/Driver/Options.td | ||
1706 | Names of options with an equals sign end in _EQ. | |
clang/test/Driver/fpatchable-function-entry.c | ||
12 | I think we should have a more descriptive message here, maybe something like: error: the second parameter of '-fpatchable-function-entry=' must be '0' or omitted |
Unit tests: pass. 61312 tests passed, 0 failed and 736 were skipped.
clang-tidy: fail. Please fix clang-tidy findings.
clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
@ostannard @nickdesaulniers @peter.smith Unfortunately we will have to get buy-in from GCC before committing to do an incompatible design. A promise that they will implement a new approach to overcome the disadvantages and an agreement on the new option name is probably sufficient for us to move forward.
A list of deficiency:
- sh_addralign={4,8} (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00271.html)
- --gc-sections/COMDAT does not work
- Using a PC-relative relocation type instead of a symbolic relocation type (R_AARCH64_ABS64, R_X86_64_64, R_PPC64_ADDR64) avoids SHF_WRITE, RELRO, and dynamic relocations (24 bytes*number of functions, not negligible).
- Incompatible with BTI c. nop;nop;nop;bti c;nop nop probably works. I hope my CodeGen change does not make such interop harder.
- On i386/x86-64 they should use multi-byte NOPs.
Other issues can be fixed in a backward-compatible manner, but the relocation type issue cannot. If you know the Linux kernel contacts, please make them aware of https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html and the design we'd like to do in D72215.
Some targets don't have a PC relative relocation of pointer size, e.g. R_MIPS_PC64 (MIPS only has R_MIPS_PC32, which is a GNU extension, sigh https://sourceware.org/git/?p=binutils-gdb.git;a=commit;f=include/elf/mips.h;h=092dcd755dcdcf664b25a7011fd15957f124c29f). If we really need a PC-relative variant, we can use a new syntax -fpatchable-function-entry=2,0,pcrel with the existing option name. The new syntax should not add too much burden on Clang's side if GCC/Linux developers do want to save dynamic relocations.
Let me commit the patch series with symbolic relocation types first.
Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .
I reproduced with
make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig
You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)
If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.
This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)
@peter.smith @nickdesaulniers What do you think?
If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.
This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)@peter.smith @nickdesaulniers What do you think?
Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.
If you have cycles to debug the assertions quickly, maybe it's a simple fix that doesn't require a revert?
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)
Maybe, but I would prefer to avoid such autoconf like detection. cc-option in the kernel is relatively simple and hardened at this point.
@peter.smith @nickdesaulniers What do you think?
Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.
I should have been clearer, apologies; we're not blocked by the assertion failures, the CI won't halt and should pick up other build failures.
But if trunk is broken, shouldn't it be reverted there first, and then we can merge the revert to 10.x (and then trunk can be fixed eventually)?
I will need to read your other qemu command lines to learn how to debug.
The problem may be that the Linux kernel expects a different behavior from Clang's. We don't have to be hurry. If we decide to revert it in release/10.x, we can do that just a few days before 10.0.0 is released. (That was why I separated these patches...) I am more concerned of whether the GCC side will eventually fix the flaws (https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html).
(I really dislike how the feature was developed on the GCC side. Yet another Linux-kernel specific GCC option when there are already 4 existing options for the same feature)
I don't think the commit is to be blamed. The availability of the driver option -fpatchable-function-entry= enables CONFIG_DYNAMIC_FTRACE_WITH_REGS and CONFIG_LIVEPATCH, which were not tested before. There could be other issues in the code path.
% rg fpatchable-function-entry arch/arm64/Kconfig 147: if $(cc-option,-fpatchable-function-entry=2) arch/arm64/Makefile 100: CC_FLAGS_FTRACE := -fpatchable-function-entry=2
If we can explicitly disable the options in the CI, that will be very nice.
It definitely won't be this commit, the assert fail looks like it is from the patchable-function attribute pass. The CI just runs allyesconfig and allmodconfig and tells us if there are any new regressions, it won't report this one again so it is unlikely to be worth masking it out.
Assert failure:
clang-10: /work/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:139: llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>::operator*() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = false, IsConst = false]: Assertion `!NodePtr->isKnownSentinel()' failed. Stack dump: 0. Program arguments: /work/llvm-project/build/buildclang/bin/clang-10 -cc1 -triple aarch64-unknown-linux-gnu -S -disable-free -main-file-name slab_common.c -mrelocation-model static -mthread-model posix -fno-delete-null-pointer-checks -mllvm -warn-stack-size=2048 -mframe-pointer=non-leaf -relaxed-aliasing -mdisable-tail-calls -fmath-errno -fno-rounding-math -masm-verbose -no-integrated-as -mconstructor-aliases -target-cpu generic -target-feature -fp-armv8 -target-feature -crypto -target-feature -neon -target-feature -sha2 -target-feature -aes -target-abi aapcs -mllvm -aarch64-enable-global-merge=false -fallow-half-arguments-and-returns -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -nostdsysteminc -nobuiltininc -resource-dir /work/llvm-project/build/buildclang/lib/clang/10.0.0 -dependency-file mm/.slab_common.o.d -MT mm/slab_common.o -sys-header-deps -isystem /work/clangmaster/lib/clang/10.0.0/include -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -I ./arch/arm64/include -I ./arch/arm64/include/generated -I ./include -I ./arch/arm64/include/uapi -I ./arch/arm64/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -D __KERNEL__ -D CC_USING_PATCHABLE_FUNCTION_ENTRY -D KASAN_SHADOW_SCALE_SHIFT=3 -D CONFIG_AS_LSE=1 -D CONFIG_CC_HAS_K_CONSTRAINT=1 -D KASAN_SHADOW_SCALE_SHIFT=3 -D KBUILD_BASENAME="slab_common" -D KBUILD_MODNAME="slab_common" -fmacro-prefix-map=./= -O2 -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -Werror=unknown-warning-option -Wno-address-of-packed-member -Wno-format-invalid-specifier -Wno-gnu -Wno-tautological-compare -Wno-unused-const-variable -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -std=gnu89 -fno-dwarf-directory-asm -fdebug-compilation-dir /work/linux/linux -ferror-limit 19 -fmessage-length 0 -fpatchable-function-entry=2 -fwrapv -stack-protector 2 -ftrivial-auto-var-init=pattern -fno-builtin -fno-signed-char -fwchar-type=short -fno-signed-wchar -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fno-common -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o /tmp/slab_common-7e60cc.s -x c mm/slab_common.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'mm/slab_common.c'. 4. Running pass 'Implement the 'patchable-function' attribute' on function '@rcu_lock_acquire'
Maybe I'm reading too much into it, but please always have respect for the competition, as a representative of the company and LLVM community you represent, as well as my friend. It should be clear why that's important and why I won't tolerate anything less. If you still disagree, I'm happy to talk more about it privately.
Can you enumerate the 4 existing options? If we do already have an existing tool in our toolbox, it would be good to consider their strengths and limitations before building a new tool.
Apologies if my word felt strong. I meant I felt sad that GCC developed another option in this area, along with -mfentry, -mhotpatch (SystemZ specific), -mnop-mcount (x86, SystemZ, -fno-pie specific), -mrecord-mcount (used to retire scripts/recordmcount.{pl,c}). The GCC implementation has several issues, as I listed in the link I pasted above. (I fixed a GCC bug in this area.) I also feel making the dependent Linux features enabled by default when -fpatchable-function-entry= is recognized by the compiler was not sufficient. It should have more sophisticated check.
The idea of this option is great. The intention was to unify the existing options. However, two major flaws (comdat (C++) / --gc-sections) make it impractical for non-Linux-kernel applications. I noticed we recently add support for -mnop-mcount. I haven't made enough investigation, but I suspect we don't have to do that and can use a general option instead.
@hans To address the CI problem (https://reviews.llvm.org/D72222#1836728) reported by @peter.smith, we just need to delete the driver option in the release/10.x branch. (I need a -E preproccessed file to investigate.)
I think we don't need to hurry. We can simply revert this patch (D72222) a few days before 10.0.0.
clang+lld built at HEAD + Linux at head
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/Release/bin/clang LD=~/llvm/Release/bin/ld.lld O=/tmp/arm64 -j 30 distclean allmodconfig make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/Release/bin/clang LD=~/llvm/Release/bin/ld.lld O=/tmp/arm64 -j 30 all ... OBJCOPY arch/arm64/boot/Image GZIP arch/arm64/boot/Image.gz make[1]: Leaving directory '/tmp/arm64'
@peter.smith The build was smooth. Do I need other options to reproduce?
@hans To address the CI problem (https://reviews.llvm.org/D72222#1836728) reported by @peter.smith, we just need to delete the driver option in the release/10.x branch. (I need a -E preproccessed file to investigate.)
I think we don't need to hurry. We can simply revert this patch (D72222) a few days before 10.0.0.
Waiting until a few days before the release is not a good idea. If we know there is a problem, I'd like it to be fixed on the branch as soon as possible.
I was able to reproduce:
$ git clone --depth 1 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git $ cd linux-next $ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 allyesconfig $ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 mm/kasan/quarantine.o
The exact compiler invocation (via tacking on V=1 to the above make command is:
$ clang -Wp,-MD,mm/kasan/.quarantine.o.d -nostdinc -isystem /android0/llvm-project/llvm/build/lib/clang/11.0.0/include -I./arch/arm64/include -I./arch/arm64/include/generated -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DCC_USING_PATCHABLE_FUNCTION_ENTRY -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/usr/bin/ --gcc-toolchain=/usr -no-integrated-as -Werror=unknown-warning-option -mgeneral-regs-only -DCONFIG_CC_HAS_K_CONSTRAINT=1 -fno-asynchronous-unwind-tables -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-address-of-packed-member -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -ftrivial-auto-var-init=pattern -fpatchable-function-entry=2 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=./= -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -fno-builtin -DKBUILD_MODFILE='"mm/kasan/quarantine"' -DKBUILD_BASENAME='"quarantine"' -DKBUILD_MODNAME='"quarantine"' -c -o mm/kasan/quarantine.o mm/kasan/quarantine.c
creduce spits out:
// $ clang -O2 quarantine.i __attribute__((patchable_function_entry(0))) a() { b(({ c: &&c; })); }
This is a useful summary of similar tools, thank you for the survey. Indeed, many of these seem to be used to support tracing in the Linux kernel.
Thanks. Looks like the following is sufficient to reproduce. (I incorrectly thought an empty entry MachineBasicBlock was impossible.) I take the opportunity to add test coverage (testing debug-location: D73301)
define void @foo() #0 { entry: unreachable } attributes #0 = { "patchable-function-entry"="2" }
A -DLLVM_ENABLE_ASSERTIONS=on build is required to trigger the assertion failure. My make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/ReleaseAssert/bin/clang LD=~/llvm/ReleaseAssert/bin/ld.lld O=/tmp/arm64 allmodconfig all -j 30 build succeeded. Now I will be in favor of pushing the bugfix to release/10.x .
Not clear about -fpatchable-function-entry=N,M where M>0 (D73070, D73071, D73072). For completeness, I'd like them to be included in release/10.x so we will not have a clang 10 that does not work with M>0.
For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.
note that the gcc fix will be in the gcc-10 release (it got accepted),
and soon backported to gcc-9 (the first release with bti support).
(only aarch64 bti was fixed, that's what affected linux)
the gcc fix keeps bti outside of the patch area in the M=0 case,
which makes the current linux *runtime* patching code simpler.
the choice was made because this is important if there is a mix
of bti and non-bti functions which can happen if
- compiler detects not_called_directly(f) and omits bti or
- individual attributes on functions turn on/off bti
with the llvm patch the *runtime* code would have to detect
which patch area has bti and which not to skip over. (fine for
future M>0 usage, but current M=0 case should not be complex,
i.e. linux will have to disable bti+ftrace on llvm if this patch is
accepted until it has the complex runtime code. and the patch
may make static detection of the compiler behaviour harder as
it fails differently than old gcc-9).
i did not look deeply into llvm but i'd expect some hook for
printing the function label, which can print the bti before
the patch area based on some global flag (an ugly hack, but
allows consistent behaviour across the compilers)
i hope this makes sense.
For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.
Speaking as the person implementing the Linux side of things, I think that will be a problem. Kernel-side we want consistency across compilers.
For Linux we were planning to do a very simple build-time test to rule out compilers with the broken behaviour. We would expect functional compilers to have a consistent layout for a given combination of options, and we would consider this layout to be ABI.
The compile time check would compile a trivial test file, e.g.
void function(void) { }
... with flags:
-fpatchable-function-entry=2 -mbranch-protection=bti
... and if the function entry point is a NOP, mark that compiler as broken. In practice, this will mean that the kernel build system will disable BTI support for those broken compilers.
Trying to support different layout approaches within Linux will add a fair amount of unnecessary complexity which we cannot contain in one place, and makes it more likely that support for either case will be broken in future.
For the M=0 case, I would prefer to avoid runtime detection unless really necessary.
For the M!=0 case, which I believe Linux will need to use in the near future, I realise that a runtime check will be necessary to detect the absence of a BTI. Regardless, consistent behaviour across compilers will make this significantly simpler and less error-prone.
Thanks,
Mark.
When -mbranch-protection=bti is used, due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 (existing GCC releases have the issue), Linux/arch/arm64 has to make more checks to detect the combination.
I changed nop; nop; bti c to bti c; nop; nop with 2 commits not included in LLVM release/10.x (a72d15e37c5e066f597f13a8ba60aff214ac992d and 9a24488cb67a90f889529987275c5e411ce01dda). They make Clang's M!=0 placement (.Lpatch; nop; func: bti c; nop) consistent with GCC.
Edit: cherry-picked to release/10.x
For the M=0 case, Clang does (func: .Lpatch; bti c; nop; nop), which is inconsistent with GCC (func: bti c; .Lpatch: nop; nop). I'll be happy to try updating the placement of .Lpatch if future M>0 usage does not obsolete M=0 usage (a proof that the proposed GCC layout is indeed superior, not just for simplicity for a not-so-common configuration), otherwise I would feel the clang side is just making changes to appease GCC/Linux compatibility, which would make me sad. (Again: this is our ~5th time adding an option specific to Linux kernel, with a good intention to make it general, but in reality it doesn't https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html) I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.
As covered in D72222#1838988 the plan is to detect broken compilers at build-time and reject the combination of options. The Linux patching code will not handle varied behaviours at runtime.
I changed nop; nop; bti c to bti c; nop; nop with 2 commits not included in LLVM release/10.x (a72d15e37c5e066f597f13a8ba60aff214ac992d and 9a24488cb67a90f889529987275c5e411ce01dda). They make Clang's M!=0 placement (.Lpatch; nop; func: bti c; nop) consistent with GCC.
Edit: cherry-picked to release/10.x
That's great; thanks!
For the M=0 case, Clang does (func: .Lpatch; bti c; nop; nop), which is inconsistent with GCC (func: bti c; .Lpatch: nop; nop). I'll be happy to try updating the placement of .Lpatch if future M>0 usage does not obsolete M=0 usage (a proof that the proposed GCC layout is indeed superior, not just for simplicity for a not-so-common configuration), otherwise I would feel the clang side is just making changes to appease GCC/Linux compatibility, which would make me sad.
I do not expect future M>0 usage to obsolete M=0 usage, and these will be used by distinct configurations of Linux with distinct code paths. I expect that we will need M>0 in future to handle very large kernel images and/or live-patching where we may need to place PLTs or other metadata close to a function. I expect that other cases will continue to use M=0.
I can appreciate the frustration here, but I do think that this is an ABI issue if the compilers diverge in behaviour here. Especially as future additions to the architecture or PCS could complicate matters, and what could seem benign divergence now could turn out to be unreconcilable.
Aligning with the GCC M=0 case here will mean that patching code that works with GCC should just work without modification for clang, and I do think that's a better position to be in generally.
(Again: this is our ~5th time adding an option specific to Linux kernel, with a good intention to make it general, but in reality it doesn't https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html) I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.
That's a good point w.r.t. x86; I will get in touch with the people working on ftrace and live-patching there
Thanks,
Mark.
I spoke with Steven Rostedt (ftrace maintainer), and Josh Poimboeuf (livepatching maintainer) in the OFTC/#linux-rt IRC channel. Today x86 uses -mfentry, and they have no reason to move to -fpatchable-function-entry so long as the ENDBR32/ENDBR64 instructions compose nicely with -mfentry.
Given that, I don't think x86 kernel folk care either way.
On the GCC side I was under the impression that x86 would go the same way as arm64, per https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424#c4
There's a GCC ticket for x86: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492
Thanks,
Mark.
Just to follow up: D73680 was cherry-picked to 10.x. Does that mean all issues are resolved here, and the kernel folks are happy, or is there more work expected that might affect the release?
@hans Thanks for following up!
All known AArch64 -fpatchable-function-entry= issues are resolved after D73680.
D73760 is an x86 counterpart of D73680. It changed the placement of the label .Lpatch0, which has been agreed to be superior.
hjl is working on the GCC side issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492. I don't know whether that fix will be included in GCC 10.
From the discussions, what we did in D73760 is a consensus and GCC will eventually do it as well. D73760 will be nice to be cherry picked if there is not much trouble, so we won't have a major release with unsatisfactory label placement if Linux x86 developers ever want to adopt -fpatchable-function-entry=.
D73760 will be nice to be cherry picked if there is not much trouble, so we won't have a major release with unsatisfactory label placement if Linux x86 developers ever want to adopt -fpatchable-function-entry=.
Cherry-picked as 4c96b369a074e93a0be536dd795d3f245ef6f18b. Thanks!
We already have err_drv_argument_not_allowed_with, which can be used instead of adding this.