Page MenuHomePhabricator

[Driver][CodeGen] Add -fpatchable-function-entry=N[,0]
ClosedPublic

Authored by MaskRay on Jan 4 2020, 9:06 PM.

Details

Summary

In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.

Diff Detail

Event Timeline

MaskRay created this revision.Jan 4 2020, 9:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2020, 9:06 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
MaskRay updated this revision to Diff 236222.Jan 4 2020, 9:18 PM

Forgot to add test/Driver/fpatchable-function-entry.c

Unit tests: pass. 61253 tests passed, 0 failed and 736 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

ostannard added inline comments.
clang/include/clang/Basic/DiagnosticDriverKinds.td
77

We already have err_drv_argument_not_allowed_with, which can be used instead of adding this.

clang/include/clang/Driver/Options.td
1706

Names of options with an equals sign end in _EQ.

clang/test/Driver/fpatchable-function-entry.c
12

I think we should have a more descriptive message here, maybe something like:

error: the second parameter of '-fpatchable-function-entry=' must be '0' or omitted
MaskRay updated this revision to Diff 236730.Jan 7 2020, 5:51 PM
MaskRay marked 2 inline comments as done.
MaskRay retitled this revision from [Driver][CodeGen] Add -fpatchable-function-entry=N[,M] to [Driver][CodeGen] Add -fpatchable-function-entry=N[,0].
MaskRay added reviewers: ostannard, peter.smith.
MaskRay removed a subscriber: ostannard.

Address review comments

MaskRay marked an inline comment as done.Jan 7 2020, 6:13 PM

Unit tests: pass. 61312 tests passed, 0 failed and 736 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

This revision is now accepted and ready to land.Jan 8 2020, 1:26 AM
MaskRay planned changes to this revision.EditedJan 8 2020, 10:54 AM

@ostannard @nickdesaulniers @peter.smith Unfortunately we will have to get buy-in from GCC before committing to do an incompatible design. A promise that they will implement a new approach to overcome the disadvantages and an agreement on the new option name is probably sufficient for us to move forward.

A list of deficiency:

  • sh_addralign={4,8} (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00271.html)
  • --gc-sections/COMDAT does not work
  • Using a PC-relative relocation type instead of a symbolic relocation type (R_AARCH64_ABS64, R_X86_64_64, R_PPC64_ADDR64) avoids SHF_WRITE, RELRO, and dynamic relocations (24 bytes*number of functions, not negligible).
  • Incompatible with BTI c. nop;nop;nop;bti c;nop nop probably works. I hope my CodeGen change does not make such interop harder.
  • On i386/x86-64 they should use multi-byte NOPs.

Other issues can be fixed in a backward-compatible manner, but the relocation type issue cannot. If you know the Linux kernel contacts, please make them aware of https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html and the design we'd like to do in D72215.

MaskRay accepted this revision.EditedJan 9 2020, 9:56 PM

Some targets don't have a PC relative relocation of pointer size, e.g. R_MIPS_PC64 (MIPS only has R_MIPS_PC32, which is a GNU extension, sigh https://sourceware.org/git/?p=binutils-gdb.git;a=commit;f=include/elf/mips.h;h=092dcd755dcdcf664b25a7011fd15957f124c29f). If we really need a PC-relative variant, we can use a new syntax -fpatchable-function-entry=2,0,pcrel with the existing option name. The new syntax should not add too much burden on Clang's side if GCC/Linux developers do want to save dynamic relocations.

Let me commit the patch series with symbolic relocation types first.

MaskRay requested review of this revision.Jan 9 2020, 11:14 PM
This revision is now accepted and ready to land.Jan 9 2020, 11:14 PM
This revision was automatically updated to reflect the committed changes.

Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .

I reproduced with

make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig

You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

MaskRay added a subscriber: hans.Jan 23 2020, 9:53 AM

Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .

I reproduced with

make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig

You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

@peter.smith @nickdesaulniers What do you think?

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

@peter.smith @nickdesaulniers What do you think?

Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.

Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .

I reproduced with

make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig

You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.

If you have cycles to debug the assertions quickly, maybe it's a simple fix that doesn't require a revert?

(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

Maybe, but I would prefer to avoid such autoconf like detection. cc-option in the kernel is relatively simple and hardened at this point.

@peter.smith @nickdesaulniers What do you think?

Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.

I should have been clearer, apologies; we're not blocked by the assertion failures, the CI won't halt and should pick up other build failures.

hans added a comment.Jan 23 2020, 10:25 AM

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

@peter.smith @nickdesaulniers What do you think?

Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.

But if trunk is broken, shouldn't it be reverted there first, and then we can merge the revert to 10.x (and then trunk can be fixed eventually)?

MaskRay added a comment.EditedJan 23 2020, 10:31 AM

Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .

I reproduced with

make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig

You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.

If you have cycles to debug the assertions quickly, maybe it's a simple fix that doesn't require a revert?

(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

Maybe, but I would prefer to avoid such autoconf like detection. cc-option in the kernel is relatively simple and hardened at this point.

I will need to read your other qemu command lines to learn how to debug.

The problem may be that the Linux kernel expects a different behavior from Clang's. We don't have to be hurry. If we decide to revert it in release/10.x, we can do that just a few days before 10.0.0 is released. (That was why I separated these patches...) I am more concerned of whether the GCC side will eventually fix the flaws (https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html).
(I really dislike how the feature was developed on the GCC side. Yet another Linux-kernel specific GCC option when there are already 4 existing options for the same feature)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

@peter.smith @nickdesaulniers What do you think?

Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.

But if trunk is broken, shouldn't it be reverted there first, and then we can merge the revert to 10.x (and then trunk can be fixed eventually)?

I don't think the commit is to be blamed. The availability of the driver option -fpatchable-function-entry= enables CONFIG_DYNAMIC_FTRACE_WITH_REGS and CONFIG_LIVEPATCH, which were not tested before. There could be other issues in the code path.

% rg fpatchable-function-entry
arch/arm64/Kconfig
147:            if $(cc-option,-fpatchable-function-entry=2)

arch/arm64/Makefile
100:  CC_FLAGS_FTRACE := -fpatchable-function-entry=2

If we can explicitly disable the options in the CI, that will be very nice.

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)

@peter.smith @nickdesaulniers What do you think?

Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.

But if trunk is broken, shouldn't it be reverted there first, and then we can merge the revert to 10.x (and then trunk can be fixed eventually)?

I don't think the commit is to be blamed. The availability of the driver option -fpatchable-function-entry= enables CONFIG_DYNAMIC_FTRACE_WITH_REGS and CONFIG_LIVEPATCH, which were not tested before. There could be other issues in the code path.

% rg fpatchable-function-entry
arch/arm64/Kconfig
147:            if $(cc-option,-fpatchable-function-entry=2)

arch/arm64/Makefile
100:  CC_FLAGS_FTRACE := -fpatchable-function-entry=2

If we can explicitly disable the options in the CI, that will be very nice.

It definitely won't be this commit, the assert fail looks like it is from the patchable-function attribute pass. The CI just runs allyesconfig and allmodconfig and tells us if there are any new regressions, it won't report this one again so it is unlikely to be worth masking it out.

Assert failure:

clang-10: /work/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:139: llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>::operator*() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = false, IsConst = false]: Assertion `!NodePtr->isKnownSentinel()' failed.
Stack dump:
0.	Program arguments: /work/llvm-project/build/buildclang/bin/clang-10 -cc1 -triple aarch64-unknown-linux-gnu -S -disable-free -main-file-name slab_common.c -mrelocation-model static -mthread-model posix -fno-delete-null-pointer-checks -mllvm -warn-stack-size=2048 -mframe-pointer=non-leaf -relaxed-aliasing -mdisable-tail-calls -fmath-errno -fno-rounding-math -masm-verbose -no-integrated-as -mconstructor-aliases -target-cpu generic -target-feature -fp-armv8 -target-feature -crypto -target-feature -neon -target-feature -sha2 -target-feature -aes -target-abi aapcs -mllvm -aarch64-enable-global-merge=false -fallow-half-arguments-and-returns -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -nostdsysteminc -nobuiltininc -resource-dir /work/llvm-project/build/buildclang/lib/clang/10.0.0 -dependency-file mm/.slab_common.o.d -MT mm/slab_common.o -sys-header-deps -isystem /work/clangmaster/lib/clang/10.0.0/include -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -I ./arch/arm64/include -I ./arch/arm64/include/generated -I ./include -I ./arch/arm64/include/uapi -I ./arch/arm64/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -D __KERNEL__ -D CC_USING_PATCHABLE_FUNCTION_ENTRY -D KASAN_SHADOW_SCALE_SHIFT=3 -D CONFIG_AS_LSE=1 -D CONFIG_CC_HAS_K_CONSTRAINT=1 -D KASAN_SHADOW_SCALE_SHIFT=3 -D KBUILD_BASENAME="slab_common" -D KBUILD_MODNAME="slab_common" -fmacro-prefix-map=./= -O2 -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -Werror=unknown-warning-option -Wno-address-of-packed-member -Wno-format-invalid-specifier -Wno-gnu -Wno-tautological-compare -Wno-unused-const-variable -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length -std=gnu89 -fno-dwarf-directory-asm -fdebug-compilation-dir /work/linux/linux -ferror-limit 19 -fmessage-length 0 -fpatchable-function-entry=2 -fwrapv -stack-protector 2 -ftrivial-auto-var-init=pattern -fno-builtin -fno-signed-char -fwchar-type=short -fno-signed-wchar -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fno-common -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o /tmp/slab_common-7e60cc.s -x c mm/slab_common.c 
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'mm/slab_common.c'.
4.	Running pass 'Implement the 'patchable-function' attribute' on function '@rcu_lock_acquire'

(I really dislike how the feature was developed on the GCC side. Yet another Linux-kernel specific GCC option when there are already 4 existing options for the same feature)

Maybe I'm reading too much into it, but please always have respect for the competition, as a representative of the company and LLVM community you represent, as well as my friend. It should be clear why that's important and why I won't tolerate anything less. If you still disagree, I'm happy to talk more about it privately.

Can you enumerate the 4 existing options? If we do already have an existing tool in our toolbox, it would be good to consider their strengths and limitations before building a new tool.

(I really dislike how the feature was developed on the GCC side. Yet another Linux-kernel specific GCC option when there are already 4 existing options for the same feature)

Maybe I'm reading too much into it, but please always have respect for the competition, as a representative of the company and LLVM community you represent, as well as my friend. It should be clear why that's important and why I won't tolerate anything less. If you still disagree, I'm happy to talk more about it privately.

Can you enumerate the 4 existing options? If we do already have an existing tool in our toolbox, it would be good to consider their strengths and limitations before building a new tool.

Apologies if my word felt strong. I meant I felt sad that GCC developed another option in this area, along with -mfentry, -mhotpatch (SystemZ specific), -mnop-mcount (x86, SystemZ, -fno-pie specific), -mrecord-mcount (used to retire scripts/recordmcount.{pl,c}). The GCC implementation has several issues, as I listed in the link I pasted above. (I fixed a GCC bug in this area.) I also feel making the dependent Linux features enabled by default when -fpatchable-function-entry= is recognized by the compiler was not sufficient. It should have more sophisticated check.

The idea of this option is great. The intention was to unify the existing options. However, two major flaws (comdat (C++) / --gc-sections) make it impractical for non-Linux-kernel applications. I noticed we recently add support for -mnop-mcount. I haven't made enough investigation, but I suspect we don't have to do that and can use a general option instead.

@hans To address the CI problem (https://reviews.llvm.org/D72222#1836728) reported by @peter.smith, we just need to delete the driver option in the release/10.x branch. (I need a -E preproccessed file to investigate.)
I think we don't need to hurry. We can simply revert this patch (D72222) a few days before 10.0.0.

Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .

I reproduced with

make CC=/path/to/clang/install/clang ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc allmodconfig

You can get the log files for the build, which is from clang-10
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_kernel/llvm-release-aarch64-mainline-allmodconfig (4: update: llvm-linux: 19676)

If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.

clang+lld built at HEAD + Linux at head

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/Release/bin/clang LD=~/llvm/Release/bin/ld.lld O=/tmp/arm64 -j 30 distclean allmodconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/Release/bin/clang LD=~/llvm/Release/bin/ld.lld O=/tmp/arm64 -j 30 all
...
  OBJCOPY arch/arm64/boot/Image
  GZIP    arch/arm64/boot/Image.gz
make[1]: Leaving directory '/tmp/arm64'

@peter.smith The build was smooth. Do I need other options to reproduce?

hans added a comment.Jan 23 2020, 2:45 PM

@hans To address the CI problem (https://reviews.llvm.org/D72222#1836728) reported by @peter.smith, we just need to delete the driver option in the release/10.x branch. (I need a -E preproccessed file to investigate.)
I think we don't need to hurry. We can simply revert this patch (D72222) a few days before 10.0.0.

Waiting until a few days before the release is not a good idea. If we know there is a problem, I'd like it to be fixed on the branch as soon as possible.

@peter.smith The build was smooth. Do I need other options to reproduce?

I was able to reproduce:

$ git clone --depth 1 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
$ cd linux-next
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 allyesconfig
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 mm/kasan/quarantine.o

The exact compiler invocation (via tacking on V=1 to the above make command is:

$ clang -Wp,-MD,mm/kasan/.quarantine.o.d  -nostdinc -isystem /android0/llvm-project/llvm/build/lib/clang/11.0.0/include -I./arch/arm64/include -I./arch/arm64/include/generated  -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DCC_USING_PATCHABLE_FUNCTION_ENTRY -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/usr/bin/ --gcc-toolchain=/usr -no-integrated-as -Werror=unknown-warning-option -mgeneral-regs-only -DCONFIG_CC_HAS_K_CONSTRAINT=1 -fno-asynchronous-unwind-tables -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-address-of-packed-member -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -ftrivial-auto-var-init=pattern -fpatchable-function-entry=2 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=./= -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length   -fno-builtin      -DKBUILD_MODFILE='"mm/kasan/quarantine"' -DKBUILD_BASENAME='"quarantine"' -DKBUILD_MODNAME='"quarantine"' -c -o mm/kasan/quarantine.o mm/kasan/quarantine.c

creduce spits out:

// $ clang -O2 quarantine.i
__attribute__((patchable_function_entry(0))) a() {
  b(({
    c:
      &&c;
  }));
}

along with -mfentry, -mhotpatch (SystemZ specific), -mnop-mcount (x86, SystemZ, -fno-pie specific), -mrecord-mcount (used to retire scripts/recordmcount.{pl,c}).

This is a useful summary of similar tools, thank you for the survey. Indeed, many of these seem to be used to support tracing in the Linux kernel.

MaskRay added a comment.EditedJan 23 2020, 4:44 PM

@peter.smith The build was smooth. Do I need other options to reproduce?

I was able to reproduce:

$ git clone --depth 1 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
$ cd linux-next
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 allyesconfig
$ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make CC=clang -j71 mm/kasan/quarantine.o

The exact compiler invocation (via tacking on V=1 to the above make command is:

$ clang -Wp,-MD,mm/kasan/.quarantine.o.d  -nostdinc -isystem /android0/llvm-project/llvm/build/lib/clang/11.0.0/include -I./arch/arm64/include -I./arch/arm64/include/generated  -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DCC_USING_PATCHABLE_FUNCTION_ENTRY -DKASAN_SHADOW_SCALE_SHIFT=3 -Qunused-arguments -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 --target=aarch64-linux-gnu --prefix=/usr/bin/ --gcc-toolchain=/usr -no-integrated-as -Werror=unknown-warning-option -mgeneral-regs-only -DCONFIG_CC_HAS_K_CONSTRAINT=1 -fno-asynchronous-unwind-tables -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-address-of-packed-member -O2 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -ftrivial-auto-var-init=pattern -fpatchable-function-entry=2 -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=date-time -Werror=incompatible-pointer-types -fmacro-prefix-map=./= -Wno-initializer-overrides -Wno-format -Wno-sign-compare -Wno-format-zero-length   -fno-builtin      -DKBUILD_MODFILE='"mm/kasan/quarantine"' -DKBUILD_BASENAME='"quarantine"' -DKBUILD_MODNAME='"quarantine"' -c -o mm/kasan/quarantine.o mm/kasan/quarantine.c

creduce spits out:

// $ clang -O2 quarantine.i
__attribute__((patchable_function_entry(0))) a() {
  b(({
    c:
      &&c;
  }));
}

Thanks. Looks like the following is sufficient to reproduce. (I incorrectly thought an empty entry MachineBasicBlock was impossible.) I take the opportunity to add test coverage (testing debug-location: D73301)

define void @foo() #0 {
entry:
  unreachable
}

attributes #0 = { "patchable-function-entry"="2" }

A -DLLVM_ENABLE_ASSERTIONS=on build is required to trigger the assertion failure. My make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/ReleaseAssert/bin/clang LD=~/llvm/ReleaseAssert/bin/ld.lld O=/tmp/arm64 allmodconfig all -j 30 build succeeded. Now I will be in favor of pushing the bugfix to release/10.x .

Not clear about -fpatchable-function-entry=N,M where M>0 (D73070, D73071, D73072). For completeness, I'd like them to be included in release/10.x so we will not have a clang 10 that does not work with M>0.

For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.

nsz added a subscriber: nsz.Jan 24 2020, 2:05 AM

A -DLLVM_ENABLE_ASSERTIONS=on build is required to trigger the assertion failure. My make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTGCC=gcc CC=~/llvm/ReleaseAssert/bin/clang LD=~/llvm/ReleaseAssert/bin/ld.lld O=/tmp/arm64 allmodconfig all -j 30 build succeeded. Now I will be in favor of pushing the bugfix to release/10.x .

Not clear about -fpatchable-function-entry=N,M where M>0 (D73070, D73071, D73072). For completeness, I'd like them to be included in release/10.x so we will not have a clang 10 that does not work with M>0.

For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.

note that the gcc fix will be in the gcc-10 release (it got accepted),
and soon backported to gcc-9 (the first release with bti support).
(only aarch64 bti was fixed, that's what affected linux)

the gcc fix keeps bti outside of the patch area in the M=0 case,
which makes the current linux *runtime* patching code simpler.

the choice was made because this is important if there is a mix
of bti and non-bti functions which can happen if

  1. compiler detects not_called_directly(f) and omits bti or
  2. individual attributes on functions turn on/off bti

with the llvm patch the *runtime* code would have to detect
which patch area has bti and which not to skip over. (fine for
future M>0 usage, but current M=0 case should not be complex,
i.e. linux will have to disable bti+ftrace on llvm if this patch is
accepted until it has the complex runtime code. and the patch
may make static detection of the compiler behaviour harder as
it fails differently than old gcc-9).

i did not look deeply into llvm but i'd expect some hook for
printing the function label, which can print the bti before
the patch area based on some global flag (an ugly hack, but
allows consistent behaviour across the compilers)

i hope this makes sense.

For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.

Speaking as the person implementing the Linux side of things, I think that will be a problem. Kernel-side we want consistency across compilers.

For Linux we were planning to do a very simple build-time test to rule out compilers with the broken behaviour. We would expect functional compilers to have a consistent layout for a given combination of options, and we would consider this layout to be ABI.

The compile time check would compile a trivial test file, e.g.

void function(void) { }

... with flags:

-fpatchable-function-entry=2 -mbranch-protection=bti

... and if the function entry point is a NOP, mark that compiler as broken. In practice, this will mean that the kernel build system will disable BTI support for those broken compilers.

Trying to support different layout approaches within Linux will add a fair amount of unnecessary complexity which we cannot contain in one place, and makes it more likely that support for either case will be broken in future.

For the M=0 case, I would prefer to avoid runtime detection unless really necessary.

For the M!=0 case, which I believe Linux will need to use in the near future, I realise that a runtime check will be necessary to detect the absence of a BTI. Regardless, consistent behaviour across compilers will make this significantly simpler and less error-prone.

Thanks,
Mark.

MaskRay added a comment.EditedJan 24 2020, 10:19 AM

For BTI c issue, GCC has several releases that do not work with -mbranch-protection=bti. The Linux kernel has to develop some mechanism to detect the undesirable placement of bti c, if there are -mbranch-protection=bti users. So I don't think that inconsistency in clang 10.0.0 with GCC will be a problem.

Speaking as the person implementing the Linux side of things, I think that will be a problem. Kernel-side we want consistency across compilers.

For Linux we were planning to do a very simple build-time test to rule out compilers with the broken behaviour. We would expect functional compilers to have a consistent layout for a given combination of options, and we would consider this layout to be ABI.

The compile time check would compile a trivial test file, e.g.

void function(void) { }

... with flags:

-fpatchable-function-entry=2 -mbranch-protection=bti

... and if the function entry point is a NOP, mark that compiler as broken. In practice, this will mean that the kernel build system will disable BTI support for those broken compilers.

Trying to support different layout approaches within Linux will add a fair amount of unnecessary complexity which we cannot contain in one place, and makes it more likely that support for either case will be broken in future.

For the M=0 case, I would prefer to avoid runtime detection unless really necessary.

For the M!=0 case, which I believe Linux will need to use in the near future, I realise that a runtime check will be necessary to detect the absence of a BTI. Regardless, consistent behaviour across compilers will make this significantly simpler and less error-prone.

Thanks,
Mark.

When -mbranch-protection=bti is used, due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 (existing GCC releases have the issue), Linux/arch/arm64 has to make more checks to detect the combination.

I changed nop; nop; bti c to bti c; nop; nop with 2 commits not included in LLVM release/10.x (a72d15e37c5e066f597f13a8ba60aff214ac992d and 9a24488cb67a90f889529987275c5e411ce01dda). They make Clang's M!=0 placement (.Lpatch; nop; func: bti c; nop) consistent with GCC.
Edit: cherry-picked to release/10.x

For the M=0 case, Clang does (func: .Lpatch; bti c; nop; nop), which is inconsistent with GCC (func: bti c; .Lpatch: nop; nop). I'll be happy to try updating the placement of .Lpatch if future M>0 usage does not obsolete M=0 usage (a proof that the proposed GCC layout is indeed superior, not just for simplicity for a not-so-common configuration), otherwise I would feel the clang side is just making changes to appease GCC/Linux compatibility, which would make me sad. (Again: this is our ~5th time adding an option specific to Linux kernel, with a good intention to make it general, but in reality it doesn't https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html) I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.

When -mbranch-protection=bti is used, due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 (existing GCC releases have the issue), Linux/arch/arm64 has to make more checks to detect the combination.

As covered in D72222#1838988 the plan is to detect broken compilers at build-time and reject the combination of options. The Linux patching code will not handle varied behaviours at runtime.

I changed nop; nop; bti c to bti c; nop; nop with 2 commits not included in LLVM release/10.x (a72d15e37c5e066f597f13a8ba60aff214ac992d and 9a24488cb67a90f889529987275c5e411ce01dda). They make Clang's M!=0 placement (.Lpatch; nop; func: bti c; nop) consistent with GCC.
Edit: cherry-picked to release/10.x

That's great; thanks!

For the M=0 case, Clang does (func: .Lpatch; bti c; nop; nop), which is inconsistent with GCC (func: bti c; .Lpatch: nop; nop). I'll be happy to try updating the placement of .Lpatch if future M>0 usage does not obsolete M=0 usage (a proof that the proposed GCC layout is indeed superior, not just for simplicity for a not-so-common configuration), otherwise I would feel the clang side is just making changes to appease GCC/Linux compatibility, which would make me sad.

I do not expect future M>0 usage to obsolete M=0 usage, and these will be used by distinct configurations of Linux with distinct code paths. I expect that we will need M>0 in future to handle very large kernel images and/or live-patching where we may need to place PLTs or other metadata close to a function. I expect that other cases will continue to use M=0.

I can appreciate the frustration here, but I do think that this is an ABI issue if the compilers diverge in behaviour here. Especially as future additions to the architecture or PCS could complicate matters, and what could seem benign divergence now could turn out to be unreconcilable.

Aligning with the GCC M=0 case here will mean that patching code that works with GCC should just work without modification for clang, and I do think that's a better position to be in generally.

(Again: this is our ~5th time adding an option specific to Linux kernel, with a good intention to make it general, but in reality it doesn't https://gcc.gnu.org/ml/gcc/2020-01/msg00067.html) I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.

That's a good point w.r.t. x86; I will get in touch with the people working on ftrace and live-patching there

Thanks,
Mark.

I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.

That's a good point w.r.t. x86; I will get in touch with the people working on ftrace and live-patching there

I spoke with Steven Rostedt (ftrace maintainer), and Josh Poimboeuf (livepatching maintainer) in the OFTC/#linux-rt IRC channel. Today x86 uses -mfentry, and they have no reason to move to -fpatchable-function-entry so long as the ENDBR32/ENDBR64 instructions compose nicely with -mfentry.

Given that, I don't think x86 kernel folk care either way.

On the GCC side I was under the impression that x86 would go the same way as arm64, per https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424#c4

There's a GCC ticket for x86: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492

Thanks,
Mark.

I shall also mention that we are essentially making decisions for x86 people's endbr32/endbr64. I hope you can engage in x86 maintainers. Clang's current layout is recorded at D73071.

That's a good point w.r.t. x86; I will get in touch with the people working on ftrace and live-patching there

I spoke with Steven Rostedt (ftrace maintainer), and Josh Poimboeuf (livepatching maintainer) in the OFTC/#linux-rt IRC channel. Today x86 uses -mfentry, and they have no reason to move to -fpatchable-function-entry so long as the ENDBR32/ENDBR64 instructions compose nicely with -mfentry.

Given that, I don't think x86 kernel folk care either way.

On the GCC side I was under the impression that x86 would go the same way as arm64, per https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424#c4

There's a GCC ticket for x86: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492

Thanks,
Mark.

I created D73680 to place the patch label after BTI.

@hans Is there still time to cherry pick the patch to release/10.x? See above, Linux developers really want the Clang release to have compatible behavior with GCC.

hans added a comment.Jan 30 2020, 5:55 AM

I created D73680 to place the patch label after BTI.

@hans Is there still time to cherry pick the patch to release/10.x? See above, Linux developers really want the Clang release to have compatible behavior with GCC.

Yes, there is still time. Just let me know which commits to cherry-pick.

hans added a comment.Feb 4 2020, 2:35 AM

I created D73680 to place the patch label after BTI.

@hans Is there still time to cherry pick the patch to release/10.x? See above, Linux developers really want the Clang release to have compatible behavior with GCC.

Yes, there is still time. Just let me know which commits to cherry-pick.

Just to follow up: D73680 was cherry-picked to 10.x. Does that mean all issues are resolved here, and the kernel folks are happy, or is there more work expected that might affect the release?

I created D73680 to place the patch label after BTI.

@hans Is there still time to cherry pick the patch to release/10.x? See above, Linux developers really want the Clang release to have compatible behavior with GCC.

Yes, there is still time. Just let me know which commits to cherry-pick.

Just to follow up: D73680 was cherry-picked to 10.x. Does that mean all issues are resolved here, and the kernel folks are happy, or is there more work expected that might affect the release?

@hans Thanks for following up!

All known AArch64 -fpatchable-function-entry= issues are resolved after D73680.
D73760 is an x86 counterpart of D73680. It changed the placement of the label .Lpatch0, which has been agreed to be superior.

hjl is working on the GCC side issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492. I don't know whether that fix will be included in GCC 10.
From the discussions, what we did in D73760 is a consensus and GCC will eventually do it as well. D73760 will be nice to be cherry picked if there is not much trouble, so we won't have a major release with unsatisfactory label placement if Linux x86 developers ever want to adopt -fpatchable-function-entry=.

hans added a comment.Feb 5 2020, 6:19 AM

D73760 will be nice to be cherry picked if there is not much trouble, so we won't have a major release with unsatisfactory label placement if Linux x86 developers ever want to adopt -fpatchable-function-entry=.

Cherry-picked as 4c96b369a074e93a0be536dd795d3f245ef6f18b. Thanks!