[AArch64] FMV support and necessary target features dependencies.
This is Function Multi Versioning (FMV) implementation for AArch64 target in
accordance with Beta Arm C Language Extensions specification
https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning
It supports new "target_version" function attribute and extends existing
"target_clones" one. Also missing dependencies for target features were added.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
| clang/docs/ClangCommandLineReference.rst | ||
|---|---|---|
| 1086 ↗ | (On Diff #436971) | This file is auto-generated. Don't touch it. | 
| clang/lib/CodeGen/CGBuiltin.cpp | ||
| 1184–1186 | This can be committed separately. | |
| clang/test/CodeGen/attr-target-version.c | ||
| 76 | Remove //. and // <empty>. Note: some editors like neovim supports { } to navigate among paragraphs. Adding // just adds inconvenience. | |
I'm concerned as to the design of this addition, I don't particularly appreciate the reasons for making 'target_clones' different, nor the purpose for adding a new attribute instead of using 'target' for what seems like exactly that? IF the new spelling is THAT necessary, we perhaps don't need a whole new attribute for it either.
| clang/include/clang/AST/ASTContext.h | ||
|---|---|---|
| 3119 | It is concerning that this differs from the above. | |
| clang/include/clang/AST/Decl.h | ||
| 1895 | I find myself immediately wondering why this isn't just 'Target'. What semantically are we trying to change? | |
| clang/include/clang/Basic/Attr.td | ||
| 2749 | would probably just call this getName ? | |
| clang/include/clang/Basic/AttrDocs.td | ||
| 2380 | This sounds absurdly like 'target', and I think we should instead investigate why enabling THAT for ARM isn't sufficient. | |
| 2403 | This seems like an unnecessary change, we should do what we can to make this work as much like the non-ARM version as possible. | |
| clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
| 11576 | I'm not clear as to what this means? | |
| clang/include/clang/Sema/Sema.h | ||
| 4663 | This further concerns me that we're changing behavior this much... | |
Thank you for fair concern, "target_clones" for AArch64 has different format, semantic, e.g. "default" is not required. Therefore it diverges with X86 in these parts. "target" attribute has been already used and supported on AArch64 in a different sense, like target("arm"), target("dotprod"), target("branch-protection=bti"). The intention of creating new "target_version" attribute is not to overlap with that. It also has different format, mangling and semantic, e.g. treating function without attribute as "default", and option to disable attribute droping function multi versions. Do these explanations dispel your concern?
| clang/docs/ClangCommandLineReference.rst | ||
|---|---|---|
| 1086 ↗ | (On Diff #436971) | It looked out of sync with options td files: +.. option:: -gen-reproducer=<arg>, -fno-crash-diagnostics (equivalent to -gen-reproducer=off) + +Emit reproducer on (option: off, crash (default), error, always) + +.. option:: -print-diagnostic-options, --print-diagnostic-options + +Print all of Clang's warning options + +.. option:: -fdriver-only + +Only run the driver. + ... | 
| clang/include/clang/AST/ASTContext.h | ||
|---|---|---|
| 3119 | target_version supports features only, that is similar to ParsedTargetAttr { std::vector<std::string> Features; }. | |
| clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
| 11576 | It gives a warning if target_clones attributes contains features which have no impact on code generation ( no supported yet ) and ignored. They has "<NS>" OPTION in llvm/include/llvm/Support/AArch64TargetParser.def  // expected-warning@+1 {{version list contains no code impact entries}}
void __attribute__((target_clones("sha1+pmull"))) warn2(void);
// expected-warning@+1 {{version list contains no code impact entries}}
int __attribute__((target_clones("rng", "fhm+dpb+sha1", "default"))) redecl4(void) { return 1; } | |
| clang/docs/ClangCommandLineReference.rst | ||
|---|---|---|
| 1086 ↗ | (On Diff #436971) | But it needs to be manually autogenerated.. | 
| clang/docs/ClangCommandLineReference.rst | ||
|---|---|---|
| 1086 ↗ | (On Diff #436971) | Done https://reviews.llvm.org/D128116, thanks to @MaskRay | 
Is it *necessary* that it diverges like this? (Is there some published standards document you're trying to conform to, is there an implementation difficulty with not diverging, something else?)
"target" attribute has been already used and supported on AArch64 in a different sense, like target("arm"), target("dotprod"), target("branch-protection=bti"). The intention of creating new "target_version" attribute is not to overlap with that. It also has different format, mangling and semantic, e.g. treating function without attribute as "default", and option to disable attribute droping function multi versions. Do these explanations dispel your concern?
Do I understand correctly that AArch64 was using an attribute named target which does not behave like the attribute with the same name in GCC and Clang on other architectures, and now you'd like to introduce a new attribute which does behave like target but under a different name? If I have that correct, I don't think that's reasonable -- there's far too much possibility for confusion with that situation already, and adding a new attribute only increases the confusion. I'm not certain where the technical debt came from, but we shouldn't increase the burden on users here; I think target and target_clones should be made to work consistently across architectures if at all possible.
In ACLE there are a few rules/behaviour documented around what should be the behaviour of the "default" for example. Making for example "default" optional hopefully makes the adation of multi versioning seamless as possible. Compilers won't support from day one these attributes therefore the goal was to make the whole addition of a multi versioned function as less distributive as possible while the code is still compiled with older compilers too.
// this is the original code
// mandating __attribute__ ((target ("default"))) would not work with the today's compilers
void foo(){}
// new backward compatible code
#ifdef __ARM_FEATURE_FUNCTION_MULTI_VERSIONING
__attribute__ ((target_version("fancy_feature")))
void foo(){}
#endifif the "default" is not mandated here, it felt not right to mandate it for the target_clones either.
"target" attribute has been already used and supported on AArch64 in a different sense, like target("arm"), target("dotprod"), target("branch-protection=bti"). The intention of creating new "target_version" attribute is not to overlap with that. It also has different format, mangling and semantic, e.g. treating function without attribute as "default", and option to disable attribute droping function multi versions. Do these explanations dispel your concern?
Do I understand correctly that AArch64 was using an attribute named target which does not behave like the attribute with the same name in GCC and Clang on other architectures, and now you'd like to introduce a new attribute which does behave like target but under a different name? If I have that correct, I don't think that's reasonable -- there's far too much possibility for confusion with that situation already, and adding a new attribute only increases the confusion. I'm not certain where the technical debt came from, but we shouldn't increase the burden on users here; I think target and target_clones should be made to work consistently across architectures if at all possible.
Your understanding is correct. target attribute has two usage model. One is just redefine the to be used codegen options, this is used already widely for Arm and AArch64. The other use of the target attribute is the multi versioning and the rational for the target_version attribute is the easier distinction between the two usage mode, also not to break any code out there by changing the behaviour of an attribute.
I don't think differentiating the uses here is a good idea. I think it would have been a GREAT idea about 10 years ago, but that ship has already sailed once GCC started using it that way however. We should be keeping the current behavior, otherwise we're going to have a horrible mix of target/target_version working inconsistently between platforms.
That largely is my concern as well. The existing behavior of target is just that -- the existing behavior. I think deviating from that existing behavior will be confusing in practice. Adding additional attributes doesn't improve that confusion because users then have to know to decide between two very similar attributes, which means they then need to educate themselves on the differences between them. If we're going to push them towards the documentation to correctly use the attribute anyway, that's basically the same situation they're in today with the confusing dual behavior of target.
| clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
|---|---|---|
| 11576 | Perhaps it would be better to warn with something like "version list contains entries that don't impact code generation". I think that would be more clear. | |
| clang/include/clang/Basic/TargetInfo.h | ||
| 1345 | Similarly to the warning string, I think a name like featureAffectsCodeGen(...) would be more clear in its use. | |
| clang/lib/Basic/Targets/AArch64.cpp | ||
| 821 | Slight nit, but braces aren't needed here. | |
| clang/lib/CodeGen/CodeGenFunction.cpp | ||
| 2582 | Parentheses not needed. | |
| 2699 | I'm not 100% sure on the differences between assert(..) and LLvm_unreachable(...), but perhaps that would be cleaner than an assertion on false. | |
| 2705 | Nit on the capital letters for every word. Only "No" requires capitalisation. | |
| clang/lib/Sema/SemaDecl.cpp | ||
| 10842 | Candidate doesn't need capitalisation either. | |
| clang/lib/Sema/SemaDeclAttr.cpp | ||
| 3522 | I think HasCodeImpact would be more explanatory. | |
We started with the ACLE to be sure the platforms and compilers will implement the same for Arm targets so make the developers life easier with a consistent behaviour on Arm platforms. Users of the attributes anyway need to aware of the architecture differences. Like -mtune is different between Arm/AArch64 and x86.
I have some hope others may see the benefits of the new semantics (e.g. make the "default" optional for target_clones ) will be picked up by other architectures.
And that's been frustrating for years too and would have been nice had aarch64 not followed it :)
That said, my desire here is source compatibility between open source compilers - I think that we need at least source compatibility with gcc and then we can go from there.
| llvm/lib/Target/AArch64/AArch64.td | ||
|---|---|---|
| 80 | What is this for? | |
| llvm/lib/Target/AArch64/AArch64.td | ||
|---|---|---|
| 80 | FMV is a target feature which is enabled by default and can be disabled (-mno-fmv ). Accoding to ACLE spec "FMV may be disabled in compile time by a compiler flag. In this case the default version shall be used." | |
Please find the thread on the GCC mailing list here: https://gcc.gnu.org/pipermail/gcc/2022-July/239134.html 
Feedback there sounds positive to me.
All received feedback added to this PR.
https://github.com/ARM-software/acle/pull/211
@ilinpv Could you please update the patch accordingly?
(I'm on vacation until September, please expect slow responses.)
| llvm/include/llvm/Support/AArch64TargetParser.def | ||
|---|---|---|
| 188 ↗ | (On Diff #438130) | The compiler already has names for these, used in the existing -march=... options and target(...) attributes. It would be better to not have to invent a new set of names that is slightly different. It would be good if this could be combined with the table above too. Hopefully it is just an expansion to the values already present. | 
Addressing comments and ACLE specification updates.
sme-f64 -> sme-f64f64 and sme-i64 -> sme-i16i64 features are expected to be renamed in specification.
Rebase on top of trunk. Addressed comments. Fixed initFeatureMap issue resulting in some fmv features not enabled on target, use it and AARCH64_ARCH_EXT_NAME for target attribute features dependencies as well.
| compiler-rt/lib/builtins/cpu_model.c | ||
|---|---|---|
| 1322 | I reply on __aarch64_cpu_features as global is guaranteed to be initialised to 0. | |
I can only comment on the target features part of the patch - I've been hoping it would add something similar and looks very useful in its own right.
| clang/lib/Basic/Targets/AArch64.cpp | ||
|---|---|---|
| 63 | There can be an 8.9 / 9.4 now. | |
| 994–1016 | fmv features -> dependant features. | |
| clang/test/CodeGen/aarch64-targetattr.c | ||
| 93–109 | These updates look good, thanks for the improvements! | |
| llvm/include/llvm/Support/AArch64TargetParser.def | ||
| 107 ↗ | (On Diff #479203) | This table could maybe be formatted a little more nicely. LLVM's clang format style would usually align the ( and the next line. Perhaps putting the last integer before the string can help too. AARCH64_ARCH_EXT_NAME("invalid",       AArch64::AEK_INVALID,     {},              {},             \
                      MAX, 0, "")We could alternatively just clang-format the whole thing. | 
| 115 ↗ | (On Diff #479203) | Should RDM be dependant on +jsconv,+complxnum? | 
| llvm/include/llvm/Support/AArch64TargetParser.def | ||
|---|---|---|
| 115 ↗ | (On Diff #479203) | I suppose RDM implies simd (+neon) which implies +jsconv,+complxnum | 
| llvm/include/llvm/Support/AArch64TargetParser.def | ||
|---|---|---|
| 115 ↗ | (On Diff #479203) | jsconv and complxnum are 8.3 features, are they not? Or do they somehow mean something different here? It would seem strange for neon to depend on them. | 
| llvm/include/llvm/Support/AArch64TargetParser.def | ||
|---|---|---|
| 115 ↗ | (On Diff #479203) | Right, last implication is true from 8.3, and wrong before that, thanks for spotting that! I will fix that in other places too. FEAT_JSCVT implements the functionality identified by 0b0001. In Armv8.0, Armv8.1, and Armv8.2, the only permitted value is 0b0000. From Armv8.3, if Advanced SIMD or Floating-point is implemented, the only permitted value is 0b0001. From Armv8.3, if Advanced SIMD or Floating-point is not implemented, the only permitted value is 0b0000. | 
just small comment, thanks @ilinpv
LGTM, just let others to chime in.
| clang/lib/Basic/Targets/AArch64.cpp | ||
|---|---|---|
| 657 | wondering if we could use FEAT_MAX (or similar) here to avoid error as add features later. maybe a comment in AArch64TargetParser.def is the solution. | |
| 752–771 | Please note the initialisation moved, see D139622.  | |
| 968–969 | maybe we need do disable SVE too here? | |
Hi, we're seeing a build failure in Fuchsia's Clang CI. We're seeing this on all of our builders: arm64 & x64 linux, mac and windows
FAILED: CMakeFiles/clang_rt.builtins-aarch64.dir/cpu_model.c.o 
/b/s/w/ir/x/w/recipe_cleanup/cxx-rbevgw5lbzc/reclient-cxx-wrapper.sh /b/s/w/ir/x/w/staging/llvm_build/./bin/clang --target=aarch64-unknown-linux-gnu --sysroot=/b/s/w/ir/x/w/cipd/linux -DHAS_ASM_LSE -DVISIBILITY_HIDDEN  --target=aarch64-unknown-linux-gnu -O2 -g -DNDEBUG -DCOMPILER_RT_HAS_FLOAT16 -std=c11 -fPIC -fno-builtin -fvisibility=hidden -fomit-frame-pointer -MD -MT CMakeFiles/clang_rt.builtins-aarch64.dir/cpu_model.c.o -MF CMakeFiles/clang_rt.builtins-aarch64.dir/cpu_model.c.o.d -o CMakeFiles/clang_rt.builtins-aarch64.dir/cpu_model.c.o -c /b/s/w/ir/x/w/llvm-llvm-project/compiler-rt/lib/builtins/cpu_model.c
../../../../llvm-llvm-project/compiler-rt/lib/builtins/cpu_model.c:1233:15: error: use of undeclared identifier 'HWCAP_CPUID'
  if (hwcap & HWCAP_CPUID) {
              ^
1 error generated.Can you revert if the fix is going to be hard. I imagine in this case the preprocessor logic is slightly off, so a forward fix may be easy.
Hi, thanks for the fix. that unblocked our builder. Unfortunately, we still see some errors in tests.
FAIL: Clang :: Driver/aarch64-features.c (7460 of 16622)
******************** TEST 'Clang :: Driver/aarch64-features.c' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-none-linux-gnu -### /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c -fsyntax-only 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 2';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target arm64-none-linux-gnu -### /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c -fsyntax-only 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 10';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-android -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-FMV /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 13';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-android -rtlib=compiler-rt -mno-fmv  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-FMV-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 16';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-FMV-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 19';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target arm64-unknown-linux -rtlib=libgcc  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-FMV-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 26';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-android -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 29';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 32';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target arm64-unknown-linux -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 35';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64--none-eabi -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 38';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-apple-darwin -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 41';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-windows-gnu -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 44';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-unknown-openbsd -rtlib=compiler-rt  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 47';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=libgcc  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-10  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 51';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=libgcc  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-7.5.0  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 55';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=libgcc  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-9.3.1  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 59';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=libgcc  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-9.3.0  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-OFF /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 63';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target arm64-linux -rtlib=compiler-rt -mno-outline-atomics  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck  -check-prefixes=CHECK-OUTLINE-ATOMICS-OFF,CHECK-NO-OUTLINE-ATOMICS /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 67';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-linux-gnu -rtlib=libgcc -mno-outline-atomics  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-10  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck  -check-prefixes=CHECK-OUTLINE-ATOMICS-OFF,CHECK-NO-OUTLINE-ATOMICS /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 72';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-apple-darwin -rtlib=compiler-rt -moutline-atomics  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
: 'RUN: at line 75';   /b/s/w/ir/x/w/staging/llvm_build/bin/clang -target aarch64-windows-gnu -rtlib=libgcc -moutline-atomics  --gcc-toolchain=/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/Inputs/aarch64-linux-gnu-tree/gcc-7.5.0  -### -c /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c 2>&1 | /b/s/w/ir/x/w/staging/llvm_build/bin/FileCheck -check-prefix=CHECK-OUTLINE-ATOMICS-ON /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
--
Exit Code: 1
Command Output (stderr):
--
/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c:22:19: error: CHECK-FMV-OFF: expected string not found in input
// CHECK-FMV-OFF: "-target-feature" "-fmv"
                  ^
<stdin>:1:1: note: scanning from here
Fuchsia clang version 16.0.0 (https://llvm.googlesource.com/llvm-project 9b2fecec406d6a6bcda9fbb9251db2ae202c7400)
^
<stdin>:6:461: note: possible intended match here
 "/b/s/w/ir/x/w/staging/llvm_build/bin/clang-16" "-cc1" "-triple" "aarch64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "--mrelax-relocations" "-disable-free" "-clear-ast-before-backend" "-main-file-name" "aarch64-features.c" "-mrelocation-model" "pic" "-pic-level" "2" "-pic-is-pie" "-mframe-pointer=non-leaf" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8a" "-target-abi" "aapcs" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=/b/s/w/ir/x/w/staging/llvm_build/tools/clang/test/Driver" "-resource-dir" "/b/s/w/ir/x/w/staging/llvm_build/lib/clang/16" "-internal-isystem" "/b/s/w/ir/x/w/staging/llvm_build/lib/clang/16/include" "-internal-isystem" "/usr/local/include" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir=/b/s/w/ir/x/w/staging/llvm_build/tools/clang/test/Driver" "-ferror-limit" "19" "-fno-signed-char" "-fgnuc-version=4.2.1" "-target-feature" "+outline-atomics" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "aarch64-features.o" "-x" "c" "/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c"
                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ^
Input file: <stdin>
Check file: /b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c
-dump-input=help explains the following input dump.
Input was:
<<<<<<
            1: Fuchsia clang version 16.0.0 (https://llvm.googlesource.com/llvm-project 9b2fecec406d6a6bcda9fbb9251db2ae202c7400) 
check:22'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
            2: Target: aarch64-unknown-linux-gnu 
check:22'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            3: Thread model: posix 
check:22'0     ~~~~~~~~~~~~~~~~~~~~
            4: InstalledDir: /b/s/w/ir/x/w/staging/llvm_build/bin 
check:22'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            5:  (in-process) 
check:22'0     ~~~~~~~~~~~~~~
            6:  "/b/s/w/ir/x/w/staging/llvm_build/bin/clang-16" "-cc1" "-triple" "aarch64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "--mrelax-relocations" "-disable-free" "-clear-ast-before-backend" "-main-file-name" "aarch64-features.c" "-mrelocation-model" "pic" "-pic-level" "2" "-pic-is-pie" "-mframe-pointer=non-leaf" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8a" "-target-abi" "aapcs" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=/b/s/w/ir/x/w/staging/llvm_build/tools/clang/test/Driver" "-resource-dir" "/b/s/w/ir/x/w/staging/llvm_build/lib/clang/16" "-internal-isystem" "/b/s/w/ir/x/w/staging/llvm_build/lib/clang/16/include" "-internal-isystem" "/usr/local/include" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir=/b/s/w/ir/x/w/staging/llvm_build/tools/clang/test/Driver" "-ferror-limit" "19" "-fno-signed-char" "-fgnuc-version=4.2.1" "-target-feature" "+outline-atomics" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "aarch64-features.o" "-x" "c" "/b/s/w/ir/x/w/llvm-llvm-project/clang/test/Driver/aarch64-features.c" 
check:22'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:22'1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  possible intended match
>>>>>>
--
********************If this will be hard to fix, can you revert until this can be sorted out?
The test was fixed bf94eac6a3f7c5cd8941956d44c15524fa3751bd ( it missed the case that Fuchsia has ToolChain::RLT_CompilerRT as default GetDefaultRuntimeLibType ). 
Please let me know if you have any issues remained with the patch.
Thanks! I think we're good now. At least one of our builders is green now. https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8794218564332891457/overview
I'll let you know if anything pops up, but this looks squared away to me.
Hi, this looks like a candidate for breaking the MSan bot: https://lab.llvm.org/buildbot/#/builders/5/builds/30139
Still looking into it and bisecting, will let you know when I have more info. To reproduce the bots, the best way (because MSan setup is tricky because it requires an instrumented libcxx) is to use the scripts from https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild (buildbot_fast.sh is the right one).
Yeah, unfortunately I did track this failure down to this commit and reverted it upstream. If you need help figuring it out, please let me know. You may find that adding -fsanitize-memory-track-origins=2 useful to add to the buildscript as well (which can be done by changing check_stage2_msan to check_stage2_msan_track_origins in buildbot_fast.sh.
It would be great to have more details how to setup up your bot, using buildbot_fast.sh on x86_64 Ubuntu 22.04 LTS leads to error ( pthreads installed ):
CMake Error at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message): Could NOT find Threads (missing: Threads_FOUND) Call Stack (most recent call first): /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE) /usr/share/cmake-3.22/Modules/FindThreads.cmake:238 (FIND_PACKAGE_HANDLE_STANDARD_ARGS) cmake/config-ix.cmake:114 (find_package) CMakeLists.txt:776 (include)
Also MemorySanitizer: use-of-uninitialized-value cases from https://lab.llvm.org/buildbot/#/builders/5/builds/30139 looks fine locally, all values initialized, could MSAN produce false positive results?
MSan can find false positives, but that's only if code ends up in your binary that isn't built with MSan. The buildscripts are written so that everything gets instrumented.
Hmm, not exactly sure what's going on with the could NOT find Threads there. A quick googling seems to point to pthreads.so not being in the right places, but I don't think the buildbot does anything special. Do your regular builds with -DLLVM_ENABLE_PROJECTS="compiler_rt;clang;lld" work?
When you say that it looks fine locally, is that from your own checkout but using -DLLVM_USE_SANITIZER=Memory? First thing to check is that you do end up with MSan in the test (in particular the clang binary that's being produced), which you can do by nm bin/clang-16 | grep __msan_init.
Regular builds works fine for me, pthreads located here "/lib/x86_64-linux-gnu/libpthread.so" "/usr/lib/x86_64-linux-gnu/libpthread.so". Enabling "-DLLVM_USE_SANITIZER=Memory" resulted in many "WARNING: MemorySanitizer: use-of-uninitialized-value" on tblgen like:
cd /data/ReleasesToCommit/llvm-project/build && /data/ReleasesToCommit/llvm-project/build/bin/llvm-tblgen -gen-intrinsic-enums -intrinsic-prefix=s390 -I /data/ReleasesToCommit/llvm-project/llvm/include/llvm/IR -I/data/ReleasesToCommit/llvm-project/build/include -I/data/ReleasesToCommit/llvm-project/llvm/include /data/ReleasesToCommit/llvm-project/llvm/include/llvm/IR/Intrinsics.td --write-if-changed -o include/llvm/IR/IntrinsicsS390.h -d include/llvm/IR/IntrinsicsS390.h.d [build] ==2441251==WARNING: MemorySanitizer: use-of-uninitialized-value
Yeah that's a false-positive because of bad-ordering. MSan is much tricker because it requires an instrumented libcxx. If you can't use the buildscript, an MVP for the right ordering should be something like:
- Build a new clang.
$ cd /tmp/1/ $ cmake \ -DLLVM_ENABLE_PROJECTS="clang;compiler-rt;lld" \ -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi" \ -DCMAKE_C_COMPILER=clang \ -DCMAKE_CXX_COMPILER=clang++ \ -GNinja \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_USE_LINKER=lld \ -DCMAKE_C_FLAGS="-Wall" \ -DCMAKE_CXX_FLAGS="-Wall" \ /path/to/llvm/llvm $ ninja clang lld compiler-rt llvm-symbolizer
- Build a sanitizer libcxx.
$ cd /tmp/2 $ cmake \ -DCMAKE_C_COMPILER=/tmp/1/bin/clang \ -DCMAKE_CXX_COMPILER=/tmp/1/bin/clang++ \ -GNinja \ -DLLVM_USE_SANITIZER=Memory \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_ENABLE_RUNTIMES="'libcxx;libcxxabi'" \ -DLLVM_USE_LINKER="'lld'" \ /path/to/llvm/runtimes/ # <-------- Make sure this is *runtimes*, not llvm. $ ninja cxx cxxabi
- Build a msan-ified clang, and use the libcxx from step 2.
$ cd /tmp/3 $ cat .cmake_script.sh #!/bin/bash -e LDFLAGS="-lc++abi" LDFLAGS="$LDFLAGS -Wl,--rpath=/tmp/2/lib" # <---- use the instrumented libcxx from step 2 LDFLAGS="$LDFLAGS -L/tmp/2/lib" CFLAGS="$LDFLAGS" CFLAGS="$CFLAGS -fsanitize=memory" CFLAGS="$CFLAGS -nostdinc++" CFLAGS="$CFLAGS -isystem /tmp/2/include" # <---- use the instrumented libcxx from step 2 CFLAGS="$CFLAGS -isystem /tmp/2/include/c++/v1" CFLAGS="$CFLAGS -w" CFLAGS="$CFLAGS -fsanitize-memory-use-after-dtor -fsanitize-memory-param-retval" USE_SANITIZER="Memory" # USE_SANITIZER="MemoryWithOrigins" # <-------------------+-- uncomment these and comment the USE_SANITIZER="Memory" above to get track-origins. # CFLAGS="$CFLAGS -fsanitize-memory-track-origins=2" # <--+ cmake \ -DCMAKE_C_COMPILER=/tmp/1/bin/clang \ -DCMAKE_CXX_COMPILER=/tmp/1/bin/clang++ \ -DLLVM_ENABLE_LIBCXX=ON \ -GNinja \ -DLLVM_USE_SANITIZER="$USE_SANITIZER" \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DCMAKE_C_FLAGS="$CFLAGS" \ -DCMAKE_CXX_FLAGS="$CFLAGS" \ -DCMAKE_EXE_LINKER_FLAGS="$LDFLAGS" \ -DLLVM_ENABLE_PROJECTS="clang;lld;clang-tools-extra" \ -DLLVM_USE_LINKER="lld" \ /path/to/llvm/llvm $ . .cmake_script.sh $ ninja clang lld $ ninja check-clang check-llvm # <----- finally, run your tests
I've managed to reproduce "MemorySanitizer: use-of-uninitialized-value" error locally, thank you @hctim for help!
If I understand it right, it seems MSan didn't handle correctly SmallVector - a variable-sized array with some number of elements in-place and heap allocation for additional elements if needed:
clang/lib/Sema/SemaDeclAttr.cpp:3615 SmallVector<SmallString<64>, 2> StringsBuffer;
There were 2 elements in-placed for StringsBuffer and tests which require 3 failed with MSan use-of-uninitialized-value error. 
With number of StringsBuffer in-placed elements set to 3
SmallVector<SmallString<64>, 3> StringsBuffer;
all use-of-uninitialized-value errors have gone.
I'm not sure "MSan didn't handle correctly SmallVector" is the case. Given your diagnosis of 3-elements-vs-2, I'm guessing the root cause is that clang/lib/Sema/SemaDecl.cpp:11369 is wrong:
!std::equal(CurClones->featuresStrs_begin(),
            CurClones->featuresStrs_end(),
            NewClones->featuresStrs_begin()))) {This construction of std::equal is very error-prone, as if NewClones.size() < CurClones.size(), then this invariable leads to buffer-overflow. I'm wondering if that's the underlying cause, it would seem entirely possible that expanding the in-place elements are always "initialized" from MSan's perspective and so the current code has a false-negative, and your new code made it so that the vector is now heap-based, which is revealing the underlying issue. Maybe worth trying one more thing and adding an assert(CurClones->size() <= NewClones->size()); to double check?
I think -Wl,-rpath and -isystem are somewhat different.
sanitizer_ldflags="-Wl,--rpath=$libcxx_out/lib/x86_64-unknown-linux-gnu -L$libcxx_out/lib/x86_64-unknown-linux-gnu -lc++abi" sanitizer_cflags="-nostdinc++ -isystem$libcxx_out/include/x86_64-unknown-linux-gnu/c++/v1 -isystem$libcxx_out/include/c++/v1"
I use:
ninja -C ~/llvm/out/stable llvm-ar clang lld msan compiler-rt-headers
LLVM_COMMON=(-GNinja -DCMAKE_CXX_COMPILER=$HOME/llvm/out/stable/bin/clang++ -DCMAKE_C_COMPILER=$HOME/llvm/out/stable/bin/clang -DCMAKE_CXX_ARCHIVE_CREATE="$HOME/llvm/out/stable/bin/llvm-ar qcS --thin <TARGET> <OBJECTS>" -DCMAKE_CXX_ARCHIVE_FINISH=: -DLLVM_APPEND_VC_REV=OFF -DLLVM_ENABLE_LLD=On -DLLVM_TARGETS_TO_BUILD=host -DLLVM_TABLEGEN=/tmp/Rel/bin/llvm-tblgen -DCLANG_ENABLE_ARCMT=off -DCLANG_ENABLE_STATIC_ANALYZER=off -DCLANG_TABLEGEN=/tmp/Rel/bin/clang-tblgen)
libcxx_out=$HOME/llvm/out/msan-libcxx
cmake -Sruntimes -B$libcxx_out $LLVM_COMMON -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_RUNTIMES='libcxx;libcxxabi' -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on -DLLVM_USE_SANITIZER=Memory
ninja -C $libcxx_out cxx cxxabi
sanitizer_ldflags="-Wl,--rpath=$libcxx_out/lib/x86_64-unknown-linux-gnu -L$libcxx_out/lib/x86_64-unknown-linux-gnu -lc++abi"
sanitizer_cflags="-nostdinc++ -isystem$libcxx_out/include/x86_64-unknown-linux-gnu/c++/v1 -isystem$libcxx_out/include/c++/v1"
# See http://llvm.org/bugs/show_bug.cgi?id=19071, http://www.cmake.org/Bug/view.php?id=15264
sanitizer_cflags="$sanitizer_cflags $sanitizer_ldflags -fsanitize=memory -w"
cmake -Hllvm -B/tmp/out/msan ${LLVM_COMMON} -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;lld' -DCMAKE_C_FLAGS=$sanitizer_cflags -DCMAKE_CXX_FLAGS=$sanitizer_cflags -DCMAKE_EXE_LINKER_FLAGS=$sanitizer_ldflags -DCMAKE_SHARED_LINKER_FLAGS=$sanitizer_ldflags -DLLVM_ENABLE_LIBCXX=On -DLLVM_USE_SPLIT_DWARF=On -DLLVM_USE_SANITIZER=MemoryI don't think std::equal is underlying cause here. We have featuresStrs_size() comparison before calling it:
if (CurClones && NewClones &&
          (CurClones->featuresStrs_size() != NewClones->featuresStrs_size() ||
           !std::equal(CurClones->featuresStrs_begin(),
                       CurClones->featuresStrs_end(),
                       NewClones->featuresStrs_begin()))) {Also even if we completely remove std::equal the use-of-uninitialized-value error still persist.
Sorry for the red herring. MSan is detecting use-after-destructor here.
Strings holds StringRef-references to the memory allocated in StringsBuffer.
When StringsBuffer is template-constructed with two inline elements (i.e. SmallVector<SmallString<64>, 2> StringsBuffer;), the third element that gets added (in clang::Sema::checkTargetClonesAttrString, SemaDeclAttr.cpp:3568) causes the SmallVector to move the three strings to the heap (the two already existing inline and the new addition).
SmallVector cleans up the inline memory by calling the destructors on it. MSan dutifully marks the destroyed memory as uninitialized, to detect use-after-destruction. The reason why ASan doesn't generate a report here is that the memory is still technically "reachable", it's possible to have destroyed memory that's still part of a live stack frame or hasn't had its heap allocation cleaned up yet.
Because Strings captured the reference of the two string objects when they were inline-allocated, as soon as this move-to-heap happens, these two references are dangling. Then, when any caller attempts to iterate over Strings, it finally explodes as MSan correctly detects the use of destroyed memory.
Allocating 3 objects inline solves the issue for now, as there's no uses that end up needing more than 3 elements, but this leaves a footgun for anyone who would add another element later.
Many thanks @hctim for investigation! I refactored Strings and StringsBuffer not to capture dangling references - use-of-uninitialized-value errors were fixed. I would reland this patch if you don't mind.
Checked the changes I'm suggesting and they fix the standalone build.
| clang/lib/Basic/Targets/AArch64.cpp | ||
|---|---|---|
| 650 | That doesn't work for standalone builds. Instead maybe use #include "llvm/TargetParser/AArch64TargetParser.def" ? | |
| 669 | Same comment as on line 650. #include "llvm/TargetParser/AArch64TargetParser.def" | |
| 679 | Same comment as on line 650. #include "llvm/TargetParser/AArch64TargetParser.def" | |
I'm investigating a size increase we observed after this change for Meta's Android apps. This increases the size of compiler-rt by 1.6 KB, which is small by itself, but then compiler-rt is statically linked into each SO, and some of our apps have hundreds of SOs, so the increase adds up to a sizeable total. (We would ideally have far fewer SOs, but that's a pretty involved change.)
-mno-fmv doesn't help. What I've found is that cpu_model.c is getting pulled in from the compiler-rt archive into our SOs because of references to outlined atomics (__aarch64_have_lse_atomics), and then it has the static constructor init_cpu_features, which pulls in init_cpu_features_resolver. We're not actually using multi-versioning anywhere, but we're still paying the size cost for it as a result. Would we consider moving the newly added functions into their own file (or perhaps moving the outlined atomics functions into a different file), so that you can use outlined atomics without also paying the size cost of function multiversioning if you don't need it?
I also had a couple of general questions, since I think I'm missing something obvious:
- How come we need both init_cpu_features and init_cpu_features_resolver? It seems like we're codegenning calls to the latter where needed, so I'm not sure what the former is adding on top.
- From what I can see, we're codegenning calls to init_cpu_features_resolver without any arguments, but the actual definition of that function has two arguments. How does that work?
Function multiversioning expects compiler-rt has __aarch64_cpu_features, it will be broken if compiler-rt miss that ( clang/lib/Driver/ToolChains/Clang.cpp:7231 ). I believe function multiversioning will be used in Android as outline atomics already did.
I also had a couple of general questions, since I think I'm missing something obvious:
- How come we need both init_cpu_features and init_cpu_features_resolver? It seems like we're codegenning calls to the latter where needed, so I'm not sure what the former is adding on top.
- From what I can see, we're codegenning calls to init_cpu_features_resolver without any arguments, but the actual definition of that function has two arguments. How does that work?
hwcaps are ABI specified arguments of ifunc resolver. The constructor init_cpu_features calls getauxval to initialize hwcaps and then pass them explicitly to init_cpu_features_resolver. If resolver called before constructor we get init_cpu_features_resolver hwcap and hwcap2 as arguments from dynamic loader.
We can use -mno-fmv to avoid that dependency, right? We're interested in using that for our own code (where we don't make use of function multi-versioning), and want to prevent the compiler-rt support from being pulled in from the archive unnecessarily. It'd still be available for users who needed it.
I also had a couple of general questions, since I think I'm missing something obvious:
- How come we need both init_cpu_features and init_cpu_features_resolver? It seems like we're codegenning calls to the latter where needed, so I'm not sure what the former is adding on top.
- From what I can see, we're codegenning calls to init_cpu_features_resolver without any arguments, but the actual definition of that function has two arguments. How does that work?
hwcaps are ABI specified arguments of ifunc resolver. The constructor init_cpu_features calls getauxval to initialize hwcaps and then pass them explicitly to init_cpu_features_resolver. If resolver called before constructor we get init_cpu_features_resolver hwcap and hwcap2 as arguments from dynamic loader.
Got it, thanks.
Right, you will need to explicitly provide '-mno-fmv` then. Currently __aarch64_cpu_features cpu_model.c stuff located in bultins (libclang_rt.builtins-aarch64.a). Did I understand you correctly that your apps linked agains libclang_rt.builtins-aarch64.a and if we move function multiversioning part to new library, lets say libclang_rt.cpu_features-aarch64.a, that will resolve your concern ? As a sidenote, builtins/cpu_model.c contains X86 CPU features used in function multiversioning on that target as well.
It doesn't need to be in a separate library. It just needs to be in a different source file than the outlined atomics, so that it only gets included in the link if it's explicitly referenced and not just because outlined atomics are used.
You're right that it conceptually makes sense for this to be in cpu_model.c though. An alternative would be providing an option for compiler-rt to be built without the multiversioning support, e.g. if it's built with -mno-fmv itself. Does that seem reasonable?
Sounds good to me, please look into the patch adding such cmake option https://reviews.llvm.org/D141199
Using cmake -DCOMPILER_RT_DISABLE_AARCH64_FMV=On will exclude Aarch64 multiversioning support from compiler-rt build.
This patch has made it considerably harder to understand what is going on in the TargetParser. If you get a chance, please could you add some clarifying comments and tidy-ups. I appreciate that a lot of this is following the lead of the pre-existing TargetParser code, but lets try to improve it as we go.
| clang/include/clang/Basic/TargetInfo.h | ||
|---|---|---|
| 1345 | I agree that the function name doesn't suggest what the doc comment says this function does. Also, the meaning of the return value (does this affect codegen?) seems unrelated to the other output value (list of dependent options) so I don't understand why this is one function rather than two. | |
| 1345 | Is this really useful for any other targets? It seems very specific to AArch64 anf FMV, and at the only 2 call sites you are guaranteed to have an AArch64TargetInfo. Seems like an unnecessary extension to the interface. | |
| clang/lib/AST/ASTContext.cpp | ||
| 13302 | Is the meaning of this question mark explained anywhere? Could a more meaningful type be used rather than passing around strings? | |
| clang/lib/Basic/Targets/AArch64.cpp | ||
| 664 | FeatureVec is not a vector. | |
| 671 | If I'm reading this right, this is saying that the extension "does not affect codegen" (from the docstring) if the extension's dependent features DEP_FEATURES happens to be the empty string (or an invalid extension name is passed in). IIUC, this is not a reasonable way to indicate that it does not affect codegen; an extra macro parameter/struct field would be more appropriate. | |
| llvm/include/llvm/TargetParser/AArch64TargetParser.def | ||
| 108 | The reader has to know somehow that MAX will be concatenated into FEAT_MAX etc. Please write FEAT_MAX explicitly to make it easier to see what is going on. | |
| 109 | It would be better to add a named variable like FMVMaxPriority than having "none" keep track of the magic value. | |
| 167 | The extensions defined here used to be the things that could be added to march; now it also includes anything that can be specified in attributes? There should be a comment explaining that and making it clear which ones appear in -march now. Reusing AArch64::AEK_NONE to indicate that the entries should not appear on the command line is confusing; it should at least it should be documented or renamed. | |
| 221 | Why do some of these have themselves as dependencies e.g. +sha2 while others don't e.g. +sha1? It looks like the ones that don't are not available through march? Ideally they would all be done consistently. | |
| llvm/include/llvm/TargetParser/AArch64TargetParser.h | ||
| 27 | Could you add an explanation of what these are, how they differ from the ArchExtKind enum, the fact that FEAT_MAX is a sentinel, etc? Could ArchExtKind not have been extended, rather than having two enums modelling essentially the same thing? Also this has to be kept in sync with compiler-rt, which should be documented here. | |
| 183 | Please document what these new fields represent | |
| 241 | This could return a std::string. Also please document what it returns, it is not clear what a "feature option" is and what the distinction from a "feature" is. | |
| 253 | What is a SupportsMask? This seems related to the CPUFeatures enum, please document what is going on. | |
I fully agree, thank you for valuable comments! Let me address them in separate patch.
| compiler-rt/lib/builtins/cpu_model.c | ||
|---|---|---|
| 1338 | This breaks the build with glibc 2.17. | |
| compiler-rt/lib/builtins/cpu_model.c | ||
|---|---|---|
| 1338 | Thanks for https://reviews.llvm.org/D145494 fix. AT_HWCAP2 was not defined. | |
It is concerning that this differs from the above.