Out-of-line helper functions to support LSE deployment added.
This is a port of libgcc implementation:
https://gcc.gnu.org/git/?p=gcc.git;h=33befddcb849235353dc263db1c7d07dc15c9faa
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is a port of libgcc implementation
I take it the licensing stuff all works out because Linaro is the original author and you're working with them?
compiler-rt/lib/builtins/aarch64/lse-init.c | ||
---|---|---|
30 ↗ | (On Diff #304157) | Having an extra constructor isn't great, it would be better to put it in compiler-rt/lib/builtins/cpu_model.c. |
compiler-rt/cmake/builtin-config-ix.cmake | ||
---|---|---|
25 | Question: is there a requirement that compiler-rt be buildable with an assembler which doesn't support the LSE instruction syntax? |
compiler-rt/cmake/builtin-config-ix.cmake | ||
---|---|---|
25 | I wouldn't say it is a requirement, be buildable by various gnu toolchains which include assembler without lse syntax is a plus. |
compiler-rt/lib/builtins/aarch64/lse-init.c | ||
---|---|---|
16 ↗ | (On Diff #304157) | AOSP toolchain build against bionic |
compiler-rt/lib/builtins/cpu_model.c | ||
---|---|---|
773 | You can't include the weak attribute without also testing vs NULL when you use it. Is there a particular system for which you added this attribute? If the function does not exist on a given system, you'd be better off just omitting the entire constructor. |
compiler-rt/lib/builtins/cpu_model.c | ||
---|---|---|
773 | Weak could be helpful if you compiled on old Android API without getauxval but run on newer one that has it. |
compiler-rt/lib/builtins/cpu_model.c | ||
---|---|---|
773 | I don't know if it's worth providing the future API for an old Android. If you omit that, you can remove the defines for AT_HWCAP and HWCAP_ATOMICS as well. We could also reduce the explicit version checks with #if defined(__has_include) # if __has_include(<sys/auxv.h>) # include <sys/auxv.h> # if defined(AT_HWCAP) && defined(HWCAP_ATOMICS) // constructor # endif # endif #endif |
compiler-rt/lib/builtins/cpu_model.c | ||
---|---|---|
773 | Thanks for suggestion, I think we can omit that now and add support for old Android API later if required. HWCAP_ATOMICS is not defined on AOSP so I've added ifndefs. |
This broke building for non-ELF targets, I'll send a patch for fixing that.
compiler-rt/lib/builtins/CMakeLists.txt | ||
---|---|---|
521 | These generated templated files end up placed directly in the root of the builddir (when building standalone, with cmake pointing at compiler-rt/lib/builtins) which I find a bit unusual. Could some subdir be made for them to avoid cluttering the build dir too much? |
Also, I see that this wasn't actually sent to any mailing list - I guess we should have some list auto-added for anything touching compiler-rt?
Plans and questions were sent to llvm-dev http://lists.llvm.org/pipermail/llvm-dev/2020-October/145841.html.
compiler-rt/lib/builtins/CMakeLists.txt | ||
---|---|---|
521 | Yes, if it is more convenient I can create subdir like 'outline_atomic_helpers' for them |
Fair enough, but despite that, afaik the individual patch reviews still should have a public list set as subscriber by principle (either llvm-commits, cfe-commits or libcxx-commits). This becomes a bit easier to miss by the fact that such lists are automatically added as subscribers for certain subprojects, but apparently not for compiler-rt. (Also - even if the patch reviews had been sent to the list, I most certainly wouldn't have picked up and commented on them.)
compiler-rt/lib/builtins/CMakeLists.txt | ||
---|---|---|
521 | Yeah, something like that sounds good to me. |
compiler-rt/lib/builtins/CMakeLists.txt | ||
---|---|---|
521 | Patch for that is here: https://reviews.llvm.org/D92724 |
I tested this change on Graviton2 aarch64-linux with clang -moutline-atomics.
clang was configured with compiler-rt:
cmake -v -G Ninja \ -DCLANG_DEFAULT_RTLIB:STRING=compiler-rt \ -DLLVM_ENABLE_PROJECTS:STRING="clang;compiler-rt;libunwind" \ -DCLANG_DEFAULT_UNWINDLIB:STRING=libunwind \ -DCMAKE_BUILD_TYPE:STRING=Release \ -DCMAKE_INSTALL_PREFIX:PATH=/home/ubuntu/llvm/usr/ \ ../llvm
I built and tested https://github.com/xianyi/OpenBLAS and https://github.com/boostorg/boost.git and there are no new fails.
I also tested the micro-benchmark as described in https://reviews.llvm.org/D91157
+ export LD_LIBRARY_PATH=/home/ubuntu/llvm/usr/lib + LD_LIBRARY_PATH=/home/ubuntu/llvm/usr/lib + /home/ubuntu/llvm/usr/bin/clang++ -o clang-generic-v8 a.cc -std=c++11 -O2 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread + ./clang-generic-v8 2020-12-16 23:22:45 Running ./clang-generic-v8 Run on (64 X 243.75 MHz CPU s) CPU Caches: L1 Data 64 KiB (x64) L1 Instruction 64 KiB (x64) L2 Unified 1024 KiB (x64) L3 Unified 32768 KiB (x1) Load Average: 0.33, 2.58, 7.68 ***WARNING*** Library was built as DEBUG. Timings may be affected. ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_atomic_increment 9.21 ns 9.21 ns 76001652 BM_atomic_fetch_add 9.21 ns 9.21 ns 76021489 BM_atomic_compare_exchange 7.61 ns 7.61 ns 92024848 BM_std_atomic_compare_exchange 12.4 ns 12.4 ns 56402874 + /home/ubuntu/llvm/usr/bin/clang++ -o clang-lse -march=armv8-a+lse a.cc -std=c++11 -O2 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread + ./clang-lse 2020-12-16 23:22:49 Running ./clang-lse Run on (64 X 243.75 MHz CPU s) CPU Caches: L1 Data 64 KiB (x64) L1 Instruction 64 KiB (x64) L2 Unified 1024 KiB (x64) L3 Unified 32768 KiB (x1) Load Average: 0.38, 2.55, 7.64 ***WARNING*** Library was built as DEBUG. Timings may be affected. ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_atomic_increment 7.21 ns 7.21 ns 97070064 BM_atomic_fetch_add 7.21 ns 7.21 ns 97124037 BM_atomic_compare_exchange 7.21 ns 7.21 ns 97134576 BM_std_atomic_compare_exchange 11.8 ns 11.8 ns 59414449 + /home/ubuntu/llvm/usr/bin/clang++ -o clang-moutline -moutline-atomics a.cc -std=c++11 -O2 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread + ./clang-moutline 2020-12-16 23:22:52 Running ./clang-moutline Run on (64 X 243.75 MHz CPU s) CPU Caches: L1 Data 64 KiB (x64) L1 Instruction 64 KiB (x64) L2 Unified 1024 KiB (x64) L3 Unified 32768 KiB (x1) Load Average: 0.38, 2.55, 7.64 ***WARNING*** Library was built as DEBUG. Timings may be affected. ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_atomic_increment 7.21 ns 7.21 ns 97090953 BM_atomic_fetch_add 7.21 ns 7.21 ns 97136432 BM_atomic_compare_exchange 7.21 ns 7.21 ns 97132760 BM_std_atomic_compare_exchange 11.6 ns 11.6 ns 60292533
Question: is there a requirement that compiler-rt be buildable with an assembler which doesn't support the LSE instruction syntax?