- User Since
- Aug 4 2014, 10:15 AM (337 w, 4 d)
Dec 16 2020
I tested this change on Graviton2 aarch64-linux with clang -moutline-atomics.
clang was configured with compiler-rt:
cmake -v -G Ninja \ -DCLANG_DEFAULT_RTLIB:STRING=compiler-rt \ -DLLVM_ENABLE_PROJECTS:STRING="clang;compiler-rt;libunwind" \ -DCLANG_DEFAULT_UNWINDLIB:STRING=libunwind \ -DCMAKE_BUILD_TYPE:STRING=Release \ -DCMAKE_INSTALL_PREFIX:PATH=/home/ubuntu/llvm/usr/ \ ../llvm
Dec 5 2020
I tested this change on Graviton2 aarch64-linux by building https://github.com/xianyi/OpenBLAS with clang -O3 -moutline-atomics and make test: all tests pass with and without outline-atomics.
Clang was configured to use libgcc.
Dec 24 2019
This cleanup looks good to me.
Nov 5 2019
Oct 28 2019
Looks good to me. Thanks!
Oct 7 2019
Sep 24 2019
Sep 19 2019
I looked at both the SLP and loop vectorizer and I think this is more work than I can do right now.
Sep 18 2019
To catch more dot product cases, we need to fix the passes above instruction selection.
Sep 17 2019
The new patch does not use the first argument of the dot product instruction: we now set it to zero.
Patch tested on x86_64-apple-darwin with make check-all.
Sep 16 2019
Sep 14 2019
Thanks for catching those patterns.
I still see a link error on aarch64-linux on master:
/usr/bin/ld: tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/AttrImpl.cpp.o: in function `clang::AttributeCommonInfo::getAttributeSpellingListIndex() const': /home/ubuntu/llvm-project/llvm/tools/clang/include/clang/Basic/AttributeCommonInfo.h:166: undefined reference to `clang::AttributeCommonInfo::calculateAttributeSpellingListIndex() const' collect2: error: ld returned 1 exit status
I reverted locally this patch and it finishes building.
Sep 13 2019
Sep 12 2019
Looks good to me, please apply.
Updated patch by removing the patterns that generate i16. Patch passes make check-all on aarch64-linux.
The tablegen error only happens on the i16 patterns:
def : Pat<(i16 (extractelt (v8i16 V128:$V), (i64 0))), (EXTRACT_SUBREG V128:$V, hsub)>; def : Pat<(i16 (extractelt (v4i16 V64:$V), (i64 0))), (EXTRACT_SUBREG V64:$V, hsub)>;
If I remove these two patterns, make check-all passes.
Sep 11 2019
Never mind. You cannot do the other ones as it would call Match too many times and would not follow the semantics of the original code.
Sep 10 2019
I like the patch. Thanks!
Sep 7 2019
Sep 6 2019
Aug 20 2019
Updated patch to current llvm trunk.
Aug 15 2019
The last version of the patch addresses all the comments from reviews.
Ok to commit?
Jun 11 2019
For some reason asan/tests/asan_noinst_test.cc is not compiled on make check-asan and that has exposed a compile error that was not triggered by the other tests:
sanitizer_double_allocator.h:30:11: error: use of non-static data member 'use_first_' of 'DoubleAllocator' from nested type 'DoubleAllocatorCache' if (use_first_) ^~~~~~~~~~
The updated patch fixes this by accessing the non-static field of the enclosing class through a this pointer to one of the instances:
- if (use_first_) + if (this->use_first_)
The updated patch passes make check-lsan check-asan and is still under test for check-all on aarch64-linux.
The updated patch I will post is addressing all the review comments.
May 19 2019
Addressed comments from @vitalybuka: factored up the 3 versions and added more tests.
Passes with no new fails ninja check-all on an AArch64 Graviton A1 instance.
May 10 2019
Fix the x86_64 overflow warning with 1ULL << 48.
Updated patch fixes ASan.
May 9 2019
I have verified that the updated patch compiles and it reduces the execution time of leak sanitized trivial example.
Apr 12 2019
This patch reduces the number of #ifdefs as suggested by Kostya, and speeds up both the leak and address sanitizers on aarch64.
Passes check-all on x86_64-linux and aarch64-linux is still under test.
Worked with Brian Rzycki @brzycki.
Apr 8 2019
Also, this is changing only the standalone lsan, not lsan used as part of asan. Right?
standalone lsan is not widely used, AFAICT.
Address review comments from Kostya: move AArch64 lsan allocator to a separate file to avoid #ifdefs.
Apr 5 2019
Ok, I will prepare an updated patch.
Thanks Brian and Kostya for your reviews.
Looks good to me.
Apr 3 2019
Rebased patch on today's trunk.
Mar 29 2019
I am accepting the scalable vector types based on the comments in
Feb 21 2019
This seems to be a useful transform that is not yet covered by the current implementation of jump-threading.
I think GCC calls it control dependence DCE.
Please run and report performance numbers on the testsuite and other benchmarks that you have access to.
Nov 1 2018
Maybe you can add the testcase from my previous patch: https://reviews.llvm.org/D53588
Oct 29 2018
Looks good to me.
Oct 25 2018
Oct 24 2018
The change looks good to me. Thanks!
Oct 23 2018
The change looks good with some minor changes. Thanks!
Oct 22 2018
Oct 20 2018
Oct 15 2018
Fixed in https://reviews.llvm.org/rL344566
Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.
Fix two comments from @tejohnson.
I will post a patch to fix the comments from @tejohnson.
Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.
Oct 5 2018
Added an early return for outlining if function entry is cold.
Added a check for invoke calls: invoke should not be marked as cold by back propagation.
Oct 4 2018
Before this patch we have a 10% regression in sqlite with hot-cold-split pass.
With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.
Oct 2 2018
Oct 1 2018
Please add a testcase that will exercise the new flag.
Sep 27 2018
Sep 21 2018
Sep 12 2018
Looks good to me.
Sep 11 2018
Sep 10 2018
Looks good to me.
Sep 6 2018
The patch looks good to me.
Please address the last two comments and then apply.