User Details
- User Since
- Apr 11 2019, 9:55 AM (104 w, 6 d)
Tue, Mar 30
LGTM.
Dec 14 2020
Nov 19 2020
Nov 18 2020
Nov 4 2020
Nov 3 2020
Nov 2 2020
Oct 30 2020
Address code review comments.
Oct 29 2020
ping
Oct 26 2020
Fix formatting.
Oct 22 2020
Add LLVM_DEBUG stmts.
Note: this failure looks bogus to me as it also happens in other PRs like https://reviews.llvm.org/D89964
linux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp
Address comments.
Address comments
Oct 21 2020
Fix clang-format
Oct 2 2020
Look OK to me but you should also add other reviewers.
Sep 10 2020
Sep 9 2020
Sep 2 2020
Sep 1 2020
ping
Aug 31 2020
Aug 27 2020
Address code review comments.
Fix quotes in profile.ll
Aug 26 2020
Aug 24 2020
Here are a couple of programs to test the performance of a simple memset vs a simple initialization loop. On my system (PPC) even this short init. loop is slower than the memset.
% cat loop.c #include <string.h> int main() { int A[N]; for (int n=0; n<STEPS; ++n) for(int i=0;i<N;++i) A[i] = 0; return A[0]; }
% gcc -O0 loop.c -DN=10 -DSTEPS=1000000; time ./a.out
./a.out 0.10s user 0.00s system 99% cpu 0.099 total
% cat memset.c #include <string.h> int main() { int A[N]; for (int n=0; n<STEPS; ++n) memset(A, 0, N * sizeof(int)); return A[0]; }
% gcc -O0 memset.c -DN=10 -DSTEPS=1000000; time ./a.out
./a.out 0.02s user 0.00s system 99% cpu 0.022 total
Transforming a loop into a memset or memcpy is not *always* profitable (it depends on how many elements are initialized/copied and on the efficiency of the target architecture implementation for those libraries) but is often better than a loop. The low level optimizer should change short memset/memcpy back into a sequence of assignments, IMO this should not be done in opt because the exact length for which memset is less profitable than individual assignment is a function of the target architecture. As for this PR, adding an option to disable the optimization is a good thing as it provides more flexibility to users that for whatever reason do not want the transformation to run (and is also handy for compiler developers when debugging code). And more flexibility is a good thing.
Aug 21 2020
Aug 13 2020
Jul 31 2020
Jul 30 2020
Jul 29 2020
@uweigand and @Kai I forgot to fix a test case. Now fixed are you still ok with the test case change?
Fix test case for PPC64 and SystemZ
Fix formatting
Jul 28 2020
Jul 2 2020
Jun 4 2020
May 28 2020
LGTM
May 25 2020
Mar 13 2020
Mar 12 2020
Mar 10 2020
Feb 12 2020
Feb 11 2020
I choose to name the pass Loop Fission to avoid confusion with the existing LoopDistribution pass (eventually Loop Fission should replace that pass). Also Loop Fission is the opposite of Loop Fusion :-) ?
Addressed review comments from @Meinersbur
Jan 31 2020
Jan 30 2020
ping
Jan 13 2020
ping
Dec 18 2019
Addressing code review comments.
Dec 10 2019
Based on the feedback received during code review I have added a new public member function called 'getPerfectLoops' which can be used to retrieve a list of loops that are perfect with respect to each other.
For example, given the following loop nest containing 4 loops, 'getPerfectLoops' would return {{L1,L2},{L3,L4}}.
for(i) // L1 for(j) // L2 <code> for(k) // L3 for(l) // L4
Nov 20 2019
Addressing code review comments from Michael.
Nov 4 2019
Oct 29 2019
Partially address code review comments from @Meinersbur.
ping
Oct 16 2019
ping
Oct 11 2019
Oct 10 2019
Oct 8 2019
Oct 4 2019
Aug 13 2019
Aug 12 2019
Fix build warning for operator<< when using GCC 7.