In the OptimizeLEA pass keep instructions' positions in the basic block saved and use them for calculation of the distance between two instructions instead of std::distance. This reduces complexity of the pass from O(n^3) to O(n^2) and thus the compile time.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
Here are the results.
Just -Os (LEA pass is disabled):
$ time ./bin/clang++ -std=c++11 -S a.ll -Os real 0m8.328s user 0m8.282s sys 0m0.041s
-Os with the old LEA pass:
$ time ./bin/clang++ -std=c++11 -S a.ll -Os -mllvm -enable-x86-lea-opt real 0m11.653s user 0m11.591s sys 0m0.059s
-Os with the new LEA pass:
$ time ./bin/clang++ -std=c++11 -S a.ll -Os -mllvm -enable-x86-lea-opt real 0m8.446s user 0m8.380s sys 0m0.064s
a.ll is taken from the example from https://llvm.org/bugs/show_bug.cgi?id=25843 and was generated this way:
$ python a.py 5000 > a.cc $ ./bin/clang++ -std=c++11 -S -emit-llvm -Os a.cc (this took about 1.5 hours)
Comment Actions
Thanks for this! Just one tiny nit.
lib/Target/X86/X86OptimizeLEAs.cpp | ||
---|---|---|
91 | Is there a reason that we can't use DenseMap here instead? |
Is there a reason that we can't use DenseMap here instead?