Enable LoopLoadElimination by default

Description

Enable LoopLoadElimination by default

Summary:
I re-benchmarked this and results are similar to original results in
D13259:

On ARM64:

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27%
SingleSource/Benchmarks/Polybench/stencils/adi                   -19.78%

On x86:

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog  -27.14%

And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop
Distribution.

In terms of compile time, there is ~5% increase on both
SingleSource/Benchmarks/Misc/oourafft and
SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny
loop-intensive programs where SCEV computations dominates compile time.

The reason that time spent in SCEV increases has to do with the design
of the old pass manager. If a transform pass does not preserve an
analysis we *invalidate* the analysis even if there was *no*
modification made by the transform pass.

This means that currently we don't take advantage of LLE and LV sharing
the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV
for LLE.

(There should be a way to work around this limitation in the case of
SCEV and LAA since both compute things on demand and internally cache
their result. Thus we could pretend that transform passes preserve
these analyses and manually invalidate them upon actual modification.
On the other hand the new pass manager is supposed to solve so I am not
sure if this is worthwhile.)

Reviewers: hfinkel, dberlin

Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16300

Details

Committed
anemetFeb 29 2016, 12:35 PM
Differential Revision
D16300: Enable LoopLoadElimination by default
Parents
rL262249: Fixup MIPS testcase after r262247 and make it a little more robust.
Branches
Unknown
Tags
Unknown