This patch assigns cost of the scaling used in addressing. On many ARM cores, a negated register offset takes longer than a non-negated register offset in a register-offset addressing mode. For instance:
Above, (1) takes less cycles than (2).
By assigning appropriate scaling factor cost, we enable the LLVM to make the right trade-offs in the optimization and code-selection phase.
The patch improves the performance as follows –
Cortex-A53 : spec.twolf: 2.4%, ShootoutC++_matrix: 28.4%, Stanford/Puzzle: 12.4%, IndirectAddressing-dbl: 5.49% Cortex-A57 : spec2006.hmmer: 1.5% , spec2006.lbm: 1.1%
The patch also improves performance on other third-party benchmarks