The sinking of address computations to their users (loads/stores)
is often blocked by call instructions, which take the address as
a parameter - unless the call is "cold", it's considered a non-foldable use.
Considering the whole call sequence, including passing the arguments,
it is sometimes possible to materialize an address computation directly
into a hard register, in a sense "to fold the addressing mode into the call".
For example, on AArch64 the register-to-register copy
instruction ("C6.2.190 MOV (register)", which would likely by used to pass
a pre-computed address argument, is an alias to "C6.2.207 ORR (shifted register)"
and typically has the same latency and throughput as an "ADD" instruction.
This change tries to allow sinking of more addresses to load/store instructions
by preventing some call instructions from being blockers.
With this change CodeGenPrepare still does sinking only towards memory
loads/stores. It works in synergy with a MachineSink patch in
https://reviews.llvm.org/D145706, which does sinking towards calls.
This patch (together with the others up/down the stack) improves
SPECv6 500.perlbench_r by about 3.26% and the whole
of SPECv6 intrate by about 0.46% (geomean).
In this case, the addressing mode doesn't actually represent any computation, so it isn't relevant for this transform; when do you expect it to become relevant?