This patch improves the instruction sequence the load/store optimizer emits to materialize a new base register with offset applied.
If we have a chain of loads/stores like this:
ldr r0, [r5, #4]
ldr r1, [r5, #8]
ldr r2, [r5, #12]
The pass will always use a MOV and 8-bit immediate add (source and destination register are the same in tADDi8) to get a new base:
mov r2, r5
adds r2, #4
ldm r2, {r0, r1, r2}
However, if the immediate fits into 3 bits, as in this case, we can actually generate (with tADDi3):
adds r2, r5, #4
ldm r2, {r0, r1, r2}
I’ve also added a test case for this and made two existing load/store optimizer tests run with –verify-machineinstrs to catch any problems.
Cheers
Moritz