Currently the findIncDecAfter will only look at the next instruction for post-inc candidates in the load/store optimizer. This extends that to a search through the current BB, until an instruction that modifies or uses the increment reg is found. This allows more post-inc load/stores and ldm/stm's to be created, especially in cases where a schedule might move instructions further apart.
What happens about a case like the following:
mov r0, sp ldrd r1, r2, [sp] ldr r3, [r0] add sp, sp, #8
In this case, the LDR of r3 is not directly reading or modifying SP, but it is nonetheless dependent on SP not having been incremented yet, because it's accessing the stack slot that becomes invalid as soon as SP is increased. I don't see any check for that situation here.
Ah yeah. Good point, because it may handle an exception on the same stack. I had originally believed that this Post-RA Machine scheduler was running after this, and it could do the same kind of re-arranging. Looking again it seems that it's treating the stack pointer as a scheduling barrier.
The SP was not my motivating case. I'll can easily remove it from this patch.
I was lucky to notice it at all – I only spotted it because one of the test changes in the previous version of the patch did lift an increment of SP past two load instructions. As best I could tell it wasn't harmful in that particular case, but that clued me in to look more closely and see whether it could have been harmful in another case.
LGTM with the SP exemption, thanks.
(spelling nit, if you can be bothered: it's → its)