Hi Renato, all,
This patch fixes http://llvm.org/bugs/show_bug.cgi?id=19972 - a Thumb-1 specific codegen bug in the ARM Load/Store optimization pass.
Previously, the basic block was searched for future uses of the base register, and if necessary any writeback to the base register was reset using a SUB instruction (e.g. before calling a function) just before such a use. However, this step happened *before* the merged LDM/STM instruction was built. So if there was (e.g.) a function call directly after the not-yet-formed LDM/STM, the pass would first insert a SUB instruction to reset the base register, and then (at the same location, incorrectly) insert the LDM/STM itself.
Example (taken from Bugzilla):
void bar(int*, int, int); void foo(int *A) { int x = A[0]; int y = A[1]; bar(A, x, y); }
This currently compiles to:
$ clang -target arm -Os -mthumb -mcpu=arm926ej-s -c t.c -S -o -
@ BB#0: @ %entry push {r7, lr} add r7, sp, #0 subs r0, #8 ldm r0!, {r1, r2} bl bar pop {r7, pc}
With the patch applied, it gives the much more reasonable
@ BB#0: push {r7, lr} add r7, sp, #0 ldm r0!, {r1, r2} subs r0, #8 bl bar pop {r7, pc}
(Note that we do need the subs as the ldm writes back to the base register).
The patch moves the code that emits the merged instruction forward so this happens before the call that updates base register uses and potentially inserts the SUB. I've also added a test case derived from the example above to make sure that it now works properly.
Cheers
Moritz
This change looks odd. Why is the previous case not relevant any more?