So, I have this testcase:
void f(int n, int x[]) { if (n < 0) return; int a[n]; for (int i = 0; i < n; i++) a[i] = x[n - i - 1]; for (int i = 0; i < n; i++) x[i] = a[i] + 1; }
that, compiled with -O1/-Os for AArch64 and X86, generates machine code, which
fails to properly restore the stack pointer upon function return.
The testcase allocates a VLA, thus clang generates calls to llvm.stacksave and
llvm.stackrestore. The latter call is lowered to mov sp, x2, however this
move does not affect decisions made by the shrink wrapping pass, as the
instruction does not use or define a callee-saved register.
The end effect is that placement of register save/restore code is such that
along a certain path, the callee-saved registers and the stack pointer are
restored, but then the stack pointer is overwritten with an incorrect value.
This patches fixes the issue this by modifying ShrinkWrap::useOrDefCSROrFI to explicitly check
for the stack pointer register (using TLI.getStackPointerRegisterToSaveRestore) in non-call instructions.