So, I have this testcase:
void f(int n, int x[]) { if (n < 0) return; int a[n]; for (int i = 0; i < n; i++) a[i] = x[n - i - 1]; for (int i = 0; i < n; i++) x[i] = a[i] + 1; }
that, compiled with -O1/-Os for AArch64 and X86, generates machine code, which
fails to properly restore the stack pointer upon function return.
The testcase allocates a VLA, thus clang generates calls to llvm.stacksave and
llvm.stackrestore. The latter call is lowered to mov sp, x2, however this
move does not affect decisions made by the shrink wrapping pass, as the
instruction does not use or define a callee-saved register.
The end effect is that placement of register save/restore code is such that
along a certain path, the callee-saved registers and the stack pointer are
restored, but then the stack pointer is overwritten with an incorrect value.
This patches fixes the issue this by modifying ShrinkWrap::useOrDefCSROrFI to explicitly check
for the stack pointer register (using TLI.getStackPointerRegisterToSaveRestore) in non-call instructions.
I think this is incorrect if the instruction calls a function with a different calling convention, right? In that case some CSRs would be clobbered while otherwise they were expected to be preserved.
IIUC, the issue here is that some instructions are marked as let Uses = [SP], like calls and returns, but are safe to skip during the analysis.
If I'm not missing anything, to handle all the cases I think it's better to verify all the register operands and reg masks, then only skip the instruction if it only uses SP and isCall or something similar. I'm not sure if isCall would handle all the cases, but from a quick glance over *InstrInfo.td, other instructions that have Uses = [SP] should definitely affect the placement of the save/restore blocks.