Page MenuHomePhabricator

[ARM][ConstantIslands] Fix stack mis-alignment caused by undoLRSpillRestore.
ClosedPublic

Authored by huihuiz on Feb 27 2020, 12:01 PM.

Details

Summary

It is not safe for ARMConstantIslands to undoLRSpillRestore. PrologEpilogInserter is
the one to ensure stack alignment, taking into consideration LR is spilled or not.

For noreturn function with StackAlignment 8 (function contains call/alloc),
undoLRSpillRestore cause stack be mis-aligned. Fixing stack alignment in
ARMConstantIslands doesn't give us much benefit, as undo LR spill/restore only
occur in large function with near branches only, also doesn't have callee-saved LR spill.

Diff Detail

Event Timeline

huihuiz created this revision.Feb 27 2020, 12:01 PM
huihuiz added a comment.EditedFeb 27 2020, 12:05 PM

run: llc -O0 < llvm/test/CodeGen/Thumb/stack-mis-alignment.ll -o -

you can see the mis-alignment, the StackAlignment is 8 for this test

@ %bb.0:                                @ %entry
        .pad    #4
        sub     sp, #4
huihuiz updated this revision to Diff 247112.Feb 27 2020, 2:35 PM

reduce test a bit

I was about to say that completely removing this is overkill, and that we could just limit it to not happen for functions where stack alignment is required, but then I noticed that it has a second bug:

target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "thumbv6m-none-linux-gnueabi"

define i32 @f(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e) noreturn nounwind {
entry:
  call i32 @llvm.arm.space(i32 2048, i32 undef)
  tail call i32 @exit(i32 %e)
  unreachable
}

declare i32 @llvm.arm.space(i32, i32)
declare i32 @exit(i32)

Current generated code (without this patch):

$ llc < test.ll
  f:
        .fnstart
@ %bb.0:                                @ %entry
        .pad    #4
        sub     sp, #4
        .zero   2048
        ldr     r0, [sp, #8]
        bl      exit

The ldr has been generated assuming that SP has been moved down by 8 bytes, removing the push causes the argument to be read from the wrong place on the stack.

This could also be fixed by tracking whether stack addresses above the LR save location have been used, but that's extra complexity, for an optimisation which I wouldn't expect to trigger very often (needs a function > 2048 bytes, but not needing more than 4 registers). Therefore I'm OK with just removing it.

This was the only use of isLRSpilledForFarJump/setLRIsSpilledForFarJump, so they can be removed now.

There is also a comment in ARMFrameLowering referring to this optimisation, which should be updated:

// Force LR to be spilled if the Thumb function size is > 2048. This enables
// use of BL to implement far jump. If it turns out that it's not needed
// then the branch fix up path will undo it.
llvm/test/CodeGen/Thumb/remove-unneeded-push-pop.ll
8

This test can just be removed, there's not much point in testing that we don't do a valid (in this case) optimisation. We also generally prefer to avoid leaving references to code which has been deleted.

for an optimisation which I wouldn't expect to trigger very often (needs a function > 2048 bytes, but not needing more than 4 registers). Therefore I'm OK with just removing it.

+1

huihuiz updated this revision to Diff 247323.Feb 28 2020, 11:21 AM
huihuiz marked an inline comment as done.
huihuiz edited the summary of this revision. (Show Details)
huihuiz edited reviewers, added: ostannard; removed: olista01.

Thank you so much Oliver Stannard for the review!

Getting rid of a few more unnecessary code.

ostannard accepted this revision.Mar 2 2020, 4:57 AM

LGTM, thanks.

This revision is now accepted and ready to land.Mar 2 2020, 4:57 AM
This revision was automatically updated to reflect the committed changes.