This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Split SP adjustment to reduce the offset of callee saved register spill and restore
ClosedPublic

Authored by shiva0217 on Sep 25 2019, 2:27 AM.

Details

Summary

We would like to split the SP adjustment to reduce the instructions in prologue and epilogue as the following case. In this way, the offset of the callee saved register could fit in a single store.

add     sp,sp,-2032
sw      ra,2028(sp)
sw      s0,2024(sp)
sw      s1,2020(sp)
sw      s3,2012(sp)
sw      s4,2008(sp)
add     sp,sp,-64

Diff Detail

Repository
rL LLVM

Event Timeline

shiva0217 created this revision.Sep 25 2019, 2:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2019, 2:27 AM
luismarques added inline comments.Sep 25 2019, 2:55 AM
lib/Target/RISCV/RISCVFrameLowering.h
49

I'm not sure we need to split things into the boolean and the adjustment amount. Why not just have a single function, and if the returned initial adjustment is 0 then we just do the normal SP adjustment, otherwise we do it in two steps?

luismarques added inline comments.Sep 25 2019, 3:07 AM
lib/Target/RISCV/RISCVFrameLowering.cpp
248

If SecondSPAdjustmentAmount == 0 then haven't we done something wrong? The first SP adjustment becomes the only one, so isSplitSPAdjust can't logically be true, right? If you keep the separate boolean/amount, then I would suggest asserting that SecondSPAdjustmentAmount > 0 when isSplitSPAdjust is true.

shiva0217 marked 2 inline comments as done.Sep 25 2019, 6:52 AM
shiva0217 added inline comments.
lib/Target/RISCV/RISCVFrameLowering.cpp
248

When the isSplitSPAdjust return true, the StackSize > 2048, and the FirstSPAdjustAmount = 2048 - StackAlign. So SecondSPAdjustmentAmount = StackSize - FirstSPAdjustAmount should always greater than zero when isSplitSPAdjust return true. We probably could remove the SecondSPAdjustAmount > 0 in the if line. I think it's great idea to add assertion for SecondSPAdjustmentAmount > 0 to catch anything we unexpect even if we merge the functions, thanks.

lib/Target/RISCV/RISCVFrameLowering.h
49

Good idea, I'll merge the functions, thanks.

shiva0217 updated this revision to Diff 221771.Sep 25 2019, 7:47 AM

Update patch to address the comments.

Hi @luismarques, thanks for the review :)

getFirstSPAdjustAmount does a non-trivial amount of work. Instead of calling it repeatedly it would be better to call it once per prologue/epilogue generation, and cache the result in a local variable.
With that fixed it LGTM.

shiva0217 updated this revision to Diff 221882.Sep 25 2019, 7:24 PM

Update patch

  1. Call getFirstSPAdjustAmount() once in per prologue/epilogue generation as @luismarques suggest
  2. Generate CFI directives in large-stack.ll test case to check the CFI generation
shiva0217 updated this revision to Diff 221895.Sep 26 2019, 1:14 AM

Update patch

Reducing FP setting offset by setting FP before second SP adjustment.

Update patch

@shiva0217 Thanks for updating the patch and addressing the previous concerns. This is looking quite good. I have just a couple more concerns before approving this:

  • The boundary condition for doing the two-stage sp adjustment probably needs adjustment. For instance, try this:
int main() {
    char xx[2048-16];
    foo(xx);
}
--
        addi    sp, sp, -2032
        sd      ra, 2024(sp)
        addi    sp, sp, -16
...

I think that one would still fit a single-stage sp update. Related to that, where you have the code if (!isInt<12>(StackSize) && (CSI.size() > 0)), be sure that the isInt<12> check properly accounts for the StackSize, CSI size, StackAlign etc. Which brings me to the second point...

  • Please include tests showing that the boundary condition is correctly computed. For instance, one test where the stack size is just enough not to have to split the sp adjustment and another where the adjustment is needed.

Thanks!

Update patch

@shiva0217 Thanks for updating the patch and addressing the previous concerns. This is looking quite good. I have just a couple more concerns before approving this:

  • The boundary condition for doing the two-stage sp adjustment probably needs adjustment. For instance, try this:
int main() {
    char xx[2048-16];
    foo(xx);
}
--
        addi    sp, sp, -2032
        sd      ra, 2024(sp)
        addi    sp, sp, -16
...

I think that one would still fit a single-stage sp update. Related to that, where you have the code if (!isInt<12>(StackSize) && (CSI.size() > 0)), be sure that the isInt<12> check properly accounts for the StackSize, CSI size, StackAlign etc. Which brings me to the second point...

  • Please include tests showing that the boundary condition is correctly computed. For instance, one test where the stack size is just enough not to have to split the sp adjustment and another where the adjustment is needed.

Thanks!

Hi @luismarques,

Thanks for the boundary test case, I'll add the tests that one will split and the other will not.
The case will split in the prologue because in the epilogue +2048 will not fit in signed 12-bit.
The StackSize will contain the callee saved stack area and the StackSize has been aligned by determineFrameLayout();
The condition CSI.size() > 0 is to check that there exists a callee saved register spill.
GCC will split in the prologue and the epilogue in this case to simplify the split logic.

main:
      add     sp,sp,-2032
      sw      ra,2028(sp)
      add     sp,sp,-16
      mv      a0,sp
      call    foo
      add     sp,sp,16
      lw      ra,2028(sp)
      li      a0,0
      add     sp,sp,2032
      jr      ra
      .size   main, .-main
      .ident  "GCC:

Should we sync the GCC behaviour in this case?

shiva0217 updated this revision to Diff 222078.Sep 26 2019, 9:58 PM

Add splitting SP adjustment boundary test case as @luismarques suggest.

luismarques accepted this revision.Oct 3 2019, 5:07 AM

LGTM, after fixing the comment.

lib/Target/RISCV/RISCVFrameLowering.cpp
194

This is the prologue so it should say "after saving", not "after restoring".

test/CodeGen/RISCV/split-sp-adjust.ll
20 ↗(On Diff #222078)

This test would be more clear if the alloca size and the computed stack size better matched, but getStackSize is overestimating the size, as indicated in a comment copied from Lanai. I think this can be committed as is, but we should investigate that stack overallocation and if/when we fix that we should adjust this test.

This revision is now accepted and ready to land.Oct 3 2019, 5:07 AM
shiva0217 marked 2 inline comments as done.Oct 3 2019, 6:46 PM
shiva0217 added inline comments.
lib/Target/RISCV/RISCVFrameLowering.cpp
194

Thanks for catching this.

test/CodeGen/RISCV/split-sp-adjust.ll
20 ↗(On Diff #222078)

Ok, we could update the test when the stack estimation changing.

shiva0217 updated this revision to Diff 223130.Oct 3 2019, 6:48 PM

Fix comment in the patch.

Hi @luismarques, thanks for the review, really appreciate that.

This revision was automatically updated to reflect the committed changes.