This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Reduce stack frame size by allocating parameter area on an on-demand basis for ELFv2 ABI
ClosedPublic

Authored by inouehrs on Feb 12 2017, 11:11 PM.

Details

Summary

On the ELFv2 ABI, we need to allocate the parameter area in the stack frame only if a callee really uses it, e.g. vararg function.
However, current LLVM always allocates the parameter area conservatively. This inflate the stack size significantly.

This patch reduces the stack frame size by not allocating the parameter area if it is not required. In the current implementation LowerFormalArguments_64SVR4 already handles the parameter area, but LowerCall_64SVR4 does not (when calculating the stack frame size). What this patch does is making LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.

The average stack frame size for the functions generated while compiling the LLVM source tree is reduced by about 50 bytes (from 182.9 byte to 133.7 byte).

Diff Detail

Event Timeline

inouehrs created this revision.Feb 12 2017, 11:11 PM

Please add some regression tests (i.e. in test/CodeGen/PowerPC/).

inouehrs updated this revision to Diff 88484.Feb 14 2017, 8:58 PM
inouehrs added a reviewer: hfinkel.
  • I updated the patch to further reduce the stack frame size. Now the average stack frame size is reduced by about 50 bytes (from 182.9 byte to 133.7 byte).
  • I added assets to confirm the parameter area is allocated before actually using it.
  • I added a regression test.
nemanjai edited edge metadata.Feb 23 2017, 3:21 AM

Please provide a message for the asserts to alert the user as to why the function needs a parameter area.

inouehrs updated this revision to Diff 89619.Feb 24 2017, 1:02 AM

I added message for asserts.

hfinkel edited edge metadata.Mar 1 2017, 11:11 AM

Why is there only code in LowerCall_64SVR4? Don't you also need to add corresponding code to LowerFormalArguments_64SVR4?

LowerFormalArguments_64SVR4 already includes the same analysis. So, LowerCall_64SVR4 and LowerFormalArguments_64SVR4 have been inconsistent (LowerCall_64SVR4 is more conservative than LowerFormalArguments_64SVR4).
What this patch does is making LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.

hfinkel accepted this revision.Mar 2 2017, 6:46 AM

LowerFormalArguments_64SVR4 already includes the same analysis. So, LowerCall_64SVR4 and LowerFormalArguments_64SVR4 have been inconsistent (LowerCall_64SVR4 is more conservative than LowerFormalArguments_64SVR4).
What this patch does is making LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.

Okay, please note that in the commit message. LGTM.

This revision is now accepted and ready to land.Mar 2 2017, 6:46 AM
inouehrs edited the summary of this revision. (Show Details)Mar 2 2017, 8:23 AM
This revision was automatically updated to reflect the committed changes.