This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Try to move the stack pointer update instruction later in the prologue and earlier in the epilogue
ClosedPublic

Authored by stefanp on Jan 4 2018, 12:10 PM.

Details

Summary

This should provide a small performance boost in functions that are able to do this.
I've kept this change rather conservative at the moment and functions with frame pointers or base pointers will not try to move the stack pointer update.

Performance numbers on Power 8 for SPEC 2006
429.mcf - 1.5 % improvement
462.libquantum - 1.0 % improvement
464.h264ref - 3.0 % improvement
471.omnetpp - 1.0 % improvement
453.povray - 1.5 % improvement

All other benchmarks changed less than 0.2 % which I consider as "no change".
I have found no degradations in SPEC2006.

Diff Detail

Repository
rL LLVM

Event Timeline

stefanp created this revision.Jan 4 2018, 12:10 PM
sfertile added inline comments.Jan 4 2018, 7:52 PM
lib/Target/PowerPC/PPCFrameLowering.cpp
827 ↗(On Diff #128635)

wil -> will

832 ↗(On Diff #128635)

Can we use !isLargeFrame here instead of isInt<16>(FrameSize) ?

stefanp added inline comments.Jan 5 2018, 6:28 AM
lib/Target/PowerPC/PPCFrameLowering.cpp
832 ↗(On Diff #128635)

No, they are not actually the same thing unfortunately. In the prologue we use this definition:
bool isLargeFrame = !isInt<16>(NegFrameSize);
In the epilogue we use this definition.
bool isLargeFrame = !isInt<16>(FrameSize);
But I need the two conditions to be identical in the two guards that I've set up. So, unfortunately, I can't use the isLargeFrame in the prologue.

stefanp updated this revision to Diff 128762.Jan 5 2018, 9:20 AM

Fixed a typo and a missing set of brackets.

Just to verify, this is effectively hand scheduling the prologue and epilogue?

-eric

Yes, we are hand scheduling the prologue and epilogue.
However, this is a little more than moving instructions around. Since we are moving the stack pointer update instruction, spills that use the stack pointer may have to be updated with new offsets.

echristo accepted this revision.Jan 5 2018, 1:38 PM

One inline comment and nit, but then I'm ok with this. Should elaborate about why we're doing this in the commit message.

lib/Target/PowerPC/PPCFrameLowering.cpp
1217 ↗(On Diff #128762)

No braces here. Also comment.

This revision is now accepted and ready to land.Jan 5 2018, 1:38 PM

These comments are not an indication that another review is necessary, but please address them on the commit.

lib/Target/PowerPC/PPCFrameLowering.cpp
833 ↗(On Diff #128762)

Nit: no spaces in the template parameter list when there's a single one. No space after the ampersand either. Also, fix up your naming convention here and elsewhere - variables start with a capital.

1366 ↗(On Diff #128762)

I think more importantly - it'll allow the CSR restores to hide the latency of the move-from-LR as they're between the mflr and blr.

832 ↗(On Diff #128635)

It would probably be good to put in a quick comment explaining that so someone doesn't change it in the future.

hfinkel added inline comments.Jan 6 2018, 6:10 PM
lib/Target/PowerPC/PPCFrameLowering.cpp
830 ↗(On Diff #128762)

movingStackUpdateDown -> MovingStackUpdateDown

831 ↗(On Diff #128762)

Please explain in the comment above why these particular things are the validity condition for this optimization.

1368 ↗(On Diff #128762)

movingStackUpdateUp -> MovingStackUpdateUp

This revision was automatically updated to reflect the committed changes.