This is an archive of the discontinued LLVM Phabricator instance.

[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)
ClosedPublic

Authored by hans on Mar 30 2016, 2:57 PM.

Details

Summary

For code such as:

void f(int, int);
void g() {
  f(1, 2);
}

compiled for 32-bit X86 Linux at -Os, Clang would previously generate:

subl    $12, %esp
subl    $8, %esp
pushl   $2
pushl   $1
calll   f
addl    $16, %esp
addl    $12, %esp
retl

This patch fixes that by merging adjacent stack adjustments in eliminateCallFramePseudoInstr().

Unfortunately, this means that eliminateCallFramePseudoInstr() doesn't just replace a single machine instruction, but potentially erases the previous and/or next instruction as well. To allow the caller of eliminateCallFramePseudoInstr() to continue iterating, this patch makes the function return an iterator to the next instruction. This makes the code in PEI::replaceFrameIndices() much simpler.

Note that previously, replaceFrameIndices() would try to re-visit the instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment.

I would probably commit the change to replaceFrameIndices() and return type of eliminateCallFramePseudoInstr() separately.

Diff Detail

Repository
rL LLVM

Event Timeline

hans updated this revision to Diff 52127.Mar 30 2016, 2:57 PM
hans retitled this revision from to [X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140).
hans updated this object.
hans added a reviewer: rnk.
hans added a subscriber: llvm-commits.
rnk added inline comments.Mar 30 2016, 4:58 PM
lib/Target/X86/X86FrameLowering.cpp
2561 ↗(On Diff #52127)

How did mergeSPUpdates change the sign here?

test/CodeGen/X86/force-align-stack-alloca.ll
36 ↗(On Diff #52127)

can you check for 3x pushl here to make it clear that we're allocating 32 bytes of stack?

39 ↗(On Diff #52127)

Add a comment about why 4? (deallocate 32 bytes of outgoing callframe for memset, allocate 28 for f)

hans marked 2 inline comments as done.Mar 30 2016, 5:28 PM
hans added inline comments.
lib/Target/X86/X86FrameLowering.cpp
2561 ↗(On Diff #52127)

It doesn't. What's changed is that now the Amount can change above, due to the merging, and that also needs to feed into the CFAOffset.

I figured it was simpler to just spell out that CFAOffset is the opposite of the stack adjustment.

Actually, let me see if I can make this a bit more clear in a new patch..

hans updated this revision to Diff 52162.Mar 30 2016, 5:28 PM

Addressing comments.

rnk accepted this revision.Mar 30 2016, 6:14 PM
rnk edited edge metadata.

lgtm

This revision is now accepted and ready to land.Mar 30 2016, 6:14 PM
jpienaar added inline comments.
lib/Target/Lanai/LanaiFrameLowering.cpp
142 ↗(On Diff #52162)

Thanks for updating this too. 'void' here should probably be 'MachineBasicBlock::iterator'.

hans updated this revision to Diff 52244.Mar 31 2016, 11:07 AM
hans edited edge metadata.

Rebasing on top of r264966 (a few more tests needed updates), and addressing Jacques's comment.

This revision was automatically updated to reflect the committed changes.
hans added a comment.Mar 31 2016, 12:49 PM

The actual merging was committed in r265039.

lib/Target/Lanai/LanaiFrameLowering.cpp
142 ↗(On Diff #52162)

Oops! Thanks for spotting this!