This is an archive of the discontinued LLVM Phabricator instance.

PrologEpilogInserter: Improve API to determine callee save regsiters.
ClosedPublic

Authored by MatzeB on Jul 2 2015, 2:50 PM.

Details

Summary

This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():

  • Rename the function to determineCalleeSaves()
  • Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo.
  • Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 28977.Jul 2 2015, 2:50 PM
MatzeB retitled this revision from to PrologEpilogInserter: Improve API to determine callee save regsiters..
MatzeB updated this object.
MatzeB added reviewers: qcolombet, atrick, asl.
MatzeB set the repository for this revision to rL LLVM.
MatzeB added a subscriber: llvm-commits.
MatzeB added a comment.Jul 2 2015, 3:04 PM

+CC Daniel:

While doing this I hit some strange issues in MipsFrameLowering where the frame and base pointer declared to be saved and the one in the callee save list don't always match (32bit vs. 64bit version of the register). This worked by accident before because PhysRegUsed was tracking register units. I put in a register alias iterator to fix it in this patch, but I assume this mismatch was not intentional.

MatzeB added a subscriber: dsanders.Jul 2 2015, 3:05 PM

+CC Daniel (this time for real):

While doing this I hit some strange issues in MipsFrameLowering where the frame and base pointer declared to be saved and the one in the callee save list don't always match (32bit vs. 64bit version of the register). This worked by accident before because PhysRegUsed was tracking register units. I put in a register alias iterator to fix it in this patch, but I assume this mismatch was not intentional.

... but I assume this mismatch was not intentional.

I can understand why it looks weird but I think it was intentional. For the O32 and N64 ABI's the register in the callee-saved list and actually-used frame/base register should always be the same. However our N32 ABI has 64-bit registers and 32-bit pointers. In N32, the register to be saved/restored by the calling convention is the 64-bit register (because we don't know what it holds) but the register used for the frame/base pointer is the 32-bit register.

... but I assume this mismatch was not intentional.

I can understand why it looks weird but I think it was intentional. For the O32 and N64 ABI's the register in the callee-saved list and actually-used frame/base register should always be the same. However our N32 ABI has 64-bit registers and 32-bit pointers. In N32, the register to be saved/restored by the calling convention is the 64-bit register (because we don't know what it holds) but the register used for the frame/base pointer is the 32-bit register.

Ok, just wanted to make sure. It's easy enough to use the register alias iterator here.

hfinkel accepted this revision.Jul 13 2015, 8:59 PM
hfinkel added a reviewer: hfinkel.
hfinkel added a subscriber: hfinkel.

LGTM.

lib/Target/AArch64/AArch64FrameLowering.cpp
356 ↗(On Diff #28977)

Please don't remove the descriptive string from the assert.

lib/Target/ARM/ARMFrameLowering.cpp
803 ↗(On Diff #28977)

Same here.

This revision is now accepted and ready to land.Jul 13 2015, 8:59 PM
This revision was automatically updated to reflect the committed changes.
jfb added a subscriber: jfb.Jul 14 2015, 2:46 PM

FYI, I have a follow-up patch to this: http://reviews.llvm.org/D11199

atrick edited edge metadata.Jul 14 2015, 10:59 PM

Very nice cleanup + optimization.

Just to satisfy my own curiosity, what if the current function has a UWTable attribute, but the call site is noreturn+nounwind? I suppose we still need to spill the clobbered callee saves for debugging (as you have done)?

Only one correctness question, why don't you check for implicit defs here?

+ for (const MachineOperand &MO : make_range(def_begin(*AI), def_end()))