I began looking at how to implement the packed stack layout and got kind of stuck on understanding the current code. The use of the LocalAreaOffset with a value of -160 does not seem to have any real benefit, although it certainly is confusing to me... Just the notion of "the local area" per
/// getOffsetOfLocalArea - This method returns the offset of the local area /// from the stack pointer on entrance to a function. /// int getOffsetOfLocalArea() const { return LocalAreaOffset; }
is problematic since we don't really have a single constant offset to a single area for locals. We have the register save area between 0 and +160 from EntrySP, and then we have another local area for e.g.spills, which would begin at SP+160 *after the prologue*. Incoming stack arguments would be at +160 and above from EntrySP. The use of this offset in different context is very confusing to me, and it seems much more readable to instead just allocate the stack objects with fixed offsets explicitly relative to EntrySP.
This offset is used when PEI::calculateFrameObjectOffsets() computes the offsets. This becomes -168 for the first spilled reg. TargetFrameLowering::getFrameIndexReference() then subtracts the LAO, which is negative, so we get -8 back again (plus stack size). I don't see any other uses, and it doesn't seem like we gain anything from this.
The debug dumps are currently broken, which may not be a real problem, but still:
fi#-3: size=8, align=8, fixed, at location [SP+280] fi#-2: size=8, align=8, fixed, at location [SP+272] fi#-1: size=8, align=8, fixed, at location [SP+160]
This is actually one incoming stack argument at +0, and then %r14 and %r15 saved at +112 and +120 while the MachineFrameInfo::print() has subtracted the LocalAreaOffset of -160 to them all.
This patch seems to work (knock-knock), by simply setting this offset to 0 and instead add a SystemZMC::RegisterSaveAreaSize constant (160) in places as needed instead.
- Removed SystemZMC::CallFrameSize - slightly confusing to me to have this as a constant since a "call frame" could also include stack arguments in my world. Use instead SystemZMC::RegisterSaveAreaSize to represent the 160-byte base area.
- All SystemZ calls to CreateFixedObject() needs to be adjusted:
LowerFormalArguments():
- incoming stack arguments needs RegisterSaveAreaSize added to be +160.
- varargs:
- add RegisterSaveAreaSize for setVarArgsFrameIndex() to get +160 from incoming SP for first stack arg.
- RegSaveIndex is +0 from incoming SP...
- No need to add offset for storing of FPR to the fixed slots since the table has the correct offset from incoming SP.
lowerFRAMEADDR(): The frame address should be the incoming SP, which is also the address of the back chain, so just give 0 as offset.
- Users of getObjectOffset():
PEI::calculateFrameObjectOffsets() / MachineFrameInfo::estimateStackSize(): Should return the same value as before, since even though we now give an offset of 160 for a stack arg, this is negated and therefore not counted. This makes sense I think since it's a check into the local area to be allocated.
TargetFrameLowering::getFrameIndexReference(): As before except the offset is now included in the SP Offsets for each object while LAO is 0.
SystemZFrameLowering::processFunctionBeforeFrameFinalized(): The +160 offset is already part of the stackarg ObjectOffset so don't add it again. In fact, the current code is conservatively wrong, because it adds +160 also to the register saves, which already have the final offsets from the table into the Register save area.
SelectionDAGAddressAnalysis.cpp, in BaseIndexOffset::equalBaseIndex(): This is just a relative difference so it shouldn't matter if the LAO is included or not.
- I don't quite understand
SystemZMC::CFAOffsetFromInitialSP: A little confused as to why this has an offset from the incoming SP, as the Canonical Frame Address is supposed to be the incoming SP, per definition, or? It seems that we are pretending that the 160-byte base area is part of the call, so the CFA is above it. If I make this 0, then two DebugInfo/SystemZ/ tests fail, so it seems this is needed, unless I just forgot to update some other place as well.
the asserts in TargetInstrInfo::foldMemoryOperand() / TargetLoweringBase::emitPatchPoint():
assert(MFI.getObjectOffset(FI) != -1)
All tests pass, and SPEC is NFC except for one file where it seems that the emergency spill slots are no longer allocated (see above under processFunctionBeforeFrameFinalized()).