This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
8/9
AArch64FrameLowering.cpp
1
AArch64InstrInfo.cpp
4/4
AArch64MachineFunctionInfo.h
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
framelayout-fp-csr.ll
-
framelayout-frame-record.mir
-
framelayout-unaligned-fp.ll

Differential D70800

Fix AArch64 AAPCS frame record chain
ClosedPublic

Authored by logan on Nov 27 2019, 11:49 PM.

Download Raw Diff

Details

Reviewers

efriedma
sdesmalen
t.p.northover
cferris

Commits

rGe9d9a612084b: Reapply D70800: Fix AArch64 AAPCS frame record chain
rGd4e10e6adb1b: AArch64: Fix frame record chain

Summary

After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

logan created this revision.Nov 27 2019, 11:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 27 2019, 11:49 PM

Herald added subscribers: llvm-commits, hiraditya, kristof.beyls. · View Herald Transcript

Thanks for creating a fix for this @logan!

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459	There is a condition here that is not yet tested. If the frame-record is saved that is both LR and FP, not just FP, so is this case needed?

logan marked an inline comment as done.Nov 28 2019, 10:22 PM

logan added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459	I think we don't have to check LR here because AAPCS64 guarantees FP and LR will be spilled to consecutive words. Besides, we only care about the address of the spilled FP.

sdesmalen added inline comments.Nov 29 2019, 2:08 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459	If `LR` is also spilled, shouldn't `isPaired()` be true? This is testing two layouts `(LR, FP)` and `(FP, <something else>)`, where the latter is not a frame-record. This makes me think that this code also needs a condition for `hasFP(MF)`, because that guarantees the existence of a frame-record (as opposed to ordinary spills of FP/LR that don't necessarily constitute the framerecord)

logan marked an inline comment as done.Nov 29 2019, 11:07 PM

logan added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459	I am concerned about the case that FP is the second register in the pair (next one). Since the `EmitMI` function only gets a `RegisterPairInfo`. I cannot see the next `RegisterPairInfo`. Let me think about how to revise this tomorrow. `hasFP(MF)` makes sense to me. I'll update the code in the next revision.

logan updated this revision to Diff 231642.Dec 1 2019, 11:20 PM

logan marked 3 inline comments as done.

logan added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459	Revised. Please take a look. Thanks.

sdesmalen added inline comments.Dec 2 2019, 2:56 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1184	Having calculated the offset to the FrameRecord explicitly, can we now replace this with: int FPOffset = AFI->getFrameRecordOffset(); ?
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134	Is this description correct? The current meaning of `FrameRecordOffset` seems to be the offset from SP _after_ allocating the callee-save area.

logan updated this revision to Diff 231820.Dec 2 2019, 9:52 PM

logan marked 4 inline comments as done.

logan added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1184	Yes. Replaced in the latest Diff.
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134	Thanks. Reworded.

logan updated this revision to Diff 231828.Dec 2 2019, 11:02 PM

logan marked 2 inline comments as done.

Functionally the patch is looking good, nearly there!

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134	Thanks! Is it worth renaming the variable to something like `OffsetToFrameRecordFromCalleeSaveBase` to make the meaning of the variable easier to understand in the places it is used? (I couldn't think of something shorter :))
llvm/test/CodeGen/AArch64/framelayout-frame-record.ll
12 ↗	(On Diff #231828)	This has some implicit knowledge that d9 and d8 will be used to keep d0 and d1. This is probably better tested with a MIR test like: #RUN: llc -mtriple=aarch64-- -start-before prologepilog %s -o - \| FileCheck %s --- name: TestFrameRecordLocation tracksRegLiveness: true frameInfo: isFrameAddressTaken: true body: \| bb.0: $d8 = IMPLICIT_DEF $d9 = IMPLICIT_DEF RET_ReallyLR ... which explicitly marks d8 and d9 as callee-saved. The `isFrameAddressTaken: true` will trigger the use of the frame-pointer, and thus storing of the frame-record. You only need to add a few check lines.
13 ↗	(On Diff #231828)	nit: instead of checking `[[FP_OFFSET:16]]` as a variable, you can just check `16` directly.

logan updated this revision to Diff 232037.Dec 3 2019, 11:27 PM

logan marked 5 inline comments as done.Dec 3 2019, 11:31 PM

logan added inline comments.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2234	Just noticed that Windows have different ordering.
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134	The naming convention in the file seems to be `{get,set,}XXXOffset`, thus I renamed this to `CalleeSaveBaseToFrameRecordOffset`.

logan marked an inline comment as done.Dec 3 2019, 11:32 PM

Thanks @logan , LGTM!

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2234	Yes, good spot!

This revision is now accepted and ready to land.Dec 4 2019, 2:10 PM

nickdesaulniers added a subscriber: nickdesaulniers.Dec 10 2019, 9:16 AM

srhines added a reviewer: cferris.Dec 10 2019, 9:20 AM

srhines added a subscriber: srhines.

Run git clang-format and resolve rebase conflicts.

I am currently converting my SVN commit access to GitHub access. I'll land this as soon as possible.

Closed by commit rGd4e10e6adb1b: AArch64: Fix frame record chain (authored by logan). · Explain WhyDec 14 2019, 10:40 AM

This revision was automatically updated to reflect the committed changes.

I reverted the CL because I encountered the following assertion:

assert((DestReg != AArch64::SP || Bytes % 16 == 0) &&
       "SP increment/decrement not 16-byte aligned");

I will upload a revision soon.

This revision is now accepted and ready to land.Dec 16 2019, 10:17 PM

Rebase to latest LLVM master branch

Herald added a subscriber: danielkiss. · View Herald TranscriptApr 14 2020, 9:36 PM

Harbormaster failed remote builds in B53291: Diff 257595!Apr 14 2020, 10:17 PM

In D70800#1982722, @logan wrote:

Rebase to latest LLVM master branch

Did you make any other changes other than the rebase?

In D70800#1982889, @sdesmalen wrote:

In D70800#1982722, @logan wrote:

Rebase to latest LLVM master branch

Did you make any other changes other than the rebase?

Nope. I am still debugging the assertion failure. One of the emitFrameOffset in emitEpilogue get non-zero NumBytes that is not multiple of 16. But from the limited backtrace, I have some difficulty to spot the problem.

BTW, I cannot locally reproduce it with the same command. On my side, the command can run without problems.

resistor mentioned this in rG9936455204fd: Reapply D70800: Fix AArch64 AAPCS frame record chain.Aug 26 2020, 12:38 PM

mstorsjo mentioned this in rG04879086b443: Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain".Aug 26 2020, 11:40 PM

This revision was landed with ongoing or failed builds.Aug 27 2020, 10:30 AM

Closed by commit rGe9d9a612084b: Reapply D70800: Fix AArch64 AAPCS frame record chain (authored by resistor). · Explain Why

This revision was automatically updated to reflect the committed changes.

resistor added a commit: rGe9d9a612084b: Reapply D70800: Fix AArch64 AAPCS frame record chain.

paulwalker-arm added a reverting change: rGbc9a29b9ee6a: Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain".Sep 1 2020, 8:10 AM

I've had to revert this patch because it caused runtime failures when building spec2k/eon with -march=armv8-a+sve -mllvm -aarch64-sve-vector-bits-min=256.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
3445–3446	I don't believe this is a safe fix for the issue mentioned when the patch was previously reverted by https://reviews.llvm.org/rG04879086b44348cad600a0a1ccbe1f7776cc3cf9. The stack always being 16-byte align is a requirement for the AAPCS.

But doesn't that mean the assert is now protecting less than it was? Without the knowledge of FP's displacement, you cannot know if Bytes % 8 == 0 is safe or not. Is the knowledge of FP's displacement recorded anyway so the original intent of the assert is not compromised? From an SVE point of view, FP displacement will either need to be prevented or undone if it's used to access SVE stack slots. This is because SVE offsets are implicitly scaled and thus any byte based displacement cannot be encoded into its instructions.

It's my believe that at a minimum the assert should be tightened up to pre-D70800 levels before reapplying the patch.

Paul!!!

On 01/09/2020, 16:28, "Owen Anderson" <resistor@icloud.com> wrote:

Sure, but that means there need to be a way to detect when FP is displaced (hence my original question). If such data is available (and it needs to be for SVE code generation to function correctly) then I don't see why the assert cannot use it to assert Bytes is correctly aligned and displaced.

On 01/09/2020, 18:29, "Owen Anderson" <resistor@icloud.com> wrote:

The assertion is protecting less than it was, because the original was overly strictly.  Please refer to the test case I mentioned, which provides an example where AAPCS-compliant code generation (frame record layout is part of AAPCS!) is impossible under the original assertion.  As such, I don’t see a path to returning the assertion to its original form.  Again, correct AAPCS-compliant code generation for this test is impossible with the assertion as it was.

I suspect you will find that if you use that test case as a starting point and add an SVE stack object to it, you will find that the SVE frame layout code needs to handle the case of an unaligned frame pointer.  That’s most likely the underlying cause of the failure you were seeing.

Fair enough, thanks for the information Owen.

Clearly my preference would be to not fix one AArch64 bug by introducing another. You sound more knowledgeable in this area than myself so perhaps could create your suggested SVE test cases to ensure this isn't the case. That said, if you prefer to just reapply the patch as is, at least we've recorded the issue for others working on SVE.

Paul!!!

On 01/09/2020, 19:12, "Owen Anderson" <resistor@mac.com> wrote:

resistor mentioned this in rG5987da8764b7: Revert "Revert "Reapply D70800: Fix AArch64 AAPCS frame record chain"".Sep 1 2020, 12:29 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64FrameLowering.cpp

36 lines

AArch64InstrInfo.cpp

4 lines

AArch64MachineFunctionInfo.h

11 lines

test/

CodeGen/

AArch64/

framelayout-fp-csr.ll

22 lines

framelayout-frame-record.mir

29 lines

framelayout-unaligned-fp.ll

42 lines

Diff 288386

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 1,018 Lines • ▼ Show 20 Lines
}		}

static bool needsWinCFI(const MachineFunction &MF) {		static bool needsWinCFI(const MachineFunction &MF) {
const Function &F = MF.getFunction();		const Function &F = MF.getFunction();
return MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&		return MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&
F.needsUnwindTableEntry();		F.needsUnwindTableEntry();
}		}

static bool isTargetDarwin(const MachineFunction &MF) {
return MF.getSubtarget<AArch64Subtarget>().isTargetDarwin();
}

static bool isTargetWindows(const MachineFunction &MF) {		static bool isTargetWindows(const MachineFunction &MF) {
return MF.getSubtarget<AArch64Subtarget>().isTargetWindows();		return MF.getSubtarget<AArch64Subtarget>().isTargetWindows();
}		}

// Convenience function to determine whether I is an SVE callee save.		// Convenience function to determine whether I is an SVE callee save.
static bool IsSVECalleeSave(MachineBasicBlock::iterator I) {		static bool IsSVECalleeSave(MachineBasicBlock::iterator I) {
switch (I->getOpcode()) {		switch (I->getOpcode()) {
default:		default:
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	if (CombineSPBump)
fixupCalleeSaveRestoreStackOffset(*MBBI, AFI->getLocalStackSize(),		fixupCalleeSaveRestoreStackOffset(*MBBI, AFI->getLocalStackSize(),
NeedsWinCFI, &HasWinCFI);		NeedsWinCFI, &HasWinCFI);
++MBBI;		++MBBI;
}		}

// For funclets the FP belongs to the containing function.		// For funclets the FP belongs to the containing function.
if (!IsFunclet && HasFP) {		if (!IsFunclet && HasFP) {
// Only set up FP if we actually need to.		// Only set up FP if we actually need to.
int64_t FPOffset = isTargetDarwin(MF) ? (AFI->getCalleeSavedStackSize() - 16) : 0;		int64_t FPOffset = AFI->getCalleeSaveBaseToFrameRecordOffset();
		sdesmalenUnsubmitted Done Reply Inline Actions Having calculated the offset to the FrameRecord explicitly, can we now replace this with: int FPOffset = AFI->getFrameRecordOffset(); ? sdesmalen: Having calculated the offset to the FrameRecord explicitly, can we now replace this with: int…
		loganAuthorUnsubmitted Done Reply Inline Actions Yes. Replaced in the latest Diff. logan: Yes. Replaced in the latest Diff.

if (CombineSPBump)		if (CombineSPBump)
FPOffset += AFI->getLocalStackSize();		FPOffset += AFI->getLocalStackSize();

// Issue sub fp, sp, FPOffset or		// Issue sub fp, sp, FPOffset or
// mov fp,sp when FPOffset is zero.		// mov fp,sp when FPOffset is zero.
// Note: All stores of callee-saved registers are marked as "FrameSetup".		// Note: All stores of callee-saved registers are marked as "FrameSetup".
// This code marks the instruction(s) that set the FP also.		// This code marks the instruction(s) that set the FP also.
▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	if (isAsynchronousEHPersonality(Per)) {
BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::COPY), AArch64::FP)		BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::COPY), AArch64::FP)
.addReg(AArch64::X1)		.addReg(AArch64::X1)
.setMIFlag(MachineInstr::FrameSetup);		.setMIFlag(MachineInstr::FrameSetup);
MBB.addLiveIn(AArch64::X1);		MBB.addLiveIn(AArch64::X1);
}		}
}		}

if (needsFrameMoves) {		if (needsFrameMoves) {
const DataLayout &TD = MF.getDataLayout();
const int StackGrowth = isTargetDarwin(MF)
? (2 * -TD.getPointerSize(0))
: -AFI->getCalleeSavedStackSize();
Register FramePtr = RegInfo->getFrameRegister(MF);
// An example of the prologue:		// An example of the prologue:
//		//
// .globl __foo		// .globl __foo
// .align 2		// .align 2
// __foo:		// __foo:
// Ltmp0:		// Ltmp0:
// .cfi_startproc		// .cfi_startproc
// .cfi_personality 155, ___gxx_personality_v0		// .cfi_personality 155, ___gxx_personality_v0
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	if (needsFrameMoves) {
// Ltmp3:		// Ltmp3:
// .cfi_offset w29, -16		// .cfi_offset w29, -16
// Ltmp4:		// Ltmp4:
// .cfi_offset w27, -24		// .cfi_offset w27, -24
// Ltmp5:		// Ltmp5:
// .cfi_offset w28, -32		// .cfi_offset w28, -32

if (HasFP) {		if (HasFP) {
		const int OffsetToFirstCalleeSaveFromFP =
		AFI->getCalleeSaveBaseToFrameRecordOffset() -
		AFI->getCalleeSavedStackSize();
		Register FramePtr = RegInfo->getFrameRegister(MF);

// Define the current CFA rule to use the provided FP.		// Define the current CFA rule to use the provided FP.
unsigned Reg = RegInfo->getDwarfRegNum(FramePtr, true);		unsigned Reg = RegInfo->getDwarfRegNum(FramePtr, true);
unsigned CFIIndex = MF.addFrameInst(		unsigned CFIIndex = MF.addFrameInst(
MCCFIInstruction::cfiDefCfa(nullptr, Reg, FixedObject - StackGrowth));		MCCFIInstruction::cfiDefCfa(nullptr, Reg, FixedObject - OffsetToFirstCalleeSaveFromFP));
BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))		BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
.addCFIIndex(CFIIndex)		.addCFIIndex(CFIIndex)
.setMIFlags(MachineInstr::FrameSetup);		.setMIFlags(MachineInstr::FrameSetup);
} else {		} else {
unsigned CFIIndex;		unsigned CFIIndex;
if (SVEStackSize) {		if (SVEStackSize) {
const TargetSubtargetInfo &STI = MF.getSubtarget();		const TargetSubtargetInfo &STI = MF.getSubtarget();
const TargetRegisterInfo &TRI = *STI.getRegisterInfo();		const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
▲ Show 20 Lines • Show All 274 Lines • ▼ Show 20 Lines	if (!hasFP(MF)) {
NumBytes = 0;		NumBytes = 0;
}		}

// Restore the original stack pointer.		// Restore the original stack pointer.
// FIXME: Rather than doing the math here, we should instead just use		// FIXME: Rather than doing the math here, we should instead just use
// non-post-indexed loads for the restores if we aren't actually going to		// non-post-indexed loads for the restores if we aren't actually going to
// be able to save any instructions.		// be able to save any instructions.
if (!IsFunclet && (MFI.hasVarSizedObjects() \|\| AFI->isStackRealigned())) {		if (!IsFunclet && (MFI.hasVarSizedObjects() \|\| AFI->isStackRealigned())) {
int64_t OffsetToFrameRecord =
isTargetDarwin(MF) ? (-(int64_t)AFI->getCalleeSavedStackSize() + 16) : 0;
emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::FP,		emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::FP,
{OffsetToFrameRecord, MVT::i8},		{-AFI->getCalleeSaveBaseToFrameRecordOffset(), MVT::i8},
TII, MachineInstr::FrameDestroy, false, NeedsWinCFI);		TII, MachineInstr::FrameDestroy, false, NeedsWinCFI);
} else if (NumBytes)		} else if (NumBytes)
emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,		emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
{NumBytes, MVT::i8}, TII, MachineInstr::FrameDestroy, false,		{NumBytes, MVT::i8}, TII, MachineInstr::FrameDestroy, false,
NeedsWinCFI);		NeedsWinCFI);

// This must be placed after the callee-save restore code because that code		// This must be placed after the callee-save restore code because that code
// assumes the SP is at the same location as it was after the callee-save save		// assumes the SP is at the same location as it was after the callee-save save
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	int AArch64FrameLowering::getNonLocalFrameIndexReference(
return getSEHFrameIndexOffset(MF, FI);		return getSEHFrameIndexOffset(MF, FI);
}		}

static StackOffset getFPOffset(const MachineFunction &MF, int64_t ObjectOffset) {		static StackOffset getFPOffset(const MachineFunction &MF, int64_t ObjectOffset) {
const auto *AFI = MF.getInfo<AArch64FunctionInfo>();		const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();		const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
bool IsWin64 =		bool IsWin64 =
Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());		Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());

unsigned FixedObject =		unsigned FixedObject =
getFixedObjectSize(MF, AFI, IsWin64, /IsFunclet=/false);		getFixedObjectSize(MF, AFI, IsWin64, /IsFunclet=/false);
unsigned FPAdjust = isTargetDarwin(MF)		int64_t CalleeSaveSize = AFI->getCalleeSavedStackSize(MF.getFrameInfo());
? 16 : AFI->getCalleeSavedStackSize(MF.getFrameInfo());		int64_t FPAdjust =
		CalleeSaveSize - AFI->getCalleeSaveBaseToFrameRecordOffset();
return {ObjectOffset + FixedObject + FPAdjust, MVT::i8};		return {ObjectOffset + FixedObject + FPAdjust, MVT::i8};
}		}

static StackOffset getStackOffset(const MachineFunction &MF, int64_t ObjectOffset) {		static StackOffset getStackOffset(const MachineFunction &MF, int64_t ObjectOffset) {
const auto &MFI = MF.getFrameInfo();		const auto &MFI = MF.getFrameInfo();
return {ObjectOffset + (int64_t)MFI.getStackSize(), MVT::i8};		return {ObjectOffset + (int64_t)MFI.getStackSize(), MVT::i8};
}		}

▲ Show 20 Lines • Show All 371 Lines • ▼ Show 20 Lines	for (unsigned i = 0; i < Count; ++i) {
int Offset = RPI.isScalable() ? ScalableByteOffset : ByteOffset;		int Offset = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
assert(Offset % Scale == 0);		assert(Offset % Scale == 0);
RPI.Offset = Offset / Scale;		RPI.Offset = Offset / Scale;

assert(((!RPI.isScalable() && RPI.Offset >= -64 && RPI.Offset <= 63) \|\|		assert(((!RPI.isScalable() && RPI.Offset >= -64 && RPI.Offset <= 63) \|\|
(RPI.isScalable() && RPI.Offset >= -256 && RPI.Offset <= 255)) &&		(RPI.isScalable() && RPI.Offset >= -256 && RPI.Offset <= 255)) &&
"Offset out of bounds for LDP/STP immediate");		"Offset out of bounds for LDP/STP immediate");

		// Save the offset to frame record so that the FP register can point to the
		// innermost frame record (spilled FP and LR registers).
		if (NeedsFrameRecord && ((!IsWindows && RPI.Reg1 == AArch64::LR &&
		RPI.Reg2 == AArch64::FP) \|\|
		(IsWindows && RPI.Reg1 == AArch64::FP &&
		RPI.Reg2 == AArch64::LR)))
		AFI->setCalleeSaveBaseToFrameRecordOffset(Offset);
		loganAuthorUnsubmitted Done Reply Inline Actions Just noticed that Windows have different ordering. logan: Just noticed that Windows have different ordering.
		sdesmalenUnsubmitted Not Done Reply Inline Actions Yes, good spot! sdesmalen: Yes, good spot!

RegPairs.push_back(RPI);		RegPairs.push_back(RPI);
if (RPI.isPaired())		if (RPI.isPaired())
++i;		++i;
}		}
}		}

bool AArch64FrameLowering::spillCalleeSavedRegisters(		bool AArch64FrameLowering::spillCalleeSavedRegisters(
MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	auto EmitMI = [&](const RegPairInfo &RPI) {
unsigned FrameIdxReg1 = RPI.FrameIdx;		unsigned FrameIdxReg1 = RPI.FrameIdx;
unsigned FrameIdxReg2 = RPI.FrameIdx + 1;		unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
if (NeedsWinCFI && RPI.isPaired()) {		if (NeedsWinCFI && RPI.isPaired()) {
std::swap(Reg1, Reg2);		std::swap(Reg1, Reg2);
std::swap(FrameIdxReg1, FrameIdxReg2);		std::swap(FrameIdxReg1, FrameIdxReg2);
}		}
MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(LdrOpc));		MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(LdrOpc));
if (RPI.isPaired()) {		if (RPI.isPaired()) {
MIB.addReg(Reg2, getDefRegState(true));		MIB.addReg(Reg2, getDefRegState(true));
		sdesmalenUnsubmitted Done Reply Inline Actions There is a condition here that is not yet tested. If the frame-record is saved that is both LR and FP, not just FP, so is this case needed? sdesmalen: There is a condition here that is not yet tested. If the frame-record is saved that is both LR…
		loganAuthorUnsubmitted Done Reply Inline Actions I think we don't have to check LR here because AAPCS64 guarantees FP and LR will be spilled to consecutive words. Besides, we only care about the address of the spilled FP. logan: I think we don't have to check LR here because AAPCS64 guarantees FP and LR will be spilled to…
		sdesmalenUnsubmitted Done Reply Inline Actions If `LR` is also spilled, shouldn't `isPaired()` be true? This is testing two layouts `(LR, FP)` and `(FP, <something else>)`, where the latter is not a frame-record. This makes me think that this code also needs a condition for `hasFP(MF)`, because that guarantees the existence of a frame-record (as opposed to ordinary spills of FP/LR that don't necessarily constitute the framerecord) sdesmalen: If `LR` is also spilled, shouldn't `isPaired()` be true? This is testing two layouts `(LR, FP)`…
		loganAuthorUnsubmitted Done Reply Inline Actions I am concerned about the case that FP is the second register in the pair (next one). Since the `EmitMI` function only gets a `RegisterPairInfo`. I cannot see the next `RegisterPairInfo`. Let me think about how to revise this tomorrow. `hasFP(MF)` makes sense to me. I'll update the code in the next revision. logan: I am concerned about the case that FP is the second register in the pair (next one). Since the…
		loganAuthorUnsubmitted Done Reply Inline Actions Revised. Please take a look. Thanks. logan: Revised. Please take a look. Thanks.
MIB.addMemOperand(MF.getMachineMemOperand(		MIB.addMemOperand(MF.getMachineMemOperand(
MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),		MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
MachineMemOperand::MOLoad, Size, Alignment));		MachineMemOperand::MOLoad, Size, Alignment));
}		}
MIB.addReg(Reg1, getDefRegState(true))		MIB.addReg(Reg1, getDefRegState(true))
.addReg(AArch64::SP)		.addReg(AArch64::SP)
.addImm(RPI.Offset) // [sp, #offset*scale]		.addImm(RPI.Offset) // [sp, #offset*scale]
// where factor*scale is implicit		// where factor*scale is implicit
▲ Show 20 Lines • Show All 789 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

Show First 20 Lines • Show All 3,436 Lines • ▼ Show 20 Lines	void llvm::emitFrameOffset(MachineBasicBlock &MBB,
StackOffset Offset, const TargetInstrInfo *TII,		StackOffset Offset, const TargetInstrInfo *TII,
MachineInstr::MIFlag Flag, bool SetNZCV,		MachineInstr::MIFlag Flag, bool SetNZCV,
bool NeedsWinCFI, bool *HasWinCFI) {		bool NeedsWinCFI, bool *HasWinCFI) {
int64_t Bytes, NumPredicateVectors, NumDataVectors;		int64_t Bytes, NumPredicateVectors, NumDataVectors;
Offset.getForFrameOffset(Bytes, NumPredicateVectors, NumDataVectors);		Offset.getForFrameOffset(Bytes, NumPredicateVectors, NumDataVectors);

// First emit non-scalable frame offsets, or a simple 'mov'.		// First emit non-scalable frame offsets, or a simple 'mov'.
if (Bytes \|\| (!Offset && SrcReg != DestReg)) {		if (Bytes \|\| (!Offset && SrcReg != DestReg)) {
assert((DestReg != AArch64::SP \|\| Bytes % 16 == 0) &&		assert((DestReg != AArch64::SP \|\| Bytes % 8 == 0) &&
"SP increment/decrement not 16-byte aligned");		"SP increment/decrement not 8-byte aligned");
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I don't believe this is a safe fix for the issue mentioned when the patch was previously reverted by https://reviews.llvm.org/rG04879086b44348cad600a0a1ccbe1f7776cc3cf9. The stack always being 16-byte align is a requirement for the AAPCS. paulwalker-arm: I don't believe this is a safe fix for the issue mentioned when the patch was previously…
unsigned Opc = SetNZCV ? AArch64::ADDSXri : AArch64::ADDXri;		unsigned Opc = SetNZCV ? AArch64::ADDSXri : AArch64::ADDXri;
if (Bytes < 0) {		if (Bytes < 0) {
Bytes = -Bytes;		Bytes = -Bytes;
Opc = SetNZCV ? AArch64::SUBSXri : AArch64::SUBXri;		Opc = SetNZCV ? AArch64::SUBSXri : AArch64::SUBXri;
}		}
emitFrameOffsetAdj(MBB, MBBI, DL, DestReg, SrcReg, Bytes, Opc, TII, Flag,		emitFrameOffsetAdj(MBB, MBBI, DL, DestReg, SrcReg, Bytes, Opc, TII, Flag,
NeedsWinCFI, HasWinCFI);		NeedsWinCFI, HasWinCFI);
SrcReg = DestReg;		SrcReg = DestReg;
▲ Show 20 Lines • Show All 3,544 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	class AArch64FunctionInfo final : public MachineFunctionInfo {
/// that must be forwarded to every musttail call.		/// that must be forwarded to every musttail call.
SmallVector<ForwardedRegister, 1> ForwardedMustTailRegParms;		SmallVector<ForwardedRegister, 1> ForwardedMustTailRegParms;

// Offset from SP-at-entry to the tagged base pointer.		// Offset from SP-at-entry to the tagged base pointer.
// Tagged base pointer is set up to point to the first (lowest address) tagged		// Tagged base pointer is set up to point to the first (lowest address) tagged
// stack slot.		// stack slot.
unsigned TaggedBasePointerOffset = 0;		unsigned TaggedBasePointerOffset = 0;

/// OutliningStyle denotes, if a function was outined, how it was outlined,		/// OutliningStyle denotes, if a function was outined, how it was outlined,
		sdesmalenUnsubmitted Done Reply Inline Actions Is this description correct? The current meaning of `FrameRecordOffset` seems to be the offset from SP _after_ allocating the callee-save area. sdesmalen: Is this description correct? The current meaning of `FrameRecordOffset` seems to be the offset…
		loganAuthorUnsubmitted Done Reply Inline Actions Thanks. Reworded. logan: Thanks. Reworded.
		sdesmalenUnsubmitted Done Reply Inline Actions Thanks! Is it worth renaming the variable to something like `OffsetToFrameRecordFromCalleeSaveBase` to make the meaning of the variable easier to understand in the places it is used? (I couldn't think of something shorter :)) sdesmalen: Thanks! Is it worth renaming the variable to something like…
		loganAuthorUnsubmitted Done Reply Inline Actions The naming convention in the file seems to be `{get,set,}XXXOffset`, thus I renamed this to `CalleeSaveBaseToFrameRecordOffset`. logan: The naming convention in the file seems to be `{get,set,}XXXOffset`, thus I renamed this to…
/// e.g. Tail Call, Thunk, or Function if none apply.		/// e.g. Tail Call, Thunk, or Function if none apply.
Optional<std::string> OutliningStyle;		Optional<std::string> OutliningStyle;

		// Offset from SP-after-callee-saved-spills (i.e. SP-at-entry minus
		// CalleeSavedStackSize) to the address of the frame record.
		int CalleeSaveBaseToFrameRecordOffset = 0;

public:		public:
AArch64FunctionInfo() = default;		AArch64FunctionInfo() = default;

explicit AArch64FunctionInfo(MachineFunction &MF) {		explicit AArch64FunctionInfo(MachineFunction &MF) {
(void)MF;		(void)MF;

// If we already know that the function doesn't have a redzone, set		// If we already know that the function doesn't have a redzone, set
// HasRedZone here.		// HasRedZone here.
▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	#endif

unsigned getTaggedBasePointerOffset() const {		unsigned getTaggedBasePointerOffset() const {
return TaggedBasePointerOffset;		return TaggedBasePointerOffset;
}		}
void setTaggedBasePointerOffset(unsigned Offset) {		void setTaggedBasePointerOffset(unsigned Offset) {
TaggedBasePointerOffset = Offset;		TaggedBasePointerOffset = Offset;
}		}

		int getCalleeSaveBaseToFrameRecordOffset() const {
		return CalleeSaveBaseToFrameRecordOffset;
		}
		void setCalleeSaveBaseToFrameRecordOffset(int Offset) {
		CalleeSaveBaseToFrameRecordOffset = Offset;
		}

private:		private:
// Hold the lists of LOHs.		// Hold the lists of LOHs.
MILOHContainer LOHContainerSet;		MILOHContainer LOHContainerSet;
SetOfInstructions LOHRelated;		SetOfInstructions LOHRelated;

DenseMap<int, std::pair<unsigned, MCSymbol *>> JumpTableEntryInfo;		DenseMap<int, std::pair<unsigned, MCSymbol *>> JumpTableEntryInfo;
};		};

Show All 22 Lines

llvm/test/CodeGen/AArch64/framelayout-fp-csr.ll

This file was added.

				; RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu -disable-post-ra --frame-pointer=all < %s \| FileCheck %s

				; The purpose of this test is to verify that frame pointer (x29)
				; is correctly setup in the presence of callee-saved floating
				; point registers. The frame pointer should point to the frame
				; record, which is located 16 bytes above the end of the CSR
				; space when a single FP CSR is in use.
				define void @test1(i32) #26 {
				entry:
				call void asm sideeffect "nop", "~{d8}"() #26
				ret void
				}
				; CHECK-LABEL: test1:
				; CHECK: str d8, [sp, #-32]!
				; CHECK-NEXT: stp x29, x30, [sp, #16]
				; CHECK-NEXT: add x29, sp, #16
				; CHECK: nop
				; CHECK: ldp x29, x30, [sp, #16]
				; CHECK-NEXT: ldr d8, [sp], #32
				; CHECK-NEXT: ret

				attributes #26 = { nounwind }

llvm/test/CodeGen/AArch64/framelayout-frame-record.mir

This file was added.

				# RUN: llc -mtriple=aarch64-linux-gnu -start-before prologepilog %s -o - \| FileCheck %s

				---
				name: TestFrameRecordLocation
				tracksRegLiveness: true
				frameInfo:
				isFrameAddressTaken: true
				body: \|
				bb.0:
				$d8 = IMPLICIT_DEF
				$d9 = IMPLICIT_DEF
				$x19 = IMPLICIT_DEF
				RET_ReallyLR

				# CHECK-LABEL: TestFrameRecordLocation

				# CHECK: stp d9, d8, [sp, #-48]!
				# CHECK: stp x29, x30, [sp, #16]
				# CHECK: str x19, [sp, #32]

				# CHECK: add x29, sp, #16

				# CHECK: .cfi_def_cfa w29, 32
				# CHECK: .cfi_offset w19, -16
				# CHECK: .cfi_offset w30, -24
				# CHECK: .cfi_offset w29, -32
				# CHECK: .cfi_offset b8, -40
				# CHECK: .cfi_offset b9, -48
				...

llvm/test/CodeGen/AArch64/framelayout-unaligned-fp.ll

This file was added.

				; RUN: llc -verify-machineinstrs < %s \| FileCheck %s

				; The purpose of this test is to construct a scenario where an odd number
				; of callee-saved GPRs as well as an odd number of callee-saved FPRs are
				; used. This caused the frame pointer to be aligned to a multiple of 8
				; on non-Darwin platforms, rather than a multiple of 16 as usual.

				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-linux-gnu"

				@a = global i64 0, align 4


				define i64 @b() {
				entry:
				%call = tail call i64 @d()
				%0 = alloca i8, i64 ptrtoint (i64 ()* @d to i64), align 16
				%1 = ptrtoint i8* %0 to i64
				store i64 %1, i64* @a, align 4
				%call1 = call i64 @e()
				%conv = sitofp i64 %call1 to float
				%2 = load i64, i64* @a, align 4
				%call2 = call i64 @f(i64 %2)
				%conv3 = fptosi float %conv to i64
				ret i64 %conv3
				}

				; CHECK-LABEL: b:
				; CHECK: str d8, [sp, #-32]!
				; CHECK-NEXT: stp x29, x30, [sp, #8]
				; CHECK-NEXT: str x19, [sp, #24]
				; CHECK-NEXT: add x29, sp, #8

				; CHECK: sub sp, x29, #8
				; CHECK-NEXT: ldr x19, [sp, #24]
				; CHECK-NEXT: ldp x29, x30, [sp, #8]
				; CHECK-NEXT: ldr d8, [sp], #32
				; CHECK-NEXT: ret

				declare i64 @d()
				declare i64 @e()
				declare i64 @f(i64)

This is an archive of the discontinued LLVM Phabricator instance.

Fix AArch64 AAPCS frame record chainClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 288386

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h

llvm/test/CodeGen/AArch64/framelayout-fp-csr.ll

llvm/test/CodeGen/AArch64/framelayout-frame-record.mir

llvm/test/CodeGen/AArch64/framelayout-unaligned-fp.ll

Fix AArch64 AAPCS frame record chain
ClosedPublic