This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetSubtargetInfo.h
-
lib/
-
CodeGen/
-
MIRParser/
-
MIRParser.cpp
-
TargetSubtargetInfo.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64FrameLowering.cpp
-
AArch64ISelLowering.h
-
AArch64ISelLowering.cpp
-
AArch64RegisterInfo.cpp
-
AArch64Subtarget.h
-
AArch64Subtarget.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
big-callframe.ll

Differential D40876

AArch64: Fix emergency spillslot being out of reach for large callframes
ClosedPublic

Authored by MatzeB on Dec 5 2017, 6:11 PM.

Download Raw Diff

Details

Reviewers

t.p.northover
qcolombet
aemerson
rengolin
eli.friedman
thegameg
gberry
efriedma

Commits

rG5c290dc20601: AArch64: Fix emergency spillslot being out of reach for large callframes
rGb42ffa1283ae: AArch64: Fix emergency spillslot being out of reach for large callframes
rL322919: AArch64: Fix emergency spillslot being out of reach for large callframes
rL322200: AArch64: Fix emergency spillslot being out of reach for large callframes

Summary

Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.

This commit does several things:

Compute max callframe size at the end of instruction selection.
Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file.
Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes.
Always place the emergency spillslot close to FP if we have a frame pointer.
Note that useFPForScavengingIndex() would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early.

I'm still in the process of coming up with a reliable and good testcase. As testcases with hundred of arguments are unwieldy I will probably end up placing a correctness test into the llvm test-suite for this.

Diff Detail

Repository: rL LLVM

Event Timeline

MatzeB created this revision.Dec 5 2017, 6:11 PM

Herald added subscribers: kristof.beyls, javed.absar, mcrosier. · View Herald TranscriptDec 5 2017, 6:12 PM

MatzeB added inline comments.Dec 5 2017, 6:13 PM

lib/Target/AArch64/AArch64FrameLowering.cpp
1269 ↗	(On Diff #125663)	Please ignore this line: I accidentally didn't take out this debug helper.

As testcases with hundred of arguments are unwieldy

You might be able to get a similar effect with "define void @foo([1000 x i32] %x) {" or something like that. (I'm not sure when exactly we need the emergency spill slot on aarch64, so I'm not sure that helps.)

I'm still in the process of coming up with a reliable and good testcase. As testcases with hundred of arguments are unwieldy I will probably end up placing a correctness test into the llvm test-suite for this.

FWIW, a while ago, I added the following test to the test-suite to test a whole range of different frame layout aspects. Maybe that test could be extended for the case of a large frame somehow?
Probably, the place to look into extending this test would be to add a new template parameter to function check_frame_variant.
http://llvm.org/viewvc/llvm-project/test-suite/trunk/MultiSource/UnitTests/C%2B%2B11/frame_layout/frame_layout.cpp?view=markup

So this works as a testcase, however I wouldn't want to check this in either; as no matter what I do, for huge callframes we always produce a huge number of stores in selectiondag (or fastisel) that take a long time to create and optimize away even when the argument is simply undef...

target triple = "arm64--"

declare void @extfunc([4096 x i64] %p)

define void @func() {
  %lvar = alloca i8

  %v00 = load volatile i8, i8* %lvar
  %v01 = load volatile i8, i8* %lvar
  %v02 = load volatile i8, i8* %lvar
  %v03 = load volatile i8, i8* %lvar
  %v04 = load volatile i8, i8* %lvar
  %v05 = load volatile i8, i8* %lvar
  %v06 = load volatile i8, i8* %lvar
  %v07 = load volatile i8, i8* %lvar
  %v08 = load volatile i8, i8* %lvar
  %v09 = load volatile i8, i8* %lvar
  %v10 = load volatile i8, i8* %lvar
  %v11 = load volatile i8, i8* %lvar
  %v12 = load volatile i8, i8* %lvar
  %v13 = load volatile i8, i8* %lvar
  %v14 = load volatile i8, i8* %lvar
  %v15 = load volatile i8, i8* %lvar
  %v16 = load volatile i8, i8* %lvar
  %v17 = load volatile i8, i8* %lvar
  %v18 = load volatile i8, i8* %lvar
  %v19 = load volatile i8, i8* %lvar
  %v20 = load volatile i8, i8* %lvar
  %v21 = load volatile i8, i8* %lvar
  %v22 = load volatile i8, i8* %lvar
  %v23 = load volatile i8, i8* %lvar
  %v24 = load volatile i8, i8* %lvar
  %v25 = load volatile i8, i8* %lvar
  %v26 = load volatile i8, i8* %lvar
  %v27 = load volatile i8, i8* %lvar
  %v28 = load volatile i8, i8* %lvar
  %v29 = load volatile i8, i8* %lvar
  %v30 = load volatile i8, i8* %lvar
  %v31 = load volatile i8, i8* %lvar

  store volatile i8 %v00, i8* %lvar
  store volatile i8 %v01, i8* %lvar
  store volatile i8 %v02, i8* %lvar
  store volatile i8 %v03, i8* %lvar
  store volatile i8 %v04, i8* %lvar
  store volatile i8 %v05, i8* %lvar
  store volatile i8 %v06, i8* %lvar
  store volatile i8 %v07, i8* %lvar
  store volatile i8 %v08, i8* %lvar
  store volatile i8 %v09, i8* %lvar
  store volatile i8 %v10, i8* %lvar
  store volatile i8 %v11, i8* %lvar
  store volatile i8 %v12, i8* %lvar
  store volatile i8 %v13, i8* %lvar
  store volatile i8 %v14, i8* %lvar
  store volatile i8 %v15, i8* %lvar
  store volatile i8 %v16, i8* %lvar
  store volatile i8 %v17, i8* %lvar
  store volatile i8 %v18, i8* %lvar
  store volatile i8 %v19, i8* %lvar
  store volatile i8 %v20, i8* %lvar
  store volatile i8 %v21, i8* %lvar
  store volatile i8 %v22, i8* %lvar
  store volatile i8 %v23, i8* %lvar
  store volatile i8 %v24, i8* %lvar
  store volatile i8 %v25, i8* %lvar
  store volatile i8 %v26, i8* %lvar
  store volatile i8 %v27, i8* %lvar
  store volatile i8 %v28, i8* %lvar
  store volatile i8 %v29, i8* %lvar
  store volatile i8 %v30, i8* %lvar
  store volatile i8 %v31, i8* %lvar

  call void @extfunc([4096 x i64] undef)
  ret void
}

(this takes an embarrassing 50 seconds to run on my machine)

I can reproduce the crash with the following:

target triple = "arm64--"
declare void @extfunc([4096 x i64]* byval %p)
define void @func([4096 x i64]* %z) {
  %lvar = alloca [31 x i8]
  %v00 = load volatile [31 x i8], [31 x i8]* %lvar
  store volatile [31 x i8] %v00, [31 x i8]* %lvar
  call void @extfunc([4096 x i64]* byval %z)
  ret void
}

In D40876#947550, @efriedma wrote:

I can reproduce the crash with the following:

Oh indeed the byval variant seems to compile fast, thanks!

Added testcase

ping

One thing I noticed when looking at this: it seems fragile to me to have MachineFrameInfo::computeMaxCallFrameSize() and PEI::calculateFrameInfo() doing essentially the same calculation with duplicated code.
Would it make sense to add an assert to PEI::calculateFrameInfo that checks that the previously calculated MaxCallFrameSize isn't smaller than the one calculated later?

lib/Target/AArch64/AArch64FrameLowering.cpp
207 ↗	(On Diff #125865)	Could you get rid of the magic 255 constant here?

In D40876#958683, @gberry wrote:

One thing I noticed when looking at this: it seems fragile to me to have MachineFrameInfo::computeMaxCallFrameSize() and PEI::calculateFrameInfo() doing essentially the same calculation with duplicated code.
Would it make sense to add an assert to PEI::calculateFrameInfo that checks that the previously calculated MaxCallFrameSize isn't smaller than the one calculated later?

After thinking about this some more, adding this assert doesn't make much sense, since e.g. spilling will increase the MaxCallFrameSize. Speaking of which, don't we still have a problem if e.g. spilling puts us over the large frame size threshold?

In D40876#958683, @gberry wrote:

One thing I noticed when looking at this: it seems fragile to me to have MachineFrameInfo::computeMaxCallFrameSize() and PEI::calculateFrameInfo() doing essentially the same calculation with duplicated code.
Would it make sense to add an assert to PEI::calculateFrameInfo that checks that the previously calculated MaxCallFrameSize isn't smaller than the one calculated later?

Indeed, that's why I added this to calculateCallFrameInfo() a while back:

assert(!MFI.isMaxCallFrameSizeComputed() ||
       (MFI.getMaxCallFrameSize() == MaxCallFrameSize &&
        MFI.adjustsStack() == AdjustsStack));

In D40876#958718, @gberry wrote:

In D40876#958683, @gberry wrote:

One thing I noticed when looking at this: it seems fragile to me to have MachineFrameInfo::computeMaxCallFrameSize() and PEI::calculateFrameInfo() doing essentially the same calculation with duplicated code.
Would it make sense to add an assert to PEI::calculateFrameInfo that checks that the previously calculated MaxCallFrameSize isn't smaller than the one calculated later?

After thinking about this some more, adding this assert doesn't make much sense, since e.g. spilling will increase the MaxCallFrameSize. Speaking of which, don't we still have a problem if e.g. spilling puts us over the large frame size threshold?

Spilling is no problem as the emergency spillslot is placed last in the stackframe (when doing SP relative access). The problem with callframe is just that they have to come after the stackframe. With this change the problematice cases are switched to use a frame pointer which means we will move the emergency spillslot to the beginning of the stackframe where it will always be in reach for the FP.

Factor out magic number into a global static constant.

And add a comment of why the DefaultSafeSPDisplacement is good enough in this case and we don't care about vector loads/stores supporting no offset at all.

I think what is bothering me about this change is that the return value of hasFP() now seems more dynamic. Did you consider a potentially simpler fix of just creating the spill slot as a fixed stack object with a hard-coded offset that would guarantee it is directly addressable from the SP/FP?

In D40876#966714, @gberry wrote:

I think what is bothering me about this change is that the return value of hasFP() now seems more dynamic. Did you consider a potentially simpler fix of just creating the spill slot as a fixed stack object with a hard-coded offset that would guarantee it is directly addressable from the SP/FP?

I do create a spill slot that is easily accessible from FP, that's the whole point of this patch. Unfortunately I have to force FP usage in these cases, hence the hasFP changes.

I don't see any way to this SP relative, since at the point of the call SP has to point to the callframe, no way around that. And the callframe is so large in this situation that we cannot reach the spillslot before it by a simple immediate.

Okay, I get it now, I had the wrong case in mind. I was thinking that you could put the scavenge slot close to [sp], but that doesn't work since you have a large outgoing stack parameter area.
This change looks good to me, but you might want to get a second opinion since my thinking on this hasn't been that clear.

Potential alternative approach: could we allocate the emergency spill slot in the red zone?

lib/Target/AArch64/AArch64FrameLowering.cpp
213 ↗	(On Diff #127419)	isMaxCallFrameSizeComputed() is false until after register allocation, right? That means register allocation will never allocate a value into the frame pointer register. This needs better comments, and I'm not sure we want to unconditionally reserve the frame pointer register just to fix an obscure bug with very large call frames.

I'm not sure using the redzone is safe on all targets.

lib/Target/AArch64/AArch64FrameLowering.cpp
213 ↗	(On Diff #127419)	Part of this change is to call computeMaxCallFrameSize() from finalizeLowering() (called at the end of ISel), so I don't think this is a problem?

efriedma added inline comments.Jan 3 2018, 1:50 PM

lib/Target/AArch64/AArch64FrameLowering.cpp
213 ↗	(On Diff #127419)	Oh, sorry, missed that part. That's much less scary, then. The only thing I can think of it might impact is inline asm. Do we have any testcases with inline asm using the frame pointer as an operand or result?

MatzeB added inline comments.Jan 8 2018, 9:49 AM

lib/Target/AArch64/AArch64FrameLowering.cpp
213 ↗	(On Diff #127419)	I also added an early `computeMaxCallFrameSize()` to the ARM target a while ago and so far it is working fine. As for inline assembly: Note that this only changes behavior in code with very unusual calls having a huge number of parameters. I would consider omitting the frame pointer an optimization that you cannot/should not rely on as there are various factors outside the authors control that could influence whether the compiler is able to perform it. There is no inline asm constraint letter defined for FP as far as I can see (so you only could rely on it indirectly by having so many register arguments to inline assembly that you need to use the FP).

There is no inline asm constraint letter defined for FP as far as I can see (so you only could rely on it indirectly by having so many register arguments to inline assembly that you need to use the FP).

You can constrain an asm operand to a specific register using a local register variable. Granted, the behavior of that is kind of confusing even without this patch, so maybe it doesn't matter.

LGTM

Inline asm explicitly specifying fp probably works as well as it ever did (which is not very well, given you can very easily generate bad assembly), and I can't see any other issues, especially if ARM is already doing something similar with computeMaxCallFrameSize.

This revision is now accepted and ready to land.Jan 8 2018, 4:03 PM

Closed by commit rL322200: AArch64: Fix emergency spillslot being out of reach for large callframes (authored by matze). · Explain WhyJan 10 2018, 10:17 AM

This revision was automatically updated to reflect the committed changes.

thegameg added a child revision: D45358: [AArch64] Use FP to access the emergency spill slot.Apr 6 2018, 2:47 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

TargetSubtargetInfo.h

3 lines

lib/

CodeGen/

MIRParser/

MIRParser.cpp

2 lines

TargetSubtargetInfo.cpp

3 lines

Target/

AArch64/

AArch64FrameLowering.cpp

32 lines

AArch64ISelLowering.h

2 lines

AArch64ISelLowering.cpp

5 lines

AArch64RegisterInfo.cpp

12 lines

AArch64Subtarget.h

2 lines

AArch64Subtarget.cpp

10 lines

test/

CodeGen/

AArch64/

big-callframe.ll

15 lines

Diff 129299

llvm/trunk/include/llvm/CodeGen/TargetSubtargetInfo.h

Show First 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	public:
/// Enable tracking of subregister liveness in register allocator.		/// Enable tracking of subregister liveness in register allocator.
/// Please use MachineRegisterInfo::subRegLivenessEnabled() instead where		/// Please use MachineRegisterInfo::subRegLivenessEnabled() instead where
/// possible.		/// possible.
virtual bool enableSubRegLiveness() const { return false; }		virtual bool enableSubRegLiveness() const { return false; }

/// Returns string representation of scheduler comment		/// Returns string representation of scheduler comment
std::string getSchedInfoStr(const MachineInstr &MI) const override;		std::string getSchedInfoStr(const MachineInstr &MI) const override;
std::string getSchedInfoStr(MCInst const &MCI) const override;		std::string getSchedInfoStr(MCInst const &MCI) const override;

		/// This is called after a .mir file was loaded.
		virtual void mirFileLoaded(MachineFunction &MF) const;
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_TARGETSUBTARGETINFO_H		#endif // LLVM_CODEGEN_TARGETSUBTARGETINFO_H

llvm/trunk/lib/CodeGen/MIRParser/MIRParser.cpp

Show First 20 Lines • Show All 411 Lines • ▼ Show 20 Lines	MIRParserImpl::initializeMachineFunction(const yaml::MachineFunction &YamlMF,
}		}
PFS.SM = &SM;		PFS.SM = &SM;

if (setupRegisterInfo(PFS, YamlMF))		if (setupRegisterInfo(PFS, YamlMF))
return true;		return true;

computeFunctionProperties(MF);		computeFunctionProperties(MF);

		MF.getSubtarget().mirFileLoaded(MF);

MF.verify();		MF.verify();
return false;		return false;
}		}

bool MIRParserImpl::parseRegisterInfo(PerFunctionMIParsingState &PFS,		bool MIRParserImpl::parseRegisterInfo(PerFunctionMIParsingState &PFS,
const yaml::MachineFunction &YamlMF) {		const yaml::MachineFunction &YamlMF) {
MachineFunction &MF = PFS.MF;		MachineFunction &MF = PFS.MF;
MachineRegisterInfo &RegInfo = MF.getRegInfo();		MachineRegisterInfo &RegInfo = MF.getRegInfo();
▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/TargetSubtargetInfo.cpp

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	else if (TSchedModel.hasInstrItineraries()) {
Latency = ItinData->getStageLatency(		Latency = ItinData->getStageLatency(
getInstrInfo()->get(MCI.getOpcode()).getSchedClass());		getInstrInfo()->get(MCI.getOpcode()).getSchedClass());
} else		} else
return std::string();		return std::string();
Optional<double> RThroughput =		Optional<double> RThroughput =
TSchedModel.computeInstrRThroughput(MCI.getOpcode());		TSchedModel.computeInstrRThroughput(MCI.getOpcode());
return createSchedInfoStr(Latency, RThroughput);		return createSchedInfoStr(Latency, RThroughput);
}		}

		void TargetSubtargetInfo::mirFileLoaded(MachineFunction &MF) const {
		}

llvm/trunk/lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
#define DEBUG_TYPE "frame-info"		#define DEBUG_TYPE "frame-info"

static cl::opt<bool> EnableRedZone("aarch64-redzone",		static cl::opt<bool> EnableRedZone("aarch64-redzone",
cl::desc("enable use of redzone on AArch64"),		cl::desc("enable use of redzone on AArch64"),
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);

STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");		STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");

		/// This is the biggest offset to the stack pointer we can encode in aarch64
		/// instructions (without using a separate calculation and a temp register).
		/// Note that the exception here are vector stores/loads which cannot encode any
		/// displacements (see estimateRSStackSizeLimit(), isAArch64FrameOffsetLegal()).
		static const unsigned DefaultSafeSPDisplacement = 255;

/// Look at each instruction that references stack frames and return the stack		/// Look at each instruction that references stack frames and return the stack
/// size limit beyond which some of these instructions will require a scratch		/// size limit beyond which some of these instructions will require a scratch
/// register during their expansion later.		/// register during their expansion later.
static unsigned estimateRSStackSizeLimit(MachineFunction &MF) {		static unsigned estimateRSStackSizeLimit(MachineFunction &MF) {
// FIXME: For now, just conservatively guestimate based on unscaled indexing		// FIXME: For now, just conservatively guestimate based on unscaled indexing
// range. We'll end up allocating an unnecessary spill slot a lot, but		// range. We'll end up allocating an unnecessary spill slot a lot, but
// realistically that's not a big deal at this stage of the game.		// realistically that's not a big deal at this stage of the game.
for (MachineBasicBlock &MBB : MF) {		for (MachineBasicBlock &MBB : MF) {
Show All 9 Lines	for (MachineInstr &MI : MBB) {

int Offset = 0;		int Offset = 0;
if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==		if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==
AArch64FrameOffsetCannotUpdate)		AArch64FrameOffsetCannotUpdate)
return 0;		return 0;
}		}
}		}
}		}
return 255;		return DefaultSafeSPDisplacement;
}		}

bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {		bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
if (!EnableRedZone)		if (!EnableRedZone)
return false;		return false;
// Don't use the red zone if the function explicitly asks us not to.		// Don't use the red zone if the function explicitly asks us not to.
// This is typically used for kernel code.		// This is typically used for kernel code.
if (MF.getFunction().hasFnAttribute(Attribute::NoRedZone))		if (MF.getFunction().hasFnAttribute(Attribute::NoRedZone))
return false;		return false;

const MachineFrameInfo &MFI = MF.getFrameInfo();		const MachineFrameInfo &MFI = MF.getFrameInfo();
const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
unsigned NumBytes = AFI->getLocalStackSize();		unsigned NumBytes = AFI->getLocalStackSize();

return !(MFI.hasCalls() \|\| hasFP(MF) \|\| NumBytes > 128);		return !(MFI.hasCalls() \|\| hasFP(MF) \|\| NumBytes > 128);
}		}

/// hasFP - Return true if the specified function should have a dedicated frame		/// hasFP - Return true if the specified function should have a dedicated frame
/// pointer register.		/// pointer register.
bool AArch64FrameLowering::hasFP(const MachineFunction &MF) const {		bool AArch64FrameLowering::hasFP(const MachineFunction &MF) const {
const MachineFrameInfo &MFI = MF.getFrameInfo();		const MachineFrameInfo &MFI = MF.getFrameInfo();
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();		const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
// Retain behavior of always omitting the FP for leaf functions when possible.		// Retain behavior of always omitting the FP for leaf functions when possible.
return (MFI.hasCalls() &&		if (MFI.hasCalls() && MF.getTarget().Options.DisableFramePointerElim(MF))
MF.getTarget().Options.DisableFramePointerElim(MF)) \|\|		return true;
MFI.hasVarSizedObjects() \|\| MFI.isFrameAddressTaken() \|\|		if (MFI.hasVarSizedObjects() \|\| MFI.isFrameAddressTaken() \|\|
MFI.hasStackMap() \|\| MFI.hasPatchPoint() \|\|		MFI.hasStackMap() \|\| MFI.hasPatchPoint() \|\|
RegInfo->needsStackRealignment(MF);		RegInfo->needsStackRealignment(MF))
		return true;
		// With large callframes around we may need to use FP to access the scavenging
		// emergency spillslot.
		//
		// Unfortunately some calls to hasFP() like machine verifier ->
		// getReservedReg() -> hasFP in the middle of global isel are too early
		// to know the max call frame size. Hopefully conservatively returning "true"
		// in those cases is fine.
		// DefaultSafeSPDisplacement is fine as we only emergency spill GP regs.
		if (!MFI.isMaxCallFrameSizeComputed() \|\|
		MFI.getMaxCallFrameSize() > DefaultSafeSPDisplacement)
		return true;

		return false;
}		}

/// hasReservedCallFrame - Under normal circumstances, when a frame pointer is		/// hasReservedCallFrame - Under normal circumstances, when a frame pointer is
/// not required, we reserve argument space for call sites in the function		/// not required, we reserve argument space for call sites in the function
/// immediately on entry to the current function. This eliminates the need for		/// immediately on entry to the current function. This eliminates the need for
/// add/sub sp brackets around call sites. Returns true if the call frame is		/// add/sub sp brackets around call sites. Returns true if the call frame is
/// included as part of the stack frame.		/// included as part of the stack frame.
bool		bool
▲ Show 20 Lines • Show All 1,133 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 641 Lines • ▼ Show 20 Lines	private:
bool getPostIndexedAddressParts(SDNode N, SDNode Op, SDValue &Base,		bool getPostIndexedAddressParts(SDNode N, SDNode Op, SDValue &Base,
SDValue &Offset, ISD::MemIndexedMode &AM,		SDValue &Offset, ISD::MemIndexedMode &AM,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;

void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue> &Results,		void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue> &Results,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;

bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override;		bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override;

		void finalizeLowering(MachineFunction &MF) const override;
};		};

namespace AArch64 {		namespace AArch64 {
FastISel *createFastISel(FunctionLoweringInfo &funcInfo,		FastISel *createFastISel(FunctionLoweringInfo &funcInfo,
const TargetLibraryInfo *libInfo);		const TargetLibraryInfo *libInfo);
} // end namespace AArch64		} // end namespace AArch64

} // end namespace llvm		} // end namespace llvm

#endif		#endif

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 10,968 Lines • ▼ Show 20 Lines

	unsigned			unsigned
	AArch64TargetLowering::getVaListSizeInBits(const DataLayout &DL) const {			AArch64TargetLowering::getVaListSizeInBits(const DataLayout &DL) const {
	if (Subtarget->isTargetDarwin() \|\| Subtarget->isTargetWindows())			if (Subtarget->isTargetDarwin() \|\| Subtarget->isTargetWindows())
	return getPointerTy(DL).getSizeInBits();			return getPointerTy(DL).getSizeInBits();

	return 3 * getPointerTy(DL).getSizeInBits() + 2 * 32;			return 3 * getPointerTy(DL).getSizeInBits() + 2 * 32;
	}			}

				void AArch64TargetLowering::finalizeLowering(MachineFunction &MF) const {
				MF.getFrameInfo().computeMaxCallFrameSize(MF);
				TargetLoweringBase::finalizeLowering(MF);
				}

llvm/trunk/lib/Target/AArch64/AArch64RegisterInfo.cpp

	Show First 20 Lines • Show All 219 Lines • ▼ Show 20 Lines

	bool AArch64RegisterInfo::requiresVirtualBaseRegisters(			bool AArch64RegisterInfo::requiresVirtualBaseRegisters(
	const MachineFunction &MF) const {			const MachineFunction &MF) const {
	return true;			return true;
	}			}

	bool			bool
	AArch64RegisterInfo::useFPForScavengingIndex(const MachineFunction &MF) const {			AArch64RegisterInfo::useFPForScavengingIndex(const MachineFunction &MF) const {
	const MachineFrameInfo &MFI = MF.getFrameInfo();			// This function indicates whether the emergency spillslot should be placed
	// AArch64FrameLowering::resolveFrameIndexReference() can always fall back			// close to the beginning of the stackframe (closer to FP) or the end
	// to the stack pointer, so only put the emergency spill slot next to the			// (closer to SP).
	// FP when there's no better way to access it (SP or base pointer).			//
	return MFI.hasVarSizedObjects() && !hasBasePointer(MF);			// The beginning works most reliably if we have a frame pointer.
				const AArch64FrameLowering &TFI = *getFrameLowering(MF);
				return TFI.hasFP(MF);
	}			}

	bool AArch64RegisterInfo::requiresFrameIndexScavenging(			bool AArch64RegisterInfo::requiresFrameIndexScavenging(
	const MachineFunction &MF) const {			const MachineFunction &MF) const {
	return true;			return true;
	}			}

	bool			bool
	▲ Show 20 Lines • Show All 204 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h

Show First 20 Lines • Show All 320 Lines • ▼ Show 20 Lines	bool isCallingConvWin64(CallingConv::ID CC) const {
case CallingConv::C:		case CallingConv::C:
return isTargetWindows();		return isTargetWindows();
case CallingConv::Win64:		case CallingConv::Win64:
return true;		return true;
default:		default:
return false;		return false;
}		}
}		}

		void mirFileLoaded(MachineFunction &MF) const override;
};		};
} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/trunk/lib/Target/AArch64/AArch64Subtarget.cpp

Show First 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	bool AArch64Subtarget::supportsAddressTopByteIgnored() const {

return false;		return false;
}		}

std::unique_ptr<PBQPRAConstraint>		std::unique_ptr<PBQPRAConstraint>
AArch64Subtarget::getCustomPBQPConstraints() const {		AArch64Subtarget::getCustomPBQPConstraints() const {
return balanceFPOps() ? llvm::make_unique<A57ChainingConstraint>() : nullptr;		return balanceFPOps() ? llvm::make_unique<A57ChainingConstraint>() : nullptr;
}		}

		void AArch64Subtarget::mirFileLoaded(MachineFunction &MF) const {
		// We usually compute max call frame size after ISel. Do the computation now
		// if the .mir file didn't specify it. Note that this will probably give you
		// bogus values after PEI has eliminated the callframe setup/destroy pseudo
		// instructions, specify explicitely if you need it to be correct.
		MachineFrameInfo &MFI = MF.getFrameInfo();
		if (!MFI.isMaxCallFrameSizeComputed())
		MFI.computeMaxCallFrameSize(MF);
		}

llvm/trunk/test/CodeGen/AArch64/big-callframe.ll

				; RUN: llc -o - %s \| FileCheck %s
				; Make sure we use a frame pointer and fp relative addressing for the emergency
				; spillslot when we have gigantic callframes.
				; CHECK-LABEL: func:
				; CHECK: stur {{.}}, [x29, #{{.}}] // 8-byte Folded Spill
				; CHECK: ldur {{.}}, [x29, #{{.}}] // 8-byte Folded Reload
				target triple = "aarch64--"
				declare void @extfunc([4096 x i64]* byval %p)
				define void @func([4096 x i64]* %z) {
				%lvar = alloca [31 x i8]
				%v = load volatile [31 x i8], [31 x i8]* %lvar
				store volatile [31 x i8] %v, [31 x i8]* %lvar
				call void @extfunc([4096 x i64]* byval %z)
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Fix emergency spillslot being out of reach for large callframesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 129299

llvm/trunk/include/llvm/CodeGen/TargetSubtargetInfo.h

llvm/trunk/lib/CodeGen/MIRParser/MIRParser.cpp

llvm/trunk/lib/CodeGen/TargetSubtargetInfo.cpp

llvm/trunk/lib/Target/AArch64/AArch64FrameLowering.cpp

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/trunk/lib/Target/AArch64/AArch64RegisterInfo.cpp

llvm/trunk/lib/Target/AArch64/AArch64Subtarget.h

llvm/trunk/lib/Target/AArch64/AArch64Subtarget.cpp

llvm/trunk/test/CodeGen/AArch64/big-callframe.ll

AArch64: Fix emergency spillslot being out of reach for large callframes
ClosedPublic