This is an archive of the discontinued LLVM Phabricator instance.

Differential D14147

Hanlding of aligned allocas on a target that does not align stack pointer.
ClosedPublic

Authored by jonpa on Oct 28 2015, 7:05 AM.

Download Raw Diff

Details

Reviewers

hfinkel

Summary

[Stack realignment] Handling of aligned allocas.

This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

Diff Detail

Event Timeline

jonpa updated this revision to Diff 38654.Oct 28 2015, 7:05 AM

jonpa retitled this revision from to Hanlding of aligned allocas on a target that does not align stack pointer..

jonpa updated this object.

jonpa added subscribers: llvm-commits, uweigand.

ping!

This patch has been on Phabricator for a while, and still needs review and approval.

SystemZ maintains normal SP alignment always, and instead dynamically realigns stack objects when needed.

Before we get into the details, please explain this statement. Why can you not implement dynamic stack realignment using a base pointer like other targets?

lib/CodeGen/MachineFunction.cpp
515	You've added this extra parameter and made a number of other changes just to produce a debug message. Please don't do that. Either make this a member function so it can compute the necessary condition itself, or remove this completely.
lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
117	Indentation looks odd here.

In D14147#287293, @hfinkel wrote:

SystemZ maintains normal SP alignment always, and instead dynamically realigns stack objects when needed.

Before we get into the details, please explain this statement. Why can you not implement dynamic stack realignment using a base pointer like other targets?

Given that Jonas' patch is based on a suggestion of mine, I'll jump in here :-)

Of course, we *can* implement dynamic stack realignment using a new reserved hard register like other targets. The point is rather that we don't *need* to. On SystemZ, the only parts of the stack frame that require non-default alignment are local variables that were manually over-aligned by the programmer. It is easily possible to implement this without any target support by just doing a bigger alloca and manually aligning a pointer within that area. (In fact, you can do this even in the source code without any compiler support.)

This is different from the situation on other platforms like Intel, where some of the "special" areas of the frame may need non-default alignment, like parameter areas, register save areas, or spill slots. In those cases, it is not possible to implement the alignment requirement without special target support, and that's where the special prolog/epilog code using an extra base register comes in.

Because of this difference, GCC supports two flavors of stack realignment: for those platforms that require it, you can use the extra base register and related code (implemented by the target back-end); but for those platforms that do *not* require it (which is actually most of them), common code simply implements alignment for local variables using generic code (no back-end changes required).

This not only minimizes code changes (most back-ends require no extra code), but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register.

Jonas patch is trying to implement a similar scheme for LLVM: back-ends may chose to implement realignment via extra base pointer, but for those that don't (need to), common code will still handle the local variable case via generic code. (As Jonas said in the initial submission, this generic implementation is still not quite as efficient as it could be, but that can be improved later ...)

In D14147#287994, @uweigand wrote:

In D14147#287293, @hfinkel wrote:

SystemZ maintains normal SP alignment always, and instead dynamically realigns stack objects when needed.

Before we get into the details, please explain this statement. Why can you not implement dynamic stack realignment using a base pointer like other targets?

Given that Jonas' patch is based on a suggestion of mine, I'll jump in here :-)

Of course, we *can* implement dynamic stack realignment using a new reserved hard register like other targets. The point is rather that we don't *need* to. On SystemZ, the only parts of the stack frame that require non-default alignment are local variables that were manually over-aligned by the programmer. It is easily possible to implement this without any target support by just doing a bigger alloca and manually aligning a pointer within that area. (In fact, you can do this even in the source code without any compiler support.)

This is different from the situation on other platforms like Intel, where some of the "special" areas of the frame may need non-default alignment, like parameter areas, register save areas, or spill slots. In those cases, it is not possible to implement the alignment requirement without special target support, and that's where the special prolog/epilog code using an extra base register comes in.

Because of this difference, GCC supports two flavors of stack realignment: for those platforms that require it, you can use the extra base register and related code (implemented by the target back-end); but for those platforms that do *not* require it (which is actually most of them), common code simply implements alignment for local variables using generic code (no back-end changes required).

This not only minimizes code changes (most back-ends require no extra code), but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register.

Jonas patch is trying to implement a similar scheme for LLVM: back-ends may chose to implement realignment via extra base pointer, but for those that don't (need to), common code will still handle the local variable case via generic code. (As Jonas said in the initial submission, this generic implementation is still not quite as efficient as it could be, but that can be improved later ...)

Uli, thanks for explaining.

but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register

This seems untrue. Even if you don't reserve a register for the base pointer, by handling these as dynamic allocations, you force yourself to keep separate pointers to each overaligned stack allocation. In short, you trade one reserved register for N virtual ones (one for each over-aligned stack object). It is true that you might spill those virtual registers when they're not needed, but that's a current-infrastructure problem not a theoretical one, and even so, unlikely to be a win.

In general, over-aligned objects are a performance feature, and should be implemented in a high-performance way. The question here seems to be: Do we want to have a suboptimal, but still functionally correct, support for over-aligned stack objects? I think having this capability makes sense, so long as it comes with appropriate comments and explanations in the code about the downsides. Let's proceed.

In D14147#294457, @hfinkel wrote:

but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register

This seems untrue. Even if you don't reserve a register for the base pointer, by handling these as dynamic allocations, you force yourself to keep separate pointers to each overaligned stack allocation. In short, you trade one reserved register for N virtual ones (one for each over-aligned stack object). It is true that you might spill those virtual registers when they're not needed, but that's a current-infrastructure problem not a theoretical one, and even so, unlikely to be a win.

Just to clarify: my "more efficient" comment refered to the way this feature is implemented in GCC today, where common code already groups all overaligned variables into a "secondary stack frame" and uses only a single alloca to allocate this frame and only a single virtual register to access it. I certainly agree that the (initial) LLVM implementation in Jonas' patch is not generally more efficient, but has the drawbacks you mention. I still think Jonas' patch is a good first step:

It actually implements overaligned variables correctly for all platforms without requiring target-specific code.
It can always be improved upon in the future to be more like the GCC implementation described above.
It should already be more efficient for the -likely common- case of functions using just one single overaligned variable.

In general, over-aligned objects are a performance feature, and should be implemented in a high-performance way. The question here seems to be: Do we want to have a suboptimal, but still functionally correct, support for over-aligned stack objects? I think having this capability makes sense, so long as it comes with appropriate comments and explanations in the code about the downsides. Let's proceed.

Agreed that we should have comments explaining the downsides of the current implementation and suggestions for future improvements.

Thanks for the review!

jonpa updated this object.Nov 25 2015, 7:57 AM

I removed the messy arguments for the warning, and also fixed indentation.
Added a comment explaining the current lack of optimization in MachineFrameInfo.h

LGTM.

include/llvm/CodeGen/MachineFrameInfo.h

128

/// Targets that set this to false don't have the ability to overalign
/// their stack frame, and thus, overaligned allocas are all treated
/// as dynamic allocations and the target must handle them as part
/// of DYNAMIC_STACKALLOC lowering.
/// FIXME: There is room for improvement in this case, in terms of
/// grouping overaligned allocas into a "secondary stack frame" and
/// then only use a single alloca to allocate this frame and only a
/// single virtual register to access it. Currently, without such an
/// optimization, each such alloca gets it's own dynamic
/// realignment.

lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp

118–119

Add:

// FIXME: Overaligned static allocas should be grouped into a single dynamic allocation instead of using a separate stack allocation for each one.

This revision is now accepted and ready to land.Nov 26 2015, 8:18 AM

Thanks for review, commited as rL254227

jonpa closed this revision.Nov 28 2015, 3:08 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MachineFrameInfo.h

3 lines

lib/

CodeGen/

MachineFunction.cpp

27 lines

SelectionDAG/

FunctionLoweringInfo.cpp

24 lines

Target/

SystemZ/

SystemZFrameLowering.cpp

3 lines

SystemZISelLowering.cpp

32 lines

test/

CodeGen/

SystemZ/

alloca-03.ll

84 lines

alloca-04.ll

14 lines

Diff 38654

include/llvm/CodeGen/MachineFrameInfo.h

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	StackObject(uint64_t Sz, unsigned Al, int64_t SP, bool IM,
: SPOffset(SP), Size(Sz), Alignment(Al), isImmutable(IM),		: SPOffset(SP), Size(Sz), Alignment(Al), isImmutable(IM),
isSpillSlot(isSS), Alloca(Val), PreAllocated(false), isAliased(A) {}		isSpillSlot(isSS), Alloca(Val), PreAllocated(false), isAliased(A) {}
};		};

/// The alignment of the stack.		/// The alignment of the stack.
unsigned StackAlignment;		unsigned StackAlignment;

/// Can the stack be realigned.		/// Can the stack be realigned.
		/// Targets that set this to false have to handle alignment of all
		hfinkelUnsubmitted Not Done Reply Inline Actions /// Targets that set this to false don't have the ability to overalign /// their stack frame, and thus, overaligned allocas are all treated /// as dynamic allocations and the target must handle them as part /// of DYNAMIC_STACKALLOC lowering. /// FIXME: There is room for improvement in this case, in terms of /// grouping overaligned allocas into a "secondary stack frame" and /// then only use a single alloca to allocate this frame and only a /// single virtual register to access it. Currently, without such an /// optimization, each such alloca gets it's own dynamic /// realignment. hfinkel: /// Targets that set this to false don't have the ability to overalign /// their stack…
		/// allocas themselves, i.e. while lowering DYNAMIC_STACKALLOC
		/// nodes.
bool StackRealignable;		bool StackRealignable;

/// The list of stack objects allocated.		/// The list of stack objects allocated.
std::vector<StackObject> Objects;		std::vector<StackObject> Objects;

/// This contains the number of fixed objects contained on		/// This contains the number of fixed objects contained on
/// the stack. Because fixed objects are stored at a negative index in the		/// the stack. Because fixed objects are stored at a negative index in the
/// Objects list, this is also the index to the 0th object in the list.		/// Objects list, this is also the index to the 0th object in the list.
▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines

lib/CodeGen/MachineFunction.cpp

Show First 20 Lines • Show All 505 Lines • ▼ Show 20 Lines	void MachineFrameInfo::ensureMaxAlignment(unsigned Align) {
if (!StackRealignable \|\| !RealignOption)		if (!StackRealignable \|\| !RealignOption)
assert(Align <= StackAlignment &&		assert(Align <= StackAlignment &&
"For targets without stack realignment, Align is out of limit!");		"For targets without stack realignment, Align is out of limit!");
if (MaxAlignment < Align) MaxAlignment = Align;		if (MaxAlignment < Align) MaxAlignment = Align;
}		}

/// Clamp the alignment if requested and emit a warning.		/// Clamp the alignment if requested and emit a warning.
static inline unsigned clampStackAlignment(bool ShouldClamp, unsigned Align,		static inline unsigned clampStackAlignment(bool ShouldClamp, unsigned Align,
unsigned StackAlign) {		unsigned StackAlign,
		bool ShouldWarn) {
		hfinkelUnsubmitted Not Done Reply Inline Actions You've added this extra parameter and made a number of other changes just to produce a debug message. Please don't do that. Either make this a member function so it can compute the necessary condition itself, or remove this completely. hfinkel: You've added this extra parameter and made a number of other changes just to produce a debug…
if (!ShouldClamp \|\| Align <= StackAlign)		if (!ShouldClamp \|\| Align <= StackAlign)
return Align;		return Align;
DEBUG(dbgs() << "Warning: requested alignment " << Align
		DEBUG(if (ShouldWarn)
		dbgs() << "Warning: requested alignment " << Align
<< " exceeds the stack alignment " << StackAlign		<< " exceeds the stack alignment " << StackAlign
<< " when stack realignment is off" << '\n');		<< " when stack realignment is off" << '\n');

return StackAlign;		return StackAlign;
}		}

/// Create a new statically sized stack object, returning a nonnegative		/// Create a new statically sized stack object, returning a nonnegative
/// identifier to represent it.		/// identifier to represent it.
int MachineFrameInfo::CreateStackObject(uint64_t Size, unsigned Alignment,		int MachineFrameInfo::CreateStackObject(uint64_t Size, unsigned Alignment,
bool isSS, const AllocaInst *Alloca) {		bool isSS, const AllocaInst *Alloca) {
assert(Size != 0 && "Cannot allocate zero size stack objects!");		assert(Size != 0 && "Cannot allocate zero size stack objects!");
Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,		Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,
Alignment, StackAlignment);		Alignment, StackAlignment,
		StackRealignable \|\| !RealignOption);
Objects.push_back(StackObject(Size, Alignment, 0, false, isSS, Alloca,		Objects.push_back(StackObject(Size, Alignment, 0, false, isSS, Alloca,
!isSS));		!isSS));
int Index = (int)Objects.size() - NumFixedObjects - 1;		int Index = (int)Objects.size() - NumFixedObjects - 1;
assert(Index >= 0 && "Bad frame index!");		assert(Index >= 0 && "Bad frame index!");
ensureMaxAlignment(Alignment);		ensureMaxAlignment(Alignment);
return Index;		return Index;
}		}

/// Create a new statically sized stack object that represents a spill slot,		/// Create a new statically sized stack object that represents a spill slot,
/// returning a nonnegative identifier to represent it.		/// returning a nonnegative identifier to represent it.
int MachineFrameInfo::CreateSpillStackObject(uint64_t Size,		int MachineFrameInfo::CreateSpillStackObject(uint64_t Size,
unsigned Alignment) {		unsigned Alignment) {
Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,		Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,
Alignment, StackAlignment);		Alignment, StackAlignment,
		StackRealignable \|\| !RealignOption);
CreateStackObject(Size, Alignment, true);		CreateStackObject(Size, Alignment, true);
int Index = (int)Objects.size() - NumFixedObjects - 1;		int Index = (int)Objects.size() - NumFixedObjects - 1;
ensureMaxAlignment(Alignment);		ensureMaxAlignment(Alignment);
return Index;		return Index;
}		}

/// Notify the MachineFrameInfo object that a variable sized object has been		/// Notify the MachineFrameInfo object that a variable sized object has been
/// created. This must be created whenever a variable sized object is created,		/// created. This must be created whenever a variable sized object is created,
/// whether or not the index returned is actually used.		/// whether or not the index returned is actually used.
int MachineFrameInfo::CreateVariableSizedObject(unsigned Alignment,		int MachineFrameInfo::CreateVariableSizedObject(unsigned Alignment,
const AllocaInst *Alloca) {		const AllocaInst *Alloca) {
HasVarSizedObjects = true;		HasVarSizedObjects = true;
Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,		Alignment = clampStackAlignment(!StackRealignable \|\| !RealignOption,
Alignment, StackAlignment);		Alignment, StackAlignment,
		StackRealignable \|\| !RealignOption);
Objects.push_back(StackObject(0, Alignment, 0, false, false, Alloca, true));		Objects.push_back(StackObject(0, Alignment, 0, false, false, Alloca, true));
ensureMaxAlignment(Alignment);		ensureMaxAlignment(Alignment);
return (int)Objects.size()-NumFixedObjects-1;		return (int)Objects.size()-NumFixedObjects-1;
}		}

/// Create a new object at a fixed location on the stack.		/// Create a new object at a fixed location on the stack.
/// All fixed objects should be created before other objects are created for		/// All fixed objects should be created before other objects are created for
/// efficiency. By default, fixed objects are immutable. This returns an		/// efficiency. By default, fixed objects are immutable. This returns an
/// index with a negative value.		/// index with a negative value.
int MachineFrameInfo::CreateFixedObject(uint64_t Size, int64_t SPOffset,		int MachineFrameInfo::CreateFixedObject(uint64_t Size, int64_t SPOffset,
bool Immutable, bool isAliased) {		bool Immutable, bool isAliased) {
assert(Size != 0 && "Cannot allocate zero size fixed stack objects!");		assert(Size != 0 && "Cannot allocate zero size fixed stack objects!");
// The alignment of the frame index can be determined from its offset from		// The alignment of the frame index can be determined from its offset from
// the incoming frame position. If the frame object is at offset 32 and		// the incoming frame position. If the frame object is at offset 32 and
// the stack is guaranteed to be 16-byte aligned, then we know that the		// the stack is guaranteed to be 16-byte aligned, then we know that the
// object is 16-byte aligned.		// object is 16-byte aligned.
unsigned Align = MinAlign(SPOffset, StackAlignment);		unsigned Align = MinAlign(SPOffset, StackAlignment);
Align = clampStackAlignment(!StackRealignable \|\| !RealignOption, Align,		Align = clampStackAlignment(!StackRealignable \|\| !RealignOption, Align,
StackAlignment);		StackAlignment,
		StackRealignable \|\| !RealignOption);
Objects.insert(Objects.begin(), StackObject(Size, Align, SPOffset, Immutable,		Objects.insert(Objects.begin(), StackObject(Size, Align, SPOffset, Immutable,
/isSS/ false,		/isSS/ false,
/Alloca/ nullptr, isAliased));		/Alloca/ nullptr, isAliased));
return -++NumFixedObjects;		return -++NumFixedObjects;
}		}

/// Create a spill slot at a fixed location on the stack.		/// Create a spill slot at a fixed location on the stack.
/// Returns an index with a negative value.		/// Returns an index with a negative value.
int MachineFrameInfo::CreateFixedSpillStackObject(uint64_t Size,		int MachineFrameInfo::CreateFixedSpillStackObject(uint64_t Size,
int64_t SPOffset) {		int64_t SPOffset) {
unsigned Align = MinAlign(SPOffset, StackAlignment);		unsigned Align = MinAlign(SPOffset, StackAlignment);
Align = clampStackAlignment(!StackRealignable \|\| !RealignOption, Align,		Align = clampStackAlignment(!StackRealignable \|\| !RealignOption, Align,
StackAlignment);		StackAlignment,
		StackRealignable \|\| !RealignOption);
Objects.insert(Objects.begin(), StackObject(Size, Align, SPOffset,		Objects.insert(Objects.begin(), StackObject(Size, Align, SPOffset,
/Immutable/ true,		/Immutable/ true,
/isSS/ true,		/isSS/ true,
/Alloca/ nullptr,		/Alloca/ nullptr,
/isAliased/ false));		/isAliased/ false));
return -++NumFixedObjects;		return -++NumFixedObjects;
}		}

▲ Show 20 Lines • Show All 370 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines

	void FunctionLoweringInfo::set(const Function &fn, MachineFunction &mf,			void FunctionLoweringInfo::set(const Function &fn, MachineFunction &mf,
	SelectionDAG *DAG) {			SelectionDAG *DAG) {
	Fn = &fn;			Fn = &fn;
	MF = &mf;			MF = &mf;
	TLI = MF->getSubtarget().getTargetLowering();			TLI = MF->getSubtarget().getTargetLowering();
	RegInfo = &MF->getRegInfo();			RegInfo = &MF->getRegInfo();
	MachineModuleInfo &MMI = MF->getMMI();			MachineModuleInfo &MMI = MF->getMMI();
				const TargetFrameLowering *TFI = MF->getSubtarget().getFrameLowering();

	// Check whether the function can return without sret-demotion.			// Check whether the function can return without sret-demotion.
	SmallVector<ISD::OutputArg, 4> Outs;			SmallVector<ISD::OutputArg, 4> Outs;
	GetReturnInfo(Fn->getReturnType(), Fn->getAttributes(), Outs, *TLI,			GetReturnInfo(Fn->getReturnType(), Fn->getAttributes(), Outs, *TLI,
	mf.getDataLayout());			mf.getDataLayout());
	CanLowerReturn = TLI->CanLowerReturn(Fn->getCallingConv(), *MF,			CanLowerReturn = TLI->CanLowerReturn(Fn->getCallingConv(), *MF,
	Fn->isVarArg(), Outs, Fn->getContext());			Fn->isVarArg(), Outs, Fn->getContext());

	// Initialize the mapping of values to registers. This is only set up for			// Initialize the mapping of values to registers. This is only set up for
	// instruction values that are used outside of the block that defines			// instruction values that are used outside of the block that defines
	// them.			// them.
	Function::const_iterator BB = Fn->begin(), EB = Fn->end();			Function::const_iterator BB = Fn->begin(), EB = Fn->end();
	for (; BB != EB; ++BB)			for (; BB != EB; ++BB)
	for (BasicBlock::const_iterator I = BB->begin(), E = BB->end();			for (BasicBlock::const_iterator I = BB->begin(), E = BB->end();
	I != E; ++I) {			I != E; ++I) {
	if (const AllocaInst *AI = dyn_cast<AllocaInst>(I)) {			if (const AllocaInst *AI = dyn_cast<AllocaInst>(I)) {
	// Static allocas can be folded into the initial stack frame adjustment.
	if (AI->isStaticAlloca()) {
	const ConstantInt *CUI = cast<ConstantInt>(AI->getArraySize());
	Type *Ty = AI->getAllocatedType();			Type *Ty = AI->getAllocatedType();
	uint64_t TySize = MF->getDataLayout().getTypeAllocSize(Ty);
	unsigned Align =			unsigned Align =
	std::max((unsigned)MF->getDataLayout().getPrefTypeAlignment(Ty),			std::max((unsigned)MF->getDataLayout().getPrefTypeAlignment(Ty),
	AI->getAlignment());			AI->getAlignment());
				unsigned StackAlign = TFI->getStackAlignment();

				// Static allocas can be folded into the initial stack frame
				// adjustment. For targets that don't realign the stack, don't
				// do this if there is an extra alignment requirement.
				if (AI->isStaticAlloca() &&
				(TFI->isStackRealignable() \|\| (Align <= StackAlign))) {
				hfinkelUnsubmitted Not Done Reply Inline Actions Indentation looks odd here. hfinkel: Indentation looks odd here.
				const ConstantInt *CUI = cast<ConstantInt>(AI->getArraySize());
				uint64_t TySize = MF->getDataLayout().getTypeAllocSize(Ty);
				hfinkelUnsubmitted Not Done Reply Inline Actions Add: // FIXME: Overaligned static allocas should be grouped into a single dynamic allocation instead of using a separate stack allocation for each one. hfinkel: Add: // FIXME: Overaligned static allocas should be grouped into a single dynamic allocation…

	TySize *= CUI->getZExtValue(); // Get total allocated size.			TySize *= CUI->getZExtValue(); // Get total allocated size.
	if (TySize == 0) TySize = 1; // Don't create zero-sized stack objects.			if (TySize == 0) TySize = 1; // Don't create zero-sized stack objects.

	StaticAllocaMap[AI] =			StaticAllocaMap[AI] =
	MF->getFrameInfo()->CreateStackObject(TySize, Align, false, AI);			MF->getFrameInfo()->CreateStackObject(TySize, Align, false, AI);

	} else {			} else {
	unsigned Align =
	std::max((unsigned)MF->getDataLayout().getPrefTypeAlignment(
	AI->getAllocatedType()),
	AI->getAlignment());
	unsigned StackAlign =
	MF->getSubtarget().getFrameLowering()->getStackAlignment();
	if (Align <= StackAlign)			if (Align <= StackAlign)
	Align = 0;			Align = 0;
	// Inform the Frame Information that we have variable-sized objects.			// Inform the Frame Information that we have variable-sized objects.
	MF->getFrameInfo()->CreateVariableSizedObject(Align ? Align : 1, AI);			MF->getFrameInfo()->CreateVariableSizedObject(Align ? Align : 1, AI);
	}			}
	}			}

	// Look for inline asm that clobbers the SP register.			// Look for inline asm that clobbers the SP register.
	▲ Show 20 Lines • Show All 456 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZFrameLowering.cpp

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	static const TargetFrameLowering::SpillSlot SpillOffsetTable[] = {
{ SystemZ::F2D, 0x88 },		{ SystemZ::F2D, 0x88 },
{ SystemZ::F4D, 0x90 },		{ SystemZ::F4D, 0x90 },
{ SystemZ::F6D, 0x98 }		{ SystemZ::F6D, 0x98 }
};		};
} // end anonymous namespace		} // end anonymous namespace

SystemZFrameLowering::SystemZFrameLowering()		SystemZFrameLowering::SystemZFrameLowering()
: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 8,		: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 8,
-SystemZMC::CallFrameSize, 8) {		-SystemZMC::CallFrameSize, 8,
		false /* StackRealignable */) {
// Create a mapping from register number to save slot offset.		// Create a mapping from register number to save slot offset.
RegSpillOffsets.grow(SystemZ::NUM_TARGET_REGS);		RegSpillOffsets.grow(SystemZ::NUM_TARGET_REGS);
for (unsigned I = 0, E = array_lengthof(SpillOffsetTable); I != E; ++I)		for (unsigned I = 0, E = array_lengthof(SpillOffsetTable); I != E; ++I)
RegSpillOffsets[SpillOffsetTable[I].Reg] = SpillOffsetTable[I].Offset;		RegSpillOffsets[SpillOffsetTable[I].Reg] = SpillOffsetTable[I].Offset;
}		}

const TargetFrameLowering::SpillSlot *		const TargetFrameLowering::SpillSlot *
SystemZFrameLowering::getCalleeSavedSpillSlots(unsigned &NumEntries) const {		SystemZFrameLowering::getCalleeSavedSpillSlots(unsigned &NumEntries) const {
▲ Show 20 Lines • Show All 466 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZISelLowering.cpp

Show First 20 Lines • Show All 2,737 Lines • ▼ Show 20 Lines	SDValue SystemZTargetLowering::lowerVACOPY(SDValue Op,
return DAG.getMemcpy(Chain, DL, DstPtr, SrcPtr, DAG.getIntPtrConstant(32, DL),		return DAG.getMemcpy(Chain, DL, DstPtr, SrcPtr, DAG.getIntPtrConstant(32, DL),
/Align/8, /isVolatile/false, /AlwaysInline/false,		/Align/8, /isVolatile/false, /AlwaysInline/false,
/isTailCall/false,		/isTailCall/false,
MachinePointerInfo(DstSV), MachinePointerInfo(SrcSV));		MachinePointerInfo(DstSV), MachinePointerInfo(SrcSV));
}		}

SDValue SystemZTargetLowering::		SDValue SystemZTargetLowering::
lowerDYNAMIC_STACKALLOC(SDValue Op, SelectionDAG &DAG) const {		lowerDYNAMIC_STACKALLOC(SDValue Op, SelectionDAG &DAG) const {
		const TargetFrameLowering *TFI = Subtarget.getFrameLowering();
		bool RealignOpt = !DAG.getMachineFunction().getFunction()->
		hasFnAttribute("no-realign-stack");

SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDValue Size = Op.getOperand(1);		SDValue Size = Op.getOperand(1);
		SDValue Align = Op.getOperand(2);
SDLoc DL(Op);		SDLoc DL(Op);

		// If user has set the no alignment function attribute, ignore
		// alloca alignments.
		uint64_t AlignVal = (RealignOpt ?
		dyn_cast<ConstantSDNode>(Align)->getZExtValue() : 0);

		uint64_t StackAlign = TFI->getStackAlignment();
		uint64_t RequiredAlign = std::max(AlignVal, StackAlign);
		uint64_t ExtraAlignSpace = RequiredAlign - StackAlign;

unsigned SPReg = getStackPointerRegisterToSaveRestore();		unsigned SPReg = getStackPointerRegisterToSaveRestore();
		SDValue NeededSpace = Size;

// Get a reference to the stack pointer.		// Get a reference to the stack pointer.
SDValue OldSP = DAG.getCopyFromReg(Chain, DL, SPReg, MVT::i64);		SDValue OldSP = DAG.getCopyFromReg(Chain, DL, SPReg, MVT::i64);

		// Add extra space for alignment if needed.
		if (ExtraAlignSpace)
		NeededSpace = DAG.getNode(ISD::ADD, DL, MVT::i64, NeededSpace,
		DAG.getConstant(ExtraAlignSpace, DL, MVT::i64));

// Get the new stack pointer value.		// Get the new stack pointer value.
SDValue NewSP = DAG.getNode(ISD::SUB, DL, MVT::i64, OldSP, Size);		SDValue NewSP = DAG.getNode(ISD::SUB, DL, MVT::i64, OldSP, NeededSpace);

// Copy the new stack pointer back.		// Copy the new stack pointer back.
Chain = DAG.getCopyToReg(Chain, DL, SPReg, NewSP);		Chain = DAG.getCopyToReg(Chain, DL, SPReg, NewSP);

// The allocated data lives above the 160 bytes allocated for the standard		// The allocated data lives above the 160 bytes allocated for the standard
// frame, plus any outgoing stack arguments. We don't know how much that		// frame, plus any outgoing stack arguments. We don't know how much that
// amounts to yet, so emit a special ADJDYNALLOC placeholder.		// amounts to yet, so emit a special ADJDYNALLOC placeholder.
SDValue ArgAdjust = DAG.getNode(SystemZISD::ADJDYNALLOC, DL, MVT::i64);		SDValue ArgAdjust = DAG.getNode(SystemZISD::ADJDYNALLOC, DL, MVT::i64);
SDValue Result = DAG.getNode(ISD::ADD, DL, MVT::i64, NewSP, ArgAdjust);		SDValue Result = DAG.getNode(ISD::ADD, DL, MVT::i64, NewSP, ArgAdjust);

		// Dynamically realign if needed.
		if (RequiredAlign > StackAlign) {
		Result =
		DAG.getNode(ISD::ADD, DL, MVT::i64, Result,
		DAG.getConstant(ExtraAlignSpace, DL, MVT::i64));
		Result =
		DAG.getNode(ISD::AND, DL, MVT::i64, Result,
		DAG.getConstant(~(RequiredAlign - 1), DL, MVT::i64));
		}

SDValue Ops[2] = { Result, Chain };		SDValue Ops[2] = { Result, Chain };
return DAG.getMergeValues(Ops, DL);		return DAG.getMergeValues(Ops, DL);
}		}

SDValue SystemZTargetLowering::lowerSMUL_LOHI(SDValue Op,		SDValue SystemZTargetLowering::lowerSMUL_LOHI(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
SDLoc DL(Op);		SDLoc DL(Op);
▲ Show 20 Lines • Show All 3,129 Lines • Show Last 20 Lines

test/CodeGen/SystemZ/alloca-03.ll

This file was added.

				; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s

				; Allocate 8 bytes, no need to align stack.
				define void @f0() {
				; CHECK-LABEL: f0:
				; CHECK: aghi %r15, -168
				; CHECK-NOT: nil
				; CHECK: mvghi 160(%r15), 10
				; CHECK: aghi %r15, 168
				%x = alloca i64
				store volatile i64 10, i64* %x
				ret void
				}

				; Allocate %len * 8, no need to align stack.
				define void @f1(i64 %len) {
				; CHECK-LABEL: f1:
				; CHECK: sllg %r0, %r2, 3
				; CHECK: lgr %r1, %r15
				; CHECK: sgr %r1, %r0
				; CHECK-NOT: ngr
				; CHECK: lgr %r15, %r1
				; CHECK: la %r1, 160(%r1)
				; CHECK: mvghi 0(%r1), 10
				%x = alloca i64, i64 %len
				store volatile i64 10, i64* %x
				ret void
				}

				; Static alloca, align 128.
				define void @f2() {
				; CHECK-LABEL: f2:
				; CHECK: aghi %r1, -128
				; CHECK: lgr %r15, %r1
				; CHECK: la %r1, 280(%r1)
				; CHECK: nill %r1, 65408
				; CHECK: mvghi 0(%r1), 10
				%x = alloca i64, i64 1, align 128
				store volatile i64 10, i64* %x, align 128
				ret void
				}

				; Dynamic alloca, align 128.
				define void @f3(i64 %len) {
				; CHECK-LABEL: f3:
				; CHECK: sllg %r1, %r2, 3
				; CHECK: la %r0, 120(%r1)
				; CHECK: lgr %r1, %r15
				; CHECK: sgr %r1, %r0
				; CHECK: lgr %r15, %r1
				; CHECK: la %r1, 280(%r1)
				; CHECK: nill %r1, 65408
				; CHECK: mvghi 0(%r1), 10
				%x = alloca i64, i64 %len, align 128
				store volatile i64 10, i64* %x, align 128
				ret void
				}

				; Static alloca w/out alignment - part of frame.
				define void @f4() {
				; CHECK-LABEL: f4:
				; CHECK: aghi %r15, -168
				; CHECK: mvhi 164(%r15), 10
				; CHECK: aghi %r15, 168
				%x = alloca i32
				store volatile i32 10, i32* %x
				ret void
				}

				; Static alloca of one i32, aligned by 128.
				define void @f5() {
				; CHECK-LABEL: f5:

				; CHECK: lgr %r1, %r15
				; CHECK: aghi %r1, -128
				; CHECK: lgr %r15, %r1
				; CHECK: la %r1, 280(%r1)
				; CHECK: nill %r1, 65408
				; CHECK: mvhi 0(%r1), 10
				%x = alloca i32, i64 1, align 128
				store volatile i32 10, i32* %x
				ret void
				}

test/CodeGen/SystemZ/alloca-04.ll

This file was added.

				; Check the "no-realign-stack" function attribute. We should get a warning.

				; RUN: llc < %s -mtriple=s390x-linux-gnu -debug-only=codegen 2>&1 \| \
				; RUN: FileCheck %s


				define void @f6() "no-realign-stack" {
				%x = alloca i64, i64 1, align 128
				store volatile i64 10, i64* %x, align 128
				ret void
				}

				; CHECK: Warning: requested alignment 128 exceeds the stack alignment 8
				; CHECK-NOT: nill

This is an archive of the discontinued LLVM Phabricator instance.

Hanlding of aligned allocas on a target that does not align stack pointer.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 38654

include/llvm/CodeGen/MachineFrameInfo.h

lib/CodeGen/MachineFunction.cpp

lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp

lib/Target/SystemZ/SystemZFrameLowering.cpp

lib/Target/SystemZ/SystemZISelLowering.cpp

test/CodeGen/SystemZ/alloca-03.ll

test/CodeGen/SystemZ/alloca-04.ll

Hanlding of aligned allocas on a target that does not align stack pointer.
ClosedPublic