This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
2/2
TargetFrameLowering.h
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
-
FunctionLoweringInfo.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64FrameLowering.h
9/10
AArch64FrameLowering.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
framelayout-sve.mir
-
sve-alloca-stackid.ll

Differential D70080

[AArch64][SVE] Allocate locals that are scalable vectors.
ClosedPublic

Authored by sdesmalen on Nov 11 2019, 6:50 AM.

Download Raw Diff

Details

Reviewers

ostannard
efriedma
rengolin
cameron.mcinally

Commits

rG9a1c243aa5de: [AArch64][SVE] Allocate locals that are scalable vectors.

Summary

This patch adds a target interface to set the StackID for a given type,
which allows scalable vectors (e.g. <vscale x 16 x i8>) to be assigned a
'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sdesmalen created this revision.Nov 11 2019, 6:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2019, 6:50 AM

Herald added subscribers: psnobl, rkruppe, hiraditya and 2 others. · View Herald Transcript

LGTM. Will give other reviewers some time to review though...

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2554–2582	Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to eventually order SVE stack objects by size.

efriedma added inline comments.Nov 11 2019, 12:40 PM

llvm/include/llvm/CodeGen/TargetFrameLowering.h
369	We don't want to encourage people to key backend behavior off of the IR type of an alloca. Please change the API so it only passes down whether the type has a variable size.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2575	Assertions should only be used to check conditions we know are false. I don't see any code that would actually guarantee this, given arbitrary IR as input. Not sure what layer is best to address this. I mean, I guess we could honor the request by turning the alloca into a dynamic alloca during isel.

efriedma added inline comments.Nov 11 2019, 12:44 PM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2554–2582	I think we need to do something here if it's possible to have objects that aren't "16 x vscale" bytes long, to ensure natural alignment. I'm fine leaving that as a FIXME for now, though.

sdesmalen marked 4 inline comments as done.Nov 12 2019, 4:12 AM

sdesmalen added inline comments.

llvm/include/llvm/CodeGen/TargetFrameLowering.h
369	Thanks, I didn't know that. I'll change the API!
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2554–2582	Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to eventually order SVE stack objects by size. Yeah, I thought that made sense but realised that PEI does not seem to sort the objects either, probably because that would undo the ordering as calculated by StackSlotColoring, which orders the slots by # uses (i.e. more uses, closer to SP). I do have a separate patch that reorders the slots based on the availability of the FP, which I can share as a follow-up to this patch.
2554–2582	I think we need to do something here if it's possible to have objects that aren't "16 x vscale" bytes long, to ensure natural alignment. I'm fine leaving that as a FIXME for now, though. Objects that aren't `16 x vscale` bytes long (but have an alignment < 16bytes) are supported by this code [1]. If you're suggesting that we should reorder the objects based on natural alignment for performance reasons, we can indeed sort the ObjectsToAllocate list to group all the predicates (`vscale x 2 bytes`) together, and all the data vectors (`vscale x 16 bytes`) together (but leave their order otherwise intact to benefit from the order defined by StackSlotColoring). I can fix that in a separate patch and add a FIXME for now. [1] All objects in the SVE area on the stack have the same `vscale` scaling. That means that if we have two objects, e.g. one of `vscale x 16 bytes` (data vector) and one of `vscale x 2 bytes` (predicate), this will be aligned to (`vscale *`) 16 bytes by allocating `vscale x 32 bytes` by the call to `alignTo` below on line 2578.
2575	Without having thought about it too deeply, that indeed seems like a feasible way to support it. For now, are you happy for me to leave the assert here (plus an extra FIXME comment) to leave the case unsupported until we fix this in a later patch?

efriedma added inline comments.Nov 12 2019, 11:36 AM

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2554–2582	Oh, I see; vscale just multiplies all the stack offsets, so anything that's aligned before vscale is applied is still aligned afterwards. Sorry, I got confused somehow. Sure, we can put off sorting for now.
2575	Please at least change it to report_fatal_error. I'm okay putting off fixing it.

Changed getStackIDForType(const Type *T) -> getStackIDForScalableVectors()
Changed assert into report_fatal_error for >16 byte alignment of SVE stack objects and added a FIXME.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2554–2582	No worries, I understand this is quite confusing. Thanks!

LGTM

This revision is now accepted and ready to land.Nov 12 2019, 4:27 PM

Closed by commit rG9a1c243aa5de: [AArch64][SVE] Allocate locals that are scalable vectors. (authored by sdesmalen). · Explain WhyNov 13 2019, 1:55 AM

This revision was automatically updated to reflect the committed changes.

sdesmalen marked an inline comment as done.

Thanks for the review @efriedma and @cameron.mcinally!

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2501–2502	Before I committed the patch, I moved the initialization of Min and Max to the start of the function. The MemorySanitizer found it was reading uninitialized memory when calling this function through `estimateSVEStackObjectOffsets`.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetFrameLowering.h

5 lines

lib/

CodeGen/

SelectionDAG/

FunctionLoweringInfo.cpp

9 lines

Target/

AArch64/

AArch64FrameLowering.h

1 line

AArch64FrameLowering.cpp

54 lines

test/

CodeGen/

AArch64/

framelayout-sve.mir

67 lines

sve-alloca-stackid.ll

17 lines

Diff 229032

llvm/include/llvm/CodeGen/TargetFrameLowering.h

Show First 20 Lines • Show All 357 Lines • ▼ Show 20 Lines	public:
/// \p MBB will be correctly handled by the target.		/// \p MBB will be correctly handled by the target.
/// As soon as the target enable shrink-wrapping without overriding		/// As soon as the target enable shrink-wrapping without overriding
/// this method, we assume that each basic block is a valid		/// this method, we assume that each basic block is a valid
/// epilogue.		/// epilogue.
virtual bool canUseAsEpilogue(const MachineBasicBlock &MBB) const {		virtual bool canUseAsEpilogue(const MachineBasicBlock &MBB) const {
return true;		return true;
}		}

		/// Returns the StackID that scalable vectors should be associated with.
		virtual TargetStackID::Value getStackIDForScalableVectors() const {
		return TargetStackID::Default;
		}
		efriedmaUnsubmitted Done Reply Inline Actions We don't want to encourage people to key backend behavior off of the IR type of an alloca. Please change the API so it only passes down whether the type has a variable size. efriedma: We don't want to encourage people to key backend behavior off of the IR type of an alloca.
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions Thanks, I didn't know that. I'll change the API! sdesmalen: Thanks, I didn't know that. I'll change the API!

virtual bool isSupportedStackID(TargetStackID::Value ID) const {		virtual bool isSupportedStackID(TargetStackID::Value ID) const {
switch (ID) {		switch (ID) {
default:		default:
return false;		return false;
case TargetStackID::Default:		case TargetStackID::Default:
case TargetStackID::NoAlloc:		case TargetStackID::NoAlloc:
return true;		return true;
}		}
Show All 23 Lines

llvm/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	for (const Instruction &I : BB) {
AI->getAlignment());		AI->getAlignment());

// Static allocas can be folded into the initial stack frame		// Static allocas can be folded into the initial stack frame
// adjustment. For targets that don't realign the stack, don't		// adjustment. For targets that don't realign the stack, don't
// do this if there is an extra alignment requirement.		// do this if there is an extra alignment requirement.
if (AI->isStaticAlloca() &&		if (AI->isStaticAlloca() &&
(TFI->isStackRealignable() \|\| (Align <= StackAlign))) {		(TFI->isStackRealignable() \|\| (Align <= StackAlign))) {
const ConstantInt *CUI = cast<ConstantInt>(AI->getArraySize());		const ConstantInt *CUI = cast<ConstantInt>(AI->getArraySize());
uint64_t TySize = MF->getDataLayout().getTypeAllocSize(Ty);		uint64_t TySize =
		MF->getDataLayout().getTypeAllocSize(Ty).getKnownMinSize();

TySize *= CUI->getZExtValue(); // Get total allocated size.		TySize *= CUI->getZExtValue(); // Get total allocated size.
if (TySize == 0) TySize = 1; // Don't create zero-sized stack objects.		if (TySize == 0) TySize = 1; // Don't create zero-sized stack objects.
int FrameIndex = INT_MAX;		int FrameIndex = INT_MAX;
auto Iter = CatchObjects.find(AI);		auto Iter = CatchObjects.find(AI);
if (Iter != CatchObjects.end() && TLI->needsFixedCatchObjects()) {		if (Iter != CatchObjects.end() && TLI->needsFixedCatchObjects()) {
FrameIndex = MF->getFrameInfo().CreateFixedObject(		FrameIndex = MF->getFrameInfo().CreateFixedObject(
TySize, 0, /IsImmutable=/false, /isAliased=/true);		TySize, 0, /IsImmutable=/false, /isAliased=/true);
MF->getFrameInfo().setObjectAlignment(FrameIndex, Align);		MF->getFrameInfo().setObjectAlignment(FrameIndex, Align);
} else {		} else {
FrameIndex =		FrameIndex =
MF->getFrameInfo().CreateStackObject(TySize, Align, false, AI);		MF->getFrameInfo().CreateStackObject(TySize, Align, false, AI);
}		}

		// Scalable vectors may need a special StackID to distinguish
		// them from other (fixed size) stack objects.
		if (Ty->isVectorTy() && Ty->getVectorIsScalable())
		MF->getFrameInfo().setStackID(FrameIndex,
		TFI->getStackIDForScalableVectors());

StaticAllocaMap[AI] = FrameIndex;		StaticAllocaMap[AI] = FrameIndex;
// Update the catch handler information.		// Update the catch handler information.
if (Iter != CatchObjects.end()) {		if (Iter != CatchObjects.end()) {
for (int *CatchObjPtr : Iter->second)		for (int *CatchObjPtr : Iter->second)
*CatchObjPtr = FrameIndex;		*CatchObjPtr = FrameIndex;
}		}
} else {		} else {
// FIXME: Overaligned static allocas should be grouped into		// FIXME: Overaligned static allocas should be grouped into
▲ Show 20 Lines • Show All 377 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64FrameLowering.h

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs,
RegScavenger *RS) const override;		RegScavenger *RS) const override;

/// Returns true if the target will correctly handle shrink wrapping.		/// Returns true if the target will correctly handle shrink wrapping.
bool enableShrinkWrapping(const MachineFunction &MF) const override {		bool enableShrinkWrapping(const MachineFunction &MF) const override {
return true;		return true;
}		}

bool enableStackSlotScavenging(const MachineFunction &MF) const override;		bool enableStackSlotScavenging(const MachineFunction &MF) const override;
		TargetStackID::Value getStackIDForScalableVectors() const override;

void processFunctionBeforeFrameFinalized(MachineFunction &MF,		void processFunctionBeforeFrameFinalized(MachineFunction &MF,
RegScavenger *RS) const override;		RegScavenger *RS) const override;

unsigned getWinEHParentFrameOffset(const MachineFunction &MF) const override;		unsigned getWinEHParentFrameOffset(const MachineFunction &MF) const override;

unsigned getWinEHFuncletFrameSize(const MachineFunction &MF) const;		unsigned getWinEHFuncletFrameSize(const MachineFunction &MF) const;

Show All 31 Lines

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	for (MachineInstr &MI : MBB) {
AArch64FrameOffsetCannotUpdate)		AArch64FrameOffsetCannotUpdate)
return 0;		return 0;
}		}
}		}
}		}
return DefaultSafeSPDisplacement;		return DefaultSafeSPDisplacement;
}		}

		TargetStackID::Value
		AArch64FrameLowering::getStackIDForScalableVectors() const {
		return TargetStackID::SVEVector;
		}

/// Returns the size of the entire SVE stackframe (calleesaves + spills).		/// Returns the size of the entire SVE stackframe (calleesaves + spills).
static StackOffset getSVEStackSize(const MachineFunction &MF) {		static StackOffset getSVEStackSize(const MachineFunction &MF) {
const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
return {(int64_t)AFI->getStackSizeSVE(), MVT::nxv1i8};		return {(int64_t)AFI->getStackSizeSVE(), MVT::nxv1i8};
}		}

bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {		bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
if (!EnableRedZone)		if (!EnableRedZone)
▲ Show 20 Lines • Show All 2,266 Lines • ▼ Show 20 Lines	bool AArch64FrameLowering::enableStackSlotScavenging(
const MachineFunction &MF) const {		const MachineFunction &MF) const {
const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
return AFI->hasCalleeSaveStackFreeSpace();		return AFI->hasCalleeSaveStackFreeSpace();
}		}

/// returns true if there are any SVE callee saves.		/// returns true if there are any SVE callee saves.
static bool getSVECalleeSaveSlotRange(const MachineFrameInfo &MFI,		static bool getSVECalleeSaveSlotRange(const MachineFrameInfo &MFI,
int &Min, int &Max) {		int &Min, int &Max) {
		Min = std::numeric_limits<int>::max();
		Max = std::numeric_limits<int>::min();

if (!MFI.isCalleeSavedInfoValid())		if (!MFI.isCalleeSavedInfoValid())
return false;		return false;

Min = std::numeric_limits<int>::max();
Max = std::numeric_limits<int>::min();
const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();		const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions Before I committed the patch, I moved the initialization of Min and Max to the start of the function. The MemorySanitizer found it was reading uninitialized memory when calling this function through `estimateSVEStackObjectOffsets`. sdesmalen: Before I committed the patch, I moved the initialization of Min and Max to the start of the…
for (auto &CS : CSI) {		for (auto &CS : CSI) {
if (AArch64::ZPRRegClass.contains(CS.getReg()) \|\|		if (AArch64::ZPRRegClass.contains(CS.getReg()) \|\|
AArch64::PPRRegClass.contains(CS.getReg())) {		AArch64::PPRRegClass.contains(CS.getReg())) {
assert((Max == std::numeric_limits<int>::min() \|\|		assert((Max == std::numeric_limits<int>::min() \|\|
Max + 1 == CS.getFrameIdx()) &&		Max + 1 == CS.getFrameIdx()) &&
"SVE CalleeSaves are not consecutive");		"SVE CalleeSaves are not consecutive");

Min = std::min(Min, CS.getFrameIdx());		Min = std::min(Min, CS.getFrameIdx());
Show All 16 Lines	static int64_t determineSVEStackObjectOffsets(MachineFrameInfo &MFI,
int64_t Offset = 0;		int64_t Offset = 0;
for (int I = MFI.getObjectIndexBegin(); I != 0; ++I)		for (int I = MFI.getObjectIndexBegin(); I != 0; ++I)
if (MFI.getStackID(I) == TargetStackID::SVEVector) {		if (MFI.getStackID(I) == TargetStackID::SVEVector) {
int64_t FixedOffset = -MFI.getObjectOffset(I);		int64_t FixedOffset = -MFI.getObjectOffset(I);
if (FixedOffset > Offset)		if (FixedOffset > Offset)
Offset = FixedOffset;		Offset = FixedOffset;
}		}

		auto Assign = [&MFI](int FI, int64_t Offset) {
		LLVM_DEBUG(dbgs() << "alloc FI(" << FI << ") at SP[" << Offset << "]\n");
		MFI.setObjectOffset(FI, Offset);
		};

// Then process all callee saved slots.		// Then process all callee saved slots.
if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) {		if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) {
// Make sure to align the last callee save slot.		// Make sure to align the last callee save slot.
MFI.setObjectAlignment(MaxCSFrameIndex, 16U);		MFI.setObjectAlignment(MaxCSFrameIndex, 16U);

// Assign offsets to the callee save slots.		// Assign offsets to the callee save slots.
for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) {		for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) {
Offset += MFI.getObjectSize(I);		Offset += MFI.getObjectSize(I);
Offset = alignTo(Offset, MFI.getObjectAlignment(I));		Offset = alignTo(Offset, MFI.getObjectAlignment(I));
if (AssignOffsets) {		if (AssignOffsets)
LLVM_DEBUG(dbgs() << "alloc FI(" << I << ") at SP[" << Offset		Assign(I, -Offset);
<< "]\n");		}
MFI.setObjectOffset(I, -Offset);
}		}

		// Create a buffer of SVE objects to allocate and sort it.
		SmallVector<int, 8> ObjectsToAllocate;
		for (int I = 0, E = MFI.getObjectIndexEnd(); I != E; ++I) {
		unsigned StackID = MFI.getStackID(I);
		if (StackID != TargetStackID::SVEVector)
		continue;
		if (MaxCSFrameIndex >= I && I >= MinCSFrameIndex)
		continue;
		if (MFI.isDeadObjectIndex(I))
		continue;

		ObjectsToAllocate.push_back(I);
}		}

		// Allocate all SVE locals and spills
		for (unsigned FI : ObjectsToAllocate) {
		unsigned Align = MFI.getObjectAlignment(FI);
		// FIXME: Given that the length of SVE vectors is not necessarily a power of
		// two, we'd need to align every object dynamically at runtime if the
		// alignment is larger than 16. This is not yet supported.
		if (Align > 16)
		report_fatal_error(
		efriedmaUnsubmitted Not Done Reply Inline Actions Assertions should only be used to check conditions we know are false. I don't see any code that would actually guarantee this, given arbitrary IR as input. Not sure what layer is best to address this. I mean, I guess we could honor the request by turning the alloca into a dynamic alloca during isel. efriedma: Assertions should only be used to check conditions we know are false. I don't see any code…
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions Without having thought about it too deeply, that indeed seems like a feasible way to support it. For now, are you happy for me to leave the assert here (plus an extra FIXME comment) to leave the case unsupported until we fix this in a later patch? sdesmalen: Without having thought about it too deeply, that indeed seems like a feasible way to support it.
		efriedmaUnsubmitted Done Reply Inline Actions Please at least change it to report_fatal_error. I'm okay putting off fixing it. efriedma: Please at least change it to report_fatal_error. I'm okay putting off fixing it.
		"Alignment of scalable vectors > 16 bytes is not yet supported");

		Offset = alignTo(Offset + MFI.getObjectSize(FI), Align);
		if (AssignOffsets)
		Assign(FI, -Offset);
}		}

		cameron.mcinallyUnsubmitted Done Reply Inline Actions Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to eventually order SVE stack objects by size. cameron.mcinally: Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to…
		efriedmaUnsubmitted Done Reply Inline Actions I think we need to do something here if it's possible to have objects that aren't "16 x vscale" bytes long, to ensure natural alignment. I'm fine leaving that as a FIXME for now, though. efriedma: I think we need to do something here if it's possible to have objects that aren't "16 x vscale"…
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions I think we need to do something here if it's possible to have objects that aren't "16 x vscale" bytes long, to ensure natural alignment. I'm fine leaving that as a FIXME for now, though. Objects that aren't `16 x vscale` bytes long (but have an alignment < 16bytes) are supported by this code [1]. If you're suggesting that we should reorder the objects based on natural alignment for performance reasons, we can indeed sort the ObjectsToAllocate list to group all the predicates (`vscale x 2 bytes`) together, and all the data vectors (`vscale x 16 bytes`) together (but leave their order otherwise intact to benefit from the order defined by StackSlotColoring). I can fix that in a separate patch and add a FIXME for now. [1] All objects in the SVE area on the stack have the same `vscale` scaling. That means that if we have two objects, e.g. one of `vscale x 16 bytes` (data vector) and one of `vscale x 2 bytes` (predicate), this will be aligned to (`vscale `) 16 bytes by allocating `vscale x 32 bytes` by the call to `alignTo` below on line 2578. sdesmalen:* > I think we need to do something here if it's possible to have objects that aren't "16 x…
		efriedmaUnsubmitted Done Reply Inline Actions Oh, I see; vscale just multiplies all the stack offsets, so anything that's aligned before vscale is applied is still aligned afterwards. Sorry, I got confused somehow. Sure, we can put off sorting for now. efriedma: Oh, I see; vscale just multiplies all the stack offsets, so anything that's aligned before…
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions No worries, I understand this is quite confusing. Thanks! sdesmalen: No worries, I understand this is quite confusing. Thanks!
		sdesmalenAuthorUnsubmitted Done Reply Inline Actions Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to eventually order SVE stack objects by size. Yeah, I thought that made sense but realised that PEI does not seem to sort the objects either, probably because that would undo the ordering as calculated by StackSlotColoring, which orders the slots by # uses (i.e. more uses, closer to SP). I do have a separate patch that reorders the slots based on the availability of the FP, which I can share as a follow-up to this patch. sdesmalen: > Nit: It doesn't actually sort them, right? Or did I miss it? I'm assuming that we'll want to…
// Note: We don't take allocatable stack objects into
// account yet, because allocation for those is not yet
// implemented.
return Offset;		return Offset;
}		}

int64_t AArch64FrameLowering::estimateSVEStackObjectOffsets(		int64_t AArch64FrameLowering::estimateSVEStackObjectOffsets(
MachineFrameInfo &MFI) const {		MachineFrameInfo &MFI) const {
int MinCSFrameIndex, MaxCSFrameIndex;		int MinCSFrameIndex, MaxCSFrameIndex;
return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex, false);		return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex, false);
}		}
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/framelayout-sve.mir

Show All 28 Lines	--- \|
define void @test_address_sve() nounwind { entry: unreachable }		define void @test_address_sve() nounwind { entry: unreachable }
define void @test_address_sve_fp() nounwind { entry: unreachable }		define void @test_address_sve_fp() nounwind { entry: unreachable }
define void @test_stack_arg_sve() nounwind { entry: unreachable }		define void @test_stack_arg_sve() nounwind { entry: unreachable }
define void @test_address_sve_out_of_range() nounwind { entry: unreachable }		define void @test_address_sve_out_of_range() nounwind { entry: unreachable }
define aarch64_sve_vector_pcs void @save_restore_pregs_sve() nounwind { entry: unreachable }		define aarch64_sve_vector_pcs void @save_restore_pregs_sve() nounwind { entry: unreachable }
define aarch64_sve_vector_pcs void @save_restore_zregs_sve() nounwind { entry: unreachable }		define aarch64_sve_vector_pcs void @save_restore_zregs_sve() nounwind { entry: unreachable }
define aarch64_sve_vector_pcs void @save_restore_sve() nounwind { entry: unreachable }		define aarch64_sve_vector_pcs void @save_restore_sve() nounwind { entry: unreachable }
define aarch64_sve_vector_pcs void @save_restore_sve_realign() nounwind { entry: unreachable }		define aarch64_sve_vector_pcs void @save_restore_sve_realign() nounwind { entry: unreachable }
		define aarch64_sve_vector_pcs void @frame_layout() nounwind { entry: unreachable }

...		...
# +----------+		# +----------+
# \|scratchreg\| // x29 is used as scratch reg.		# \|scratchreg\| // x29 is used as scratch reg.
# +----------+		# +----------+
# \| %fixed- \| // scalable SVE object of n * 18 bytes, aligned to 16 bytes,		# \| %fixed- \| // scalable SVE object of n * 18 bytes, aligned to 16 bytes,
# \| stack.0 \| // to be materialized with 2ADDVL (<=> 2 n * 16bytes)		# \| stack.0 \| // to be materialized with 2ADDVL (<=> 2 n * 16bytes)
# +----------+		# +----------+
▲ Show 20 Lines • Show All 462 Lines • ▼ Show 20 Lines	bb.0.entry:
$p11 = IMPLICIT_DEF		$p11 = IMPLICIT_DEF
$p12 = IMPLICIT_DEF		$p12 = IMPLICIT_DEF
$p13 = IMPLICIT_DEF		$p13 = IMPLICIT_DEF
$p14 = IMPLICIT_DEF		$p14 = IMPLICIT_DEF
$p15 = IMPLICIT_DEF		$p15 = IMPLICIT_DEF

RET_ReallyLR		RET_ReallyLR
---		---
		# Frame layout should be:
		# +---------------------+ <- Old SP
		# \| callee save z8 \|@ -16
		# \| callee save z23 \|@ -32
		# \| callee save p4 \|@ -34
		# \| callee save p15 \|@ -48
		# \| id #0 (size 32) \|@ -80
		# \| id #1 (size 4) \|@ -84
		# \| id #2 (size 16) \|@ -112
		# \| id #3 (size 2) \|@ -114
		# \| id #4 (size 16) \|@ -144
		# \| id #5 (size 2) \|@ -146
		# +- - - - - - - - - - -+ <- New SP @-160
		# CHECK-LABEL: name: frame_layout
		# CHECK: stack:
		# CHECK: - { id: 0, name: '', type: default, offset: -80, size: 32, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 1, name: '', type: default, offset: -84, size: 4, alignment: 2,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 2, name: '', type: default, offset: -112, size: 16, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 3, name: '', type: default, offset: -114, size: 2, alignment: 2,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 4, name: '', type: spill-slot, offset: -144, size: 16, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 5, name: '', type: spill-slot, offset: -146, size: 2, alignment: 2,
		# CHECK-NEXT: stack-id: sve-vec,
		# CHECK: - { id: 6, name: '', type: spill-slot, offset: -16, size: 16, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec, callee-saved-register: '$z8',
		# CHECK: - { id: 7, name: '', type: spill-slot, offset: -32, size: 16, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec, callee-saved-register: '$z23',
		# CHECK: - { id: 8, name: '', type: spill-slot, offset: -34, size: 2, alignment: 2,
		# CHECK-NEXT: stack-id: sve-vec, callee-saved-register: '$p4',
		# CHECK: - { id: 9, name: '', type: spill-slot, offset: -48, size: 2, alignment: 16,
		# CHECK-NEXT: stack-id: sve-vec, callee-saved-register: '$p15',
		# CHECK: - { id: 10, name: '', type: spill-slot, offset: -16, size: 8, alignment: 16,
		# CHECK-NEXT: stack-id: default, callee-saved-register: '$fp',
		#
		# CHECK: bb.0.entry:
		# CHECK-NEXT: $sp = frame-setup STRXpre killed $[[SCRATCH:[a-z0-9]+]], $sp, -16
		# CHECK-NEXT: $sp = frame-setup ADDVL_XXI $sp, -3
		# CHECK-NEXT: STR_PXI killed $p15, $sp, 6
		# CHECK-NEXT: STR_PXI killed $p4, $sp, 7
		# CHECK-NEXT: STR_ZXI killed $z23, $sp, 1
		# CHECK-NEXT: STR_ZXI killed $z8, $sp, 2
		# CHECK-NEXT: $sp = frame-setup ADDVL_XXI $sp, -7
		name: frame_layout
		stack:
		- { id: 0, type: default, size: 32, alignment: 16, stack-id: sve-vec }
		- { id: 1, type: default, size: 4, alignment: 2, stack-id: sve-vec }
		- { id: 2, type: default, size: 16, alignment: 16, stack-id: sve-vec }
		- { id: 3, type: default, size: 2, alignment: 2, stack-id: sve-vec }
		- { id: 4, type: spill-slot, size: 16, alignment: 16, stack-id: sve-vec }
		- { id: 5, type: spill-slot, size: 2, alignment: 2, stack-id: sve-vec }
		body: \|
		bb.0.entry:

		; Trigger some callee saves
		$z8 = IMPLICIT_DEF
		$z23 = IMPLICIT_DEF
		$p4 = IMPLICIT_DEF
		$p15 = IMPLICIT_DEF

		RET_ReallyLR

		---

llvm/test/CodeGen/AArch64/sve-alloca-stackid.ll

This file was added.

				; RUN: llc -mtriple=aarch64 -mattr=+sve < %s \| FileCheck %s --check-prefix=CHECKCG
				; RUN: llc -mtriple=aarch64 -mattr=+sve -stop-after=finalize-isel < %s \| FileCheck %s --check-prefix=CHECKISEL

				; CHECKCG-LABEL: foo:
				; CHECKCG: addvl sp, sp, #-1

				; CHECKISEL-LABEL: name: foo
				; CHECKISEL: stack:
				; CHECKISEL: id: 0, name: ptr, type: default, offset: 0, size: 16, alignment: 16,
				; CHECKISEL-NEXT: stack-id: sve-vec
				define i32 @foo(<vscale x 16 x i8> %val) {
				%ptr = alloca <vscale x 16 x i8>
				%res = call i32 @bar(<vscale x 16 x i8>* %ptr)
				ret i32 %res
				}

				declare i32 @bar(<vscale x 16 x i8>* %ptr);

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Allocate locals that are scalable vectors.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 229032

llvm/include/llvm/CodeGen/TargetFrameLowering.h

llvm/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp

llvm/lib/Target/AArch64/AArch64FrameLowering.h

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp

llvm/test/CodeGen/AArch64/framelayout-sve.mir

llvm/test/CodeGen/AArch64/sve-alloca-stackid.ll

[AArch64][SVE] Allocate locals that are scalable vectors.
ClosedPublic