This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetLowering.h
-
lib/
-
CodeGen/
4
CodeGenPrepare.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.h
-
AArch64ISelLowering.cpp
-
test/Transforms/CodeGenPrepare/AArch64/
-
Transforms/
-
CodeGenPrepare/
-
AArch64/
-
large-offset-gep.ll

Differential D42759

[CGP] Split large data structres to sink more GEPs
ClosedPublic

Authored by haicheng on Jan 31 2018, 12:28 PM.

Download Raw Diff

Details

Reviewers

gberry
qcolombet
skatkov
john.brawn
reames
javed.absar
efriedma

Commits

rG0aae2bc26079: [CGP] Split large data structres to sink more GEPs
rL332015: [CGP] Split large data structres to sink more GEPs

Summary

Accessing the members of a large data structures needs a lot of GEPs which usually have large offsets due to the size of the underlying data structure. If the offsets are too large to fit into the r+i addressing mode, these GEPs cannot be sunk to their users' blocks and many extra registers are needed then to carry the values of these GEPs.

This patch tries to split a large data struct starting from %base like the following.

Before:

BB0:
  %base     =

BB1:
  %gep0     = gep %base, off0
  %gep1     = gep %base, off1
  %gep2     = gep %base, off2

BB2:
  %load1    = load %gep0
  %load2    = load %gep1
  %load3    = load %gep2

After:

BB0:
  %base     =
  %new_base = gep %base, off0

BB1:
  %new_gep0 = %new_base
  %new_gep1 = gep %new_base, off1 - off0
  %new_gep2 = gep %new_base, off2 - off0

BB2:
  %load1    = load i32, i32* %new_gep0
  %load2    = load i32, i32* %new_gep1
  %load3    = load i32, i32* %new_gep2

In the above example, the struct is split into two parts. The first part still starts from %base and the second part starts from %new_base. After the splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to BB2 and folded into their users.

The algorithm to split data structure is simple and very similar to the work of merging SExts. First, it collects GEPs that have large offsets when iterating the blocks. Second, it splits the underlying data structures and updates the collected GEPs to use smaller offsets.

The code size and performance results of spec20xx is listed as below

	Code Size (%)	Performance (%)
	(- is smaller)	(+ is faster)
spec2006/bzip2	-3.47	+2.89
spec2017/imagick	-0.01	+1.02
spec2000/mesa	-0.76	+0.71
spec2017/x264	-0.91	+0.39
spec2017/leela	-0.01	+0.31
spec2000/ammp	-0.04	+0.17
spec2017/parest	0	0
spec2017/xalancbmk	0	-0.13
spec2017/blender	0	-0.18
spec2006/xalancbmk	0	-0.51

Diff Detail

Repository: rL LLVM

Event Timeline

haicheng created this revision.Jan 31 2018, 12:28 PM

Herald added subscribers: javed.absar, mcrosier. · View Herald TranscriptJan 31 2018, 12:28 PM

efriedma added a subscriber: efriedma.Jan 31 2018, 12:36 PM

efriedma added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
3758 ↗	(On Diff #132235)	I haven't really reviewed the whole patch closely, so I might be missing something, but why does it matter that the element type is a struct, as opposed to something else large like an array?

junbuml added a subscriber: junbuml.Jan 31 2018, 12:58 PM

haicheng added inline comments.Jan 31 2018, 1:05 PM

lib/CodeGen/CodeGenPrepare.cpp
3758 ↗	(On Diff #132235)	Nothing special, I am just being conservative in the beginning. I find that at this stage, LLVM can bitcast arbitrary pointers to i8* and do random stuffs. I am thinking supporting other large data structures as the next step.

tobiasvk added a subscriber: tobiasvk.Feb 6 2018, 10:52 AM

tobiasvk added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
428 ↗	(On Diff #132235)	Drive-by comment: it's not a good idea to modify a cl::opt here as this will create (benign) races between multiple backend threads in ThinLTO.

I made two changes to address the comments.

Add the support of splitting large arrays.

Introduce a target hook to enable/disable the change.

I am also welcome to any alternative implementation suggestions.

A few quick comments. Will follow up with a more complete review later this week.

lib/CodeGen/CodeGenPrepare.cpp
3743 ↗	(On Diff #133782)	Perhaps something like: // Record GEPs with a non-zero offsets as candidates for splitting in the event that the offset cannot fit into the r+i addressing mode.
3746 ↗	(On Diff #133782)	Add isa<GetElementPtrInst>(AddrInst) check?
3755 ↗	(On Diff #133782)	You could reduce the indent here by putting a 'isa<GetElementPtrInst>(AddrInst) in the outer most if statement and then make this a cast<>. See comment above.
3756 ↗	(On Diff #133782)	Also, I need some clarification here. The first check (BaseI && GEP->getParent() != BaseI->getParent()) seems to be inline with the comments below, but I don't completely follow the second check (i.e., checking that the PHI is not in the entry block, if BaseI is null).

efriedma added inline comments.Feb 12 2018, 11:38 AM

lib/Target/AArch64/AArch64ISelLowering.cpp
8012 ↗	(On Diff #133782)	The base of a GEP is always a pointer.
8014 ↗	(On Diff #133782)	I'm still not sure what the purpose of this check is; why does it matter how many dimensions an array has?

Completely remove the check of base types. So, any large data structure is supported.

Clarify some comments.

Thank you for the review.

Made some changes to improve the readability.

Kindly ping. I am appreciated to take any advice.

Kindly Ping #2

Kindly Ping #3

Just minor nits inlined.

lib/CodeGen/CodeGenPrepare.cpp
3746 ↗	(On Diff #135450)	a non-zero offsets -> a non-zero offset
3752 ↗	(On Diff #135450)	It seems that you can move this check to "else" by changing "else" to "else if".
4293 ↗	(On Diff #135450)	You may want to have a new variable for GetElementPtrInst instead of doing LargeOffsetGEP.first several times.
4295 ↗	(On Diff #135450)	!NewGEPBases.count(LargeOffsetGEP.first)
4297 ↗	(On Diff #135450)	I'm not clear about this comment. Can you clarify it little bit? typo: spllit
lib/Target/AArch64/AArch64ISelLowering.cpp
8033 ↗	(On Diff #135450)	Should we have BaseType as parameter ?

Address Jun's comments.

Thank you very much for taking a look.

skatkov added inline comments.Mar 26 2018, 10:12 PM

lib/CodeGen/CodeGenPrepare.cpp
3757 ↗	(On Diff #139315)	Why do you need the check isa<GetElementPtrInst>(AddrInst)? You are in "case Instruction::GetElementPtr" basing on "switch (Opcode) {" where Opcode is actually AddrInst->getOpcode(). So to me it is redundant...
3776 ↗	(On Diff #139315)	getParent()->getParent() => getFunction()
4819 ↗	(On Diff #139315)	Please add a comment for nontrivial comparison function.
4873 ↗	(On Diff #139315)	I wonder if you sort the an array anyway why not to use set like data-structure to keep them and avoid erase of unique elements? Is there any hidden sense for that?
4885 ↗	(On Diff #139315)	How can we get GEP == nullptr here?
4929 ↗	(On Diff #139315)	getParent()->getParent() => getFunction()

Update to address Serguei's comments. Thank you for taking a look.

lib/CodeGen/CodeGenPrepare.cpp
3757 ↗	(On Diff #139315)	AddrInst is a user. It can be a GetElementPtrConstantExpr which is not supported by CGP yet.
4873 ↗	(On Diff #139315)	I need a data structure that can be iterated in a sorted order (ascending by offsets) and all elements (pointers to GEPs) are unique. I think the data in the set are unique but I cannot access them in the order that I want If I use a set.

Kindly Ping

efriedma added inline comments.Apr 9 2018, 5:06 PM

lib/CodeGen/CodeGenPrepare.cpp
2548 ↗	(On Diff #139959)	`AssertingVH<GetElementPtrInst>`
4892 ↗	(On Diff #139959)	This is going to visit GEPs with the same offset in non-deterministic order. I guess it might not be that important, but it'll probably mess with the use-lists, which could affect some later pass. I would rather stay on the safe side and sort GEPs with the same offset based on the order they were added to the map, or some other deterministic ordering.
4899 ↗	(On Diff #139959)	The result type of the GEP might not be the type of the memory access... but I don't know how you'd get the right type off the top of my head, and it might not be important in practice. Maybe worth noting with a comment, though.
4933 ↗	(On Diff #139959)	I think you have to be a bit more careful here to avoid inserting a GEP into a block with a catchswitch. (This is Windows-only exception-handling, so maybe difficult to trigger.)

Address Eli's comments. Thank you very much, Eli.

haicheng marked 2 inline comments as done.Apr 18 2018, 1:36 PM

haicheng added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
4933 ↗	(On Diff #139959)	Now I check if the blocks contain catchswitch when collecting GEPs.

This is looking good. Just a few more small comments.

lib/CodeGen/CodeGenPrepare.cpp
278 ↗	(On Diff #142989)	Probably these maps should also use AssertingVH.
3781 ↗	(On Diff #142989)	I'm pretty sure it's impossible for `BaseI` to be a `BinaryOperator`, given it's a pointer.

Herald added a reviewer: javed.absar. · View Herald TranscriptApr 23 2018, 6:02 PM

Address Eli's comments. Thank you.

haicheng marked 2 inline comments as done.Apr 26 2018, 9:54 AM

efriedma added inline comments.Apr 27 2018, 2:35 PM

lib/CodeGen/CodeGenPrepare.cpp
278 ↗	(On Diff #142989)	By "these maps", I meant NewGEPBases and LargeOffsetGEPMap too.

Herald added a subscriber: mgrang. · View Herald TranscriptApr 27 2018, 2:35 PM

Add more AssertingVH.

efriedma added inline comments.May 1 2018, 11:16 AM

lib/CodeGen/CodeGenPrepare.cpp
278 ↗	(On Diff #142989)	LargeOffsetGEPMap still has a `GetElementPtrInst *` that isn't an AssertingVH.

Add one more AssertingVH

Kindly Ping

LGTM

This revision is now accepted and ready to land.May 9 2018, 3:23 PM

Closed by commit rL332015: [CGP] Split large data structres to sink more GEPs (authored by haicheng). · Explain WhyMay 10 2018, 11:31 AM

This revision was automatically updated to reflect the committed changes.

jonpa added a subscriber: jonpa.Jun 19 2022, 8:36 AM

jonpa added inline comments.

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816	Is there any reason to not use '!=' as oppsed to '>' here (which would match the comment below)? On SystemZ negative offsets are not supported on vector instructions.

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 19 2022, 8:36 AM

efriedma added inline comments.Jun 20 2022, 3:41 PM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816	It could work. Would probably need a few other changes to make sure we don't do anything silly like grouping together positive and negative offsets.

jonpa added inline comments.Jun 21 2022, 4:09 AM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816	I may be missing something, but as far as I can tell there should not be any problem just because the offset is negative. The type for offsets is already int64_t, and the algorithm of sorting them by offset and inserting a new GEP whenever needed seems to work regardless of a negative offset, since the most negative offset is generated first and then the remainder will be positive, or?

efriedma added inline comments.Jun 21 2022, 10:19 AM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816	I don't remember exactly how the algorithm for grouping offsets together functions; maybe it just works.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

TargetLowering.h

5 lines

lib/

CodeGen/

CodeGenPrepare.cpp

262 lines

Target/

AArch64/

AArch64ISelLowering.h

2 lines

AArch64ISelLowering.cpp

5 lines

test/

Transforms/

CodeGenPrepare/

AArch64/

large-offset-gep.ll

147 lines

Diff 146171

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 2,230 Lines • ▼ Show 20 Lines	public:
}		}

// Return true if it is profitable to use a scalar input to a BUILD_VECTOR		// Return true if it is profitable to use a scalar input to a BUILD_VECTOR
// even if the vector itself has multiple uses.		// even if the vector itself has multiple uses.
virtual bool aggressivelyPreferBuildVectorSources(EVT VecVT) const {		virtual bool aggressivelyPreferBuildVectorSources(EVT VecVT) const {
return false;		return false;
}		}

		// Return true if CodeGenPrepare should consider splitting large offset of a
		// GEP to make the GEP fit into the addressing mode and can be sunk into the
		// same blocks of its users.
		virtual bool shouldConsiderGEPOffsetSplit() const { return false; }

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Runtime Library hooks		// Runtime Library hooks
//		//

/// Rename the default libcall routine name for the specified libcall.		/// Rename the default libcall routine name for the specified libcall.
void setLibcallName(RTLIB::Libcall Call, const char *Name) {		void setLibcallName(RTLIB::Libcall Call, const char *Name) {
LibcallRoutineNames[Call] = Name;		LibcallRoutineNames[Call] = Name;
}		}
▲ Show 20 Lines • Show All 1,374 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp

Show First 20 Lines • Show All 210 Lines • ▼ Show 20 Lines
static cl::opt<bool> AddrSinkCombineBaseOffs(		static cl::opt<bool> AddrSinkCombineBaseOffs(
"addr-sink-combine-base-offs", cl::Hidden, cl::init(true),		"addr-sink-combine-base-offs", cl::Hidden, cl::init(true),
cl::desc("Allow combining of BaseOffs field in Address sinking."));		cl::desc("Allow combining of BaseOffs field in Address sinking."));

static cl::opt<bool> AddrSinkCombineScaledReg(		static cl::opt<bool> AddrSinkCombineScaledReg(
"addr-sink-combine-scaled-reg", cl::Hidden, cl::init(true),		"addr-sink-combine-scaled-reg", cl::Hidden, cl::init(true),
cl::desc("Allow combining of ScaledReg field in Address sinking."));		cl::desc("Allow combining of ScaledReg field in Address sinking."));

		static cl::opt<bool>
		EnableGEPOffsetSplit("cgp-split-large-offset-gep", cl::Hidden,
		cl::init(true),
		cl::desc("Enable splitting large offset of GEP."));

namespace {		namespace {

using SetOfInstrs = SmallPtrSet<Instruction *, 16>;		using SetOfInstrs = SmallPtrSet<Instruction *, 16>;
using TypeIsSExt = PointerIntPair<Type *, 1, bool>;		using TypeIsSExt = PointerIntPair<Type *, 1, bool>;
using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;		using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;
using SExts = SmallVector<Instruction *, 16>;		using SExts = SmallVector<Instruction *, 16>;
using ValueToSExts = DenseMap<Value *, SExts>;		using ValueToSExts = DenseMap<Value *, SExts>;

Show All 29 Lines	class CodeGenPrepare : public FunctionPass {
InstrToOrigTy PromotedInsts;		InstrToOrigTy PromotedInsts;

/// Keep track of instructions removed during promotion.		/// Keep track of instructions removed during promotion.
SetOfInstrs RemovedInsts;		SetOfInstrs RemovedInsts;

/// Keep track of sext chains based on their initial value.		/// Keep track of sext chains based on their initial value.
DenseMap<Value , Instruction > SeenChainsForSExt;		DenseMap<Value , Instruction > SeenChainsForSExt;

		/// Keep track of GEPs accessing the same data structures such as structs or
		/// arrays that are candidates to be split later because of their large
		/// size.
		DenseMap<
		AssertingVH<Value>,
		SmallVector<std::pair<AssertingVH<GetElementPtrInst>, int64_t>, 32>>
		LargeOffsetGEPMap;

		/// Keep track of new GEP base after splitting the GEPs having large offset.
		SmallSet<AssertingVH<Value>, 2> NewGEPBases;

		/// Map serial numbers to Large offset GEPs.
		DenseMap<AssertingVH<GetElementPtrInst>, int> LargeOffsetGEPID;

/// Keep track of SExt promoted.		/// Keep track of SExt promoted.
ValueToSExts ValToSExtendedUses;		ValueToSExts ValToSExtendedUses;

/// True if CFG is modified in any way.		/// True if CFG is modified in any way.
bool ModifiedDT;		bool ModifiedDT;

/// True if optimizing for size.		/// True if optimizing for size.
bool OptSize;		bool OptSize;
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	private:
bool placeDbgValues(Function &F);		bool placeDbgValues(Function &F);
bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,		bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,
LoadInst &LI, Instruction &Inst, bool HasPromoted);		LoadInst &LI, Instruction &Inst, bool HasPromoted);
bool tryToPromoteExts(TypePromotionTransaction &TPT,		bool tryToPromoteExts(TypePromotionTransaction &TPT,
const SmallVectorImpl<Instruction *> &Exts,		const SmallVectorImpl<Instruction *> &Exts,
SmallVectorImpl<Instruction *> &ProfitablyMovedExts,		SmallVectorImpl<Instruction *> &ProfitablyMovedExts,
unsigned CreatedInstsCost = 0);		unsigned CreatedInstsCost = 0);
bool mergeSExts(Function &F);		bool mergeSExts(Function &F);
		bool splitLargeGEPOffsets();
bool performAddressTypePromotion(		bool performAddressTypePromotion(
Instruction *&Inst,		Instruction *&Inst,
bool AllowPromotionWithoutCommonHeader,		bool AllowPromotionWithoutCommonHeader,
bool HasPromoted, TypePromotionTransaction &TPT,		bool HasPromoted, TypePromotionTransaction &TPT,
SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);		SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);
bool splitBranchCondition(Function &F);		bool splitBranchCondition(Function &F);
bool simplifyOffsetableRelocate(Instruction &I);		bool simplifyOffsetableRelocate(Instruction &I);
};		};
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::runOnFunction(Function &F) {
EverMadeChange \|= SplitIndirectBrCriticalEdges(F);		EverMadeChange \|= SplitIndirectBrCriticalEdges(F);

bool MadeChange = true;		bool MadeChange = true;
while (MadeChange) {		while (MadeChange) {
MadeChange = false;		MadeChange = false;
SeenChainsForSExt.clear();		SeenChainsForSExt.clear();
ValToSExtendedUses.clear();		ValToSExtendedUses.clear();
RemovedInsts.clear();		RemovedInsts.clear();
		LargeOffsetGEPMap.clear();
		LargeOffsetGEPID.clear();
for (Function::iterator I = F.begin(); I != F.end(); ) {		for (Function::iterator I = F.begin(); I != F.end(); ) {
BasicBlock BB = &I++;		BasicBlock BB = &I++;
bool ModifiedDTOnIteration = false;		bool ModifiedDTOnIteration = false;
MadeChange \|= optimizeBlock(*BB, ModifiedDTOnIteration);		MadeChange \|= optimizeBlock(*BB, ModifiedDTOnIteration);

// Restart BB iteration if the dominator tree of the Function was changed		// Restart BB iteration if the dominator tree of the Function was changed
if (ModifiedDTOnIteration)		if (ModifiedDTOnIteration)
break;		break;
}		}
if (EnableTypePromotionMerge && !ValToSExtendedUses.empty())		if (EnableTypePromotionMerge && !ValToSExtendedUses.empty())
MadeChange \|= mergeSExts(F);		MadeChange \|= mergeSExts(F);
		if (!LargeOffsetGEPMap.empty())
		MadeChange \|= splitLargeGEPOffsets();

// Really free removed instructions during promotion.		// Really free removed instructions during promotion.
for (Instruction *I : RemovedInsts)		for (Instruction *I : RemovedInsts)
I->deleteValue();		I->deleteValue();

EverMadeChange \|= MadeChange;		EverMadeChange \|= MadeChange;
}		}

▲ Show 20 Lines • Show All 2,086 Lines • ▼ Show 20 Lines	class AddressingModeMatcher {
const SetOfInstrs &InsertedInsts;		const SetOfInstrs &InsertedInsts;

/// A map from the instructions to their type before promotion.		/// A map from the instructions to their type before promotion.
InstrToOrigTy &PromotedInsts;		InstrToOrigTy &PromotedInsts;

/// The ongoing transaction where every action should be registered.		/// The ongoing transaction where every action should be registered.
TypePromotionTransaction &TPT;		TypePromotionTransaction &TPT;

		// A GEP which has too large offset to be folded into the addressing mode.
		std::pair<AssertingVH<GetElementPtrInst>, int64_t> &LargeOffsetGEP;

/// This is set to true when we should not do profitability checks.		/// This is set to true when we should not do profitability checks.
/// When true, IsProfitableToFoldIntoAddressingMode always returns true.		/// When true, IsProfitableToFoldIntoAddressingMode always returns true.
bool IgnoreProfitability;		bool IgnoreProfitability;

AddressingModeMatcher(SmallVectorImpl<Instruction *> &AMI,		AddressingModeMatcher(
const TargetLowering &TLI,		SmallVectorImpl<Instruction *> &AMI, const TargetLowering &TLI,
const TargetRegisterInfo &TRI,		const TargetRegisterInfo &TRI, Type AT, unsigned AS, Instruction MI,
Type *AT, unsigned AS,		ExtAddrMode &AM, const SetOfInstrs &InsertedInsts,
Instruction *MI, ExtAddrMode &AM,		InstrToOrigTy &PromotedInsts, TypePromotionTransaction &TPT,
const SetOfInstrs &InsertedInsts,		std::pair<AssertingVH<GetElementPtrInst>, int64_t> &LargeOffsetGEP)
InstrToOrigTy &PromotedInsts,
TypePromotionTransaction &TPT)
: AddrModeInsts(AMI), TLI(TLI), TRI(TRI),		: AddrModeInsts(AMI), TLI(TLI), TRI(TRI),
DL(MI->getModule()->getDataLayout()), AccessTy(AT), AddrSpace(AS),		DL(MI->getModule()->getDataLayout()), AccessTy(AT), AddrSpace(AS),
MemoryInst(MI), AddrMode(AM), InsertedInsts(InsertedInsts),		MemoryInst(MI), AddrMode(AM), InsertedInsts(InsertedInsts),
PromotedInsts(PromotedInsts), TPT(TPT) {		PromotedInsts(PromotedInsts), TPT(TPT), LargeOffsetGEP(LargeOffsetGEP) {
IgnoreProfitability = false;		IgnoreProfitability = false;
}		}

public:		public:
/// Find the maximal addressing mode that a load/store of V can fold,		/// Find the maximal addressing mode that a load/store of V can fold,
/// give an access type of AccessTy. This returns a list of involved		/// give an access type of AccessTy. This returns a list of involved
/// instructions in AddrModeInsts.		/// instructions in AddrModeInsts.
/// \p InsertedInsts The instructions inserted by other CodeGenPrepare		/// \p InsertedInsts The instructions inserted by other CodeGenPrepare
/// optimizations.		/// optimizations.
/// \p PromotedInsts maps the instructions to their type before promotion.		/// \p PromotedInsts maps the instructions to their type before promotion.
/// \p The ongoing transaction where every action should be registered.		/// \p The ongoing transaction where every action should be registered.
static ExtAddrMode Match(Value V, Type AccessTy, unsigned AS,		static ExtAddrMode
Instruction *MemoryInst,		Match(Value V, Type AccessTy, unsigned AS, Instruction *MemoryInst,
SmallVectorImpl<Instruction*> &AddrModeInsts,		SmallVectorImpl<Instruction *> &AddrModeInsts,
const TargetLowering &TLI,		const TargetLowering &TLI, const TargetRegisterInfo &TRI,
const TargetRegisterInfo &TRI,		const SetOfInstrs &InsertedInsts, InstrToOrigTy &PromotedInsts,
const SetOfInstrs &InsertedInsts,		TypePromotionTransaction &TPT,
InstrToOrigTy &PromotedInsts,		std::pair<AssertingVH<GetElementPtrInst>, int64_t> &LargeOffsetGEP) {
TypePromotionTransaction &TPT) {
ExtAddrMode Result;		ExtAddrMode Result;

bool Success = AddressingModeMatcher(AddrModeInsts, TLI, TRI,		bool Success = AddressingModeMatcher(AddrModeInsts, TLI, TRI, AccessTy, AS,
AccessTy, AS,
MemoryInst, Result, InsertedInsts,		MemoryInst, Result, InsertedInsts,
PromotedInsts, TPT).matchAddr(V, 0);		PromotedInsts, TPT, LargeOffsetGEP)
		.matchAddr(V, 0);
(void)Success; assert(Success && "Couldn't select anything?");		(void)Success; assert(Success && "Couldn't select anything?");
return Result;		return Result;
}		}

private:		private:
bool matchScaledValue(Value *ScaleReg, int64_t Scale, unsigned Depth);		bool matchScaledValue(Value *ScaleReg, int64_t Scale, unsigned Depth);
bool matchAddr(Value *V, unsigned Depth);		bool matchAddr(Value *V, unsigned Depth);
bool matchOperationAddr(User *Operation, unsigned Opcode, unsigned Depth,		bool matchOperationAddr(User *Operation, unsigned Opcode, unsigned Depth,
▲ Show 20 Lines • Show All 1,202 Lines • ▼ Show 20 Lines	case Instruction::GetElementPtr: {
// just add it to the disp field and check validity.		// just add it to the disp field and check validity.
if (VariableOperand == -1) {		if (VariableOperand == -1) {
AddrMode.BaseOffs += ConstantOffset;		AddrMode.BaseOffs += ConstantOffset;
if (ConstantOffset == 0 \|\|		if (ConstantOffset == 0 \|\|
TLI.isLegalAddressingMode(DL, AddrMode, AccessTy, AddrSpace)) {		TLI.isLegalAddressingMode(DL, AddrMode, AccessTy, AddrSpace)) {
// Check to see if we can fold the base pointer in too.		// Check to see if we can fold the base pointer in too.
if (matchAddr(AddrInst->getOperand(0), Depth+1))		if (matchAddr(AddrInst->getOperand(0), Depth+1))
return true;		return true;
		} else if (EnableGEPOffsetSplit && isa<GetElementPtrInst>(AddrInst) &&
		TLI.shouldConsiderGEPOffsetSplit() && Depth == 0 &&
		ConstantOffset > 0) {
		jonpaUnsubmitted Not Done Reply Inline Actions Is there any reason to not use '!=' as oppsed to '>' here (which would match the comment below)? On SystemZ negative offsets are not supported on vector instructions. jonpa: Is there any reason to not use '!=' as oppsed to '>' here (which would match the comment below)?
		efriedmaUnsubmitted Not Done Reply Inline Actions It could work. Would probably need a few other changes to make sure we don't do anything silly like grouping together positive and negative offsets. efriedma: It could work. Would probably need a few other changes to make sure we don't do anything silly…
		jonpaUnsubmitted Not Done Reply Inline Actions I may be missing something, but as far as I can tell there should not be any problem just because the offset is negative. The type for offsets is already int64_t, and the algorithm of sorting them by offset and inserting a new GEP whenever needed seems to work regardless of a negative offset, since the most negative offset is generated first and then the remainder will be positive, or? jonpa: I may be missing something, but as far as I can tell there should not be any problem just…
		efriedmaUnsubmitted Not Done Reply Inline Actions I don't remember exactly how the algorithm for grouping offsets together functions; maybe it just works. efriedma: I don't remember exactly how the algorithm for grouping offsets together functions; maybe it…
		// Record GEPs with non-zero offsets as candidates for splitting in the
		// event that the offset cannot fit into the r+i addressing mode.
		// Simple and common case that only one GEP is used in calculating the
		// address for the memory access.
		Value *Base = AddrInst->getOperand(0);
		auto *BaseI = dyn_cast<Instruction>(Base);
		auto *GEP = cast<GetElementPtrInst>(AddrInst);
		if (isa<Argument>(Base) \|\| isa<GlobalValue>(Base) \|\|
		(BaseI && !isa<CastInst>(BaseI) &&
		!isa<GetElementPtrInst>(BaseI))) {
		// If the base is an instruction, make sure the GEP is not in the same
		// basic block as the base. If the base is an argument or global
		// value, make sure the GEP is not in the entry block. Otherwise,
		// instruction selection can undo the split. Also make sure the
		// parent block allows inserting non-PHI instructions before the
		// terminator.
		BasicBlock *Parent =
		BaseI ? BaseI->getParent() : &GEP->getFunction()->getEntryBlock();
		if (GEP->getParent() != Parent && !Parent->getTerminator()->isEHPad())
		LargeOffsetGEP = std::make_pair(GEP, ConstantOffset);
		}
}		}
AddrMode.BaseOffs -= ConstantOffset;		AddrMode.BaseOffs -= ConstantOffset;
return false;		return false;
}		}

// Save the valid addressing mode in case we can't match.		// Save the valid addressing mode in case we can't match.
ExtAddrMode BackupAddrMode = AddrMode;		ExtAddrMode BackupAddrMode = AddrMode;
unsigned OldSize = AddrModeInsts.size();		unsigned OldSize = AddrModeInsts.size();
▲ Show 20 Lines • Show All 394 Lines • ▼ Show 20 Lines	if (!AddrTy)
return false;		return false;
Type *AddressAccessTy = AddrTy->getElementType();		Type *AddressAccessTy = AddrTy->getElementType();
unsigned AS = AddrTy->getAddressSpace();		unsigned AS = AddrTy->getAddressSpace();

// Do a match against the root of this address, ignoring profitability. This		// Do a match against the root of this address, ignoring profitability. This
// will tell us if the addressing mode for the memory operation will		// will tell us if the addressing mode for the memory operation will
// actually cover the shared instruction.		// actually cover the shared instruction.
ExtAddrMode Result;		ExtAddrMode Result;
		std::pair<AssertingVH<GetElementPtrInst>, int64_t> LargeOffsetGEP(nullptr,
		0);
TypePromotionTransaction::ConstRestorationPt LastKnownGood =		TypePromotionTransaction::ConstRestorationPt LastKnownGood =
TPT.getRestorationPoint();		TPT.getRestorationPoint();
AddressingModeMatcher Matcher(MatchedAddrModeInsts, TLI, TRI,		AddressingModeMatcher Matcher(
AddressAccessTy, AS,		MatchedAddrModeInsts, TLI, TRI, AddressAccessTy, AS, MemoryInst, Result,
MemoryInst, Result, InsertedInsts,		InsertedInsts, PromotedInsts, TPT, LargeOffsetGEP);
PromotedInsts, TPT);
Matcher.IgnoreProfitability = true;		Matcher.IgnoreProfitability = true;
bool Success = Matcher.matchAddr(Address, 0);		bool Success = Matcher.matchAddr(Address, 0);
(void)Success; assert(Success && "Couldn't select anything?");		(void)Success; assert(Success && "Couldn't select anything?");

// The match was to check the profitability, the changes made are not		// The match was to check the profitability, the changes made are not
// part of the original matcher. Therefore, they should be dropped		// part of the original matcher. Therefore, they should be dropped
// otherwise the original matcher will not present the right state.		// otherwise the original matcher will not present the right state.
TPT.rollback(LastKnownGood);		TPT.rollback(LastKnownGood);
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	if (SelectInst *SI = dyn_cast<SelectInst>(V)) {
PhiOrSelectSeen = true;		PhiOrSelectSeen = true;
continue;		continue;
}		}

// For non-PHIs, determine the addressing mode being computed. Note that		// For non-PHIs, determine the addressing mode being computed. Note that
// the result may differ depending on what other uses our candidate		// the result may differ depending on what other uses our candidate
// addressing instructions might have.		// addressing instructions might have.
AddrModeInsts.clear();		AddrModeInsts.clear();
		std::pair<AssertingVH<GetElementPtrInst>, int64_t> LargeOffsetGEP(nullptr,
		0);
ExtAddrMode NewAddrMode = AddressingModeMatcher::Match(		ExtAddrMode NewAddrMode = AddressingModeMatcher::Match(
V, AccessTy, AddrSpace, MemoryInst, AddrModeInsts, TLI, TRI,		V, AccessTy, AddrSpace, MemoryInst, AddrModeInsts, TLI, TRI,
InsertedInsts, PromotedInsts, TPT);		InsertedInsts, PromotedInsts, TPT, LargeOffsetGEP);
NewAddrMode.OriginalValue = V;
		GetElementPtrInst *GEP = LargeOffsetGEP.first;
		if (GEP && GEP->getParent() != MemoryInst->getParent() &&
		!NewGEPBases.count(GEP)) {
		// If splitting the underlying data structure can reduce the offset of a
		// GEP, collect the GEP. Skip the GEPs that are the new bases of
		// previously split data structures.
		LargeOffsetGEPMap[GEP->getPointerOperand()].push_back(LargeOffsetGEP);
		if (LargeOffsetGEPID.find(GEP) == LargeOffsetGEPID.end())
		LargeOffsetGEPID[GEP] = LargeOffsetGEPID.size();
		}

		NewAddrMode.OriginalValue = V;
if (!AddrModes.addNewAddrMode(NewAddrMode))		if (!AddrModes.addNewAddrMode(NewAddrMode))
break;		break;
}		}

// Try to combine the AddrModes we've collected. If we couldn't collect any,		// Try to combine the AddrModes we've collected. If we couldn't collect any,
// or we have multiple but either couldn't combine them or combining them		// or we have multiple but either couldn't combine them or combining them
// wouldn't do anything useful, bail out now.		// wouldn't do anything useful, bail out now.
if (!AddrModes.combineAddrModes()) {		if (!AddrModes.combineAddrModes()) {
▲ Show 20 Lines • Show All 491 Lines • ▼ Show 20 Lines	for (Instruction *Inst : Insts) {
}		}
if (!inserted)		if (!inserted)
CurPts.push_back(Inst);		CurPts.push_back(Inst);
}		}
}		}
return Changed;		return Changed;
}		}

		// Spliting large data structures so that the GEPs accessing them can have
		// smaller offsets so that they can be sunk to the same blocks as their users.
		// For example, a large struct starting from %base is splitted into two parts
		// where the second part starts from %new_base.
		//
		// Before:
		// BB0:
		// %base =
		//
		// BB1:
		// %gep0 = gep %base, off0
		// %gep1 = gep %base, off1
		// %gep2 = gep %base, off2
		//
		// BB2:
		// %load1 = load %gep0
		// %load2 = load %gep1
		// %load3 = load %gep2
		//
		// After:
		// BB0:
		// %base =
		// %new_base = gep %base, off0
		//
		// BB1:
		// %new_gep0 = %new_base
		// %new_gep1 = gep %new_base, off1 - off0
		// %new_gep2 = gep %new_base, off2 - off0
		//
		// BB2:
		// %load1 = load i32, i32* %new_gep0
		// %load2 = load i32, i32* %new_gep1
		// %load3 = load i32, i32* %new_gep2
		//
		// %new_gep1 and %new_gep2 can be sunk to BB2 now after the splitting because
		// their offsets are smaller enough to fit into the addressing mode.
		bool CodeGenPrepare::splitLargeGEPOffsets() {
		bool Changed = false;
		for (auto &Entry : LargeOffsetGEPMap) {
		Value *OldBase = Entry.first;
		SmallVectorImpl<std::pair<AssertingVH<GetElementPtrInst>, int64_t>>
		&LargeOffsetGEPs = Entry.second;
		auto compareGEPOffset =
		[&](const std::pair<GetElementPtrInst *, int64_t> &LHS,
		const std::pair<GetElementPtrInst *, int64_t> &RHS) {
		if (LHS.first == RHS.first)
		return false;
		if (LHS.second != RHS.second)
		return LHS.second < RHS.second;
		return LargeOffsetGEPID[LHS.first] < LargeOffsetGEPID[RHS.first];
		};
		// Sorting all the GEPs of the same data structures based on the offsets.
		llvm::sort(LargeOffsetGEPs.begin(), LargeOffsetGEPs.end(),
		compareGEPOffset);
		LargeOffsetGEPs.erase(
		std::unique(LargeOffsetGEPs.begin(), LargeOffsetGEPs.end()),
		LargeOffsetGEPs.end());
		// Skip if all the GEPs have the same offsets.
		if (LargeOffsetGEPs.front().second == LargeOffsetGEPs.back().second)
		continue;
		GetElementPtrInst *BaseGEP = LargeOffsetGEPs.begin()->first;
		int64_t BaseOffset = LargeOffsetGEPs.begin()->second;
		Value *NewBaseGEP = nullptr;

		auto LargeOffsetGEP = LargeOffsetGEPs.begin();
		while (LargeOffsetGEP != LargeOffsetGEPs.end()) {
		GetElementPtrInst *GEP = LargeOffsetGEP->first;
		int64_t Offset = LargeOffsetGEP->second;
		if (Offset != BaseOffset) {
		TargetLowering::AddrMode AddrMode;
		AddrMode.BaseOffs = Offset - BaseOffset;
		// The result type of the GEP might not be the type of the memory
		// access.
		if (!TLI->isLegalAddressingMode(*DL, AddrMode,
		GEP->getResultElementType(),
		GEP->getAddressSpace())) {
		// We need to create a new base if the offset to the current base is
		// too large to fit into the addressing mode. So, a very large struct
		// may be splitted into several parts.
		BaseGEP = GEP;
		BaseOffset = Offset;
		NewBaseGEP = nullptr;
		}
		}

		// Generate a new GEP to replace the current one.
		IRBuilder<> Builder(GEP);
		Type *IntPtrTy = DL->getIntPtrType(GEP->getType());
		Type *I8PtrTy =
		Builder.getInt8PtrTy(GEP->getType()->getPointerAddressSpace());
		Type *I8Ty = Builder.getInt8Ty();

		if (!NewBaseGEP) {
		// Create a new base if we don't have one yet. Find the insertion
		// pointer for the new base first.
		BasicBlock::iterator NewBaseInsertPt;
		BasicBlock *NewBaseInsertBB;
		if (auto *BaseI = dyn_cast<Instruction>(OldBase)) {
		// If the base of the struct is an instruction, the new base will be
		// inserted close to it.
		NewBaseInsertBB = BaseI->getParent();
		if (isa<PHINode>(BaseI))
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		else if (InvokeInst *Invoke = dyn_cast<InvokeInst>(BaseI)) {
		NewBaseInsertBB =
		SplitEdge(NewBaseInsertBB, Invoke->getNormalDest());
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		} else
		NewBaseInsertPt = std::next(BaseI->getIterator());
		} else {
		// If the current base is an argument or global value, the new base
		// will be inserted to the entry block.
		NewBaseInsertBB = &BaseGEP->getFunction()->getEntryBlock();
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		}
		IRBuilder<> NewBaseBuilder(NewBaseInsertBB, NewBaseInsertPt);
		// Create a new base.
		Value *BaseIndex = ConstantInt::get(IntPtrTy, BaseOffset);
		NewBaseGEP = OldBase;
		if (NewBaseGEP->getType() != I8PtrTy)
		NewBaseGEP = NewBaseBuilder.CreatePointerCast(NewBaseGEP, I8PtrTy);
		NewBaseGEP =
		NewBaseBuilder.CreateGEP(I8Ty, NewBaseGEP, BaseIndex, "splitgep");
		NewGEPBases.insert(NewBaseGEP);
		}

		Value *NewGEP = NewBaseGEP;
		if (Offset == BaseOffset) {
		if (GEP->getType() != I8PtrTy)
		NewGEP = Builder.CreatePointerCast(NewGEP, GEP->getType());
		} else {
		// Calculate the new offset for the new GEP.
		Value *Index = ConstantInt::get(IntPtrTy, Offset - BaseOffset);
		NewGEP = Builder.CreateGEP(I8Ty, NewBaseGEP, Index);

		if (GEP->getType() != I8PtrTy)
		NewGEP = Builder.CreatePointerCast(NewGEP, GEP->getType());
		}
		GEP->replaceAllUsesWith(NewGEP);
		LargeOffsetGEPID.erase(GEP);
		LargeOffsetGEP = LargeOffsetGEPs.erase(LargeOffsetGEP);
		GEP->eraseFromParent();
		Changed = true;
		}
		}
		return Changed;
		}

/// Return true, if an ext(load) can be formed from an extension in		/// Return true, if an ext(load) can be formed from an extension in
/// \p MovedExts.		/// \p MovedExts.
bool CodeGenPrepare::canFormExtLd(		bool CodeGenPrepare::canFormExtLd(
const SmallVectorImpl<Instruction > &MovedExts, LoadInst &LI,		const SmallVectorImpl<Instruction > &MovedExts, LoadInst &LI,
Instruction *&Inst, bool HasPromoted) {		Instruction *&Inst, bool HasPromoted) {
for (auto *MovedExtInst : MovedExts) {		for (auto *MovedExtInst : MovedExts) {
if (isa<LoadInst>(MovedExtInst->getOperand(0))) {		if (isa<LoadInst>(MovedExtInst->getOperand(0))) {
LI = cast<LoadInst>(MovedExtInst->getOperand(0));		LI = cast<LoadInst>(MovedExtInst->getOperand(0));
▲ Show 20 Lines • Show All 1,835 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	bool lowerInterleavedLoad(LoadInst *LI,
ArrayRef<unsigned> Indices,		ArrayRef<unsigned> Indices,
unsigned Factor) const override;		unsigned Factor) const override;
bool lowerInterleavedStore(StoreInst SI, ShuffleVectorInst SVI,		bool lowerInterleavedStore(StoreInst SI, ShuffleVectorInst SVI,
unsigned Factor) const override;		unsigned Factor) const override;

bool isLegalAddImmediate(int64_t) const override;		bool isLegalAddImmediate(int64_t) const override;
bool isLegalICmpImmediate(int64_t) const override;		bool isLegalICmpImmediate(int64_t) const override;

		bool shouldConsiderGEPOffsetSplit() const override;

EVT getOptimalMemOpType(uint64_t Size, unsigned DstAlign, unsigned SrcAlign,		EVT getOptimalMemOpType(uint64_t Size, unsigned DstAlign, unsigned SrcAlign,
bool IsMemset, bool ZeroMemset, bool MemcpyStrSrc,		bool IsMemset, bool ZeroMemset, bool MemcpyStrSrc,
MachineFunction &MF) const override;		MachineFunction &MF) const override;

/// Return true if the addressing mode represented by AM is legal for this		/// Return true if the addressing mode represented by AM is legal for this
/// target, for a load/store of the specified type.		/// target, for a load/store of the specified type.
bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM, Type *Ty,		bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM, Type *Ty,
unsigned AS,		unsigned AS,
▲ Show 20 Lines • Show All 335 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,279 Lines • ▼ Show 20 Lines	if (!AM.Scale) {
return false;		return false;
}		}

// Check reg1 + SIZE_IN_BYTES * reg2 and reg1 + reg2		// Check reg1 + SIZE_IN_BYTES * reg2 and reg1 + reg2

return AM.Scale == 1 \|\| (AM.Scale > 0 && (uint64_t)AM.Scale == NumBytes);		return AM.Scale == 1 \|\| (AM.Scale > 0 && (uint64_t)AM.Scale == NumBytes);
}		}

		bool AArch64TargetLowering::shouldConsiderGEPOffsetSplit() const {
		// Consider splitting large offset of struct or array.
		return true;
		}

int AArch64TargetLowering::getScalingFactorCost(const DataLayout &DL,		int AArch64TargetLowering::getScalingFactorCost(const DataLayout &DL,
const AddrMode &AM, Type *Ty,		const AddrMode &AM, Type *Ty,
unsigned AS) const {		unsigned AS) const {
// Scaling factors are not free at all.		// Scaling factors are not free at all.
// Operands \| Rt Latency		// Operands \| Rt Latency
// -------------------------------------------		// -------------------------------------------
// Rt, [Xn, Xm] \| 4		// Rt, [Xn, Xm] \| 4
// -------------------------------------------		// -------------------------------------------
▲ Show 20 Lines • Show All 3,106 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/CodeGenPrepare/AArch64/large-offset-gep.ll

				; RUN: llc -mtriple=aarch64-linux-gnu -verify-machineinstrs -o - %s \| FileCheck %s

				%struct_type = type { [10000 x i32], i32, i32 }

				define void @test1(%struct_type** %s, i32 %n) {
				; CHECK-LABEL: test1
				entry:
				%struct = load %struct_type, %struct_type* %s
				br label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				define void @test2(%struct_type* %struct, i32 %n) {
				; CHECK-LABEL: test2
				entry:
				%cmp = icmp eq %struct_type* %struct, null
				br i1 %cmp, label %while_end, label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp1 = icmp slt i32 %phi, %n
				br i1 %cmp1, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				define void @test3(%struct_type* %s1, %struct_type* %s2, i1 %cond, i32 %n) {
				; CHECK-LABEL: test3
				entry:
				br i1 %cond, label %if_true, label %if_end

				if_true:
				br label %if_end

				if_end:
				%struct = phi %struct_type* [ %s1, %entry ], [ %s2, %if_true ]
				%cmp = icmp eq %struct_type* %struct, null
				br i1 %cmp, label %while_end, label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %if_end ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp1 = icmp slt i32 %phi, %n
				br i1 %cmp1, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				declare %struct_type* @foo()

				define void @test4(i32 %n) personality i32 (...)* @__FrameHandler {
				; CHECK-LABEL: test4
				entry:
				%struct = invoke %struct_type* @foo() to label %while_cond unwind label %cleanup

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void

				cleanup:
				landingpad { i8*, i32 } cleanup
				unreachable
				}

				declare i32 @__FrameHandler(...)

				define void @test5([65536 x i32]** %s, i32 %n) {
				; CHECK-LABEL: test5
				entry:
				%struct = load [65536 x i32], [65536 x i32]* %s
				br label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #14464
				; CHECK-NOT: mov w{{[0-9]+}}, #14468
				%gep0 = getelementptr [65536 x i32], [65536 x i32]* %struct, i64 0, i32 20000
				%gep1 = getelementptr [65536 x i32], [65536 x i32]* %struct, i64 0, i32 20001
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}