This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
TargetLowering.h
-
lib/
-
CodeGen/
23/29
CodeGenPrepare.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.h
3/3
AArch64ISelLowering.cpp
-
test/Transforms/CodeGenPrepare/AArch64/
-
Transforms/
-
CodeGenPrepare/
-
AArch64/
-
large-offset-gep.ll

Differential D42759

[CGP] Split large data structres to sink more GEPs
ClosedPublic

Authored by haicheng on Jan 31 2018, 12:28 PM.

Download Raw Diff

Details

Reviewers

gberry
qcolombet
skatkov
john.brawn
reames
javed.absar
efriedma

Commits

rG0aae2bc26079: [CGP] Split large data structres to sink more GEPs
rL332015: [CGP] Split large data structres to sink more GEPs

Summary

Accessing the members of a large data structures needs a lot of GEPs which usually have large offsets due to the size of the underlying data structure. If the offsets are too large to fit into the r+i addressing mode, these GEPs cannot be sunk to their users' blocks and many extra registers are needed then to carry the values of these GEPs.

This patch tries to split a large data struct starting from %base like the following.

Before:

BB0:
  %base     =

BB1:
  %gep0     = gep %base, off0
  %gep1     = gep %base, off1
  %gep2     = gep %base, off2

BB2:
  %load1    = load %gep0
  %load2    = load %gep1
  %load3    = load %gep2

After:

BB0:
  %base     =
  %new_base = gep %base, off0

BB1:
  %new_gep0 = %new_base
  %new_gep1 = gep %new_base, off1 - off0
  %new_gep2 = gep %new_base, off2 - off0

BB2:
  %load1    = load i32, i32* %new_gep0
  %load2    = load i32, i32* %new_gep1
  %load3    = load i32, i32* %new_gep2

In the above example, the struct is split into two parts. The first part still starts from %base and the second part starts from %new_base. After the splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to BB2 and folded into their users.

The algorithm to split data structure is simple and very similar to the work of merging SExts. First, it collects GEPs that have large offsets when iterating the blocks. Second, it splits the underlying data structures and updates the collected GEPs to use smaller offsets.

The code size and performance results of spec20xx is listed as below

	Code Size (%)	Performance (%)
	(- is smaller)	(+ is faster)
spec2006/bzip2	-3.47	+2.89
spec2017/imagick	-0.01	+1.02
spec2000/mesa	-0.76	+0.71
spec2017/x264	-0.91	+0.39
spec2017/leela	-0.01	+0.31
spec2000/ammp	-0.04	+0.17
spec2017/parest	0	0
spec2017/xalancbmk	0	-0.13
spec2017/blender	0	-0.18
spec2006/xalancbmk	0	-0.51

Diff Detail

Repository: rL LLVM

Event Timeline

haicheng created this revision.Jan 31 2018, 12:28 PM

Herald added subscribers: javed.absar, mcrosier. · View Herald TranscriptJan 31 2018, 12:28 PM

efriedma added a subscriber: efriedma.Jan 31 2018, 12:36 PM

efriedma added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
3751	I haven't really reviewed the whole patch closely, so I might be missing something, but why does it matter that the element type is a struct, as opposed to something else large like an array?

junbuml added a subscriber: junbuml.Jan 31 2018, 12:58 PM

haicheng added inline comments.Jan 31 2018, 1:05 PM

lib/CodeGen/CodeGenPrepare.cpp
3751	Nothing special, I am just being conservative in the beginning. I find that at this stage, LLVM can bitcast arbitrary pointers to i8* and do random stuffs. I am thinking supporting other large data structures as the next step.

tobiasvk added a subscriber: tobiasvk.Feb 6 2018, 10:52 AM

tobiasvk added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
429	Drive-by comment: it's not a good idea to modify a cl::opt here as this will create (benign) races between multiple backend threads in ThinLTO.

I made two changes to address the comments.

Add the support of splitting large arrays.

Introduce a target hook to enable/disable the change.

I am also welcome to any alternative implementation suggestions.

A few quick comments. Will follow up with a more complete review later this week.

lib/CodeGen/CodeGenPrepare.cpp
3743	Perhaps something like: // Record GEPs with a non-zero offsets as candidates for splitting in the event that the offset cannot fit into the r+i addressing mode.
3746	Add isa<GetElementPtrInst>(AddrInst) check?
3755	You could reduce the indent here by putting a 'isa<GetElementPtrInst>(AddrInst) in the outer most if statement and then make this a cast<>. See comment above.
3756	Also, I need some clarification here. The first check (BaseI && GEP->getParent() != BaseI->getParent()) seems to be inline with the comments below, but I don't completely follow the second check (i.e., checking that the PHI is not in the entry block, if BaseI is null).

efriedma added inline comments.Feb 12 2018, 11:38 AM

lib/Target/AArch64/AArch64ISelLowering.cpp
8012	The base of a GEP is always a pointer.
8014	I'm still not sure what the purpose of this check is; why does it matter how many dimensions an array has?

Completely remove the check of base types. So, any large data structure is supported.

Clarify some comments.

Thank you for the review.

Made some changes to improve the readability.

Kindly ping. I am appreciated to take any advice.

Kindly Ping #2

Kindly Ping #3

Just minor nits inlined.

lib/CodeGen/CodeGenPrepare.cpp
3743	a non-zero offsets -> a non-zero offset
3749	It seems that you can move this check to "else" by changing "else" to "else if".
4293	You may want to have a new variable for GetElementPtrInst instead of doing LargeOffsetGEP.first several times.
4295	!NewGEPBases.count(LargeOffsetGEP.first)
4297	I'm not clear about this comment. Can you clarify it little bit? typo: spllit
lib/Target/AArch64/AArch64ISelLowering.cpp
8011	Should we have BaseType as parameter ?

Address Jun's comments.

Thank you very much for taking a look.

skatkov added inline comments.Mar 26 2018, 10:12 PM

lib/CodeGen/CodeGenPrepare.cpp
3742	Why do you need the check isa<GetElementPtrInst>(AddrInst)? You are in "case Instruction::GetElementPtr" basing on "switch (Opcode) {" where Opcode is actually AddrInst->getOpcode(). So to me it is redundant...
3761	getParent()->getParent() => getFunction()
4802	Please add a comment for nontrivial comparison function.
4856	I wonder if you sort the an array anyway why not to use set like data-structure to keep them and avoid erase of unique elements? Is there any hidden sense for that?
4868	How can we get GEP == nullptr here?
4912	getParent()->getParent() => getFunction()

Update to address Serguei's comments. Thank you for taking a look.

lib/CodeGen/CodeGenPrepare.cpp
3742	AddrInst is a user. It can be a GetElementPtrConstantExpr which is not supported by CGP yet.
4856	I need a data structure that can be iterated in a sorted order (ascending by offsets) and all elements (pointers to GEPs) are unique. I think the data in the set are unique but I cannot access them in the order that I want If I use a set.

Kindly Ping

efriedma added inline comments.Apr 9 2018, 5:06 PM

lib/CodeGen/CodeGenPrepare.cpp
2545	`AssertingVH<GetElementPtrInst>`
4866	This is going to visit GEPs with the same offset in non-deterministic order. I guess it might not be that important, but it'll probably mess with the use-lists, which could affect some later pass. I would rather stay on the safe side and sort GEPs with the same offset based on the order they were added to the map, or some other deterministic ordering.
4873	The result type of the GEP might not be the type of the memory access... but I don't know how you'd get the right type off the top of my head, and it might not be important in practice. Maybe worth noting with a comment, though.
4907	I think you have to be a bit more careful here to avoid inserting a GEP into a block with a catchswitch. (This is Windows-only exception-handling, so maybe difficult to trigger.)

Address Eli's comments. Thank you very much, Eli.

haicheng marked 2 inline comments as done.Apr 18 2018, 1:36 PM

haicheng added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
4907	Now I check if the blocks contain catchswitch when collecting GEPs.

This is looking good. Just a few more small comments.

lib/CodeGen/CodeGenPrepare.cpp
278	Probably these maps should also use AssertingVH.
3753	I'm pretty sure it's impossible for `BaseI` to be a `BinaryOperator`, given it's a pointer.

Herald added a reviewer: javed.absar. · View Herald TranscriptApr 23 2018, 6:02 PM

Address Eli's comments. Thank you.

haicheng marked 2 inline comments as done.Apr 26 2018, 9:54 AM

efriedma added inline comments.Apr 27 2018, 2:35 PM

lib/CodeGen/CodeGenPrepare.cpp
278	By "these maps", I meant NewGEPBases and LargeOffsetGEPMap too.

Herald added a subscriber: mgrang. · View Herald TranscriptApr 27 2018, 2:35 PM

Add more AssertingVH.

efriedma added inline comments.May 1 2018, 11:16 AM

lib/CodeGen/CodeGenPrepare.cpp
278	LargeOffsetGEPMap still has a `GetElementPtrInst *` that isn't an AssertingVH.

Add one more AssertingVH

Kindly Ping

LGTM

This revision is now accepted and ready to land.May 9 2018, 3:23 PM

Closed by commit rL332015: [CGP] Split large data structres to sink more GEPs (authored by haicheng). · Explain WhyMay 10 2018, 11:31 AM

This revision was automatically updated to reflect the committed changes.

jonpa added a subscriber: jonpa.Jun 19 2022, 8:36 AM

jonpa added inline comments.

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816 ↗	(On Diff #146171)	Is there any reason to not use '!=' as oppsed to '>' here (which would match the comment below)? On SystemZ negative offsets are not supported on vector instructions.

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 19 2022, 8:36 AM

efriedma added inline comments.Jun 20 2022, 3:41 PM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816 ↗	(On Diff #146171)	It could work. Would probably need a few other changes to make sure we don't do anything silly like grouping together positive and negative offsets.

jonpa added inline comments.Jun 21 2022, 4:09 AM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816 ↗	(On Diff #146171)	I may be missing something, but as far as I can tell there should not be any problem just because the offset is negative. The type for offsets is already int64_t, and the algorithm of sorting them by offset and inserting a new GEP whenever needed seems to work regardless of a negative offset, since the most negative offset is generated first and then the remainder will be positive, or?

efriedma added inline comments.Jun 21 2022, 10:19 AM

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp
3816 ↗	(On Diff #146171)	I don't remember exactly how the algorithm for grouping offsets together functions; maybe it just works.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

TargetLowering.h

7 lines

lib/

CodeGen/

CodeGenPrepare.cpp

231 lines

Target/

AArch64/

AArch64ISelLowering.h

2 lines

AArch64ISelLowering.cpp

9 lines

test/

Transforms/

CodeGenPrepare/

AArch64/

large-offset-gep.ll

147 lines

Diff 133782

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 2,220 Lines • ▼ Show 20 Lines	public:
}		}

// Return true if it is profitable to use a scalar input to a BUILD_VECTOR		// Return true if it is profitable to use a scalar input to a BUILD_VECTOR
// even if the vector itself has multiple uses.		// even if the vector itself has multiple uses.
virtual bool aggressivelyPreferBuildVectorSources(EVT VecVT) const {		virtual bool aggressivelyPreferBuildVectorSources(EVT VecVT) const {
return false;		return false;
}		}

		// Return true if CodeGenPrepare should consider splitting large offset of a
		// GEP to make the GEP fit into the addressing mode and can be sunk into the
		// same blocks of its users.
		virtual bool shouldConsiderGEPOffsetSplit(Type* BaseType) const {
		return false;
		}

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Runtime Library hooks		// Runtime Library hooks
//		//

/// Rename the default libcall routine name for the specified libcall.		/// Rename the default libcall routine name for the specified libcall.
void setLibcallName(RTLIB::Libcall Call, const char *Name) {		void setLibcallName(RTLIB::Libcall Call, const char *Name) {
LibcallRoutineNames[Call] = Name;		LibcallRoutineNames[Call] = Name;
}		}
▲ Show 20 Lines • Show All 1,323 Lines • Show Last 20 Lines

lib/CodeGen/CodeGenPrepare.cpp

Show First 20 Lines • Show All 209 Lines • ▼ Show 20 Lines
static cl::opt<bool> AddrSinkCombineBaseOffs(		static cl::opt<bool> AddrSinkCombineBaseOffs(
"addr-sink-combine-base-offs", cl::Hidden, cl::init(true),		"addr-sink-combine-base-offs", cl::Hidden, cl::init(true),
cl::desc("Allow combining of BaseOffs field in Address sinking."));		cl::desc("Allow combining of BaseOffs field in Address sinking."));

static cl::opt<bool> AddrSinkCombineScaledReg(		static cl::opt<bool> AddrSinkCombineScaledReg(
"addr-sink-combine-scaled-reg", cl::Hidden, cl::init(true),		"addr-sink-combine-scaled-reg", cl::Hidden, cl::init(true),
cl::desc("Allow combining of ScaledReg field in Address sinking."));		cl::desc("Allow combining of ScaledReg field in Address sinking."));

		static cl::opt<bool>
		EnableGEPOffsetSplit("cgp-split-large-offset-gep", cl::Hidden,
		cl::init(true),
		cl::desc("Enable splitting large offset of GEP."));

namespace {		namespace {

using SetOfInstrs = SmallPtrSet<Instruction *, 16>;		using SetOfInstrs = SmallPtrSet<Instruction *, 16>;
using TypeIsSExt = PointerIntPair<Type *, 1, bool>;		using TypeIsSExt = PointerIntPair<Type *, 1, bool>;
using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;		using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;
using SExts = SmallVector<Instruction *, 16>;		using SExts = SmallVector<Instruction *, 16>;
using ValueToSExts = DenseMap<Value *, SExts>;		using ValueToSExts = DenseMap<Value *, SExts>;

Show All 29 Lines	class CodeGenPrepare : public FunctionPass {
InstrToOrigTy PromotedInsts;		InstrToOrigTy PromotedInsts;

/// Keep track of instructions removed during promotion.		/// Keep track of instructions removed during promotion.
SetOfInstrs RemovedInsts;		SetOfInstrs RemovedInsts;

/// Keep track of sext chains based on their initial value.		/// Keep track of sext chains based on their initial value.
DenseMap<Value , Instruction > SeenChainsForSExt;		DenseMap<Value , Instruction > SeenChainsForSExt;

		// Keep track of GEPs accessing the same data structures such as structs or
		// arrays that are candidates to be split later because of their large size.
		DenseMap<Value , SmallVector<std::pair<GetElementPtrInst , int64_t>, 32>>
		LargeOffsetGEPMap;

/// Keep track of SExt promoted.		/// Keep track of SExt promoted.
ValueToSExts ValToSExtendedUses;		ValueToSExts ValToSExtendedUses;

/// True if CFG is modified in any way.		/// True if CFG is modified in any way.
bool ModifiedDT;		bool ModifiedDT;

		efriedmaUnsubmitted Done Reply Inline Actions Probably these maps should also use AssertingVH. efriedma: Probably these maps should also use AssertingVH.
		efriedmaUnsubmitted Done Reply Inline Actions By "these maps", I meant NewGEPBases and LargeOffsetGEPMap too. efriedma: By "these maps", I meant NewGEPBases and LargeOffsetGEPMap too.
		efriedmaUnsubmitted Done Reply Inline Actions LargeOffsetGEPMap still has a `GetElementPtrInst ` that isn't an AssertingVH. efriedma:* LargeOffsetGEPMap still has a `GetElementPtrInst *` that isn't an AssertingVH.
/// True if optimizing for size.		/// True if optimizing for size.
bool OptSize;		bool OptSize;

/// DataLayout for the Function being processed.		/// DataLayout for the Function being processed.
const DataLayout *DL = nullptr;		const DataLayout *DL = nullptr;

public:		public:
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid
Show All 39 Lines	private:
bool placeDbgValues(Function &F);		bool placeDbgValues(Function &F);
bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,		bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,
LoadInst &LI, Instruction &Inst, bool HasPromoted);		LoadInst &LI, Instruction &Inst, bool HasPromoted);
bool tryToPromoteExts(TypePromotionTransaction &TPT,		bool tryToPromoteExts(TypePromotionTransaction &TPT,
const SmallVectorImpl<Instruction *> &Exts,		const SmallVectorImpl<Instruction *> &Exts,
SmallVectorImpl<Instruction *> &ProfitablyMovedExts,		SmallVectorImpl<Instruction *> &ProfitablyMovedExts,
unsigned CreatedInstsCost = 0);		unsigned CreatedInstsCost = 0);
bool mergeSExts(Function &F);		bool mergeSExts(Function &F);
		bool splitLargeGEPOffsets();
bool performAddressTypePromotion(		bool performAddressTypePromotion(
Instruction *&Inst,		Instruction *&Inst,
bool AllowPromotionWithoutCommonHeader,		bool AllowPromotionWithoutCommonHeader,
bool HasPromoted, TypePromotionTransaction &TPT,		bool HasPromoted, TypePromotionTransaction &TPT,
SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);		SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);
bool splitBranchCondition(Function &F);		bool splitBranchCondition(Function &F);
bool simplifyOffsetableRelocate(Instruction &I);		bool simplifyOffsetableRelocate(Instruction &I);
};		};
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::runOnFunction(Function &F) {
EverMadeChange \|= SplitIndirectBrCriticalEdges(F);		EverMadeChange \|= SplitIndirectBrCriticalEdges(F);

bool MadeChange = true;		bool MadeChange = true;
while (MadeChange) {		while (MadeChange) {
MadeChange = false;		MadeChange = false;
SeenChainsForSExt.clear();		SeenChainsForSExt.clear();
ValToSExtendedUses.clear();		ValToSExtendedUses.clear();
RemovedInsts.clear();		RemovedInsts.clear();
		LargeOffsetGEPMap.clear();
for (Function::iterator I = F.begin(); I != F.end(); ) {		for (Function::iterator I = F.begin(); I != F.end(); ) {
		tobiasvkUnsubmitted Done Reply Inline Actions Drive-by comment: it's not a good idea to modify a cl::opt here as this will create (benign) races between multiple backend threads in ThinLTO. tobiasvk: Drive-by comment: it's not a good idea to modify a cl::opt here as this will create (benign)…
BasicBlock BB = &I++;		BasicBlock BB = &I++;
bool ModifiedDTOnIteration = false;		bool ModifiedDTOnIteration = false;
MadeChange \|= optimizeBlock(*BB, ModifiedDTOnIteration);		MadeChange \|= optimizeBlock(*BB, ModifiedDTOnIteration);

// Restart BB iteration if the dominator tree of the Function was changed		// Restart BB iteration if the dominator tree of the Function was changed
if (ModifiedDTOnIteration)		if (ModifiedDTOnIteration)
break;		break;
}		}
if (EnableTypePromotionMerge && !ValToSExtendedUses.empty())		if (EnableTypePromotionMerge && !ValToSExtendedUses.empty())
MadeChange \|= mergeSExts(F);		MadeChange \|= mergeSExts(F);
		if (!LargeOffsetGEPMap.empty())
		MadeChange \|= splitLargeGEPOffsets();

// Really free removed instructions during promotion.		// Really free removed instructions during promotion.
for (Instruction *I : RemovedInsts)		for (Instruction *I : RemovedInsts)
I->deleteValue();		I->deleteValue();

EverMadeChange \|= MadeChange;		EverMadeChange \|= MadeChange;
}		}

▲ Show 20 Lines • Show All 2,086 Lines • ▼ Show 20 Lines	class AddressingModeMatcher {
const SetOfInstrs &InsertedInsts;		const SetOfInstrs &InsertedInsts;

/// A map from the instructions to their type before promotion.		/// A map from the instructions to their type before promotion.
InstrToOrigTy &PromotedInsts;		InstrToOrigTy &PromotedInsts;

/// The ongoing transaction where every action should be registered.		/// The ongoing transaction where every action should be registered.
TypePromotionTransaction &TPT;		TypePromotionTransaction &TPT;

		// A GEP which has too large offset to be folded into the addressing mode.
		std::pair<GetElementPtrInst *, int64_t> &LargeOffsetGEP;
		efriedmaUnsubmitted Done Reply Inline Actions `AssertingVH<GetElementPtrInst>` efriedma: `AssertingVH<GetElementPtrInst>`

/// This is set to true when we should not do profitability checks.		/// This is set to true when we should not do profitability checks.
/// When true, IsProfitableToFoldIntoAddressingMode always returns true.		/// When true, IsProfitableToFoldIntoAddressingMode always returns true.
bool IgnoreProfitability;		bool IgnoreProfitability;

AddressingModeMatcher(SmallVectorImpl<Instruction *> &AMI,		AddressingModeMatcher(SmallVectorImpl<Instruction *> &AMI,
const TargetLowering &TLI,		const TargetLowering &TLI,
const TargetRegisterInfo &TRI,		const TargetRegisterInfo &TRI, Type *AT, unsigned AS,
Type *AT, unsigned AS,
Instruction *MI, ExtAddrMode &AM,		Instruction *MI, ExtAddrMode &AM,
const SetOfInstrs &InsertedInsts,		const SetOfInstrs &InsertedInsts,
InstrToOrigTy &PromotedInsts,		InstrToOrigTy &PromotedInsts,
TypePromotionTransaction &TPT)		TypePromotionTransaction &TPT,
		std::pair<GetElementPtrInst *, int64_t> &LargeOffsetGEP)
: AddrModeInsts(AMI), TLI(TLI), TRI(TRI),		: AddrModeInsts(AMI), TLI(TLI), TRI(TRI),
DL(MI->getModule()->getDataLayout()), AccessTy(AT), AddrSpace(AS),		DL(MI->getModule()->getDataLayout()), AccessTy(AT), AddrSpace(AS),
MemoryInst(MI), AddrMode(AM), InsertedInsts(InsertedInsts),		MemoryInst(MI), AddrMode(AM), InsertedInsts(InsertedInsts),
PromotedInsts(PromotedInsts), TPT(TPT) {		PromotedInsts(PromotedInsts), TPT(TPT), LargeOffsetGEP(LargeOffsetGEP) {
IgnoreProfitability = false;		IgnoreProfitability = false;
}		}

public:		public:
/// Find the maximal addressing mode that a load/store of V can fold,		/// Find the maximal addressing mode that a load/store of V can fold,
/// give an access type of AccessTy. This returns a list of involved		/// give an access type of AccessTy. This returns a list of involved
/// instructions in AddrModeInsts.		/// instructions in AddrModeInsts.
/// \p InsertedInsts The instructions inserted by other CodeGenPrepare		/// \p InsertedInsts The instructions inserted by other CodeGenPrepare
/// optimizations.		/// optimizations.
/// \p PromotedInsts maps the instructions to their type before promotion.		/// \p PromotedInsts maps the instructions to their type before promotion.
/// \p The ongoing transaction where every action should be registered.		/// \p The ongoing transaction where every action should be registered.
static ExtAddrMode Match(Value V, Type AccessTy, unsigned AS,		static ExtAddrMode
Instruction *MemoryInst,		Match(Value V, Type AccessTy, unsigned AS, Instruction *MemoryInst,
SmallVectorImpl<Instruction*> &AddrModeInsts,		SmallVectorImpl<Instruction *> &AddrModeInsts,
const TargetLowering &TLI,		const TargetLowering &TLI, const TargetRegisterInfo &TRI,
const TargetRegisterInfo &TRI,		const SetOfInstrs &InsertedInsts, InstrToOrigTy &PromotedInsts,
const SetOfInstrs &InsertedInsts,		TypePromotionTransaction &TPT,
InstrToOrigTy &PromotedInsts,		std::pair<GetElementPtrInst *, int64_t> &LargeOffsetGEP) {
TypePromotionTransaction &TPT) {
ExtAddrMode Result;		ExtAddrMode Result;

bool Success = AddressingModeMatcher(AddrModeInsts, TLI, TRI,		bool Success = AddressingModeMatcher(AddrModeInsts, TLI, TRI, AccessTy, AS,
AccessTy, AS,
MemoryInst, Result, InsertedInsts,		MemoryInst, Result, InsertedInsts,
PromotedInsts, TPT).matchAddr(V, 0);		PromotedInsts, TPT, LargeOffsetGEP)
		.matchAddr(V, 0);
(void)Success; assert(Success && "Couldn't select anything?");		(void)Success; assert(Success && "Couldn't select anything?");
return Result;		return Result;
}		}

private:		private:
bool matchScaledValue(Value *ScaleReg, int64_t Scale, unsigned Depth);		bool matchScaledValue(Value *ScaleReg, int64_t Scale, unsigned Depth);
bool matchAddr(Value *V, unsigned Depth);		bool matchAddr(Value *V, unsigned Depth);
bool matchOperationAddr(User *Operation, unsigned Opcode, unsigned Depth,		bool matchOperationAddr(User *Operation, unsigned Opcode, unsigned Depth,
▲ Show 20 Lines • Show All 1,139 Lines • ▼ Show 20 Lines	case Instruction::GetElementPtr: {
// just add it to the disp field and check validity.		// just add it to the disp field and check validity.
if (VariableOperand == -1) {		if (VariableOperand == -1) {
AddrMode.BaseOffs += ConstantOffset;		AddrMode.BaseOffs += ConstantOffset;
if (ConstantOffset == 0 \|\|		if (ConstantOffset == 0 \|\|
TLI.isLegalAddressingMode(DL, AddrMode, AccessTy, AddrSpace)) {		TLI.isLegalAddressingMode(DL, AddrMode, AccessTy, AddrSpace)) {
// Check to see if we can fold the base pointer in too.		// Check to see if we can fold the base pointer in too.
if (matchAddr(AddrInst->getOperand(0), Depth+1))		if (matchAddr(AddrInst->getOperand(0), Depth+1))
return true;		return true;
		} else {
		skatkovUnsubmitted Not Done Reply Inline Actions Why do you need the check isa<GetElementPtrInst>(AddrInst)? You are in "case Instruction::GetElementPtr" basing on "switch (Opcode) {" where Opcode is actually AddrInst->getOpcode(). So to me it is redundant... skatkov: Why do you need the check isa<GetElementPtrInst>(AddrInst)? You are in "case Instruction…
		haichengAuthorUnsubmitted Not Done Reply Inline Actions AddrInst is a user. It can be a GetElementPtrConstantExpr which is not supported by CGP yet. haicheng: AddrInst is a user. It can be a GetElementPtrConstantExpr which is not supported by CGP yet.
		// Record a GEP which has too large offset that cannot fit into the r+i
		mcrosierUnsubmitted Done Reply Inline Actions Perhaps something like: // Record GEPs with a non-zero offsets as candidates for splitting in the event that the offset cannot fit into the r+i addressing mode. mcrosier: Perhaps something like: // Record GEPs with a non-zero offsets as candidates for splitting…
		junbumlUnsubmitted Done Reply Inline Actions a non-zero offsets -> a non-zero offset junbuml: a non-zero offsets -> a non-zero offset
		// addressing mode. The GEP is the candidate to be split later.
		Value *Base = AddrInst->getOperand(0);
		if (EnableGEPOffsetSplit &&
		mcrosierUnsubmitted Done Reply Inline Actions Add isa<GetElementPtrInst>(AddrInst) check? mcrosier: Add isa<GetElementPtrInst>(AddrInst) check?
		TLI.shouldConsiderGEPOffsetSplit(Base->getType()) && Depth == 0 &&
		ConstantOffset > 0) {
		// Simple and common case that only one GEP is used in calculating the
		junbumlUnsubmitted Done Reply Inline Actions It seems that you can move this check to "else" by changing "else" to "else if". junbuml: It seems that you can move this check to "else" by changing "else" to "else if".
		// address for the memory access.
		Instruction *BaseI = dyn_cast<Instruction>(Base);
		efriedmaUnsubmitted Done Reply Inline Actions I haven't really reviewed the whole patch closely, so I might be missing something, but why does it matter that the element type is a struct, as opposed to something else large like an array? efriedma: I haven't really reviewed the whole patch closely, so I might be missing something, but why…
		haichengAuthorUnsubmitted Not Done Reply Inline Actions Nothing special, I am just being conservative in the beginning. I find that at this stage, LLVM can bitcast arbitrary pointers to i8* and do random stuffs. I am thinking supporting other large data structures as the next step. haicheng: Nothing special, I am just being conservative in the beginning. I find that at this stage…
		if (isa<Argument>(Base) \|\| isa<GlobalValue>(Base) \|\|
		(BaseI && !isa<CastInst>(BaseI) && !isa<BinaryOperator>(BaseI) &&
		efriedmaUnsubmitted Done Reply Inline Actions I'm pretty sure it's impossible for `BaseI` to be a `BinaryOperator`, given it's a pointer. efriedma: I'm pretty sure it's impossible for `BaseI` to be a `BinaryOperator`, given it's a pointer.
		!isa<GetElementPtrInst>(BaseI)))
		if (auto *GEP = dyn_cast<GetElementPtrInst>(AddrInst))
		mcrosierUnsubmitted Done Reply Inline Actions You could reduce the indent here by putting a 'isa<GetElementPtrInst>(AddrInst) in the outer most if statement and then make this a cast<>. See comment above. mcrosier: You could reduce the indent here by putting a 'isa<GetElementPtrInst>(AddrInst) in the outer…
		if ((BaseI && GEP->getParent() != BaseI->getParent()) \|\|
		mcrosierUnsubmitted Done Reply Inline Actions Also, I need some clarification here. The first check (BaseI && GEP->getParent() != BaseI->getParent()) seems to be inline with the comments below, but I don't completely follow the second check (i.e., checking that the PHI is not in the entry block, if BaseI is null). mcrosier: Also, I need some clarification here. The first check (BaseI && GEP->getParent() != BaseI…
		(!BaseI &&
		GEP->getParent() !=
		&GEP->getParent()->getParent()->getEntryBlock()))
		// Make sure the base is not in the same basic block as the
		// GEP. Otherwise, the split might be undone.
		skatkovUnsubmitted Done Reply Inline Actions getParent()->getParent() => getFunction() skatkov: getParent()->getParent() => getFunction()
		LargeOffsetGEP = std::make_pair(GEP, ConstantOffset);
		}
}		}
AddrMode.BaseOffs -= ConstantOffset;		AddrMode.BaseOffs -= ConstantOffset;
return false;		return false;
}		}

// Save the valid addressing mode in case we can't match.		// Save the valid addressing mode in case we can't match.
ExtAddrMode BackupAddrMode = AddrMode;		ExtAddrMode BackupAddrMode = AddrMode;
unsigned OldSize = AddrModeInsts.size();		unsigned OldSize = AddrModeInsts.size();
▲ Show 20 Lines • Show All 394 Lines • ▼ Show 20 Lines	if (!AddrTy)
return false;		return false;
Type *AddressAccessTy = AddrTy->getElementType();		Type *AddressAccessTy = AddrTy->getElementType();
unsigned AS = AddrTy->getAddressSpace();		unsigned AS = AddrTy->getAddressSpace();

// Do a match against the root of this address, ignoring profitability. This		// Do a match against the root of this address, ignoring profitability. This
// will tell us if the addressing mode for the memory operation will		// will tell us if the addressing mode for the memory operation will
// actually cover the shared instruction.		// actually cover the shared instruction.
ExtAddrMode Result;		ExtAddrMode Result;
		std::pair<GetElementPtrInst *, int64_t> LargeOffsetGEP(nullptr, 0);
TypePromotionTransaction::ConstRestorationPt LastKnownGood =		TypePromotionTransaction::ConstRestorationPt LastKnownGood =
TPT.getRestorationPoint();		TPT.getRestorationPoint();
AddressingModeMatcher Matcher(MatchedAddrModeInsts, TLI, TRI,		AddressingModeMatcher Matcher(
AddressAccessTy, AS,		MatchedAddrModeInsts, TLI, TRI, AddressAccessTy, AS, MemoryInst, Result,
MemoryInst, Result, InsertedInsts,		InsertedInsts, PromotedInsts, TPT, LargeOffsetGEP);
PromotedInsts, TPT);
Matcher.IgnoreProfitability = true;		Matcher.IgnoreProfitability = true;
bool Success = Matcher.matchAddr(Address, 0);		bool Success = Matcher.matchAddr(Address, 0);
(void)Success; assert(Success && "Couldn't select anything?");		(void)Success; assert(Success && "Couldn't select anything?");

// The match was to check the profitability, the changes made are not		// The match was to check the profitability, the changes made are not
// part of the original matcher. Therefore, they should be dropped		// part of the original matcher. Therefore, they should be dropped
// otherwise the original matcher will not present the right state.		// otherwise the original matcher will not present the right state.
TPT.rollback(LastKnownGood);		TPT.rollback(LastKnownGood);
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	if (SelectInst *SI = dyn_cast<SelectInst>(V)) {
PhiOrSelectSeen = true;		PhiOrSelectSeen = true;
continue;		continue;
}		}

// For non-PHIs, determine the addressing mode being computed. Note that		// For non-PHIs, determine the addressing mode being computed. Note that
// the result may differ depending on what other uses our candidate		// the result may differ depending on what other uses our candidate
// addressing instructions might have.		// addressing instructions might have.
AddrModeInsts.clear();		AddrModeInsts.clear();
		std::pair<GetElementPtrInst *, int64_t> LargeOffsetGEP(nullptr, 0);
ExtAddrMode NewAddrMode = AddressingModeMatcher::Match(		ExtAddrMode NewAddrMode = AddressingModeMatcher::Match(
V, AccessTy, AddrSpace, MemoryInst, AddrModeInsts, TLI, TRI,		V, AccessTy, AddrSpace, MemoryInst, AddrModeInsts, TLI, TRI,
InsertedInsts, PromotedInsts, TPT);		InsertedInsts, PromotedInsts, TPT, LargeOffsetGEP);
NewAddrMode.OriginalValue = V;
		if (LargeOffsetGEP.first &&
		LargeOffsetGEP.first->getParent() != MemoryInst->getParent())
		// Only collect GEPs that cannot be sunk unless the underlying data
		// structures are split.
		LargeOffsetGEPMap[LargeOffsetGEP.first->getPointerOperand()].push_back(
		LargeOffsetGEP);

		NewAddrMode.OriginalValue = V;
		junbumlUnsubmitted Done Reply Inline Actions You may want to have a new variable for GetElementPtrInst instead of doing LargeOffsetGEP.first several times. junbuml: You may want to have a new variable for GetElementPtrInst instead of doing LargeOffsetGEP.
if (!AddrModes.addNewAddrMode(NewAddrMode))		if (!AddrModes.addNewAddrMode(NewAddrMode))
break;		break;
		junbumlUnsubmitted Done Reply Inline Actions !NewGEPBases.count(LargeOffsetGEP.first) junbuml: !NewGEPBases.count(LargeOffsetGEP.first)
}		}

		junbumlUnsubmitted Done Reply Inline Actions I'm not clear about this comment. Can you clarify it little bit? typo: spllit junbuml: I'm not clear about this comment. Can you clarify it little bit? typo: spllit
// Try to combine the AddrModes we've collected. If we couldn't collect any,		// Try to combine the AddrModes we've collected. If we couldn't collect any,
// or we have multiple but either couldn't combine them or combining them		// or we have multiple but either couldn't combine them or combining them
// wouldn't do anything useful, bail out now.		// wouldn't do anything useful, bail out now.
if (!AddrModes.combineAddrModes()) {		if (!AddrModes.combineAddrModes()) {
TPT.rollback(LastKnownGood);		TPT.rollback(LastKnownGood);
return false;		return false;
}		}
TPT.commit();		TPT.commit();
▲ Show 20 Lines • Show All 488 Lines • ▼ Show 20 Lines	for (Instruction *Inst : Insts) {
}		}
if (!inserted)		if (!inserted)
CurPts.push_back(Inst);		CurPts.push_back(Inst);
}		}
}		}
return Changed;		return Changed;
}		}

		static int
		skatkovUnsubmitted Done Reply Inline Actions Please add a comment for nontrivial comparison function. skatkov: Please add a comment for nontrivial comparison function.
		compareGEPOffset(const std::pair<GetElementPtrInst , int64_t> LHS,
		const std::pair<GetElementPtrInst , int64_t> RHS) {
		if (LHS->first == RHS->first)
		return 0;
		if (LHS->second != RHS->second)
		return LHS->second > RHS->second ? 1 : -1;
		return LHS->first > RHS->first ? 1 : -1;
		}

		// Spliting large data structures so that the GEPs accessing them can have
		// smaller offsets so that they can be sunk to the same blocks as their users.
		// For example, a large struct starting from %base is splitted into two parts
		// where the second part starts from %new_base.
		//
		// Before:
		// BB0:
		// %base =
		//
		// BB1:
		// %gep0 = gep %base, off0
		// %gep1 = gep %base, off1
		// %gep2 = gep %base, off2
		//
		// BB2:
		// %load1 = load %gep0
		// %load2 = load %gep1
		// %load3 = load %gep2
		//
		// After:
		// BB0:
		// %base =
		// %new_base = gep %base, off0
		//
		// BB1:
		// %new_gep0 = %new_base
		// %new_gep1 = gep %new_base, off1 - off0
		// %new_gep2 = gep %new_base, off2 - off0
		//
		// BB2:
		// %load1 = load i32, i32* %new_gep0
		// %load2 = load i32, i32* %new_gep1
		// %load3 = load i32, i32* %new_gep2
		//
		// %new_gep1 and %new_gep2 can be sunk to BB2 now after the splitting because
		// their offsets are smaller enough to fit into the addressing mode.
		bool CodeGenPrepare::splitLargeGEPOffsets() {
		bool Changed = false;
		for (auto &Entry : LargeOffsetGEPMap) {
		SmallVectorImpl<std::pair<GetElementPtrInst *, int64_t>> &LargeOffsetGEPs =
		Entry.second;
		// Sorting all the GEPs of the same data structures based on the offsets.
		array_pod_sort(LargeOffsetGEPs.begin(), LargeOffsetGEPs.end(),
		compareGEPOffset);
		LargeOffsetGEPs.erase(
		skatkovUnsubmitted Not Done Reply Inline Actions I wonder if you sort the an array anyway why not to use set like data-structure to keep them and avoid erase of unique elements? Is there any hidden sense for that? skatkov: I wonder if you sort the an array anyway why not to use set like data-structure to keep them…
		haichengAuthorUnsubmitted Not Done Reply Inline Actions I need a data structure that can be iterated in a sorted order (ascending by offsets) and all elements (pointers to GEPs) are unique. I think the data in the set are unique but I cannot access them in the order that I want If I use a set. haicheng: I need a data structure that can be iterated in a sorted order (ascending by offsets) and all…
		std::unique(LargeOffsetGEPs.begin(), LargeOffsetGEPs.end()),
		LargeOffsetGEPs.end());
		// Skip if all the GEPs have the same offsets.
		if (LargeOffsetGEPs.front().second == LargeOffsetGEPs.back().second)
		continue;
		GetElementPtrInst *BaseGEP = LargeOffsetGEPs.begin()->first;
		int64_t BaseOffset = LargeOffsetGEPs.begin()->second;
		Value *NewBaseGEP = nullptr;
		for (auto &LargeOffsetGEP : LargeOffsetGEPs) {
		GetElementPtrInst *GEP = LargeOffsetGEP.first;
		efriedmaUnsubmitted Done Reply Inline Actions This is going to visit GEPs with the same offset in non-deterministic order. I guess it might not be that important, but it'll probably mess with the use-lists, which could affect some later pass. I would rather stay on the safe side and sort GEPs with the same offset based on the order they were added to the map, or some other deterministic ordering. efriedma: This is going to visit GEPs with the same offset in non-deterministic order. I guess it might…
		int64_t Offset = LargeOffsetGEP.second;
		if (!GEP)
		skatkovUnsubmitted Done Reply Inline Actions How can we get GEP == nullptr here? skatkov: How can we get GEP == nullptr here?
		continue;
		if (Offset != BaseOffset) {
		TargetLowering::AddrMode AddrMode;
		AddrMode.BaseOffs = Offset - BaseOffset;
		if (!TLI->isLegalAddressingMode(*DL, AddrMode,
		efriedmaUnsubmitted Done Reply Inline Actions The result type of the GEP might not be the type of the memory access... but I don't know how you'd get the right type off the top of my head, and it might not be important in practice. Maybe worth noting with a comment, though. efriedma: The result type of the GEP might not be the type of the memory access... but I don't know how…
		GEP->getResultElementType(),
		GEP->getAddressSpace())) {
		// We need to create a new base if the offset to the current base is
		// too large to fit into the addressing mode. So, a very large struct
		// may be splitted into several parts.
		BaseGEP = GEP;
		BaseOffset = Offset;
		NewBaseGEP = nullptr;
		}
		}

		// Generate a new GEP to replace the current one.
		IRBuilder<> Builder(GEP);
		Type *IntPtrTy = DL->getIntPtrType(GEP->getType());
		Type *I8PtrTy =
		Builder.getInt8PtrTy(GEP->getType()->getPointerAddressSpace());
		Type *I8Ty = Builder.getInt8Ty();

		if (!NewBaseGEP) {
		// Create a new base if we don't have one yet. Find the insertion
		// pointer for the new base first.
		BasicBlock::iterator NewBaseInsertPt;
		BasicBlock *NewBaseInsertBB;
		if (auto *BaseI = dyn_cast<Instruction>(Entry.first)) {
		// If the base of the struct is an instruction, the new base will be
		// inserted close to it.
		NewBaseInsertBB = BaseI->getParent();
		if (isa<PHINode>(BaseI))
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		else if (InvokeInst *Invoke = dyn_cast<InvokeInst>(BaseI)) {
		NewBaseInsertBB =
		SplitEdge(NewBaseInsertBB, Invoke->getNormalDest());
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		} else
		efriedmaUnsubmitted Done Reply Inline Actions I think you have to be a bit more careful here to avoid inserting a GEP into a block with a catchswitch. (This is Windows-only exception-handling, so maybe difficult to trigger.) efriedma: I think you have to be a bit more careful here to avoid inserting a GEP into a block with a…
		haichengAuthorUnsubmitted Not Done Reply Inline Actions Now I check if the blocks contain catchswitch when collecting GEPs. haicheng: Now I check if the blocks contain catchswitch when collecting GEPs.
		NewBaseInsertPt = std::next(BaseI->getIterator());
		} else {
		// If the current base is an argument or global value, the new base
		// will be inserted to the entry block.
		NewBaseInsertBB = &BaseGEP->getParent()->getParent()->getEntryBlock();
		skatkovUnsubmitted Done Reply Inline Actions getParent()->getParent() => getFunction() skatkov: getParent()->getParent() => getFunction()
		NewBaseInsertPt = NewBaseInsertBB->getFirstInsertionPt();
		}
		IRBuilder<> NewBaseBuilder(NewBaseInsertBB, NewBaseInsertPt);
		// Create a new base.
		Value *BaseIndex = ConstantInt::get(IntPtrTy, BaseOffset);
		NewBaseGEP = Entry.first;
		if (NewBaseGEP->getType() != I8PtrTy)
		NewBaseGEP = NewBaseBuilder.CreatePointerCast(NewBaseGEP, I8PtrTy);
		NewBaseGEP =
		NewBaseBuilder.CreateGEP(I8Ty, NewBaseGEP, BaseIndex, "splitgep");
		}

		Value *NewGEP = NewBaseGEP;
		if (Offset == BaseOffset) {
		if (GEP->getType() != I8PtrTy)
		NewGEP = Builder.CreatePointerCast(NewGEP, GEP->getType());
		} else {
		// Calculate the new offset for the new GEP.
		Value *Index = ConstantInt::get(IntPtrTy, Offset - BaseOffset);
		NewGEP = Builder.CreateGEP(I8Ty, NewBaseGEP, Index);

		if (GEP->getType() != I8PtrTy)
		NewGEP = Builder.CreatePointerCast(NewGEP, GEP->getType());
		}
		GEP->replaceAllUsesWith(NewGEP);
		GEP->eraseFromParent();
		Changed = true;
		}
		}
		return Changed;
		}

/// Return true, if an ext(load) can be formed from an extension in		/// Return true, if an ext(load) can be formed from an extension in
/// \p MovedExts.		/// \p MovedExts.
bool CodeGenPrepare::canFormExtLd(		bool CodeGenPrepare::canFormExtLd(
const SmallVectorImpl<Instruction > &MovedExts, LoadInst &LI,		const SmallVectorImpl<Instruction > &MovedExts, LoadInst &LI,
Instruction *&Inst, bool HasPromoted) {		Instruction *&Inst, bool HasPromoted) {
for (auto *MovedExtInst : MovedExts) {		for (auto *MovedExtInst : MovedExts) {
if (isa<LoadInst>(MovedExtInst->getOperand(0))) {		if (isa<LoadInst>(MovedExtInst->getOperand(0))) {
LI = cast<LoadInst>(MovedExtInst->getOperand(0));		LI = cast<LoadInst>(MovedExtInst->getOperand(0));
▲ Show 20 Lines • Show All 1,834 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	bool lowerInterleavedLoad(LoadInst *LI,
ArrayRef<unsigned> Indices,		ArrayRef<unsigned> Indices,
unsigned Factor) const override;		unsigned Factor) const override;
bool lowerInterleavedStore(StoreInst SI, ShuffleVectorInst SVI,		bool lowerInterleavedStore(StoreInst SI, ShuffleVectorInst SVI,
unsigned Factor) const override;		unsigned Factor) const override;

bool isLegalAddImmediate(int64_t) const override;		bool isLegalAddImmediate(int64_t) const override;
bool isLegalICmpImmediate(int64_t) const override;		bool isLegalICmpImmediate(int64_t) const override;

		bool shouldConsiderGEPOffsetSplit(Type *BaseType) const override;

EVT getOptimalMemOpType(uint64_t Size, unsigned DstAlign, unsigned SrcAlign,		EVT getOptimalMemOpType(uint64_t Size, unsigned DstAlign, unsigned SrcAlign,
bool IsMemset, bool ZeroMemset, bool MemcpyStrSrc,		bool IsMemset, bool ZeroMemset, bool MemcpyStrSrc,
MachineFunction &MF) const override;		MachineFunction &MF) const override;

/// Return true if the addressing mode represented by AM is legal for this		/// Return true if the addressing mode represented by AM is legal for this
/// target, for a load/store of the specified type.		/// target, for a load/store of the specified type.
bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM, Type *Ty,		bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM, Type *Ty,
unsigned AS,		unsigned AS,
▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,002 Lines • ▼ Show 20 Lines	if (!AM.Scale) {
return false;		return false;
}		}

// Check reg1 + SIZE_IN_BYTES * reg2 and reg1 + reg2		// Check reg1 + SIZE_IN_BYTES * reg2 and reg1 + reg2

return AM.Scale == 1 \|\| (AM.Scale > 0 && (uint64_t)AM.Scale == NumBytes);		return AM.Scale == 1 \|\| (AM.Scale > 0 && (uint64_t)AM.Scale == NumBytes);
}		}

		bool AArch64TargetLowering::shouldConsiderGEPOffsetSplit(Type *BaseType) const {
		junbumlUnsubmitted Done Reply Inline Actions Should we have BaseType as parameter ? junbuml: Should we have BaseType as parameter ?
		if (auto *Ty = dyn_cast<PointerType>(BaseType))
		efriedmaUnsubmitted Done Reply Inline Actions The base of a GEP is always a pointer. efriedma: The base of a GEP is always a pointer.
		// Consider splitting large offset of struct or array.
		if (Ty->getElementType()->isAggregateType())
		efriedmaUnsubmitted Done Reply Inline Actions I'm still not sure what the purpose of this check is; why does it matter how many dimensions an array has? efriedma: I'm still not sure what the purpose of this check is; why does it matter how many dimensions an…
		return true;

		return false;
		}

int AArch64TargetLowering::getScalingFactorCost(const DataLayout &DL,		int AArch64TargetLowering::getScalingFactorCost(const DataLayout &DL,
const AddrMode &AM, Type *Ty,		const AddrMode &AM, Type *Ty,
unsigned AS) const {		unsigned AS) const {
// Scaling factors are not free at all.		// Scaling factors are not free at all.
// Operands \| Rt Latency		// Operands \| Rt Latency
// -------------------------------------------		// -------------------------------------------
// Rt, [Xn, Xm] \| 4		// Rt, [Xn, Xm] \| 4
// -------------------------------------------		// -------------------------------------------
▲ Show 20 Lines • Show All 3,042 Lines • Show Last 20 Lines

test/Transforms/CodeGenPrepare/AArch64/large-offset-gep.ll

This file was added.

				; RUN: llc -mtriple=aarch64-linux-gnu -verify-machineinstrs -o - %s \| FileCheck %s

				%struct_type = type { [10000 x i32], i32, i32 }

				define void @test1(%struct_type** %s, i32 %n) {
				; CHECK-LABEL: test1
				entry:
				%struct = load %struct_type, %struct_type* %s
				br label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				define void @test2(%struct_type* %struct, i32 %n) {
				; CHECK-LABEL: test2
				entry:
				%cmp = icmp eq %struct_type* %struct, null
				br i1 %cmp, label %while_end, label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp1 = icmp slt i32 %phi, %n
				br i1 %cmp1, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				define void @test3(%struct_type* %s1, %struct_type* %s2, i1 %cond, i32 %n) {
				; CHECK-LABEL: test3
				entry:
				br i1 %cond, label %if_true, label %if_end

				if_true:
				br label %if_end

				if_end:
				%struct = phi %struct_type* [ %s1, %entry ], [ %s2, %if_true ]
				%cmp = icmp eq %struct_type* %struct, null
				br i1 %cmp, label %while_end, label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %if_end ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp1 = icmp slt i32 %phi, %n
				br i1 %cmp1, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}

				declare %struct_type* @foo()

				define void @test4(i32 %n) personality i32 (...)* @__FrameHandler {
				; CHECK-LABEL: test4
				entry:
				%struct = invoke %struct_type* @foo() to label %while_cond unwind label %cleanup

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #40000
				; CHECK-NOT: mov w{{[0-9]+}}, #40004
				%gep0 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 1
				%gep1 = getelementptr %struct_type, %struct_type* %struct, i64 0, i32 2
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void

				cleanup:
				landingpad { i8*, i32 } cleanup
				unreachable
				}

				declare i32 @__FrameHandler(...)

				define void @test5([65536 x i32]** %s, i32 %n) {
				; CHECK-LABEL: test5
				entry:
				%struct = load [65536 x i32], [65536 x i32]* %s
				br label %while_cond

				while_cond:
				%phi = phi i32 [ 0, %entry ], [ %i, %while_body ]
				; CHECK: mov w{{[0-9]+}}, #14464
				; CHECK-NOT: mov w{{[0-9]+}}, #14468
				%gep0 = getelementptr [65536 x i32], [65536 x i32]* %struct, i64 0, i32 20000
				%gep1 = getelementptr [65536 x i32], [65536 x i32]* %struct, i64 0, i32 20001
				%cmp = icmp slt i32 %phi, %n
				br i1 %cmp, label %while_body, label %while_end

				while_body:
				; CHECK: str w{{[0-9]+}}, [x{{[0-9]+}}, #4]
				%i = add i32 %phi, 1
				store i32 %i, i32* %gep0
				store i32 %phi, i32* %gep1
				br label %while_cond

				while_end:
				ret void
				}