This is an archive of the discontinued LLVM Phabricator instance.

[Polly] Consolidate invariant loads
ClosedPublic

Authored by jdoerfert on Oct 1 2015, 4:13 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
grosser

Commits

rG697fdf891c50: Consolidate invariant loads
rPLO249853: Consolidate invariant loads
rL249853: Consolidate invariant loads

Summary

                                                                                                                                               
If a (assumed) invariant location is loaded multiple times we
generated a parameter for each location. However, this caused compile
time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
BT in the NAS benchmarks). Additionally, the code we generate is
suboptimal as we preload the same location multiple times and perform
the same checks on all the parameters that refere to the same value.

With this patch we consolidate the invariant loads in three steps:
  1) During SCoP initialization required invariant loads are put in
     equivalence classes based on their pointer operand. One
     representing load is used to generate a parameter for the whole
     class, thus we never generate multiple parameters for the same
     location.
  2) During the SCoP simplification we remove invariant memory
     accesses that are in the same equivalence class. While doing so
     we build the union of all execution domains as it is only
     important that the location is at least accessed once.
  3) During code generation we only preload one element of each
     equivalence class with the unified execution domain. All others
     are mapped to that preloaded value.
     equivalence classes based on their pointer operand. One
     representing load is used to generate a parameter for the whole
     class, thus we never generate multiple parameters for the same
     location.

Diff Detail

Repository: rL LLVM

Event Timeline

jdoerfert updated this revision to Diff 36221.Oct 1 2015, 4:13 AM

jdoerfert retitled this revision from to [Polly] Consolidate invariant loads.

jdoerfert added reviewers: grosser, Meinersbur.

jdoerfert updated this object.

jdoerfert added a subscriber: Restricted Project.

Herald added a subscriber: sanjoy. · View Herald TranscriptOct 1 2015, 4:13 AM

Diff against D13195 to highlight the changes

jdoerfert added a parent revision: D13195: [Polly] Allow invariant loads in the SCoP description.Oct 1 2015, 4:15 AM

Could you please describe what the code is doing not only in the commit message, but also as source code comment?

Why is ScopDetection involved at all? Shouldn't it be ScopInfo alone which decides what that Scop's parameters are?

In D13338#257504, @Meinersbur wrote:

Could you please describe what the code is doing not only in the commit message, but also as source code comment?

I think the source is well documented. If you disagree please inline a comment so I know what part you refer too.

Why is ScopDetection involved at all?

Because in ScopInfo we cannot build the equivalence classes until the SCoP is completed and to build the SCoP in resonable time we need equivalence classes.
For example in the SCEVAffinator (that is used throughout the SCoP creation) we need to normalize required invariant load parameters otherwise we would introduce different parameters for each invariant load.
To normalize these parameters we already need equivalence classes but in the expression that is translated at that point there might only be a reference to one of the invariant loads (most certainly not to all that are equivalent).
Thus, to determine the representing element for an equivalence class we need to know all elements of it before we use the SCEVAffinator for the first time.
The only way to build the equivalence classses before we use the SCEVAffinator is to do it in the ScopDetection where we actually see all required invariant loads.

Shouldn't it be ScopInfo alone which decides what that Scop's parameters are?

ScopInfo actually never "really" decides what the parameters are, it only normalizes them to a certain degree (and now even more). The parameters are collected and given to the SCoP by the SCEVAffinator and the SCEVValidator.

In D13338#257549, @jdoerfert wrote:

In D13338#257504, @Meinersbur wrote:

Could you please describe what the code is doing not only in the commit message, but also as source code comment?

I think the source is well documented. If you disagree please inline a comment so I know what part you refer too.

The commit message describes 3 phases, but there are just 1.5 notable new comments. What about the other phases?

I added some inline comments where I think you could write a bit more. (These are not questions you need to answer to me, I got them from either some other comment or the commit log).

Mmh, I mixed them with my other remarks that I found.

Why is ScopDetection involved at all?

Because in ScopInfo we cannot build the equivalence classes until the SCoP is completed and to build the SCoP in resonable time we need equivalence classes.
For example in the SCEVAffinator (that is used throughout the SCoP creation) we need to normalize required invariant load parameters otherwise we would introduce different parameters for each invariant load.
To normalize these parameters we already need equivalence classes but in the expression that is translated at that point there might only be a reference to one of the invariant loads (most certainly not to all that are equivalent).
Thus, to determine the representing element for an equivalence class we need to know all elements of it before we use the SCEVAffinator for the first time.
The only way to build the equivalence classses before we use the SCEVAffinator is to do it in the ScopDetection where we actually see all required invariant loads.

Correct me if I am wrong, SCEVAffinator is only used by the ScopInfo pass. ScopInfo also get the list of invariant using getRequiredInvariantLoads(). It can by itself create the equivalence classes by going through the map.

This might be additional work because ScopDetection in the patch does the equivalence classes already on the fly. However, it would be better for layering as ScopDetection shouldn't care about hoisting; it just determines the size of scop regions.

Shouldn't it be ScopInfo alone which decides what that Scop's parameters are?

ScopInfo actually never "really" decides what the parameters are, it only normalizes them to a certain degree (and now even more). The parameters are collected and given to the SCoP by the SCEVAffinator and the SCEVValidator.

Isn't it ScopInfo::addParams() which collect the parameters?

include/polly/ScopDetection.h
150 ↗	(On Diff #36222)	What are they remembered for?
188 ↗	(On Diff #36222)	What are the equivalence classes? Why are there equivalence classes?
254 ↗	(On Diff #36222)	What is the condition for that?
402 ↗	(On Diff #36222)	Why are they required?
include/polly/ScopInfo.h
936 ↗	(On Diff #36222)	Why the rename?
1219 ↗	(On Diff #36222)	If those are two actions, why not put them into different functions? I can't find the changes for simplifySCoP().
1241 ↗	(On Diff #36222)	I prefer the longer name from ScopDetection
include/polly/Support/SCEVAffinator.h
57 ↗	(On Diff #36222)	What is it used for?
include/polly/Support/SCEVValidator.h
51 ↗	(On Diff #36222)	@param missing
include/polly/Support/ScopHelper.h
46 ↗	(On Diff #36222)	What is the order?
49 ↗	(On Diff #36222)	What is the key?
152 ↗	(On Diff #36222)	When is it applicable? What is the special property of the representive SCEV?
lib/Analysis/ScopDetection.cpp
302 ↗	(On Diff #36222)	Describe under which condition the invariant valid load is required
326 ↗	(On Diff #36222)	return onlyValidRequiredInvariantLoads(AccessILC, Context)
705 ↗	(On Diff #36222)	This is 1)?
1148 ↗	(On Diff #36222)	Is it dummy (could you pass NULL instead)? Or does it serve as scratch storage?
lib/Analysis/ScopInfo.cpp
1368 ↗	(On Diff #36222)	Why the rename?
1424 ↗	(On Diff #36222)	Ideas how to improve this?
1437 ↗	(On Diff #36222)	Is this 2)? Can you describe why it is the union of the two?
1476 ↗	(On Diff #36222)	Why is one parameter correct but not the other?
lib/CodeGen/IslNodeBuilder.cpp
908 ↗	(On Diff #36222)	Why the rename? Doesn't "auto" know by itself that it's a const reference?
930 ↗	(On Diff #36222)	This is 3) ?
lib/Support/ScopHelper.cpp
385 ↗	(On Diff #36222)	Describe the conditions

jdoerfert updated this object.Oct 1 2015, 5:01 PM

jdoerfert edited edge metadata.

Updated according to Michaels idea. The SCoP will now hide most of the equivalence class magic.

Cool! Looks less impactful now!

Hi Johannes,

the patch looks conceptually good and it has a very useful commit message. I do not have any code changes, but would suggest a couple of additional comments. Some of the information that I miss as source code comments can probably just been taken from your commit message.

Some minor comments:

There is an incomplete sentence in part 3) of the commit message

Cool! Looks less impactful now!

@Michael: Thanks for reviewing! One point: in this message it seems you finished your review, but it is unclear if the patch is good to go, if you would prefer Johannes to still address some of your open comments or if you prefer me to have another look. It would help if you could state this explicitly.

Best,
Tobias

include/polly/ScopInfo.h
702 ↗	(On Diff #36323)	ordered
939 ↗	(On Diff #36323)	As Michael mentioned, the rename seems unrelated. If it is I would prefer to commit it separately before this commit to reduce the actual diff.
lib/Analysis/ScopInfo.cpp
1442 ↗	(On Diff #36323)	As Michael mentioned, adding the information of part 2) of your commit message as a comment here would make the code more understandable. IAs this function is getting large, it might indeed make sense to split it into two subfunctions, each with its own comment.
1477 ↗	(On Diff #36323)	You mention the term equivalence classes here and in one header, but do not explain which equivalence classes exist. It would be helpful to explicitly state at one of these locations what are the elements that are sorted into equivalence classes, how do they differ and which properties are used to sort them into equivalence classes.
1506 ↗	(On Diff #36323)	Michael commented: "Why is one parameter correct but not the other?" This does not yet seem to be addressed and is not clear to me either,
lib/CodeGen/IslNodeBuilder.cpp
914–915 ↗	(On Diff #36323)	As Michael commented, this rename seems unrelated. (I like the rename, but please just commit it separately ahead of time) @Michael: auto can derive 'const' and '*' but in some cases we still add them to make clear that something is a ptr or a const. Not sure if this information adds additional value here though.
936 ↗	(On Diff #36323)	As Michael mentioned, this seems to be 3) from your commit message. Adding the information from your commit message in the source code would be useful.

Closed by commit rL249853: Consolidate invariant loads (authored by jdoerfert). · Explain WhyOct 9 2015, 10:14 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

polly/

trunk/

include/

polly/

ScopInfo.h

43 lines

lib/

Analysis/

ScopInfo.cpp

125 lines

CodeGen/

IslNodeBuilder.cpp

18 lines

test/

Isl/

CodeGen/

OpenMP/

invariant_base_pointers_preloaded.ll

35 lines

invariant_load_outermost.ll

37 lines

whole-scop-non-affine-subregion.ll

16 lines

ScopInfo/

intra_and_inter_bb_scalar_dep.ll

4 lines

invariant_loads_complicated_dependences.ll

16 lines

invariant_loop_bounds.ll

20 lines

invariant_same_loop_bound_multiple_times-1.ll

106 lines

invariant_same_loop_bound_multiple_times-2.ll

109 lines

required-invariant-loop-bounds.ll

4 lines

Diff 36959

polly/trunk/include/polly/ScopInfo.h

Show First 20 Lines • Show All 674 Lines • ▼ Show 20 Lines	llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
MemoryAccess::ReductionType RT);		MemoryAccess::ReductionType RT);

/// @brief Ordered list type to hold accesses.		/// @brief Ordered list type to hold accesses.
using MemoryAccessList = std::forward_list<MemoryAccess *>;		using MemoryAccessList = std::forward_list<MemoryAccess *>;

/// @brief Type for invariant memory accesses and their domain context.		/// @brief Type for invariant memory accesses and their domain context.
using InvariantAccessTy = std::pair<MemoryAccess , isl_set >;		using InvariantAccessTy = std::pair<MemoryAccess , isl_set >;

		/// @brief Type for an ordered list of invariant accesses.
		using InvariantAccessListTy = std::forward_list<InvariantAccessTy>;

		/// @brief Type for a class of equivalent invariant memory accesses.
		using InvariantEquivClassTy = std::pair<const SCEV *, InvariantAccessListTy>;

/// @brief Type for multiple invariant memory accesses and their domain context.		/// @brief Type for multiple invariant memory accesses and their domain context.
using InvariantAccessesTy = SmallVector<InvariantAccessTy, 8>;		using InvariantAccessesTy = SmallVector<InvariantEquivClassTy, 8>;

///===----------------------------------------------------------------------===//		///===----------------------------------------------------------------------===//
/// @brief Statement of the Scop		/// @brief Statement of the Scop
///		///
/// A Scop statement represents an instruction in the Scop.		/// A Scop statement represents an instruction in the Scop.
///		///
/// It is further described by its iteration domain, its schedule and its data		/// It is further described by its iteration domain, its schedule and its data
/// accesses.		/// accesses.
▲ Show 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	void setBasicBlock(BasicBlock *Block) {
// the entry block was split and needs to be changed in the region R.		// the entry block was split and needs to be changed in the region R.
assert(BB && "Cannot set a block for a region statement");		assert(BB && "Cannot set a block for a region statement");
BB = Block;		BB = Block;
}		}

/// @brief Add @p Access to this statement's list of accesses.		/// @brief Add @p Access to this statement's list of accesses.
void addAccess(MemoryAccess *Access);		void addAccess(MemoryAccess *Access);

/// @brief Move the memory access in @p InvMAs to @p TargetList.		/// @brief Move the memory access in @p InvMAs to @p InvariantEquivClasses.
///		///
/// Note that scalar accesses that are caused by any access in @p InvMAs will		/// Note that scalar accesses that are caused by any access in @p InvMAs will
/// be eliminated too.		/// be eliminated too.
void hoistMemoryAccesses(MemoryAccessList &InvMAs,		void hoistMemoryAccesses(MemoryAccessList &InvMAs,
InvariantAccessesTy &TargetList);		InvariantAccessesTy &InvariantEquivClasses);

typedef MemoryAccessVec::iterator iterator;		typedef MemoryAccessVec::iterator iterator;
typedef MemoryAccessVec::const_iterator const_iterator;		typedef MemoryAccessVec::const_iterator const_iterator;

iterator begin() { return MemAccs.begin(); }		iterator begin() { return MemAccs.begin(); }
iterator end() { return MemAccs.end(); }		iterator end() { return MemAccs.end(); }
const_iterator begin() const { return MemAccs.begin(); }		const_iterator begin() const { return MemAccs.begin(); }
const_iterator end() const { return MemAccs.end(); }		const_iterator end() const { return MemAccs.end(); }
▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	private:
/// we would probably generate two alias groups, one for the int pointers and		/// we would probably generate two alias groups, one for the int pointers and
/// one for the float pointers.		/// one for the float pointers.
///		///
/// During code generation we will create a runtime alias check for each alias		/// During code generation we will create a runtime alias check for each alias
/// group to ensure the SCoP is executed in an alias free environment.		/// group to ensure the SCoP is executed in an alias free environment.
MinMaxVectorPairVectorTy MinMaxAliasGroups;		MinMaxVectorPairVectorTy MinMaxAliasGroups;

/// @brief List of invariant accesses.		/// @brief List of invariant accesses.
InvariantAccessesTy InvariantAccesses;		InvariantAccessesTy InvariantEquivClasses;

/// @brief Scop constructor; invoked from ScopInfo::buildScop.		/// @brief Scop constructor; invoked from ScopInfo::buildScop.
Scop(Region &R, AccFuncMapType &AccFuncMap, ScopDetection &SD,		Scop(Region &R, AccFuncMapType &AccFuncMap, ScopDetection &SD,
ScalarEvolution &SE, DominatorTree &DT, LoopInfo &LI, isl_ctx *ctx,		ScalarEvolution &SE, DominatorTree &DT, LoopInfo &LI, isl_ctx *ctx,
unsigned MaxLoopDepth);		unsigned MaxLoopDepth);

/// @brief Initialize this ScopInfo .		/// @brief Initialize this ScopInfo .
void init(AliasAnalysis &AA);		void init(AliasAnalysis &AA);
Show All 34 Lines	private:
/// @brief Simplify the SCoP representation		/// @brief Simplify the SCoP representation
///		///
/// At the moment we perform the following simplifications:		/// At the moment we perform the following simplifications:
/// - removal of no-op statements		/// - removal of no-op statements
/// @param RemoveIgnoredStmts If true, also removed ignored statments.		/// @param RemoveIgnoredStmts If true, also removed ignored statments.
/// @see isIgnored()		/// @see isIgnored()
void simplifySCoP(bool RemoveIgnoredStmts);		void simplifySCoP(bool RemoveIgnoredStmts);

		/// @brief Create equivalence classes for required invariant accesses.
		///
		/// These classes will consolidate multiple required invariant loads from the
		/// same address in order to keep the number of dimensions in the SCoP
		/// description small. For each such class equivalence class only one
		/// representing element, hence one required invariant load, will be chosen
		/// and modeled as parameter. The method
		/// Scop::getRepresentingInvariantLoadSCEV() will replace each element from an
		/// equivalence class with the representing element that is modeled. As a
		/// consequence Scop::getIdForParam() will only return an id for the
		/// representing element of each equivalence class, thus for each required
		/// invariant location.
		void buildInvariantEquivalenceClasses();

/// @brief Hoist invariant memory loads and check for required ones.		/// @brief Hoist invariant memory loads and check for required ones.
///		///
/// We first identify "common" invariant loads, thus loads that are invariant		/// We first identify "common" invariant loads, thus loads that are invariant
/// and can be hoisted. Then we check if all required invariant loads have		/// and can be hoisted. Then we check if all required invariant loads have
/// been identified as (common) invariant. A load is a required invariant load		/// been identified as (common) invariant. A load is a required invariant load
/// if it was assumed to be invariant during SCoP detection, e.g., to assume		/// if it was assumed to be invariant during SCoP detection, e.g., to assume
/// loop bounds to be affine or runtime alias checks to be placeable. In case		/// loop bounds to be affine or runtime alias checks to be placeable. In case
/// a required invariant load was not identified as (common) invariant we will		/// a required invariant load was not identified as (common) invariant we will
Show All 18 Lines	private:
void addUserContext();		void addUserContext();

/// @brief Add the bounds of the parameters to the context.		/// @brief Add the bounds of the parameters to the context.
void addParameterBounds();		void addParameterBounds();

/// @brief Simplify the assumed and boundary context.		/// @brief Simplify the assumed and boundary context.
void simplifyContexts();		void simplifyContexts();

		/// @brief Get the representing SCEV for @p S if applicable, otherwise @p S.
		///
		/// Invariant loads of the same location are put in an equivalence class and
		/// only one of them is chosen as a representing element that will be
		/// modeled as a parameter. The others have to be normalized, i.e.,
		/// replaced by the representing element of their equivalence class, in order
		/// to get the correct parameter value, e.g., in the SCEVAffinator.
		///
		/// @param S The SCEV to normalize.
		///
		/// @return The representing SCEV for invariant loads or @p S if none.
		const SCEV getRepresentingInvariantLoadSCEV(const SCEV S) const;

/// @brief Create a new SCoP statement for either @p BB or @p R.		/// @brief Create a new SCoP statement for either @p BB or @p R.
///		///
/// Either @p BB or @p R should be non-null. A new statement for the non-null		/// Either @p BB or @p R should be non-null. A new statement for the non-null
/// argument will be created and added to the statement vector and map.		/// argument will be created and added to the statement vector and map.
///		///
/// @param BB The basic block we build the statement for (or null)		/// @param BB The basic block we build the statement for (or null)
/// @param R The region we build the statement for (or null).		/// @param R The region we build the statement for (or null).
ScopStmt addScopStmt(BasicBlock BB, Region *R);		ScopStmt addScopStmt(BasicBlock BB, Region *R);
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	public:

/// @brief Get the maximum depth of the loop.		/// @brief Get the maximum depth of the loop.
///		///
/// @return The maximum depth of the loop.		/// @return The maximum depth of the loop.
inline unsigned getMaxLoopDepth() const { return MaxLoopDepth; }		inline unsigned getMaxLoopDepth() const { return MaxLoopDepth; }

/// @brief Return the set of invariant accesses.		/// @brief Return the set of invariant accesses.
const InvariantAccessesTy &getInvariantAccesses() const {		const InvariantAccessesTy &getInvariantAccesses() const {
return InvariantAccesses;		return InvariantEquivClasses;
}		}

/// @brief Mark the SCoP as optimized by the scheduler.		/// @brief Mark the SCoP as optimized by the scheduler.
void markAsOptimized() { IsOptimized = true; }		void markAsOptimized() { IsOptimized = true; }

/// @brief Check if the SCoP has been optimized by the scheduler.		/// @brief Check if the SCoP has been optimized by the scheduler.
bool isOptimized() const { return IsOptimized; }		bool isOptimized() const { return IsOptimized; }

▲ Show 20 Lines • Show All 433 Lines • Show Last 20 Lines

polly/trunk/lib/Analysis/ScopInfo.cpp

Show First 20 Lines • Show All 1,350 Lines • ▼ Show 20 Lines	void ScopStmt::print(raw_ostream &OS) const {

for (MemoryAccess *Access : MemAccs)		for (MemoryAccess *Access : MemAccs)
Access->print(OS);		Access->print(OS);
}		}

void ScopStmt::dump() const { print(dbgs()); }		void ScopStmt::dump() const { print(dbgs()); }

void ScopStmt::hoistMemoryAccesses(MemoryAccessList &InvMAs,		void ScopStmt::hoistMemoryAccesses(MemoryAccessList &InvMAs,
InvariantAccessesTy &TargetList) {		InvariantAccessesTy &InvariantEquivClasses) {

// Remove all memory accesses in @p InvMAs from this statement together		// Remove all memory accesses in @p InvMAs from this statement together
// with all scalar accesses that were caused by them. The tricky iteration		// with all scalar accesses that were caused by them. The tricky iteration
// order uses is needed because the MemAccs is a vector and the order in		// order uses is needed because the MemAccs is a vector and the order in
// which the accesses of each memory access list (MAL) are stored in this		// which the accesses of each memory access list (MAL) are stored in this
// vector is reversed.		// vector is reversed.
for (MemoryAccess *MA : InvMAs) {		for (MemoryAccess *MA : InvMAs) {
auto &MAL = *lookupAccessesFor(MA->getAccessInstruction());		auto &MAL = *lookupAccessesFor(MA->getAccessInstruction());
Show All 36 Lines	if (SE.isSCEVable(AccInst->getType())) {
if (ParamId) {		if (ParamId) {
int Dim = isl_set_find_dim_by_id(DomainCtx, isl_dim_param, ParamId);		int Dim = isl_set_find_dim_by_id(DomainCtx, isl_dim_param, ParamId);
DomainCtx = isl_set_eliminate(DomainCtx, isl_dim_param, Dim, 1);		DomainCtx = isl_set_eliminate(DomainCtx, isl_dim_param, Dim, 1);
}		}
isl_id_free(ParamId);		isl_id_free(ParamId);
}		}
}		}

for (MemoryAccess *MA : InvMAs)		for (MemoryAccess *MA : InvMAs) {
TargetList.push_back(std::make_pair(MA, isl_set_copy(DomainCtx)));
		// Check for another invariant access that accesses the same location as
		// MA and if found consolidate them. Otherwise create a new equivalence
		// class at the end of InvariantEquivClasses.
		LoadInst *LInst = cast<LoadInst>(MA->getAccessInstruction());
		const SCEV *PointerSCEV = SE.getSCEV(LInst->getPointerOperand());
		bool Consolidated = false;

		for (auto &IAClass : InvariantEquivClasses) {
		const SCEV *ClassPointerSCEV = IAClass.first;
		if (PointerSCEV != ClassPointerSCEV)
		continue;

		Consolidated = true;

		// We created empty equivalence classes for required invariant loads
		// in the beginning and might encounter one of them here. If so, this
		// MA will be the first in that equivalence class.
		auto &ClassList = IAClass.second;
		if (ClassList.empty()) {
		ClassList.push_front(std::make_pair(MA, isl_set_copy(DomainCtx)));
		break;
		}

		// If the equivalence class for MA is not empty we unify the execution
		// context and add MA to the list of accesses that are in this class.
		isl_set *IAClassDomainCtx = IAClass.second.front().second;
		IAClassDomainCtx =
		isl_set_union(IAClassDomainCtx, isl_set_copy(DomainCtx));
		ClassList.push_front(std::make_pair(MA, IAClassDomainCtx));
		break;
		}

		if (Consolidated)
		continue;

		// If we did not consolidate MA, thus did not find an equivalence class
		// that for it, we create a new one.
		InvariantAccessTy IA = std::make_pair(MA, isl_set_copy(DomainCtx));
		InvariantEquivClasses.emplace_back(InvariantEquivClassTy(
		std::make_pair(PointerSCEV, InvariantAccessListTy({IA}))));
		}

isl_set_free(DomainCtx);		isl_set_free(DomainCtx);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// Scop class implement		/// Scop class implement

void Scop::setContext(__isl_take isl_set *NewContext) {		void Scop::setContext(__isl_take isl_set *NewContext) {
NewContext = isl_set_align_params(NewContext, isl_set_get_space(Context));		NewContext = isl_set_align_params(NewContext, isl_set_get_space(Context));
isl_set_free(Context);		isl_set_free(Context);
Context = NewContext;		Context = NewContext;
}		}

		const SCEV Scop::getRepresentingInvariantLoadSCEV(const SCEV S) const {
		const SCEVUnknown *SU = dyn_cast_or_null<SCEVUnknown>(S);
		if (!SU)
		return S;

		LoadInst *LInst = dyn_cast<LoadInst>(SU->getValue());
		if (!LInst)
		return S;

		// Try to find an equivalence class for the load, if found return
		// the SCEV for the representing element, otherwise return S.
		const SCEV *PointerSCEV = SE->getSCEV(LInst->getPointerOperand());
		for (const InvariantEquivClassTy &IAClass : InvariantEquivClasses) {
		const SCEV *ClassPointerSCEV = IAClass.first;
		if (ClassPointerSCEV == PointerSCEV)
		return ClassPointerSCEV;
		}

		return S;
		}

void Scop::addParams(std::vector<const SCEV *> NewParameters) {		void Scop::addParams(std::vector<const SCEV *> NewParameters) {
for (const SCEV *Parameter : NewParameters) {		for (const SCEV *Parameter : NewParameters) {
Parameter = extractConstantFactor(Parameter, *SE).second;		Parameter = extractConstantFactor(Parameter, *SE).second;

		// Normalize the SCEV to get the representing element for an invariant load.
		Parameter = getRepresentingInvariantLoadSCEV(Parameter);

if (ParameterIds.find(Parameter) != ParameterIds.end())		if (ParameterIds.find(Parameter) != ParameterIds.end())
continue;		continue;

int dimension = Parameters.size();		int dimension = Parameters.size();

Parameters.push_back(Parameter);		Parameters.push_back(Parameter);
ParameterIds[Parameter] = dimension;		ParameterIds[Parameter] = dimension;
}		}
}		}

__isl_give isl_id Scop::getIdForParam(const SCEV Parameter) const {		__isl_give isl_id Scop::getIdForParam(const SCEV Parameter) const {
		// Normalize the SCEV to get the representing element for an invariant load.
		Parameter = getRepresentingInvariantLoadSCEV(Parameter);

ParamIdType::const_iterator IdIter = ParameterIds.find(Parameter);		ParamIdType::const_iterator IdIter = ParameterIds.find(Parameter);

if (IdIter == ParameterIds.end())		if (IdIter == ParameterIds.end())
return nullptr;		return nullptr;

std::string ParameterName;		std::string ParameterName;

if (const SCEVUnknown *ValueParameter = dyn_cast<SCEVUnknown>(Parameter)) {		if (const SCEVUnknown *ValueParameter = dyn_cast<SCEVUnknown>(Parameter)) {
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	UserContext =
isl_set_set_dim_id(UserContext, isl_dim_param, i,		isl_set_set_dim_id(UserContext, isl_dim_param, i,
isl_space_get_dim_id(Space, isl_dim_param, i));		isl_space_get_dim_id(Space, isl_dim_param, i));
}		}

Context = isl_set_intersect(Context, UserContext);		Context = isl_set_intersect(Context, UserContext);
isl_space_free(Space);		isl_space_free(Space);
}		}

		void Scop::buildInvariantEquivalenceClasses() {
		const InvariantLoadsSetTy &RIL = *SD.getRequiredInvariantLoads(&getRegion());
		SmallPtrSet<const SCEV *, 4> ClassPointerSet;
		for (LoadInst *LInst : RIL) {
		const SCEV *PointerSCEV = SE->getSCEV(LInst->getPointerOperand());

		// Skip the load if we already have a equivalence class for the pointer.
		if (!ClassPointerSet.insert(PointerSCEV).second)
		continue;

		InvariantEquivClasses.emplace_back(InvariantEquivClassTy(
		std::make_pair(PointerSCEV, InvariantAccessListTy())));
		}
		}

void Scop::buildContext() {		void Scop::buildContext() {
isl_space *Space = isl_space_params_alloc(IslCtx, 0);		isl_space *Space = isl_space_params_alloc(IslCtx, 0);
Context = isl_set_universe(isl_space_copy(Space));		Context = isl_set_universe(isl_space_copy(Space));
AssumedContext = isl_set_universe(Space);		AssumedContext = isl_set_universe(Space);
}		}

void Scop::addParameterBounds() {		void Scop::addParameterBounds() {
for (const auto &ParamID : ParameterIds) {		for (const auto &ParamID : ParameterIds) {
▲ Show 20 Lines • Show All 806 Lines • ▼ Show 20 Lines	Scop::Scop(Region &R, AccFuncMapType &AccFuncMap, ScopDetection &SD,
: LI(LI), DT(DT), SE(&ScalarEvolution), SD(SD), R(R),		: LI(LI), DT(DT), SE(&ScalarEvolution), SD(SD), R(R),
AccFuncMap(AccFuncMap), IsOptimized(false),		AccFuncMap(AccFuncMap), IsOptimized(false),
HasSingleExitEdge(R.getExitingBlock()), MaxLoopDepth(MaxLoopDepth),		HasSingleExitEdge(R.getExitingBlock()), MaxLoopDepth(MaxLoopDepth),
IslCtx(Context), Context(nullptr), Affinator(this),		IslCtx(Context), Context(nullptr), Affinator(this),
AssumedContext(nullptr), BoundaryContext(nullptr), Schedule(nullptr) {}		AssumedContext(nullptr), BoundaryContext(nullptr), Schedule(nullptr) {}

void Scop::init(AliasAnalysis &AA) {		void Scop::init(AliasAnalysis &AA) {
buildContext();		buildContext();
		buildInvariantEquivalenceClasses();

buildDomains(&R);		buildDomains(&R);

// Remove empty and ignored statements.		// Remove empty and ignored statements.
// Exit early in case there are no executable statements left in this scop.		// Exit early in case there are no executable statements left in this scop.
simplifySCoP(true);		simplifySCoP(true);
if (Stmts.empty())		if (Stmts.empty())
return;		return;

Show All 35 Lines	for (MinMaxAccessTy &MMA : MinMaxAccessPair.first) {
isl_pw_multi_aff_free(MMA.second);		isl_pw_multi_aff_free(MMA.second);
}		}
for (MinMaxAccessTy &MMA : MinMaxAccessPair.second) {		for (MinMaxAccessTy &MMA : MinMaxAccessPair.second) {
isl_pw_multi_aff_free(MMA.first);		isl_pw_multi_aff_free(MMA.first);
isl_pw_multi_aff_free(MMA.second);		isl_pw_multi_aff_free(MMA.second);
}		}
}		}

for (const auto &IA : InvariantAccesses)		for (const auto &IAClass : InvariantEquivClasses)
isl_set_free(IA.second);		if (!IAClass.second.empty())
		isl_set_free(IAClass.second.front().second);
}		}

void Scop::updateAccessDimensionality() {		void Scop::updateAccessDimensionality() {
for (auto &Stmt : *this)		for (auto &Stmt : *this)
for (auto &Access : Stmt)		for (auto &Access : Stmt)
Access->updateDimensionality();		Access->updateDimensionality();
}		}

▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	for (ScopStmt &Stmt : *this) {
// We inserted invariant accesses always in the front but need them to be		// We inserted invariant accesses always in the front but need them to be
// sorted in a "natural order". The statements are already sorted in reverse		// sorted in a "natural order". The statements are already sorted in reverse
// post order and that suffices for the accesses too. The reason we require		// post order and that suffices for the accesses too. The reason we require
// an order in the first place is the dependences between invariant loads		// an order in the first place is the dependences between invariant loads
// that can be caused by indirect loads.		// that can be caused by indirect loads.
InvMAs.reverse();		InvMAs.reverse();

// Transfer the memory access from the statement to the SCoP.		// Transfer the memory access from the statement to the SCoP.
Stmt.hoistMemoryAccesses(InvMAs, InvariantAccesses);		Stmt.hoistMemoryAccesses(InvMAs, InvariantEquivClasses);

isl_set_free(Domain);		isl_set_free(Domain);
}		}
isl_union_map_free(Writes);		isl_union_map_free(Writes);

if (!InvariantAccesses.empty())		if (!InvariantEquivClasses.empty())
IsOptimized = true;		IsOptimized = true;

		auto &ScopRIL = *SD.getRequiredInvariantLoads(&getRegion());
// Check required invariant loads that were tagged during SCoP detection.		// Check required invariant loads that were tagged during SCoP detection.
for (LoadInst LI : SD.getRequiredInvariantLoads(&getRegion())) {		for (LoadInst *LI : ScopRIL) {
assert(LI && getRegion().contains(LI));		assert(LI && getRegion().contains(LI));
ScopStmt *Stmt = getStmtForBasicBlock(LI->getParent());		ScopStmt *Stmt = getStmtForBasicBlock(LI->getParent());
if (Stmt && Stmt->lookupAccessesFor(LI) != nullptr) {		if (Stmt && Stmt->lookupAccessesFor(LI) != nullptr) {
DEBUG(dbgs() << "\n\nWARNING: Load (" << *LI		DEBUG(dbgs() << "\n\nWARNING: Load (" << *LI
<< ") is required to be invariant but was not marked as "		<< ") is required to be invariant but was not marked as "
"such. SCoP for "		"such. SCoP for "
<< getRegion() << " will be dropped\n\n");		<< getRegion() << " will be dropped\n\n");
addAssumption(isl_set_empty(getParamSpace()));		addAssumption(isl_set_empty(getParamSpace()));
return;		return;
}		}
}		}

// We want invariant accesses to be sorted in a "natural order" because there		// We want invariant accesses to be sorted in a "natural order" because there
// might be dependences between invariant loads. These can be caused by		// might be dependences between invariant loads. These can be caused by
// indirect loads but also because an invariant load is only conditionally		// indirect loads but also because an invariant load is only conditionally
// executed and the condition is dependent on another invariant load. As we		// executed and the condition is dependent on another invariant load. As we
// want to do code generation in a straight forward way, e.g., preload the		// want to do code generation in a straight forward way, e.g., preload the
// accesses in the list one after another, we sort them such that the		// accesses in the list one after another, we sort them such that the
// preloaded values needed in the conditions will always be in front. Before		// preloaded values needed in the conditions will always be in front. Before
// we already ordered the accesses such that indirect loads can be resolved,		// we already ordered the accesses such that indirect loads can be resolved,
// thus we use a stable sort here.		// thus we use a stable sort here.

auto compareInvariantAccesses = [this](const InvariantAccessTy &IA0,		auto compareInvariantAccesses = [this](
const InvariantAccessTy &IA1) {		const InvariantEquivClassTy &IAClass0,
		const InvariantEquivClassTy &IAClass1) {
		const InvariantAccessTy &IA0 = IAClass0.second.front();
		const InvariantAccessTy &IA1 = IAClass1.second.front();

Instruction *AI0 = IA0.first->getAccessInstruction();		Instruction *AI0 = IA0.first->getAccessInstruction();
Instruction *AI1 = IA1.first->getAccessInstruction();		Instruction *AI1 = IA1.first->getAccessInstruction();

const SCEV *S0 =		const SCEV *S0 =
SE->isSCEVable(AI0->getType()) ? SE->getSCEV(AI0) : nullptr;		SE->isSCEVable(AI0->getType()) ? SE->getSCEV(AI0) : nullptr;
const SCEV *S1 =		const SCEV *S1 =
SE->isSCEVable(AI1->getType()) ? SE->getSCEV(AI1) : nullptr;		SE->isSCEVable(AI1->getType()) ? SE->getSCEV(AI1) : nullptr;

Show All 25 Lines	auto compareInvariantAccesses = [this](
assert(!(Involves0Id1 && Involves1Id0));		assert(!(Involves0Id1 && Involves1Id0));

isl_id_free(Id0);		isl_id_free(Id0);
isl_id_free(Id1);		isl_id_free(Id1);

return Involves1Id0;		return Involves1Id0;
};		};

std::stable_sort(InvariantAccesses.begin(), InvariantAccesses.end(),		std::stable_sort(InvariantEquivClasses.begin(), InvariantEquivClasses.end(),
compareInvariantAccesses);		compareInvariantAccesses);
}		}

const ScopArrayInfo *		const ScopArrayInfo *
Scop::getOrCreateScopArrayInfo(Value BasePtr, Type AccessType,		Scop::getOrCreateScopArrayInfo(Value BasePtr, Type AccessType,
ArrayRef<const SCEV *> Sizes, bool IsPHI) {		ArrayRef<const SCEV *> Sizes, bool IsPHI) {
auto &SAI = ScopArrayInfoMap[std::make_pair(BasePtr, IsPHI)];		auto &SAI = ScopArrayInfoMap[std::make_pair(BasePtr, IsPHI)];
if (!SAI) {		if (!SAI) {
▲ Show 20 Lines • Show All 168 Lines • ▼ Show 20 Lines
}		}

void Scop::print(raw_ostream &OS) const {		void Scop::print(raw_ostream &OS) const {
OS.indent(4) << "Function: " << getRegion().getEntry()->getParent()->getName()		OS.indent(4) << "Function: " << getRegion().getEntry()->getParent()->getName()
<< "\n";		<< "\n";
OS.indent(4) << "Region: " << getNameStr() << "\n";		OS.indent(4) << "Region: " << getNameStr() << "\n";
OS.indent(4) << "Max Loop Depth: " << getMaxLoopDepth() << "\n";		OS.indent(4) << "Max Loop Depth: " << getMaxLoopDepth() << "\n";
OS.indent(4) << "Invariant Accesses: {\n";		OS.indent(4) << "Invariant Accesses: {\n";
for (const auto &IA : InvariantAccesses) {		for (const auto &IAClass : InvariantEquivClasses) {
IA.first->print(OS);		if (IAClass.second.empty()) {
OS.indent(12) << "Execution Context: " << IA.second << "\n";		OS.indent(12) << "Class Pointer: " << IAClass.first << "\n";
		} else {
		IAClass.second.front().first->print(OS);
		OS.indent(12) << "Execution Context: " << IAClass.second.front().second
		<< "\n";
		}
}		}
OS.indent(4) << "}\n";		OS.indent(4) << "}\n";
printContext(OS.indent(4));		printContext(OS.indent(4));
printArrayInfo(OS.indent(4));		printArrayInfo(OS.indent(4));
printAliasAssumptions(OS);		printAliasAssumptions(OS);
printStatements(OS.indent(4));		printStatements(OS.indent(4));
}		}

▲ Show 20 Lines • Show All 812 Lines • Show Last 20 Lines

polly/trunk/lib/CodeGen/IslNodeBuilder.cpp

Show First 20 Lines • Show All 900 Lines • ▼ Show 20 Lines	if (AlwaysExecuted) {
MergePHI->addIncoming(Constant::getNullValue(AccInstTy), CondBB);		MergePHI->addIncoming(Constant::getNullValue(AccInstTy), CondBB);

return MergePHI;		return MergePHI;
}		}
}		}

void IslNodeBuilder::preloadInvariantLoads() {		void IslNodeBuilder::preloadInvariantLoads() {

const auto &InvAccList = S.getInvariantAccesses();		const auto &InvariantEquivClasses = S.getInvariantAccesses();
if (InvAccList.empty())		if (InvariantEquivClasses.empty())
return;		return;

const Region &R = S.getRegion();		const Region &R = S.getRegion();
BasicBlock *EntryBB = &Builder.GetInsertBlock()->getParent()->getEntryBlock();		BasicBlock *EntryBB = &Builder.GetInsertBlock()->getParent()->getEntryBlock();

BasicBlock *PreLoadBB =		BasicBlock *PreLoadBB =
SplitBlock(Builder.GetInsertBlock(), Builder.GetInsertPoint(), &DT, &LI);		SplitBlock(Builder.GetInsertBlock(), Builder.GetInsertPoint(), &DT, &LI);
PreLoadBB->setName("polly.preload.begin");		PreLoadBB->setName("polly.preload.begin");
Builder.SetInsertPoint(PreLoadBB->begin());		Builder.SetInsertPoint(PreLoadBB->begin());

isl_ast_build *Build =		isl_ast_build *Build =
isl_ast_build_from_context(isl_set_universe(S.getParamSpace()));		isl_ast_build_from_context(isl_set_universe(S.getParamSpace()));

for (const auto &IA : InvAccList) {		// For each equivalence class of invariant loads we pre-load the representing
MemoryAccess *MA = IA.first;		// element with the unified execution context. However, we have to map all
		// elements of the class to the one preloaded load as they are referenced
		// during the code generation and therefor need to be mapped.
		for (const auto &IAClass : InvariantEquivClasses) {

		MemoryAccess *MA = IAClass.second.front().first;
assert(!MA->isImplicit());		assert(!MA->isImplicit());

isl_set *Domain = isl_set_copy(IA.second);		isl_set *Domain = isl_set_copy(IAClass.second.front().second);
Instruction *AccInst = MA->getAccessInstruction();		Instruction *AccInst = MA->getAccessInstruction();
Value PreloadVal = preloadInvariantLoad(MA, Domain, Build);		Value PreloadVal = preloadInvariantLoad(MA, Domain, Build);
ValueMap[AccInst] = PreloadVal;		for (const InvariantAccessTy &IA : IAClass.second)
		ValueMap[IA.first->getAccessInstruction()] = PreloadVal;

if (SE.isSCEVable(AccInst->getType())) {		if (SE.isSCEVable(AccInst->getType())) {
isl_id *ParamId = S.getIdForParam(SE.getSCEV(AccInst));		isl_id *ParamId = S.getIdForParam(SE.getSCEV(AccInst));
if (ParamId)		if (ParamId)
IDToValue[ParamId] = PreloadVal;		IDToValue[ParamId] = PreloadVal;
isl_id_free(ParamId);		isl_id_free(ParamId);
}		}

▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

polly/trunk/test/Isl/CodeGen/OpenMP/invariant_base_pointers_preloaded.ll

				; RUN: opt %loadPolly -polly-codegen -polly-parallel \
				; RUN: -polly-parallel-force -S < %s \| FileCheck %s
				;
				; Test to verify that we hand down the preloaded A[0] to the OpenMP subfunction.
				;
				; void f(float *A) {
				; for (int i = 1; i < 1000; i++)
				; A[i] += A[0] + A[0];
				; }
				;
				; CHECK: %polly.subfn.storeaddr.polly.access.A.load = getelementptr inbounds
				; CHECK: store float %polly.access.A.load, float* %polly.subfn.storeaddr.polly.access.A.load
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(float* nocapture %A) {
				entry:
				br label %for.body

				for.cond.cleanup: ; preds = %for.body
				ret void

				for.body: ; preds = %for.body, %entry
				%indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
				%tmp = load float, float* %A, align 4
				%tmp2 = load float, float* %A, align 4
				%tmpadd = fadd float %tmp, %tmp2
				%arrayidx1 = getelementptr inbounds float, float* %A, i64 %indvars.iv
				%tmp1 = load float, float* %arrayidx1, align 4
				%add = fadd float %tmp2, %tmp1
				store float %add, float* %arrayidx1, align 4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, 1000
				br i1 %exitcond, label %for.cond.cleanup, label %for.body
				}

polly/trunk/test/Isl/CodeGen/invariant_load_outermost.ll

				; RUN: opt %loadPolly -polly-codegen -S < %s \| FileCheck %s

				; CHECK: polly.start

				; void f(int *A) {
				; if (*A > 42)
				; A = A + 1;
				; else
				; A = A - 1;
				; }
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(i32* %A) {
				entry:
				br label %entry.split

				entry.split:
				%tmp = load i32, i32* %A, align 4
				%cmp = icmp sgt i32 %tmp, 42
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%tmp1 = load i32, i32* %A, align 4
				%add = add nsw i32 %tmp1, 1
				br label %if.end

				if.else: ; preds = %entry
				%tmp2 = load i32, i32* %A, align 4
				%sub = add nsw i32 %tmp2, -1
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%storemerge = phi i32 [ %sub, %if.else ], [ %add, %if.then ]
				store i32 %storemerge, i32* %A, align 4
				ret void
				}

polly/trunk/test/Isl/CodeGen/whole-scop-non-affine-subregion.ll

	; RUN: opt %loadPolly \			; RUN: opt %loadPolly \
	; RUN: -polly-codegen -S < %s \| FileCheck %s			; RUN: -polly-codegen -S < %s \| FileCheck %s

	; CHECK: polly.start			; CHECK: polly.start
				; int /* pure */ g()
	; void f(int *A) {			; void f(int *A) {
	; if (*A > 42)			; if (g())
	; A = A + 1;			; A = A + 1;
	; else			; else
	; A = A - 1;			; A = A - 1;
	; }			; }
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	define void @f(i32* %A) {			define void @f(i32* %A) {
	entry:			entry:
	br label %entry.split			br label %entry.split

	entry.split:			entry.split:
	%tmp = load i32, i32* %A, align 4			%call = call i32 @g()
	%cmp = icmp sgt i32 %tmp, 42			%cmp = icmp eq i32 %call, 0
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	%tmp1 = load i32, i32* %A, align 4			%tmp1 = load i32, i32* %A, align 4
	%add = add nsw i32 %tmp1, 1			%add = add nsw i32 %tmp1, 1
				store i32 %add, i32* %A, align 4
	br label %if.end			br label %if.end

	if.else: ; preds = %entry			if.else: ; preds = %entry
	%tmp2 = load i32, i32* %A, align 4			%tmp2 = load i32, i32* %A, align 4
	%sub = add nsw i32 %tmp2, -1			%sub = add nsw i32 %tmp2, -1
				store i32 %sub, i32* %A, align 4
	br label %if.end			br label %if.end

	if.end: ; preds = %if.else, %if.then			if.end: ; preds = %if.else, %if.then
	%storemerge = phi i32 [ %sub, %if.else ], [ %add, %if.then ]
	store i32 %storemerge, i32* %A, align 4
	ret void			ret void
	}			}

				declare i32 @g() #0

				attributes #0 = { nounwind readnone }

polly/trunk/test/ScopInfo/intra_and_inter_bb_scalar_dep.ll

	Show All 11 Lines
	; }			; }
	; }			; }

	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"			target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"

	; CHECK: Invariant Accesses: {			; CHECK: Invariant Accesses: {
	; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK: MemRef_init_ptr[0]			; CHECK: MemRef_init_ptr[0]
	; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NOT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK: MemRef_init_ptr[0]			; CHECK-NOT: MemRef_init_ptr[0]
	; CHECK: }			; CHECK: }
	define void @f(i64* noalias %A, i64 %N, i64* noalias %init_ptr) #0 {			define void @f(i64* noalias %A, i64 %N, i64* noalias %init_ptr) #0 {
	entry:			entry:
	br label %for.i			br label %for.i

	for.i: ; preds = %for.i.end, %entry			for.i: ; preds = %for.i.end, %entry
	%indvar.i = phi i64 [ 0, %entry ], [ %indvar.i.next, %for.i.end ]			%indvar.i = phi i64 [ 0, %entry ], [ %indvar.i.next, %for.i.end ]
	%indvar.i.next = add nsw i64 %indvar.i, 1			%indvar.i.next = add nsw i64 %indvar.i, 1
	Show All 31 Lines

polly/trunk/test/ScopInfo/invariant_loads_complicated_dependences.ll

	; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s			; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
	;			;
	; CHECK: Invariant Accesses: {			; CHECK: Invariant Accesses: {
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: [tmp, tmp5] -> { Stmt_for_body[i0] -> MemRef_LB[0] };			; CHECK-NEXT: [LB, UB] -> { Stmt_for_body[i0] -> MemRef_LB[0] };
	; CHECK-NEXT: Execution Context: [tmp, tmp5] -> { : }			; CHECK-NEXT: Execution Context: [LB, UB] -> { : }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: [tmp, tmp5] -> { Stmt_do_cond[i0, i1] -> MemRef_UB[0] };			; CHECK-NEXT: [LB, UB] -> { Stmt_do_cond[i0, i1] -> MemRef_UB[0] };
	; CHECK-NEXT: Execution Context: [tmp, tmp5] -> { : }			; CHECK-NEXT: Execution Context: [LB, UB] -> { : }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: [tmp, tmp5] -> { Stmt_if_then[i0, i1] -> MemRef_V[0] };			; CHECK-NEXT: [LB, UB] -> { Stmt_if_then[i0, i1] -> MemRef_V[0] };
	; CHECK-NEXT: Execution Context: [tmp, tmp5] -> { : (tmp5 >= 1 + tmp and tmp5 >= 6) or tmp >= 6 }			; CHECK-NEXT: Execution Context: [LB, UB] -> { : (UB >= 1 + LB and UB >= 6) or LB >= 6 }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: [tmp, tmp5] -> { Stmt_if_else[i0, i1] -> MemRef_U[0] };			; CHECK-NEXT: [LB, UB] -> { Stmt_if_else[i0, i1] -> MemRef_U[0] };
	; CHECK-NEXT: Execution Context: [tmp, tmp5] -> { : tmp <= 5 }			; CHECK-NEXT: Execution Context: [LB, UB] -> { : LB <= 5 }
	; CHECK-NEXT: }			; CHECK-NEXT: }
	;			;
	; void f(int restrict A, int restrict V, int restrict U, int restrict UB,			; void f(int restrict A, int restrict V, int restrict U, int restrict UB,
	; int *restrict LB) {			; int *restrict LB) {
	; for (int i = 0; i < 100; i++) {			; for (int i = 0; i < 100; i++) {
	; int j = /* invariant load / LB;			; int j = /* invariant load / LB;
	; do {			; do {
	; if (j > 5)			; if (j > 5)
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

polly/trunk/test/ScopInfo/invariant_loop_bounds.ll

	; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s			; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
	;			;
	; CHECK: Invariant Accesses: {			; CHECK: Invariant Accesses: {
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: MemRef_bounds[2]			; CHECK-NEXT: MemRef_bounds[2]
	; CHECK-NEXT: Execution Context: [tmp, tmp8, tmp10] -> { : }			; CHECK-NEXT: Execution Context: [p_0, p_1, bounds] -> { : }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: MemRef_bounds[1]			; CHECK-NEXT: MemRef_bounds[1]
	; CHECK-NEXT: Execution Context: [tmp, tmp8, tmp10] -> { : tmp >= 1 }			; CHECK-NEXT: Execution Context: [p_0, p_1, bounds] -> { : p_0 >= 1 }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: MemRef_bounds[0]			; CHECK-NEXT: MemRef_bounds[0]
	; CHECK-NEXT: Execution Context: [tmp, tmp8, tmp10] -> { : tmp8 >= 1 and tmp >= 1 }			; CHECK-NEXT: Execution Context: [p_0, p_1, bounds] -> { : p_1 >= 1 and p_0 >= 1 }
	; CHECK-NEXT: }			; CHECK-NEXT: }
	;			;
	; CHECK: p0: %tmp			; CHECK: p0: (8 + @bounds)<nsw>
	; CHECK: p1: %tmp8			; CHECK: p1: (4 + @bounds)<nsw>
	; CHECK: p2: %tmp10			; CHECK: p2: @bounds
	; CHECK: Statements {			; CHECK: Statements {
	; CHECK: Stmt_for_body_6			; CHECK: Stmt_for_body_6
	; CHECK: Domain :=			; CHECK: Domain :=
	; CHECK: [tmp, tmp8, tmp10] -> { Stmt_for_body_6[i0, i1, i2] : i0 >= 0 and i0 <= -1 + tmp and i1 >= 0 and i1 <= -1 + tmp8 and i2 >= 0 and i2 <= -1 + tmp10 };			; CHECK: [p_0, p_1, bounds] -> { Stmt_for_body_6[i0, i1, i2] : i0 >= 0 and i0 <= -1 + p_0 and i1 >= 0 and i1 <= -1 + p_1 and i2 >= 0 and i2 <= -1 + bounds };
	; CHECK: Schedule :=			; CHECK: Schedule :=
	; CHECK: [tmp, tmp8, tmp10] -> { Stmt_for_body_6[i0, i1, i2] -> [i0, i1, i2] };			; CHECK: [p_0, p_1, bounds] -> { Stmt_for_body_6[i0, i1, i2] -> [i0, i1, i2] };
	; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK: [tmp, tmp8, tmp10] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };			; CHECK: [p_0, p_1, bounds] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
	; CHECK: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK: [tmp, tmp8, tmp10] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };			; CHECK: [p_0, p_1, bounds] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
	; CHECK: }			; CHECK: }
	;			;
	; int bounds[3];			; int bounds[3];
	; double data[1024][1024][1024];			; double data[1024][1024][1024];
	;			;
	; void foo() {			; void foo() {
	; int i, j, k;			; int i, j, k;
	; for (k = 0; k < bounds[2]; k++)			; for (k = 0; k < bounds[2]; k++)
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

polly/trunk/test/ScopInfo/invariant_same_loop_bound_multiple_times-1.ll

				; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
				;
				; Verify that we only have one parameter and one invariant load for all
				; three loads that occure in the region but actually access the same
				; location. Also check that the execution context is the most generic
				; one, e.g., here the universal set.
				;
				; CHECK: Invariant Accesses: {
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: MemRef_bounds[0]
				; CHECK-NEXT: Execution Context: [bounds] -> { : }
				; CHECK-NEXT: }
				;
				; CHECK: p0: @bounds
				; CHECK-NOT: p1
				; CHECK: Statements {
				; CHECK: Stmt_for_body_6
				; CHECK: Domain :=
				; CHECK: [bounds] -> { Stmt_for_body_6[i0, i1, i2] : i0 >= 0 and i0 <= -1 + bounds and i1 >= 0 and i1 <= -1 + bounds and i2 >= 0 and i2 <= -1 + bounds };
				; CHECK: Schedule :=
				; CHECK: [bounds] -> { Stmt_for_body_6[i0, i1, i2] -> [i0, i1, i2] };
				; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK: [bounds] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
				; CHECK: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK: [bounds] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
				; CHECK: }
				;
				; int bounds[1];
				; double data[1024][1024][1024];
				;
				; void foo() {
				; int i, j, k;
				; for (k = 0; k < bounds[0]; k++)
				; for (j = 0; j < bounds[0]; j++)
				; for (i = 0; i < bounds[0]; i++)
				; data[k][j][i] += i + j + k;
				; }
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				@bounds = common global [1 x i32] zeroinitializer, align 4
				@data = common global [1024 x [1024 x [1024 x double]]] zeroinitializer, align 16

				define void @foo() {
				entry:
				br label %for.cond

				for.cond: ; preds = %for.inc.16, %entry
				%indvars.iv5 = phi i64 [ %indvars.iv.next6, %for.inc.16 ], [ 0, %entry ]
				%tmp = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp7 = sext i32 %tmp to i64
				%cmp = icmp slt i64 %indvars.iv5, %tmp7
				br i1 %cmp, label %for.body, label %for.end.18

				for.body: ; preds = %for.cond
				br label %for.cond.1

				for.cond.1: ; preds = %for.inc.13, %for.body
				%indvars.iv3 = phi i64 [ %indvars.iv.next4, %for.inc.13 ], [ 0, %for.body ]
				%tmp8 = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp9 = sext i32 %tmp8 to i64
				%cmp2 = icmp slt i64 %indvars.iv3, %tmp9
				br i1 %cmp2, label %for.body.3, label %for.end.15

				for.body.3: ; preds = %for.cond.1
				br label %for.cond.4

				for.cond.4: ; preds = %for.inc, %for.body.3
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc ], [ 0, %for.body.3 ]
				%tmp10 = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp11 = sext i32 %tmp10 to i64
				%cmp5 = icmp slt i64 %indvars.iv, %tmp11
				br i1 %cmp5, label %for.body.6, label %for.end

				for.body.6: ; preds = %for.cond.4
				%tmp12 = add nsw i64 %indvars.iv, %indvars.iv3
				%tmp13 = add nsw i64 %tmp12, %indvars.iv5
				%tmp14 = trunc i64 %tmp13 to i32
				%conv = sitofp i32 %tmp14 to double
				%arrayidx11 = getelementptr inbounds [1024 x [1024 x [1024 x double]]], [1024 x [1024 x [1024 x double]]]* @data, i64 0, i64 %indvars.iv5, i64 %indvars.iv3, i64 %indvars.iv
				%tmp15 = load double, double* %arrayidx11, align 8
				%add12 = fadd double %tmp15, %conv
				store double %add12, double* %arrayidx11, align 8
				br label %for.inc

				for.inc: ; preds = %for.body.6
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %for.cond.4

				for.end: ; preds = %for.cond.4
				br label %for.inc.13

				for.inc.13: ; preds = %for.end
				%indvars.iv.next4 = add nuw nsw i64 %indvars.iv3, 1
				br label %for.cond.1

				for.end.15: ; preds = %for.cond.1
				br label %for.inc.16

				for.inc.16: ; preds = %for.end.15
				%indvars.iv.next6 = add nuw nsw i64 %indvars.iv5, 1
				br label %for.cond

				for.end.18: ; preds = %for.cond
				ret void
				}

polly/trunk/test/ScopInfo/invariant_same_loop_bound_multiple_times-2.ll

				; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
				;
				; Verify that we only have one parameter and one invariant load for all
				; three loads that occure in the region but actually access the same
				; location. Also check that the execution context is the most generic
				; one, e.g., here the universal set.
				;
				; CHECK: Invariant Accesses: {
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: MemRef_bounds[0]
				; CHECK-NEXT: Execution Context: [bounds, p] -> { : }
				; CHECK-NEXT: }
				;
				; CHECK: p0: @bounds
				; CHECK: p1: %p
				; CHECK-NOT: p2:
				; CHECK: Statements {
				; CHECK: Stmt_for_body_6
				; CHECK: Domain :=
				; CHECK: [bounds, p] -> { Stmt_for_body_6[i0, i1, i2] : p = 0 and i0 >= 0 and i0 <= -1 + bounds and i1 >= 0 and i1 <= -1 + bounds and i2 >= 0 and i2 <= -1 + bounds };
				; CHECK: Schedule :=
				; CHECK: [bounds, p] -> { Stmt_for_body_6[i0, i1, i2] -> [i0, i1, i2] };
				; CHECK: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK: [bounds, p] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
				; CHECK: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK: [bounds, p] -> { Stmt_for_body_6[i0, i1, i2] -> MemRef_data[i0, i1, i2] };
				; CHECK: }
				;
				; int bounds[1];
				; double data[1024][1024][1024];
				;
				; void foo(int p) {
				; int i, j, k;
				; for (k = 0; k < bounds[0]; k++)
				; if (p == 0)
				; for (j = 0; j < bounds[0]; j++)
				; for (i = 0; i < bounds[0]; i++)
				; data[k][j][i] += i + j + k;
				; }
				;
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				@bounds = common global [1 x i32] zeroinitializer, align 4
				@data = common global [1024 x [1024 x [1024 x double]]] zeroinitializer, align 16

				define void @foo(i32 %p) {
				entry:
				br label %for.cond

				for.cond: ; preds = %for.inc.16, %entry
				%indvars.iv5 = phi i64 [ %indvars.iv.next6, %for.inc.16 ], [ 0, %entry ]
				%tmp = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp7 = sext i32 %tmp to i64
				%cmp = icmp slt i64 %indvars.iv5, %tmp7
				br i1 %cmp, label %for.body, label %for.end.18

				for.body: ; preds = %for.cond
				%cmpp = icmp eq i32 %p, 0
				br i1 %cmpp, label %for.cond.1, label %for.inc.16

				for.cond.1: ; preds = %for.inc.13, %for.body
				%indvars.iv3 = phi i64 [ %indvars.iv.next4, %for.inc.13 ], [ 0, %for.body ]
				%tmp8 = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp9 = sext i32 %tmp8 to i64
				%cmp2 = icmp slt i64 %indvars.iv3, %tmp9
				br i1 %cmp2, label %for.body.3, label %for.end.15

				for.body.3: ; preds = %for.cond.1
				br label %for.cond.4

				for.cond.4: ; preds = %for.inc, %for.body.3
				%indvars.iv = phi i64 [ %indvars.iv.next, %for.inc ], [ 0, %for.body.3 ]
				%tmp10 = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* @bounds, i64 0, i64 0), align 4
				%tmp11 = sext i32 %tmp10 to i64
				%cmp5 = icmp slt i64 %indvars.iv, %tmp11
				br i1 %cmp5, label %for.body.6, label %for.end

				for.body.6: ; preds = %for.cond.4
				%tmp12 = add nsw i64 %indvars.iv, %indvars.iv3
				%tmp13 = add nsw i64 %tmp12, %indvars.iv5
				%tmp14 = trunc i64 %tmp13 to i32
				%conv = sitofp i32 %tmp14 to double
				%arrayidx11 = getelementptr inbounds [1024 x [1024 x [1024 x double]]], [1024 x [1024 x [1024 x double]]]* @data, i64 0, i64 %indvars.iv5, i64 %indvars.iv3, i64 %indvars.iv
				%tmp15 = load double, double* %arrayidx11, align 8
				%add12 = fadd double %tmp15, %conv
				store double %add12, double* %arrayidx11, align 8
				br label %for.inc

				for.inc: ; preds = %for.body.6
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				br label %for.cond.4

				for.end: ; preds = %for.cond.4
				br label %for.inc.13

				for.inc.13: ; preds = %for.end
				%indvars.iv.next4 = add nuw nsw i64 %indvars.iv3, 1
				br label %for.cond.1

				for.end.15: ; preds = %for.cond.1
				br label %for.inc.16

				for.inc.16: ; preds = %for.end.15
				%indvars.iv.next6 = add nuw nsw i64 %indvars.iv5, 1
				br label %for.cond

				for.end.18: ; preds = %for.cond
				ret void
				}

polly/trunk/test/ScopInfo/required-invariant-loop-bounds.ll

	; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s			; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
	;			;
	; CHECK: Invariant Accesses: {			; CHECK: Invariant Accesses: {
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: MemRef_bounds[0]			; CHECK-NEXT: MemRef_bounds[0]
	; CHECK-NEXT: Execution Context: [tmp, tmp1] -> { : }			; CHECK-NEXT: Execution Context: [bounds, p_1] -> { : }
	; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]			; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
	; CHECK-NEXT: MemRef_bounds[1]			; CHECK-NEXT: MemRef_bounds[1]
	; CHECK-NEXT: Execution Context: [tmp, tmp1] -> { : tmp >= 0 }			; CHECK-NEXT: Execution Context: [bounds, p_1] -> { : bounds >= 0 }
	; CHECK: }			; CHECK: }

	; double A[1000][1000];			; double A[1000][1000];
	; long bounds[2];			; long bounds[2];
	;			;
	; void foo() {			; void foo() {
	;			;
	; for (long i = 0; i <= bounds[0]; i++)			; for (long i = 0; i <= bounds[0]; i++)
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines