This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/polly/
-
polly/
5
ScopInfo.h
-
lib/Analysis/
-
Analysis/
9/19
ScopInfo.cpp
-
test/
-
Isl/CodeGen/
-
CodeGen/
2/5
invariant_load_unify_arrays.ll
-
ScopInfo/
2
invariant_load_unify_arrays.ll
-
invariant_load_unify_arrays_2.ll
-
invariant_load_unify_arrays_3.ll
-
invariant_load_unify_arrays_4.ll
-
invariant_load_unify_arrays_4b.ll
-
invariant_load_unify_arrays_4c.ll
-
invariant_load_unify_arrays_5.ll

Differential D28518

[Polly] Canonicalize arrays according to base-ptr equivalence class
ClosedPublic

Authored by grosser on Jan 10 2017, 8:02 AM.

Download Raw Diff

Details

Reviewers

sebpop
Meinersbur
• zinob
gareevroman
pollydev
huihuiz
efriedma
jdoerfert

Commits

rGf3adab4c20c6: [Polly] Canonicalize arrays according to base-ptr equivalence class
rPLO302636: [Polly] Canonicalize arrays according to base-ptr equivalence class
rL302636: [Polly] Canonicalize arrays according to base-ptr equivalence class

Summary

In case two arrays share base pointers in the same invariant load equivalence
class, we canonicalize all memory accesses to the first of these arrays
(according to their order in the equivalence class).

This enables us to optimize kernels such as boost::ublas by ensuring that
different references to the C array are interpreted as accesses to the same
array. Before this change the runtime alias check for ublas would fail, as it
would assume models of the C array with differing (but identically valued) base
pointers would reference distinct regions of memory whereas the referenced
memory regions were indeed identical.

As part of this change we remove most of the MemoryAccess::get*BaseAddr
interface. We removed already all references to get*BaseAddr in previous
commits to ensure that no code relies on matching base pointers between
memory accesses and scop arrays -- except for three remaining uses where we
need the original base pointer. We document for these situations that
MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct
to the base pointer of the scop array referenced by this memory access.

Diff Detail

Build Status

Buildable 3874
Build 3874: arc lint + arc unit

Event Timeline

grosser updated this revision to Diff 83810.Jan 10 2017, 8:02 AM

grosser retitled this revision from to Unify arrays according to base-ptr equivalence class.

grosser updated this object.

grosser added reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, • zinob, huihuiz, pollydev.

testcase?

Add test case

My suggestion is to modify getOrCreateScopArrayInfo which before creating a new ScopArrayInfo, looks up the canonical load from InvEquivClassVMap.

include/polly/ScopInfo.h
1847	`incorrect` may be not the correct word as there is no guarantee that two base pointers are different anyway. In that sense a program with two base ptrs are ony correct if AA says they are disjoint. This is why we have alias checks.
1849	I'd interpret the future tense (\will\) as something that is going to happen later in the pipeline. This patch, however, removes this possibility. Can you make it clearer that it is something that would happen without this step?
lib/Analysis/ScopInfo.cpp
3695	`unify` sounds like after this there will be only one indirect array. I am also not sure whether indirect array is the correct term as there is only one array whose pointer happens to be loaded twice. (The assertion you commented out thinks it is as soon as there is a dynamic load) Alternative name suggestion: canonicalizeDynamicBasePtrs
3696–3697	We might separate the analysis part and the transformation part of invariant load hoisting, such that `unifyIndirectArrays` can be activated independently of invariant load hoisting.
3705	Three possible alternatives: for (auto Pair : enumerate(EqClass.InvariantAccesses)) if (Pair.Index == 0) bool IsFirst = true; for (...) if (IsFirst) { IsFirst = False; } if (InvAccess == EqClass.InvariantAccesses.front())
3708–3712	Isn't there `CanonicalAccess->getScopArrayInfo()`. If it hasn't been computed yet, could we either call `unifyIndirectArrays()` after it has been computed or compute it earlier?
3719–3727	Duplication. Refactor-out?
3735	This marks the access relation as been replaced as if done by eg. JSONImporter, the original still being accessible. The code generator has a different code path if a new access relation has been set. Is this intended?

Hi Michael,

I updated this patch according to your comments and added some good test coverage. With r292213 this patch works as intended and it seems reasonable clean.

There are however two open points:

I did not yet check if the very same thing could be implemented better in getOrCreateScopArrayInfo()

I still want to check if we can set the original access function, instead of setting a new one. However, this will require a closer review of the basepointer handling in Polly.

@gareevroman : This patch should be good enough for your experiments, but it still needs some more thoughts before being able to be committed.

include/polly/ScopInfo.h
1847	I reformulated this and expanded the comment.
1849	Reformulated. Please see if this now reads better.
lib/Analysis/ScopInfo.cpp
3695	I like the new name -> changed.
3696–3697	Actually, this check is not needed. If invariant load hoisting is disabled, the InvariantEquivClasses array will be empty and consequently this function won't do anything. I dropped the check.
3705	Very nice. Thank you for these ideas. I especially like the first one. I changed my code to the first one, but then moved to even another solution: copying the list of accesses and dropping the first one explicitly. This has the minor cost of copying the full access list. However, this cost is likely very small and the resulting code looks very readable.
3708–3712	Not sure what you have in mind here. CanonicalSAI is not the ScopArrayInfo of the CanonicalAccess, but its the SAI which has as basepointer the load in CanonoicalSAI. I introduced now a function getScopArrayInfoOrNull() similarly to the already existing getScopArrayInfo and just use this instead of the loop above.
3719–3727	I also moved this to getScopArrayInfoOrNull()
3735	This was intended. If I just reset the original access relation, the following check starts to fail: getOriginalScopArrayInfo()->getBasePtr() == BaseAddr' We should probably review this invariant, but there are several parts of Polly that rely on this behavior. I still need to look at all of them to understand if they behavior is correct: lib/Analysis/ScopInfo.cpp: auto SAI = S.getOrCreateScopArrayInfo(Access->getBaseAddr(), ElementType, lib/Analysis/ScopInfo.cpp: PHINode PHI = cast<PHINode>(Access->getBaseAddr()); lib/Analysis/ScopInfo.cpp: auto BaseAddr = SE->getSCEV(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: auto BaseAddr = SE->getSCEV(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: HasWriteAccess.insert(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: Value *BaseAddr = Access->getBaseAddr(); lib/Analysis/ScopInfo.cpp: auto &SAI = ScopArrayInfoMap[std::make_pair(Access->getBaseAddr(), lib/CodeGen/IRBuilder.cpp: BasePtrs.insert(MA->getBaseAddr()); lib/CodeGen/IslAst.cpp: ", " + MA->getBaseAddr()->getName().str(); lib/CodeGen/BlockGenerators.cpp: return getOrCreateScalarAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreatePHIAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreatePHIAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreateScalarAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: BBMap[MA->getBaseAddr()] = lib/CodeGen/BlockGenerators.cpp: VectorBlockMap[MA->getBaseAddr()] = VectorVal; lib/CodeGen/IslNodeBuilder.cpp: if (BasePtr == MA->getBaseAddr()) {

Addressed Michael's review comments.

grosser mentioned this in D28901: [Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap.Jan 19 2017, 7:09 AM

grosser mentioned this in rL293374: [Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap.Jan 27 2017, 11:53 PM

grosser retitled this revision from Unify arrays according to base-ptr equivalence class to [Polly] Unify arrays according to base-ptr equivalence class.Feb 2 2017, 2:18 AM

grosser added a project: Restricted Project.

Update to r294734.

We now also modify the original access relation rather than setting a new
access relation. To make this possible we committed a variety of changes that
removed Polly's dependence on MemoryAccess::getBaseAddr() in previous commits.

grosser edited the summary of this revision. (Show Details)Feb 10 2017, 3:38 AM

Can you rebase to trunk? There is a use of BasePtr left in IslNodeBuilder.cpp that I don't know to resolve (And look like an adhoc-modification of the (Original)BasePtr, something you tried to avoid in this patch by using setAccessRelation)

Is it possible to remove getOriginalBaseAddr() as well? All remaining uses seem to be related to invariant load hoisting, i.e. should be internal to that algorithm, not public API.

lib/Analysis/ScopInfo.cpp
3701–3708	The loop seems to be mainly about finding a canonical ScopArrayInfo, but also modifies the `Accesses` array of elements that will be skipped afterwards anyway. I think there is no reason to do that: Execution time is worse because you need to copy the array first. More importantly, it makes the code less readable. Now we have two location that effectively do the same thing: Lines 3705 and 3719. It takes some time to understand that. Maintainability suffers as well as the two location need to stay in sync. The probability of bugs increases as well. There are two locations that need to be implemented correctly.
test/Isl/CodeGen/invariant_load_unify_arrays.ll
1	The file's name still contains `unify`
5	How does this check whether `%baseA` and `%baseB` have been canonicalized? I would think we'd check whether both stores will access the canonical base pointer, `%baseB`.
test/ScopInfo/invariant_load_unify_arrays.ll
1	The IR looks identical to the one in CodeGen. I understand that one is testing `-polly-scops -analyze` , the other `-polly-codegen -S`. However, since they are both testing the same feature, and for the sake of maintainability, wouldn't it be better to put both into the same file?

Meinersbur mentioned this in D30695: [Polly][ScopDetection] Only allow SCoP-wide available base pointers.Mar 7 2017, 7:44 AM

Meinersbur mentioned this in rL297281: [ScopDetection] Only allow SCoP-wide available base pointers..Mar 8 2017, 7:26 AM

grosser retitled this revision from [Polly] Unify arrays according to base-ptr equivalence class to [Polly] Canonicalize arrays according to base-ptr equivalence class.May 8 2017, 12:54 AM

Rebased and addressed some final comments.

Hi Michael, hi Roman,

this patch works cleanly on trunk. I believe I addressed most, but not all open comments. Michael had some last ideas regarding how some code could be improved stylewise. I thought for a while about it, but could not come up with good ideas immediately. Hence, I just rebased the patch to make sure it runs on trunk. This hopefully makes it easy to read over it again and play with ideas of how to improve readability. While I think the patch is already well readable, I am glad to incorporate good ideas to make it even more readable.

Thanks for your feedback,
Tobias

lib/Analysis/ScopInfo.cpp
3701–3708	I am not fully sure how to improve the code, given that we may not always choose the very first load as canonical array. The cost of copying the arrays is likely minor, but readability and understandability are strong arguments. I wonder if -- after the refactoring -- you happen to have a good idea of how to simplify this code?
3735	This has been changed. We now directly change the access relation.
test/Isl/CodeGen/invariant_load_unify_arrays.ll
1	Renamed to invariant_load_canonicalize_array_baseptrs.ll
5	I added CHECK-NEXT lines.
test/ScopInfo/invariant_load_unify_arrays.ll
1	Right. The problem with just dropping this test case is that we then have -polly-scops failures reported in test/CodeGen. I think it is a lot nicer to see directly from failing test cases if something in scop modeling or in code generation broke. Obviously this comes at a price of redundant test inputs. It seems to be a question of preference / tradeoffs. I generally tried to follow the rule to run only passes with the name of the test directory in the directory. However, neither was I consistent and I think Johannes also followed a different approach. Not sure what to do.

Thanks for the hand-crafted test cases.

include/polly/ScopInfo.h
815–826	Is this change related?
lib/Analysis/ScopInfo.cpp
3701–3708	Suggestion: static const ScopArrayInfo findCanonicalArray(Scop S, MemoryAccessList &Accesses) { for (MemoryAccess Access : Accesses) { const ScopArrayInfo CanonicalArray = S->getScopArrayInfoOrNull( Access->getAccessInstruction(), MemoryKind::Array); if (CanonicalArray) return CanonicalArray; } return nullptr; } static bool isUsedForIndirectHoistedLoad(Scop S, const ScopArrayInfo Array) { for (InvariantEquivClassTy &EqClass2 : S->getInvariantAccesses()) for (MemoryAccess Access2 : EqClass2.InvariantAccesses) if (Access2->getScopArrayInfo() == Array) return true; return false; } static void replaceBasePtrArrays(Scop S, const ScopArrayInfo Old, const ScopArrayInfo New) { for (ScopStmt &Stmt : S) for (MemoryAccess Access : Stmt) { if (Access->getLatestScopArrayInfo() != Old) continue; isl_id Id = New->getBasePtrId(); isl_map Map = Access->getAccessRelation(); Map = isl_map_set_tuple_id(Map, isl_dim_out, Id); Access->setAccessRelation(Map); } } void Scop::canonicalizeDynamicBasePtrs() { for (InvariantEquivClassTy &EqClass : InvariantEquivClasses) { MemoryAccessList &BasePtrAccesses = EqClass.InvariantAccesses; const ScopArrayInfo CanonicalBasePtrSAI = findCanonicalArray(this, BasePtrAccesses); if (!CanonicalBasePtrSAI) continue; for (MemoryAccess BasePtrAccess : BasePtrAccesses) { const ScopArrayInfo *BasePtrSAI = getScopArrayInfoOrNull( BasePtrAccess->getAccessInstruction(), MemoryKind::Array); if (!BasePtrSAI \|\| BasePtrSAI == CanonicalBasePtrSAI \|\| !BasePtrSAI->isCompatibleWith(CanonicalBasePtrSAI)) continue; if (isUsedForIndirectHoistedLoad(this, BasePtrSAI)) continue; replaceBasePtrArrays(this, BasePtrSAI, CanonicalBasePtrSAI); } } }
3722–3731	A comment about what this is supposed to do would be nice. I know it is related to `test/ScopInfo/invariant_load_canonicalize_array_baseptrs_5.ll`, but I think an explanation here would be even more important than in the test case.
3734	There are two local variables with the name `Access`.
3788–3793	Prefer `lookup`. `operator[]` creates an entry every time this return `Null`.
test/Isl/CodeGen/invariant_load_unify_arrays.ll
5	Where?
test/ScopInfo/invariant_load_canonicalize_array_baseptrs_4.ll
5 ↗	(On Diff #98133)	"coalesced" -> "canonicalized"?
test/ScopInfo/invariant_load_canonicalize_array_baseptrs_5.ll
6 ↗	(On Diff #98133)	unify -> canonicalize
7 ↗	(On Diff #98133)	Why only "probably"?
12 ↗	(On Diff #98133)	Could you mention that `%ptr` (a hoisted load, itself requiring that `%baseA2` is hoisted) is the culprit here?
58–61 ↗	(On Diff #98133)	To ensure a use of `%v0`, `%v1`, would store float* undef, float** %ptr store float* undef, float** %baseA2 be enough? In that case, one might peel one `` off the types as use store float 0.0, float %ptr store float 0.0, float* %baseA2
65 ↗	(On Diff #98133)	Why is `%baseB2` based on `%A`? Would eg. `%baseA3` (or `%baseA1`, which seems to be missing) be a better name?

This revision is now accepted and ready to land.May 9 2017, 6:25 AM

Closed by commit rL302636: [Polly] Canonicalize arrays according to base-ptr equivalence class (authored by grosser). · Explain WhyMay 10 2017, 4:13 AM

This revision was automatically updated to reflect the committed changes.

grosser marked 8 inline comments as done.

Meinersbur mentioned this in D33256: [Polly] [Fortran Support] Generate GPU kernels for Fortran arrays.May 26 2017, 4:23 AM

Revision Contents

Path

Size

include/

polly/

ScopInfo.h

68 lines

lib/

Analysis/

ScopInfo.cpp

73 lines

test/

Isl/

CodeGen/

invariant_load_unify_arrays.ll

33 lines

ScopInfo/

invariant_load_unify_arrays.ll

46 lines

invariant_load_unify_arrays_2.ll

91 lines

invariant_load_unify_arrays_3.ll

58 lines

invariant_load_unify_arrays_4.ll

53 lines

invariant_load_unify_arrays_4b.ll

55 lines

invariant_load_unify_arrays_4c.ll

51 lines

invariant_load_unify_arrays_5.ll

78 lines

Diff 87976

include/polly/ScopInfo.h

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines	public:
static const ScopArrayInfo getFromId(__isl_take isl_id Id);		static const ScopArrayInfo getFromId(__isl_take isl_id Id);

/// Get the space of this array access.		/// Get the space of this array access.
__isl_give isl_space *getSpace() const;		__isl_give isl_space *getSpace() const;

/// If the array is read only		/// If the array is read only
bool isReadOnly();		bool isReadOnly();

		/// Verify that @p Array is compatible to this ScopArrayInfo.
		///
		/// Two arrays are compatible if their dimensionality, the sizes of their
		/// dimensions, and their element sizes match.
		///
		/// @param Array The array to compare against.
		///
		/// @returns True, if the arrays are compatible, False otherwise.
		bool isCompatibleWith(const ScopArrayInfo *Array) const;

private:		private:
void addDerivedSAI(ScopArrayInfo *DerivedSAI) {		void addDerivedSAI(ScopArrayInfo *DerivedSAI) {
DerivedSAIs.insert(DerivedSAI);		DerivedSAIs.insert(DerivedSAI);
}		}

/// For indirect accesses this is the SAI of the BP origin.		/// For indirect accesses this is the SAI of the BP origin.
const ScopArrayInfo *BasePtrOriginSAI;		const ScopArrayInfo *BasePtrOriginSAI;

▲ Show 20 Lines • Show All 422 Lines • ▼ Show 20 Lines	public:
applyScheduleToAccessRelation(__isl_take isl_union_map *Schedule) const;		applyScheduleToAccessRelation(__isl_take isl_union_map *Schedule) const;

/// Get an isl string representing the access function read from IR.		/// Get an isl string representing the access function read from IR.
std::string getOriginalAccessRelationStr() const;		std::string getOriginalAccessRelationStr() const;

/// Get an isl string representing a new access function, if available.		/// Get an isl string representing a new access function, if available.
std::string getNewAccessRelationStr() const;		std::string getNewAccessRelationStr() const;

/// Get the base address of this access (e.g. A for A[i+j]) when		/// Get the original base address of this access (e.g. A for A[i+j]) when
/// detected.		/// detected.
		///
		/// This adress may differ from the base address referenced by the Original
		/// ScopArrayInfo to which this array belongs, as this memory access may
		/// have been unified to a ScopArray which has a different but identically
		/// valued base pointer in case invariant load hoisting is enabled.
Value *getOriginalBaseAddr() const {		Value *getOriginalBaseAddr() const {
assert(!getOriginalScopArrayInfo() /* may noy yet be initialized */ \|\|
getOriginalScopArrayInfo()->getBasePtr() == BaseAddr);
return BaseAddr;		return BaseAddr;
}		}

/// Get the base address of this access (e.g. A for A[i+j]) after a
/// potential change by setNewAccessRelation().
Value *getLatestBaseAddr() const {
return getLatestScopArrayInfo()->getBasePtr();
}

/// Old name for getOriginalBaseAddr().
Value *getBaseAddr() const { return getOriginalBaseAddr(); }

/// Get the detection-time base array isl_id for this access.		/// Get the detection-time base array isl_id for this access.
		MeinersburUnsubmitted Not Done Reply Inline Actions Is this change related? Meinersbur: Is this change related?
__isl_give isl_id *getOriginalArrayId() const;		__isl_give isl_id *getOriginalArrayId() const;

/// Get the base array isl_id for this access, modifiable through		/// Get the base array isl_id for this access, modifiable through
/// setNewAccessRelation().		/// setNewAccessRelation().
__isl_give isl_id *getLatestArrayId() const;		__isl_give isl_id *getLatestArrayId() const;

/// Old name of getOriginalArrayId().		/// Old name of getOriginalArrayId().
__isl_give isl_id *getArrayId() const { return getOriginalArrayId(); }		__isl_give isl_id *getArrayId() const { return getOriginalArrayId(); }
▲ Show 20 Lines • Show All 1,000 Lines • ▼ Show 20 Lines	private:
/// for (int j = 1; j < *LB[1]; j++)		/// for (int j = 1; j < *LB[1]; j++)
/// A[i][j] += A[0][0] + (*V);		/// A[i][j] += A[0][0] + (*V);
///		///
/// Common inv. loads: V, A[0][0], LB[0], LB[1]		/// Common inv. loads: V, A[0][0], LB[0], LB[1]
/// Required inv. loads: LB[0], LB[1], (V, if it may alias with A or LB)		/// Required inv. loads: LB[0], LB[1], (V, if it may alias with A or LB)
///		///
void hoistInvariantLoads();		void hoistInvariantLoads();

		/// Canonicalize arrays with base pointers from the same equivalence class.
		///
		/// Some context: in our normal model we assume that each base pointer is
		/// related to a single specific memory region, where memory regions
		/// associated with different base pointers are disjoint. Consequently we do
		MeinersburUnsubmitted Not Done Reply Inline Actions `incorrect` may be not the correct word as there is no guarantee that two base pointers are different anyway. In that sense a program with two base ptrs are ony correct if AA says they are disjoint. This is why we have alias checks. Meinersbur: `incorrect` may be not the correct word as there is no guarantee that two base pointers are…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions I reformulated this and expanded the comment. grosser: I reformulated this and expanded the comment.
		/// not need to compute additional data dependences that model possible
		/// overlaps of these memory regions. To verify our assumption we compute
		MeinersburUnsubmitted Not Done Reply Inline Actions I'd interpret the future tense (\will\) as something that is going to happen later in the pipeline. This patch, however, removes this possibility. Can you make it clearer that it is something that would happen without this step? Meinersbur: I'd interpret the future tense (\will\) as something that is going to happen later in the…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Reformulated. Please see if this now reads better. grosser: Reformulated. Please see if this now reads better.
		/// alias checks that verify that modeled arrays indeed do not overlap. In
		/// case an overlap is detected the runtime check fails and we fall back to
		/// the original code.
		///
		/// In case of arrays where the base pointers are know to be identical,
		/// because they are dynamically loaded by accesses that are in the same
		/// invariant load equivalence class, such run-time alias check would always
		/// be false.
		///
		/// This function makes sure that we do not generate consistently failing
		/// run-time checks for code that contains distinct arrays with known
		/// equivalent base pointers. It identifies for each invariant load
		/// equivalence class a single canonical array and canonicalizes all memory
		/// accesses that reference arrays that have base pointers that are known to
		/// be equal to the base pointer of such a canonical array to this canonical
		/// array.
		///
		/// We currently do not canonicalize arrays for which certain memory accesses
		/// have been hoisted as loop invariant.
		void canonicalizeDynamicBasePtrs();

/// Add invariant loads listed in @p InvMAs with the domain of @p Stmt.		/// Add invariant loads listed in @p InvMAs with the domain of @p Stmt.
void addInvariantLoads(ScopStmt &Stmt, InvariantAccessesTy &InvMAs);		void addInvariantLoads(ScopStmt &Stmt, InvariantAccessesTy &InvMAs);

/// Create an id for @p Param and store it in the ParameterIds map.		/// Create an id for @p Param and store it in the ParameterIds map.
void createParameterId(const SCEV *Param);		void createParameterId(const SCEV *Param);

/// Build the Context of the Scop.		/// Build the Context of the Scop.
void buildContext();		void buildContext();
▲ Show 20 Lines • Show All 570 Lines • ▼ Show 20 Lines	public:
const ScopArrayInfo createScopArrayInfo(Type ElementType,		const ScopArrayInfo createScopArrayInfo(Type ElementType,
const std::string &BaseName,		const std::string &BaseName,
const std::vector<unsigned> &Sizes);		const std::vector<unsigned> &Sizes);

/// Return the cached ScopArrayInfo object for @p BasePtr.		/// Return the cached ScopArrayInfo object for @p BasePtr.
///		///
/// @param BasePtr The base pointer the object has been stored for.		/// @param BasePtr The base pointer the object has been stored for.
/// @param Kind The kind of array info object.		/// @param Kind The kind of array info object.
		///
		/// @returns The ScopArrayInfo pointer or NULL if no such pointer is
		/// available.
		const ScopArrayInfo getScopArrayInfoOrNull(Value BasePtr, MemoryKind Kind);

		/// Return the cached ScopArrayInfo object for @p BasePtr.
		///
		/// @param BasePtr The base pointer the object has been stored for.
		/// @param Kind The kind of array info object.
		///
		/// @returns The ScopArrayInfo pointer (may assert if no such pointer is
		/// available).
const ScopArrayInfo getScopArrayInfo(Value BasePtr, MemoryKind Kind);		const ScopArrayInfo getScopArrayInfo(Value BasePtr, MemoryKind Kind);

/// Invalidate ScopArrayInfo object for base address.		/// Invalidate ScopArrayInfo object for base address.
///		///
/// @param BasePtr The base pointer of the ScopArrayInfo object to invalidate.		/// @param BasePtr The base pointer of the ScopArrayInfo object to invalidate.
/// @param Kind The Kind of the ScopArrayInfo object.		/// @param Kind The Kind of the ScopArrayInfo object.
void invalidateScopArrayInfo(Value *BasePtr, MemoryKind Kind) {		void invalidateScopArrayInfo(Value *BasePtr, MemoryKind Kind) {
auto It = ScopArrayInfoMap.find(std::make_pair(BasePtr, Kind));		auto It = ScopArrayInfoMap.find(std::make_pair(BasePtr, Kind));
▲ Show 20 Lines • Show All 234 Lines • Show Last 20 Lines

lib/Analysis/ScopInfo.cpp

Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	WriteSet = isl_union_set_intersect(
WriteSet, isl_union_set_from_set(isl_set_universe(Space)));		WriteSet, isl_union_set_from_set(isl_set_universe(Space)));

bool IsReadOnly = isl_union_set_is_empty(WriteSet);		bool IsReadOnly = isl_union_set_is_empty(WriteSet);
isl_union_set_free(WriteSet);		isl_union_set_free(WriteSet);

return IsReadOnly;		return IsReadOnly;
}		}

		bool ScopArrayInfo::isCompatibleWith(const ScopArrayInfo *Array) const {
		if (Array->getElementType() != getElementType())
		return false;

		if (Array->getNumberOfDimensions() != getNumberOfDimensions())
		return false;

		for (unsigned i = 0; i < getNumberOfDimensions(); i++)
		if (Array->getDimensionSize(i) != getDimensionSize(i))
		return false;

		return true;
		}

void ScopArrayInfo::updateElementType(Type *NewElementType) {		void ScopArrayInfo::updateElementType(Type *NewElementType) {
if (NewElementType == ElementType)		if (NewElementType == ElementType)
return;		return;

auto OldElementSize = DL.getTypeAllocSizeInBits(ElementType);		auto OldElementSize = DL.getTypeAllocSizeInBits(ElementType);
auto NewElementSize = DL.getTypeAllocSizeInBits(NewElementType);		auto NewElementSize = DL.getTypeAllocSizeInBits(NewElementType);

if (NewElementSize == OldElementSize \|\| NewElementSize == 0)		if (NewElementSize == OldElementSize \|\| NewElementSize == 0)
▲ Show 20 Lines • Show All 3,033 Lines • ▼ Show 20 Lines	void Scop::init(AliasAnalysis &AA, DominatorTree &DT, LoopInfo &LI) {
// assumed/invalid context.		// assumed/invalid context.
addRecordedAssumptions();		addRecordedAssumptions();

simplifyContexts();		simplifyContexts();
if (!buildAliasChecks(AA))		if (!buildAliasChecks(AA))
return;		return;

hoistInvariantLoads();		hoistInvariantLoads();
		canonicalizeDynamicBasePtrs();
verifyInvariantLoads();		verifyInvariantLoads();
simplifySCoP(true);		simplifySCoP(true);

// Check late for a feasible runtime context because profitability did not		// Check late for a feasible runtime context because profitability did not
// change.		// change.
if (!hasFeasibleRuntimeContext())		if (!hasFeasibleRuntimeContext())
return;		return;
}		}
▲ Show 20 Lines • Show All 385 Lines • ▼ Show 20 Lines	for (ScopStmt &Stmt : *this) {
// Transfer the memory access from the statement to the SCoP.		// Transfer the memory access from the statement to the SCoP.
for (auto InvMA : InvariantAccesses)		for (auto InvMA : InvariantAccesses)
Stmt.removeMemoryAccess(InvMA.MA);		Stmt.removeMemoryAccess(InvMA.MA);
addInvariantLoads(Stmt, InvariantAccesses);		addInvariantLoads(Stmt, InvariantAccesses);
}		}
isl_union_map_free(Writes);		isl_union_map_free(Writes);
}		}

		void Scop::canonicalizeDynamicBasePtrs() {
		MeinersburUnsubmitted Done Reply Inline Actions `unify` sounds like after this there will be only one indirect array. I am also not sure whether indirect array is the correct term as there is only one array whose pointer happens to be loaded twice. (The assertion you commented out thinks it is as soon as there is a dynamic load) Alternative name suggestion: canonicalizeDynamicBasePtrs Meinersbur: `unify` sounds like after this there will be only one indirect array. I am also not sure…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions I like the new name -> changed. grosser: I like the new name -> changed.
		for (InvariantEquivClassTy &EqClass : InvariantEquivClasses) {
		MemoryAccessList Accesses = EqClass.InvariantAccesses;
		MeinersburUnsubmitted Done Reply Inline Actions We might separate the analysis part and the transformation part of invariant load hoisting, such that `unifyIndirectArrays` can be activated independently of invariant load hoisting. Meinersbur: We might separate the analysis part and the transformation part of invariant load hoisting…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Actually, this check is not needed. If invariant load hoisting is disabled, the InvariantEquivClasses array will be empty and consequently this function won't do anything. I dropped the check. grosser: Actually, this check is not needed. If invariant load hoisting is disabled, the…

		const ScopArrayInfo *CanonicalArray = nullptr;

		while (!Accesses.empty()) {
		MemoryAccess *CanonicalAccess = Accesses.front();
		Value *CanonicalLoad = CanonicalAccess->getAccessInstruction();
		CanonicalArray = getScopArrayInfoOrNull(CanonicalLoad, MemoryKind::Array);
		Accesses.pop_front();
		MeinersburUnsubmitted Done Reply Inline Actions Three possible alternatives: for (auto Pair : enumerate(EqClass.InvariantAccesses)) if (Pair.Index == 0) bool IsFirst = true; for (...) if (IsFirst) { IsFirst = False; } if (InvAccess == EqClass.InvariantAccesses.front()) Meinersbur: Three possible alternatives: ``` for (auto Pair : enumerate(EqClass.InvariantAccesses))…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Very nice. Thank you for these ideas. I especially like the first one. I changed my code to the first one, but then moved to even another solution: copying the list of accesses and dropping the first one explicitly. This has the minor cost of copying the full access list. However, this cost is likely very small and the resulting code looks very readable. grosser: Very nice. Thank you for these ideas. I especially like the first one. I changed my code to the…
		if (CanonicalArray)
		break;
		}
		MeinersburUnsubmitted Not Done Reply Inline Actions The loop seems to be mainly about finding a canonical ScopArrayInfo, but also modifies the `Accesses` array of elements that will be skipped afterwards anyway. I think there is no reason to do that: Execution time is worse because you need to copy the array first. More importantly, it makes the code less readable. Now we have two location that effectively do the same thing: Lines 3705 and 3719. It takes some time to understand that. Maintainability suffers as well as the two location need to stay in sync. The probability of bugs increases as well. There are two locations that need to be implemented correctly. Meinersbur: The loop seems to be mainly about finding a canonical ScopArrayInfo, but also modifies the…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions I am not fully sure how to improve the code, given that we may not always choose the very first load as canonical array. The cost of copying the arrays is likely minor, but readability and understandability are strong arguments. I wonder if -- after the refactoring -- you happen to have a good idea of how to simplify this code? grosser: I am not fully sure how to improve the code, given that we may not always choose the very first…
		MeinersburUnsubmitted Not Done Reply Inline Actions Suggestion: static const ScopArrayInfo findCanonicalArray(Scop S, MemoryAccessList &Accesses) { for (MemoryAccess Access : Accesses) { const ScopArrayInfo CanonicalArray = S->getScopArrayInfoOrNull( Access->getAccessInstruction(), MemoryKind::Array); if (CanonicalArray) return CanonicalArray; } return nullptr; } static bool isUsedForIndirectHoistedLoad(Scop S, const ScopArrayInfo Array) { for (InvariantEquivClassTy &EqClass2 : S->getInvariantAccesses()) for (MemoryAccess Access2 : EqClass2.InvariantAccesses) if (Access2->getScopArrayInfo() == Array) return true; return false; } static void replaceBasePtrArrays(Scop S, const ScopArrayInfo Old, const ScopArrayInfo New) { for (ScopStmt &Stmt : S) for (MemoryAccess Access : Stmt) { if (Access->getLatestScopArrayInfo() != Old) continue; isl_id Id = New->getBasePtrId(); isl_map Map = Access->getAccessRelation(); Map = isl_map_set_tuple_id(Map, isl_dim_out, Id); Access->setAccessRelation(Map); } } void Scop::canonicalizeDynamicBasePtrs() { for (InvariantEquivClassTy &EqClass : InvariantEquivClasses) { MemoryAccessList &BasePtrAccesses = EqClass.InvariantAccesses; const ScopArrayInfo CanonicalBasePtrSAI = findCanonicalArray(this, BasePtrAccesses); if (!CanonicalBasePtrSAI) continue; for (MemoryAccess BasePtrAccess : BasePtrAccesses) { const ScopArrayInfo BasePtrSAI = getScopArrayInfoOrNull( BasePtrAccess->getAccessInstruction(), MemoryKind::Array); if (!BasePtrSAI \|\| BasePtrSAI == CanonicalBasePtrSAI \|\| !BasePtrSAI->isCompatibleWith(CanonicalBasePtrSAI)) continue; if (isUsedForIndirectHoistedLoad(this, BasePtrSAI)) continue; replaceBasePtrArrays(this, BasePtrSAI, CanonicalBasePtrSAI); } } } Meinersbur:* Suggestion: ``` static const ScopArrayInfo findCanonicalArray(Scop S…

		if (!CanonicalArray)
		continue;

		MeinersburUnsubmitted Done Reply Inline Actions Isn't there `CanonicalAccess->getScopArrayInfo()`. If it hasn't been computed yet, could we either call `unifyIndirectArrays()` after it has been computed or compute it earlier? Meinersbur: Isn't there `CanonicalAccess->getScopArrayInfo()`. If it hasn't been computed yet, could we…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Not sure what you have in mind here. CanonicalSAI is not the ScopArrayInfo of the CanonicalAccess, but its the SAI which has as basepointer the load in CanonoicalSAI. I introduced now a function getScopArrayInfoOrNull() similarly to the already existing getScopArrayInfo and just use this instead of the loop above. grosser: Not sure what you have in mind here. CanonicalSAI is not the ScopArrayInfo of the…
		for (MemoryAccess *Access : Accesses) {
		Value *Load = Access->getAccessInstruction();

		const ScopArrayInfo *Array =
		getScopArrayInfoOrNull(Load, MemoryKind::Array);

		if (!Array \|\| !Array->isCompatibleWith(CanonicalArray))
		continue;

		bool HasInvariantAccess = false;
		for (InvariantEquivClassTy &EqClass2 : InvariantEquivClasses)
		for (MemoryAccess *Access2 : EqClass2.InvariantAccesses)
		if (Access2->getScopArrayInfo() == Array) {
		HasInvariantAccess = true;
		break;
		MeinersburUnsubmitted Done Reply Inline Actions Duplication. Refactor-out? Meinersbur: Duplication. Refactor-out?
		grosserAuthorUnsubmitted Not Done Reply Inline Actions I also moved this to getScopArrayInfoOrNull() grosser: I also moved this to getScopArrayInfoOrNull()
		}

		if (HasInvariantAccess)
		break;
		MeinersburUnsubmitted Done Reply Inline Actions A comment about what this is supposed to do would be nice. I know it is related to `test/ScopInfo/invariant_load_canonicalize_array_baseptrs_5.ll`, but I think an explanation here would be even more important than in the test case. Meinersbur: A comment about what this is supposed to do would be nice. I know it is related to…

		for (ScopStmt &Stmt : *this)
		for (MemoryAccess *Access : Stmt)
		MeinersburUnsubmitted Done Reply Inline Actions There are two local variables with the name `Access`. Meinersbur: There are two local variables with the name `Access`.
		if (Access->getScopArrayInfo() == Array) {
		MeinersburUnsubmitted Done Reply Inline Actions This marks the access relation as been replaced as if done by eg. JSONImporter, the original still being accessible. The code generator has a different code path if a new access relation has been set. Is this intended? Meinersbur: This marks the access relation as been replaced as if done by eg. JSONImporter, the original…
		grosserAuthorUnsubmitted Done Reply Inline Actions This was intended. If I just reset the original access relation, the following check starts to fail: getOriginalScopArrayInfo()->getBasePtr() == BaseAddr' We should probably review this invariant, but there are several parts of Polly that rely on this behavior. I still need to look at all of them to understand if they behavior is correct: lib/Analysis/ScopInfo.cpp: auto SAI = S.getOrCreateScopArrayInfo(Access->getBaseAddr(), ElementType, lib/Analysis/ScopInfo.cpp: PHINode PHI = cast<PHINode>(Access->getBaseAddr()); lib/Analysis/ScopInfo.cpp: auto BaseAddr = SE->getSCEV(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: auto BaseAddr = SE->getSCEV(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: HasWriteAccess.insert(MA->getBaseAddr()); lib/Analysis/ScopInfo.cpp: Value BaseAddr = Access->getBaseAddr(); lib/Analysis/ScopInfo.cpp: auto &SAI = ScopArrayInfoMap[std::make_pair(Access->getBaseAddr(), lib/CodeGen/IRBuilder.cpp: BasePtrs.insert(MA->getBaseAddr()); lib/CodeGen/IslAst.cpp: ", " + MA->getBaseAddr()->getName().str(); lib/CodeGen/BlockGenerators.cpp: return getOrCreateScalarAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreatePHIAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreatePHIAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: return getOrCreateScalarAlloca(Access.getBaseAddr()); lib/CodeGen/BlockGenerators.cpp: BBMap[MA->getBaseAddr()] = lib/CodeGen/BlockGenerators.cpp: VectorBlockMap[MA->getBaseAddr()] = VectorVal; lib/CodeGen/IslNodeBuilder.cpp: if (BasePtr == MA->getBaseAddr()) { grosser:* This was intended. If I just reset the original access relation, the following check starts to…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions This has been changed. We now directly change the access relation. grosser: This has been changed. We now directly change the access relation.
		isl_id *Id = CanonicalArray->getBasePtrId();
		isl_map *Map = Access->getAccessRelation();
		Map = isl_map_set_tuple_id(Map, isl_dim_out, Id);
		Access->setAccessRelation(Map);
		}
		}
		}
		}

const ScopArrayInfo *		const ScopArrayInfo *
Scop::getOrCreateScopArrayInfo(Value BasePtr, Type ElementType,		Scop::getOrCreateScopArrayInfo(Value BasePtr, Type ElementType,
ArrayRef<const SCEV *> Sizes, MemoryKind Kind,		ArrayRef<const SCEV *> Sizes, MemoryKind Kind,
const char *BaseName) {		const char *BaseName) {
assert((BasePtr \|\| BaseName) &&		assert((BasePtr \|\| BaseName) &&
"BasePtr and BaseName can not be nullptr at the same time.");		"BasePtr and BaseName can not be nullptr at the same time.");
assert(!(BasePtr && BaseName) && "BaseName is redundant.");		assert(!(BasePtr && BaseName) && "BaseName is redundant.");
auto &SAI = BasePtr ? ScopArrayInfoMap[std::make_pair(BasePtr, Kind)]		auto &SAI = BasePtr ? ScopArrayInfoMap[std::make_pair(BasePtr, Kind)]
Show All 25 Lines	for (auto size : Sizes)
else		else
SCEVSizes.push_back(nullptr);		SCEVSizes.push_back(nullptr);

auto *SAI = getOrCreateScopArrayInfo(nullptr, ElementType, SCEVSizes,		auto *SAI = getOrCreateScopArrayInfo(nullptr, ElementType, SCEVSizes,
MemoryKind::Array, BaseName.c_str());		MemoryKind::Array, BaseName.c_str());
return SAI;		return SAI;
}		}

const ScopArrayInfo Scop::getScopArrayInfo(Value BasePtr, MemoryKind Kind) {		const ScopArrayInfo Scop::getScopArrayInfoOrNull(Value BasePtr,
		MemoryKind Kind) {
auto *SAI = ScopArrayInfoMap[std::make_pair(BasePtr, Kind)].get();		auto *SAI = ScopArrayInfoMap[std::make_pair(BasePtr, Kind)].get();
		return SAI;
		}

		const ScopArrayInfo Scop::getScopArrayInfo(Value BasePtr, MemoryKind Kind) {
		auto *SAI = getScopArrayInfoOrNull(BasePtr, Kind);
		MeinersburUnsubmitted Not Done Reply Inline Actions Prefer `lookup`. `operator[]` creates an entry every time this return `Null`. Meinersbur: Prefer `lookup`. `operator[]` creates an entry every time this return `Null`.
assert(SAI && "No ScopArrayInfo available for this base pointer");		assert(SAI && "No ScopArrayInfo available for this base pointer");
return SAI;		return SAI;
}		}

std::string Scop::getContextStr() const { return stringFromIslObj(Context); }		std::string Scop::getContextStr() const { return stringFromIslObj(Context); }

std::string Scop::getAssumedContextStr() const {		std::string Scop::getAssumedContextStr() const {
assert(AssumedContext && "Assumed context not yet built");		assert(AssumedContext && "Assumed context not yet built");
▲ Show 20 Lines • Show All 894 Lines • Show Last 20 Lines

test/Isl/CodeGen/invariant_load_unify_arrays.ll

This file was added.

				; RUN: opt %loadPolly -polly-codegen -S < %s \
				MeinersburUnsubmitted Done Reply Inline Actions The file's name still contains `unify` Meinersbur: The file's name still contains `unify`
				grosserAuthorUnsubmitted Not Done Reply Inline Actions Renamed to invariant_load_canonicalize_array_baseptrs.ll grosser: Renamed to invariant_load_canonicalize_array_baseptrs.ll
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; CHECK: polly
				MeinersburUnsubmitted Done Reply Inline Actions How does this check whether `%baseA` and `%baseB` have been canonicalized? I would think we'd check whether both stores will access the canonical base pointer, `%baseB`. Meinersbur: How does this check whether `%baseA` and `%baseB` have been canonicalized? I would think we'd…
				grosserAuthorUnsubmitted Not Done Reply Inline Actions I added CHECK-NEXT lines. grosser: I added CHECK-NEXT lines.
				MeinersburUnsubmitted Not Done Reply Inline Actions Where? Meinersbur: Where?

				define void @foo(float** %A) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseA = load float, float* %A
				store float 42.0, float* %baseA
				br label %body2

				body2:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				MeinersburUnsubmitted Not Done Reply Inline Actions The IR looks identical to the one in CodeGen. I understand that one is testing `-polly-scops -analyze` , the other `-polly-codegen -S`. However, since they are both testing the same feature, and for the sake of maintainability, wouldn't it be better to put both into the same file? Meinersbur: The IR looks identical to the one in CodeGen. I understand that one is testing ` -polly-scops…
				grosserAuthorUnsubmitted Not Done Reply Inline Actions Right. The problem with just dropping this test case is that we then have -polly-scops failures reported in test/CodeGen. I think it is a lot nicer to see directly from failing test cases if something in scop modeling or in code generation broke. Obviously this comes at a price of redundant test inputs. It seems to be a question of preference / tradeoffs. I generally tried to follow the rule to run only passes with the name of the test directory in the directory. However, neither was I consistent and I think Johannes also followed a different approach. Not sure what to do. grosser: Right. The problem with just dropping this test case is that we then have -polly-scops failures…
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; CHECK: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body2[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_baseB[0] };

				define void @foo(float** %A) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseA = load float, float* %A
				store float 42.0, float* %baseA
				br label %body2

				body2:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_2.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Make sure we choose a canonical element that is not the first invariant load,
				; but the first that is an array base pointer.

				; CHECK: Statements {
				; CHECK-NEXT: Stmt_body0
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body0[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body0[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body0[i0] -> MemRef_X[0] };
				; CHECK-NEXT: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body1[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body2[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body2[i0] -> [i0, 2] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_X[0] };
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 1]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_ptr[] };
				; CHECK-NEXT: Stmt_body3
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body3[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body3[i0] -> [i0, 3] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body3[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body4
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body4[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body4[i0] -> [i0, 4] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body4[i0] -> MemRef_X[0] };
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 1]
				; CHECK-NEXT: { Stmt_body4[i0] -> MemRef_ptr[] };
				; CHECK-NEXT: }

				define void @foo(float %A, float %X) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body0, label %exit

				body0:
				%ptr = load float, float* %A
				store float* %ptr, float** %X
				br label %body1

				body1:
				%baseA = load float, float* %A
				store float 42.0, float* %baseA
				br label %body2

				body2:
				%ptr2 = load float, float* %A
				store float* %ptr, float** %X
				br label %body3

				body3:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %body4

				body4:
				%ptr3 = load float, float* %A
				store float* %ptr, float** %X
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_3.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Verify that we canonicalize accesses even tough one of the accesses (even
				; the canonical base) has a partial execution context. This is correct as
				; the combined execution context still coveres both accesses.

				; CHECK: Invariant Accesses: {
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_A[0] };
				; CHECK-NEXT: Execution Context: { : }
				; CHECK-NEXT: }

				; CHECK: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body2[i0] : 0 <= i0 <= 510 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_baseB[0] };


				define void @foo(float** %A) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseA = load float, float* %A
				store float 42.0, float* %baseA
				%cmp = icmp slt i64 %indvar.next, 512
				br i1 %cmp, label %body2, label %latch

				body2:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_4.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Verify that a delinearized and a not delinearized access are not coalesced.

				; CHECK: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: [n] -> { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: [n] -> { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: [n] -> { Stmt_body1[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: [n] -> { Stmt_body2[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: [n] -> { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: [n] -> { Stmt_body2[i0] -> MemRef_baseA[i0, i0] };
				; CHECK-NEXT: }


				define void @foo(float** %A, i64 %n, i64 %m) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %body2

				body2:
				%baseA = load float, float* %A
				%offsetA = mul i64 %indvar, %n
				%offsetA2 = add i64 %offsetA, %indvar
				%ptrA = getelementptr float, float* %baseA, i64 %offsetA2
				store float 42.0, float* %ptrA
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_4b.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Verify that two arrays delinearized with different sizes are not coalesced.

				; CHECK: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: [m, n] -> { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: [m, n] -> { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: [m, n] -> { Stmt_body1[i0] -> MemRef_baseB[i0, i0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: [m, n] -> { Stmt_body2[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: [m, n] -> { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: [m, n] -> { Stmt_body2[i0] -> MemRef_baseA[i0, i0] };
				; CHECK-NEXT: }

				define void @foo(float** %A, i64 %n, i64 %m) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseB = load float, float* %A
				%offsetB = mul i64 %indvar, %m
				%offsetB2 = add i64 %offsetB, %indvar
				%ptrB = getelementptr float, float* %baseB, i64 %offsetB2
				store float 42.0, float* %ptrB
				br label %body2

				body2:
				%baseA = load float, float* %A
				%offsetA = mul i64 %indvar, %n
				%offsetA2 = add i64 %offsetA, %indvar
				%ptrA = getelementptr float, float* %baseA, i64 %offsetA2
				store float 42.0, float* %ptrA
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_4c.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Verify that arrays with different element types are not coalesced.

				; CHECK: Statements {
				; CHECK-NEXT: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body1[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseB[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body2[i0] : 0 <= i0 <= 1022 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_baseA[0] };
				; CHECK-NEXT: }

				define void @foo(float** %A, i64 %n, i64 %m) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [0, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseB = load float, float* %A
				store float 42.0, float* %baseB
				br label %body2

				body2:
				%baseA = load float, float* %A
				%ptrcast = bitcast float* %baseA to i64*
				store i64 42, i64* %ptrcast
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}

test/ScopInfo/invariant_load_unify_arrays_5.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \
				; RUN: -polly-invariant-load-hoisting \
				; RUN: \| FileCheck %s

				; Verify that nested arrays with invariant base pointers are handled correctly.
				; Specifically, we currently do not unify arrays where some accesses are hoisted
				; as invariant loads. If we would, we would probably need to update the access
				; function of the invariant loads as well. However, as this is not a very common
				; situation, it we leave this for now to avoid further complexity increases.
				;
				; In this test case the arrays baseA2 and baseB2 could be cononicalized to a
				; single array, but there is also an invariant access to baseA2[0] which
				; prevents the canonicalization.

				; CHECK: Invariant Accesses: {
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_A[0] };
				; CHECK-NEXT: Execution Context: { : }
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseA2[0] };
				; CHECK-NEXT: Execution Context: { : }
				; CHECK-NEXT: }

				; CHECK: Statements {
				; CHECK-NEXT: Stmt_body1
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body1[i0] : 0 <= i0 <= 1021 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body1[i0] -> [i0, 0] };
				; CHECK-NEXT: ReadAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_baseA2[1 + i0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_B[0] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body1[i0] -> MemRef_B[0] };
				; CHECK-NEXT: Stmt_body2
				; CHECK-NEXT: Domain :=
				; CHECK-NEXT: { Stmt_body2[i0] : 0 <= i0 <= 1021 };
				; CHECK-NEXT: Schedule :=
				; CHECK-NEXT: { Stmt_body2[i0] -> [i0, 1] };
				; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
				; CHECK-NEXT: { Stmt_body2[i0] -> MemRef_baseB2[0] };
				; CHECK-NEXT: }

				define void @foo(float* %A, float %B) {
				start:
				br label %loop

				loop:
				%indvar = phi i64 [1, %start], [%indvar.next, %latch]
				%indvar.next = add nsw i64 %indvar, 1
				%icmp = icmp slt i64 %indvar.next, 1024
				br i1 %icmp, label %body1, label %exit

				body1:
				%baseA2 = load float, float* %A
				%ptr = getelementptr inbounds float, float* %baseA2, i64 %indvar
				%v0 = load float, float* %ptr
				%v1 = load float, float* %baseA2
				store float* %v0, float** %B
				store float* %v1, float** %B
				br label %body2

				body2:
				%baseB2 = load float, float* %A
				store float* undef, float** %baseB2
				br label %body3

				body3:
				br label %latch

				latch:
				br label %loop

				exit:
				ret void

				}