This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/polly/
-
polly/
-
LinkAllPasses.h
3/3
ScopInfo.h
-
lib/
-
CMakeLists.txt
-
Support/
2/3
RegisterPasses.cpp
-
Transform/
50/59
MaximalStaticExpansion.cpp
-
test/MaximalStaticExpansion/
-
MaximalStaticExpansion/
2/2
partial_access.ll
-
too_many_writes.ll
1/1
working_expansion.ll

Differential D34982

[Polly][WIP] Fully-Indexed static expansion
ClosedPublic

Authored by niosega on Jul 4 2017, 7:18 AM.

Download Raw Diff

Details

Reviewers

simbuerg
Meinersbur
bollu

Commits

rG81fb6b3e4085: [Polly] Fully-Indexed static expansion
rPLO310304: [Polly] Fully-Indexed static expansion
rL310304: [Polly] Fully-Indexed static expansion

Summary

The idea of this patch is to implement a mechanism of fully-indexed static expansion.

The goal of this patch is to be able to expand every memory access to a fully indexed one. For example from this original source code :

 for(int i = 0; i<Ni; i++)
   for(int j = 0; j<Ni; j++)
S:     B[j] = j;
T: A[i] = B[i]

After the pass, we want this :

 for(int i = 0; i<Ni; i++)
   for(int j = 0; j<Ni; j++)
S:     B[i, j] = j;
T: A[i] = B[i, i]

We are, for now, unable to apply expansion on some cases :

Scalar access
Multiple writes per SAI
MayWrite Access
Expansion that leads to an access to the original array

Diff Detail

Event Timeline

niosega created this revision.Jul 4 2017, 7:18 AM

niosega created this object with visibility "Custom Policy".

niosega created this object with edit policy "Custom Policy".

niosega edited the summary of this revision. (Show Details)Jul 5 2017, 2:04 AM

niosega added a reviewer: Meinersbur.

niosega changed the visibility from "Custom Policy" to "Public (No Login Required)".

niosega changed the edit policy from "Custom Policy" to "All Users".

niosega added subscribers: pollydev, llvm-commits.

niosega added inline comments.Jul 5 2017, 2:10 AM

lib/Transform/MaximalStaticExpansion.cpp
68	By reading and trying to understand the work done in DeLICM and JSONImporter::importAccesses, I had begin the implementation of my pass. But I am stuck in building the new map. For example, if we have the following c code : for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: tmp = A[N+i]tmp; After the pass, we want this : for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) T: tmp[i, j] = A[N+i]tmp; This means that I want to transform the map of S, [N] -> {S[i,j] -> tmp}, to the map of T, [N] -> {S[i,j] -> tmp[i, j]}. But i was not able to find a way to do that. Anybody has an idea on how to proceed ?
test/MaximalStaticExpansion/no_optim.ll
1 ↗	(On Diff #105173)	I try to create a test case with the folowing c code : #define Ni 2000 double mse(double A[Ni], int N) { int i; double tmp = 2; for (i = 0; i < Ni; i++) { for (int j = 0; j<Ni; j++) { tmp = A[N+i]*tmp; } } return tmp; } I use the following commands to generate the IR : clang -O2 -S -emit-llvm pure_c_main.c opt -S -mem2reg -polly-scops -polly-export-jscop pure_c_main.ll and this command to detect the SCOP : opt -polly-process-unprofitable -polly-remarks-minimal -polly-use-llvm-names -polly-scops -analyze -polly-export-jscop no_optim.ll When I use this commands, polly detects scop but the IR is optimize due to the O2 in clang. I'd like to have a non optimize version of the IR so that the IR is directly linked to the c code. But when I remove the -02, polly does not detect any scop. Anybody has an idea why ?

Fully-Indexed expansion of the write accesses :

Build the new access map from the current access map
Create a new SAI for the expanded version of the access array or scalar
Modify the memory access to the new SAI

Seems to be working for both scalars and non-scalars access.

For now, the test is only based on dump comparison because the code is broken : the reads are still to the old SAI.

Herald added a reviewer: bollu. · View Herald TranscriptJul 12 2017, 12:28 PM

Herald added a subscriber: mgorny. · View Herald Transcript

Meinersbur added inline comments.Jul 13 2017, 3:40 AM

lib/Support/RegisterPasses.cpp
319–320	Suggestion: Put it behind DeadCodeElimination? It uses DependenceInfo and should have the same results even before MSE.
lib/Transform/MaximalStaticExpansion.cpp
2	Remove `- Expand the memory access`. No other header does this.
13	To be consistent with the other files, add an empty line before `//===------`

Meinersbur added inline comments.Jul 13 2017, 3:40 AM

lib/Transform/MaximalStaticExpansion.cpp
41	Typo: transformations
86	Check for whther this returns `nullptr`. At least an assertion. Suggestion: Get the domain sizes from the writing statement's domain.
106–109	`isl_dim_in` and `isl_dim_out` dimensions have no dim_id, so this is not necessary.
112	At this point, in the regression test, we have { Stmt_for_cond1_preheader[i0] -> MemRef_tmp_11__phi1_expanded[o0] } (i0 and o0 are unconnected) That is, every statement instance accesses all array elements. It should be something like: { Stmt_for_cond1_preheader[i0] -> MemRef_tmp_11__phi1_expanded[i0] }
136	Use inline declarations: bool CorrectWrite = expandWrite(S, MA);
138	We cannot return early with transformations only partially applied. The SCoP representation will be inconsistent and at best passes after this will crash, at worst we miscompile. Please check in advance if a transformation can be applied before trying to apply it. In this case, you may want to check to each `ScopArrayInfo` whether it is applicable. The class `ScalarDefUseChains` in DeLICM.cpp might be helpful to get the accesses of a SAI. I think we will sooner-or-later integrate it into ScopInfo.
148	`runOnScop` returns true if and only if the IR has been modified. We only modify the SCoP representation, therefore we only return `false`.
152	In `printScop`, you must print to `OS`. It will print to `stdout` instead to `stderr`. This should also make the regression test simpler.
156	You will need to add `AU.addRequired<DependenceInfo>();` here at some point.
test/MaximalStaticExpansion/mse___%for.cond1.preheader---%for.end8.jscop
1 ↗	(On Diff #106279)	What is this file needed for?

In this revision, we have done :

Use of c++ bindings for isl instead of direct isl
Implementation of the reads expansion
Add constraint for the write map so that the in dims are link to the out dims
Remove useless JSCoP file
Create a new expanded SAI for each statement
Add a check method before doing the expansion to avoid partial expansion

Problems :

Do I have to expand already fully-indexed array (for example, the first write of B in the test case "too-many-writes.ll" ) ?

For now, we can not expand :

Scalar access
MayWrite access
SAI with more than one write
SAI with read that can cause partial read access (because polly can not handle partial read access)

niosega marked 6 inline comments as done.Jul 20 2017, 7:52 AM

[Suggestion] Add a regression test where generated LLVM-IR is tested.

include/polly/ScopInfo.h
840–842	[Style] Why was this moved?
lib/Transform/MaximalStaticExpansion.cpp
47	[Style] Instead of a set of not expandable arrays, why not using a set of expandable arrays? It feels safer because if an array is just missing in the set for whatever reason, it would not default to expand it. [Style] `std::set` is rarely used in LLVM. There are alternitives: See http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallset-h .
51	[Typo] checkExpandability
59–61	[Style] Please make them doxygen comments (`///`) The doxygen style for parameters we usually use is. @param Dependences The RAW dependences of the SCoP @p S.
77	[Style] Negation in variable name and double negation. Why not `bool Expandable = true`?
124–129	[Style] `NumerElementMap = isl_union_map_n_map(CurrentReadWriteDependences.get())`
143–144	[Serious] In normal operation, do not print anything. Users expect only clang warnings and errors. Alternatives are: `DEBUG(dbgs() << "")` (discurraged for regression tests, this is meant for debugging) Print remarks using `-pass-remarks-missed` Print information at `-analysis` `STATISTIC`
214–218	[Serious] This is not a sufficient condition for full expansion. E.g. one dimension can be a static `0`. There is also more than one possibility for full expansion (e.g. one starting at 0, another at 1). The reads must access the correct element. So if you want to implement this as a heuristic that expansion is not worth it in this case, implement it in `checkExpandability` such that reads are also not modified.
249–250	[Serious] I think `getMaxBackedgeTakenCount` can fail in cases where Polly is still able to detect affine loops. Please add at least an `assert` that SCEV is not `SCEVCountNotCompute`.
255	[Suggestion] Should this be `getLatestScopArrayInfo()` ?
297–298	[Remark] The structure I had in mind was for (array : S.arrays()) { if (!checkExpendability(array)) continue; for (ScopStmt &Stmt : S) for (MemoryAccess *MA : Stmt) { if (MA->isRead()) expandRead(MA); if (MA->isWrite()) expandRead(MA); } } This does not need a `NotExpandable` set, but has worse asymptotic runtime. So I guess your version has an advantage.
302–304	[Style] We usually do not use parenthesis around single statements: if (isExpandable(SAI)) continue;
313	[Style] In Polly's coding style, all sentences end with a dot (but I personally don't care).

Thank you for this mostly working version. I hope my comments are not too daunting.

lib/Transform/MaximalStaticExpansion.cpp
90	[Suggestion] One could skip the check for the current array once it is known to be unexpandable and continue with the next one.
188–199	[Suggestion] Instead of searching for the correct id, you could derive the name as in `expandWrite` and look it up in `ScopArrayNameMap`. [Serious] What if the `Id` is not found? Please add an assertion for that case.
250	[Suggestion] Or use `ScalarEvolution::getAddExpr(SCEV, ScalarEvolution::getConstant(1))` [Serious] I don't see where +1 is added to the ISL version.
test/MaximalStaticExpansion/working_expansion.ll
24	Shouldn't this be MemRef_B_Stmt_for_body3_expanded[2000][3000] ?

niosega added inline comments.Jul 21 2017, 9:18 AM

include/polly/ScopInfo.h
840–842	This was moved from private to public section because I need this method to find the expanded SAI name during read expansion. But if I use the solution you suggest, I will not need it anymore.
lib/Transform/MaximalStaticExpansion.cpp
90	That is what I wanted to do in the beginning. But I didn't find a clean solution to "escape" from the two innermost loops and go to next iteration of the loop that iterate over the SAI
250	We discuss during the last phone call about an other solution that involves methods that are in FlattenAlgo.cpp to get the boundary of the loop iterations variables. That's what I meant by "ISL version". But for now, I can not access the methods from FlattenAlgo.cpp because they are defined inside an unamed namespace.
297–298	If I am not wrong, we must first expand all writes before expanding the reads. Because otherwise problems can happened during trying to find expanded SAI during read expansion.

Meinersbur added inline comments.Jul 21 2017, 2:03 PM

include/polly/ScopInfo.h
840–842	ok.
lib/Transform/MaximalStaticExpansion.cpp
90	A possible solution is to refactor the inner two loops to a function, which can `return true/false` for a `ScopArrayInfo` at any point. bool isExpandable(SAI) { for (ScopStmt &Stmt : S) for (MemoryAccess *MA : Stmt) { if (MA->isMayWrite()) return false; ... } } for (auto &SAI : S.arrays()) { if (!isExpandable()) NotExpandables.insert(SAI); }
250	You may need to modify them anyway, so don't hesitate to copy them over, especially for a prototype. If later we find they share a significant amount of code, we can find a common file for them.
297–298	Let me refine what I some time ago I had in mind. for (array : S.arrays()) { MemoryAccess TheWrite = nullptr; List<MemoryAccess> AllReads; if (!isExpandable(array, TheWrite, AllReads)) continue; assert(TheWrite); assert(AllReads.size() > 0); ScopArrayInfo ExpandedArray = expandWrite(TheWrite); for (MemoryAccess *MA : AllReads) expandRead(MA, ExpandedArray); }

niosega marked 23 inline comments as done.Jul 24 2017, 12:18 PM

Take into account remarks from Michael.

Change the structure of the expansion. Now iterate over SAI of the Scop.
Get the boundaries of the loop iterations variables with ISL.
Style modifications.

Implementation for remarks is in place but not working. One test case is broken due to change in structure. But the output seems to be correct. I will correct it as soon as possible.

Remove debug if condition.

Meinersbur added inline comments.Jul 25 2017, 8:49 AM

lib/Transform/MaximalStaticExpansion.cpp
194	I get a compile error here. `MA->getAccessRelation()` has been updated to use C++ object. Please update to Polly trunk.
340	[Suggestion] Pass string as `llvm::StringRef` (or `const std::string &` to avoid a copy)
test/MaximalStaticExpansion/partial_access.ll
2	[Style] Please remove trailing whitespace.

simbuerg added inline comments.Jul 25 2017, 9:17 AM

lib/Transform/MaximalStaticExpansion.cpp
182	Reminder: These should/will become diagnostics
253	As far as I remember, this will fail, if there are more than one map in the union_map. I would check that with at least an assert.
259	Where do you use this name?
315	This takes a const string &, no need to go over the c_str().
343	Why 'AssumpRestrict'?

niosega marked 7 inline comments as done.Jul 26 2017, 4:01 AM

Take into account Michael and Andreas comments.

Diagnostic still does not work.

Emit remarks instead of stderr printing. Test case works.

To have any effect, we need to clear the DependenceInfo, otherwise -polly-opt-isl will still use the unexpanded dependence info and the mse was useless.

DependenceInfo has no facility yet to reset a previous analysis. You might want want to add one into this patch or a follow-up one.

lib/Support/RegisterPasses.cpp
144	[Suggestion] To be consistent with other switches that add passes, name this `-polly-enable-mse` and the pass itself `-polly-mse` (instead of `-polly-opt-mse`)?
lib/Transform/MaximalStaticExpansion.cpp
68	[Style] Use `SmallPtrSetImpl<MemoryAccess *>` to not require the small size in the parameter.
188	[Typo] extand -> expand
219	The consequence would not be a partial read access, but it would need to read the original value the element had before entering the SCoP. That's a special case similar to having more than one write.
308	[Nit] The UpperBound could overflow a long. Add an assertion for that?
329–332	[Style] This could be simpler using NewAccessMap = NewAccessMap->equate(isl::dim::in, dim. isl::dim::out, dim); or, even, better, use `basic_map::equal`.
348	[Nit] `OptimizationRemarkEmitterWrapperPass`
358–360	[Style] No braces around single statements. Also possible: SmallPtrSet<ScopArrayInfo *, 4> CurrentSAI(S.array_begin(), array_end());
test/MaximalStaticExpansion/partial_access.ll
2	The interleaving of stdout (`-analyze`) and stderr (`-pass-remarks-analysis`) is undefined. It is better to have two separate RUN lines, one checking `analyze`, the other `-pass-remarks-analysis`.

This revision is now accepted and ready to land.Jul 26 2017, 7:10 AM

niosega marked 8 inline comments as done.Jul 26 2017, 8:01 AM

niosega added inline comments.

lib/Transform/MaximalStaticExpansion.cpp
219	Are you sure ? Because if I remember well. Let say that we are analyzing this code : for (i = 0; i < Ni; i++) { B[j] = i; for (int j = 0; j<Nj; j++) { B[j] = j; } A[i] = B[i]; When I try to set the new access relation for the B read, the setNewAccessRelation method of class MemoryAccess failed with the assert "Partial READ accesses not supported".
308	How can I efficiently check that there is an overflow ?
329–332	I'd like to use isl_basic_map_equal but I did not find the documentation of this method on the online isl doc. There is also no example of uses in Polly. Can you explain me how it works ?

Meinersbur added inline comments.Jul 27 2017, 4:44 AM

lib/Transform/MaximalStaticExpansion.cpp
219	How were you able to do that? setNewAccessRelation accepts only an isl_map, not isl_union_map. Let me explain in more detail. for (int k = 0; k < M; k+=1) { for (int i = 0; i<= N/2; i+=1) { S: B[i] = i; } for (int i = 0; i<N; i+=1) { T: = ... B[i] } } The flow dependence would look something like: { T[k, i] -> S[k, i] : 0 < i <= N/2 } We could naively expand B to B_expanded: for (int k = 0; k < M; k+=1) { for (int i = 0; i<= N/2; i+=1) { S: B_expanded[k][i] = i; } for (int i = 0; i<N; i+=1) { T: = ... B_expanded[k][i] } } The problem here is that `B_expanded[k][i]` for i > N/2 never gets written (And T would read uninitialized memory). The flow dependence doesn't tell which instance of S wrote it in the first place!! If you try to apply it naively anyway using setNewAccessRelation, we need a source of the value for all instances of S, but we don't have one for `i > N/2`! This is way partial read accesses are unsupported. The correct thing to do would be to read the value from the original array B (which then becomes read-only). { T[k,i] -> B_expanded[k,i] : i < N/2; T[k,i] -> B[i] : i>=N/2 } This again is an isl_union_map (NOT a partial access since it is defined for all instances of T), which we currently do not support support. Please try to understand what the problem with partial read accesses is. Not the partial read accesses are the problem, but the reason why you would want to use one.
308	`UpperBound.le(INT_MAX)` (I think there is no implicit conversion from int to isl::val, but you gget the idea)
329–332	isl_map_space(SpaceMap.copy(), SpaceMap.dim(isl::in)) should get you a basic_map of that space where the `n_equal = SpaceMap.dim(isl::in)` in- and out- dimensions are equal. Something like. { Stmt[i0, i1] -> MemRef[o0, o1] : i0 = o0 and i1 = o1 } However, no documention could mean that the function was not intended to be public.

Take Michael comments into account.

Meinersbur added inline comments.Jul 27 2017, 8:57 AM

lib/Support/RegisterPasses.cpp
144	I'm ok with the switch name, but doesn't the e in "mse" already stand for "expansion" (therefore expand-mse is short for "expand-maximal-array-expansion")
lib/Transform/MaximalStaticExpansion.cpp
308–309	This assertion fails on Windows (and 32 bit platforms): The `isl::val` constructor takes a `long`, which is 32 bit this platforms. UINT_MAX exceeds its range.
310	It has been tested for the range of an `unsigned int`.
311	If `UpperBound.get_num_si()` is `UINT_MAX`, you get an overflow when adding +1.
test/MaximalStaticExpansion/read_from_original.ll
1 ↗	(On Diff #108467)	Nice new testcase!

niosega added inline comments.Jul 29 2017, 2:03 PM

lib/Transform/MaximalStaticExpansion.cpp
308–309	It's a mistake from my side to compare it with UINT_MAX. If I replace std::numeric_limits<unsigned>::max() with std::numeric_limits<long>::max() it should work on every platform, right ?

Meinersbur added inline comments.Jul 30 2017, 11:02 AM

lib/Transform/MaximalStaticExpansion.cpp
308–309	Except that it is stored into a `std::vector<unsigned>`. Storing a `long` into as an `unsigned int` may get you another overflow. My suggestion is to stick with the lowest common maximum: `std::numeric_limits<int>::max()`. I don't think you would want to allocate memory larger than that anyway.

Take into account Michaels comments.
Update setNewAccessRelation call (isl::map as parameter instead of isl map * )

niosega edited the summary of this revision. (Show Details)Aug 3 2017, 3:20 PM

niosega edited the summary of this revision. (Show Details)Aug 3 2017, 3:23 PM

LGTM.

Andreas, do you want to commit?

lib/Transform/MaximalStaticExpansion.cpp
309	Why `- 1`?

Closed by commit rL310304: [Polly] Fully-Indexed static expansion (authored by simbuerg). · Explain WhyAug 7 2017, 1:55 PM

This revision was automatically updated to reflect the committed changes.

niosega mentioned this in D36647: [Polly][WIP] Scalar fully indexed expansion.Aug 12 2017, 1:50 PM

simbuerg mentioned this in rL311619: [Polly][WIP] Scalar fully indexed expansion.Aug 23 2017, 5:05 PM

Revision Contents

Path

Size

include/

polly/

LinkAllPasses.h

3 lines

ScopInfo.h

6 lines

lib/

CMakeLists.txt

1 line

Support/

RegisterPasses.cpp

9 lines

Transform/

MaximalStaticExpansion.cpp

349 lines

test/

MaximalStaticExpansion/

partial_access.ll

106 lines

too_many_writes.ll

113 lines

working_expansion.ll

101 lines

Diff 107505

include/polly/LinkAllPasses.h

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
llvm::Pass *createCodeGenerationPass();		llvm::Pass *createCodeGenerationPass();
#ifdef GPU_CODEGEN		#ifdef GPU_CODEGEN
llvm::Pass *createPPCGCodeGenerationPass(GPUArch Arch = GPUArch::NVPTX64,		llvm::Pass *createPPCGCodeGenerationPass(GPUArch Arch = GPUArch::NVPTX64,
GPURuntime Runtime = GPURuntime::CUDA);		GPURuntime Runtime = GPURuntime::CUDA);
#endif		#endif
llvm::Pass *createIslScheduleOptimizerPass();		llvm::Pass *createIslScheduleOptimizerPass();
llvm::Pass *createFlattenSchedulePass();		llvm::Pass *createFlattenSchedulePass();
llvm::Pass *createDeLICMPass();		llvm::Pass *createDeLICMPass();
		llvm::Pass *createMaximalStaticExpansionPass();

extern char &CodePreparationID;		extern char &CodePreparationID;
} // namespace polly		} // namespace polly

namespace {		namespace {
struct PollyForcePassLinking {		struct PollyForcePassLinking {
PollyForcePassLinking() {		PollyForcePassLinking() {
// We must reference the passes in such a way that compilers will not		// We must reference the passes in such a way that compilers will not
Show All 17 Lines	PollyForcePassLinking() {
polly::createPollyCanonicalizePass();		polly::createPollyCanonicalizePass();
polly::createPolyhedralInfoPass();		polly::createPolyhedralInfoPass();
polly::createIslAstInfoWrapperPassPass();		polly::createIslAstInfoWrapperPassPass();
polly::createCodeGenerationPass();		polly::createCodeGenerationPass();
#ifdef GPU_CODEGEN		#ifdef GPU_CODEGEN
polly::createPPCGCodeGenerationPass();		polly::createPPCGCodeGenerationPass();
#endif		#endif
polly::createIslScheduleOptimizerPass();		polly::createIslScheduleOptimizerPass();
		polly::createMaximalStaticExpansionPass();
polly::createFlattenSchedulePass();		polly::createFlattenSchedulePass();
polly::createDeLICMPass();		polly::createDeLICMPass();
polly::createDumpModulePass("", true);		polly::createDumpModulePass("", true);
polly::createSimplifyPass();		polly::createSimplifyPass();
polly::createPruneUnprofitablePass();		polly::createPruneUnprofitablePass();
}		}
} PollyForcePassLinking; // Force link by creating a global definition.		} PollyForcePassLinking; // Force link by creating a global definition.
} // namespace		} // namespace

namespace llvm {		namespace llvm {
class PassRegistry;		class PassRegistry;
void initializeCodePreparationPass(llvm::PassRegistry &);		void initializeCodePreparationPass(llvm::PassRegistry &);
void initializeDeadCodeElimPass(llvm::PassRegistry &);		void initializeDeadCodeElimPass(llvm::PassRegistry &);
void initializeJSONExporterPass(llvm::PassRegistry &);		void initializeJSONExporterPass(llvm::PassRegistry &);
void initializeJSONImporterPass(llvm::PassRegistry &);		void initializeJSONImporterPass(llvm::PassRegistry &);
void initializeIslAstInfoWrapperPassPass(llvm::PassRegistry &);		void initializeIslAstInfoWrapperPassPass(llvm::PassRegistry &);
void initializeCodeGenerationPass(llvm::PassRegistry &);		void initializeCodeGenerationPass(llvm::PassRegistry &);
#ifdef GPU_CODEGEN		#ifdef GPU_CODEGEN
void initializePPCGCodeGenerationPass(llvm::PassRegistry &);		void initializePPCGCodeGenerationPass(llvm::PassRegistry &);
#endif		#endif
void initializeIslScheduleOptimizerPass(llvm::PassRegistry &);		void initializeIslScheduleOptimizerPass(llvm::PassRegistry &);
		void initializeMaximalStaticExpanderPass(llvm::PassRegistry &);
void initializePollyCanonicalizePass(llvm::PassRegistry &);		void initializePollyCanonicalizePass(llvm::PassRegistry &);
void initializeFlattenSchedulePass(llvm::PassRegistry &);		void initializeFlattenSchedulePass(llvm::PassRegistry &);
void initializeDeLICMPass(llvm::PassRegistry &);		void initializeDeLICMPass(llvm::PassRegistry &);
} // namespace llvm		} // namespace llvm

#endif		#endif

include/polly/ScopInfo.h

Show First 20 Lines • Show All 648 Lines • ▼ Show 20 Lines	private:

void assumeNoOutOfBound();		void assumeNoOutOfBound();

/// Compute bounds on an over approximated access relation.		/// Compute bounds on an over approximated access relation.
///		///
/// @param ElementSize The size of one element accessed.		/// @param ElementSize The size of one element accessed.
void computeBoundsOnAccessRelation(unsigned ElementSize);		void computeBoundsOnAccessRelation(unsigned ElementSize);

/// Get the original access function as read from IR.
__isl_give isl_map *getOriginalAccessRelation() const;

/// Return the space in which the access relation lives in.		/// Return the space in which the access relation lives in.
__isl_give isl_space *getOriginalAccessRelationSpace() const;		__isl_give isl_space *getOriginalAccessRelationSpace() const;

/// Get the new access function imported or set by a pass		/// Get the new access function imported or set by a pass
__isl_give isl_map *getNewAccessRelation() const;		__isl_give isl_map *getNewAccessRelation() const;

/// Fold the memory access to consider parametric offsets		/// Fold the memory access to consider parametric offsets
///		///
▲ Show 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	public:

/// Return the access relation after the schedule was applied.		/// Return the access relation after the schedule was applied.
__isl_give isl_pw_multi_aff *		__isl_give isl_pw_multi_aff *
applyScheduleToAccessRelation(__isl_take isl_union_map *Schedule) const;		applyScheduleToAccessRelation(__isl_take isl_union_map *Schedule) const;

/// Get an isl string representing the access function read from IR.		/// Get an isl string representing the access function read from IR.
std::string getOriginalAccessRelationStr() const;		std::string getOriginalAccessRelationStr() const;

		/// Get the original access function as read from IR.
		__isl_give isl_map *getOriginalAccessRelation() const;

		MeinersburUnsubmitted Done Reply Inline Actions [Style] Why was this moved? Meinersbur: [Style] Why was this moved?
		niosegaAuthorUnsubmitted Done Reply Inline Actions This was moved from private to public section because I need this method to find the expanded SAI name during read expansion. But if I use the solution you suggest, I will not need it anymore. niosega: This was moved from private to public section because I need this method to find the expanded…
		MeinersburUnsubmitted Done Reply Inline Actions ok. Meinersbur: ok.
/// Get an isl string representing a new access function, if available.		/// Get an isl string representing a new access function, if available.
std::string getNewAccessRelationStr() const;		std::string getNewAccessRelationStr() const;

/// Get an isl string representing the latest access relation.		/// Get an isl string representing the latest access relation.
std::string getAccessRelationStr() const;		std::string getAccessRelationStr() const;

/// Get the original base address of this access (e.g. A for A[i+j]) when		/// Get the original base address of this access (e.g. A for A[i+j]) when
/// detected.		/// detected.
▲ Show 20 Lines • Show All 2,186 Lines • Show Last 20 Lines

lib/CMakeLists.txt

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	add_library(PollyCore OBJECT
Transform/Canonicalization.cpp		Transform/Canonicalization.cpp
Transform/CodePreparation.cpp		Transform/CodePreparation.cpp
Transform/DeadCodeElimination.cpp		Transform/DeadCodeElimination.cpp
Transform/ScheduleOptimizer.cpp		Transform/ScheduleOptimizer.cpp
Transform/FlattenSchedule.cpp		Transform/FlattenSchedule.cpp
Transform/FlattenAlgo.cpp		Transform/FlattenAlgo.cpp
Transform/DeLICM.cpp		Transform/DeLICM.cpp
Transform/Simplify.cpp		Transform/Simplify.cpp
		Transform/MaximalStaticExpansion.cpp
${POLLY_HEADER_FILES}		${POLLY_HEADER_FILES}
)		)
set_target_properties(PollyCore PROPERTIES FOLDER "Polly")		set_target_properties(PollyCore PROPERTIES FOLDER "Polly")

# Create the library that can be linked into LLVM's tools and Polly's unittests.		# Create the library that can be linked into LLVM's tools and Polly's unittests.
# It depends on all library it needs, such that with		# It depends on all library it needs, such that with
# LLVM_POLLY_LINK_INTO_TOOLS=ON, its dependencies like PollyISL are linked as		# LLVM_POLLY_LINK_INTO_TOOLS=ON, its dependencies like PollyISL are linked as
# well.		# well.
▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

lib/Support/RegisterPasses.cpp

Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	static cl::opt<polly::VectorizerChoice, true> Vectorizer(
cl::location(PollyVectorizerChoice), cl::init(polly::VECTORIZER_NONE),		cl::location(PollyVectorizerChoice), cl::init(polly::VECTORIZER_NONE),
cl::ZeroOrMore, cl::cat(PollyCategory));		cl::ZeroOrMore, cl::cat(PollyCategory));

static cl::opt<bool> ImportJScop(		static cl::opt<bool> ImportJScop(
"polly-import",		"polly-import",
cl::desc("Import the polyhedral description of the detected Scops"),		cl::desc("Import the polyhedral description of the detected Scops"),
cl::Hidden, cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));		cl::Hidden, cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));

		static cl::opt<bool> FullyIndexedStaticExpansion(
		"polly-mse",
		MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] To be consistent with other switches that add passes, name this `-polly-enable-mse` and the pass itself `-polly-mse` (instead of `-polly-opt-mse`)? Meinersbur: [Suggestion] To be consistent with other switches that add passes, name this `-polly-enable…
		MeinersburUnsubmitted Not Done Reply Inline Actions I'm ok with the switch name, but doesn't the e in "mse" already stand for "expansion" (therefore expand-mse is short for "expand-maximal-array-expansion") Meinersbur: I'm ok with the switch name, but doesn't the e in "mse" already stand for "expansion"…
		cl::desc("Fully expand the memory accesses of the detected Scops"),
		cl::Hidden, cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));

static cl::opt<bool> ExportJScop(		static cl::opt<bool> ExportJScop(
"polly-export",		"polly-export",
cl::desc("Export the polyhedral description of the detected Scops"),		cl::desc("Export the polyhedral description of the detected Scops"),
cl::Hidden, cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));		cl::Hidden, cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));

static cl::opt<bool> DeadCodeElim("polly-run-dce",		static cl::opt<bool> DeadCodeElim("polly-run-dce",
cl::desc("Run the dead code elimination"),		cl::desc("Run the dead code elimination"),
cl::Hidden, cl::init(false), cl::ZeroOrMore,		cl::Hidden, cl::init(false), cl::ZeroOrMore,
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	#ifdef GPU_CODEGEN
LLVMInitializeNVPTXAsmPrinter();		LLVMInitializeNVPTXAsmPrinter();
#endif		#endif
initializeCodePreparationPass(Registry);		initializeCodePreparationPass(Registry);
initializeDeadCodeElimPass(Registry);		initializeDeadCodeElimPass(Registry);
initializeDependenceInfoPass(Registry);		initializeDependenceInfoPass(Registry);
initializeDependenceInfoWrapperPassPass(Registry);		initializeDependenceInfoWrapperPassPass(Registry);
initializeJSONExporterPass(Registry);		initializeJSONExporterPass(Registry);
initializeJSONImporterPass(Registry);		initializeJSONImporterPass(Registry);
		initializeMaximalStaticExpanderPass(Registry);
initializeIslAstInfoWrapperPassPass(Registry);		initializeIslAstInfoWrapperPassPass(Registry);
initializeIslScheduleOptimizerPass(Registry);		initializeIslScheduleOptimizerPass(Registry);
initializePollyCanonicalizePass(Registry);		initializePollyCanonicalizePass(Registry);
initializePolyhedralInfoPass(Registry);		initializePolyhedralInfoPass(Registry);
initializeScopDetectionWrapperPassPass(Registry);		initializeScopDetectionWrapperPassPass(Registry);
initializeScopInfoRegionPassPass(Registry);		initializeScopInfoRegionPassPass(Registry);
initializeScopInfoWrapperPassPass(Registry);		initializeScopInfoWrapperPassPass(Registry);
initializeCodegenCleanupPass(Registry);		initializeCodegenCleanupPass(Registry);
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	void registerPollyPasses(llvm::legacy::PassManagerBase &PM) {
if (EnableDeLICM)		if (EnableDeLICM)
PM.add(polly::createDeLICMPass());		PM.add(polly::createDeLICMPass());
if (EnableSimplify)		if (EnableSimplify)
PM.add(polly::createSimplifyPass());		PM.add(polly::createSimplifyPass());

if (ImportJScop)		if (ImportJScop)
PM.add(polly::createJSONImporterPass());		PM.add(polly::createJSONImporterPass());

if (DeadCodeElim)		if (DeadCodeElim)
PM.add(polly::createDeadCodeElimPass());		PM.add(polly::createDeadCodeElimPass());
		MeinersburUnsubmitted Done Reply Inline Actions Suggestion: Put it behind DeadCodeElimination? It uses DependenceInfo and should have the same results even before MSE. Meinersbur: Suggestion: Put it behind DeadCodeElimination? It uses DependenceInfo and should have the same…

		if (FullyIndexedStaticExpansion)
		PM.add(polly::createMaximalStaticExpansionPass());

if (EnablePruneUnprofitable)		if (EnablePruneUnprofitable)
PM.add(polly::createPruneUnprofitablePass());		PM.add(polly::createPruneUnprofitablePass());

#ifdef GPU_CODEGEN		#ifdef GPU_CODEGEN
if (Target == TARGET_HYBRID)		if (Target == TARGET_HYBRID)
PM.add(		PM.add(
polly::createPPCGCodeGenerationPass(GPUArchChoice, GPURuntimeChoice));		polly::createPPCGCodeGenerationPass(GPUArchChoice, GPURuntimeChoice));
#endif		#endif
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

lib/Transform/MaximalStaticExpansion.cpp

This file was added.

				//===---------------- MaximalStaticExpansion.cpp -------------------------===//
				//
				MeinersburUnsubmitted Done Reply Inline Actions Remove `- Expand the memory access`. No other header does this. Meinersbur: Remove `- Expand the memory access`. No other header does this.
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass fully expand the memory accesses of a Scop to get rid of
				// dependencies.
				//
				//===----------------------------------------------------------------------===//
				MeinersburUnsubmitted Done Reply Inline Actions To be consistent with the other files, add an empty line before `//===------` Meinersbur: To be consistent with the other files, add an empty line before `//===------`

				#include "polly/DependenceInfo.h"
				#include "polly/FlattenAlgo.h"
				#include "polly/LinkAllPasses.h"
				#include "polly/Options.h"
				#include "polly/ScopInfo.h"
				#include "polly/Support/GICHelper.h"
				#include "polly/Support/ISLOStream.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/Support/Debug.h"

				using namespace llvm;
				using namespace polly;

				namespace {
				class MaximalStaticExpander : public ScopPass {
				public:
				static char ID;
				explicit MaximalStaticExpander() : ScopPass(ID) {}

				~MaximalStaticExpander() {}

				/// Expand the accesses of the SCoP @p S.
				bool runOnScop(Scop &S) override;

				/// Print the SCoP @p S.
				void printScop(raw_ostream &OS, Scop &S) const override;

				MeinersburUnsubmitted Done Reply Inline Actions Typo: transformations Meinersbur: Typo: transformations
				/// Register all analyses and transformations required.
				void getAnalysisUsage(AnalysisUsage &AU) const override;

				private:
				// The set of not expandable SAI.
				std::set<const ScopArrayInfo *> NotExpandables;
				MeinersburUnsubmitted Done Reply Inline Actions [Style] Instead of a set of not expandable arrays, why not using a set of expandable arrays? It feels safer because if an array is just missing in the set for whatever reason, it would not default to expand it. [Style] `std::set` is rarely used in LLVM. There are alternitives: See http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallset-h . Meinersbur: [Style] Instead of a set of not expandable arrays, why not using a set of expandable arrays? It…

				// Check which SAI from SCoP @p S is expandable.
				// The parameter @p Dependences is the RAW dependences of the SCoP @p S.
				void checkExpendability(Scop &S, isl::union_map &Dependences);
				MeinersburUnsubmitted Done Reply Inline Actions [Typo] checkExpandability Meinersbur: [Typo] checkExpandability

				// Return true if the @p SAI in parameter is expandable.
				bool isExpandable(const ScopArrayInfo *SAI);

				// Expand the write memory access @p MA belonging to the SCoP @p S.
				void expandWrite(Scop &S, MemoryAccess *MA);

				// Expand the read memory access @p MAP belonging to the SCoP @p S.
				// The parameter @p Writes is all the write memory accesses of the SCoP @p S.
				// The parameter @p Dependences is the RAW dependences of the SCoP @p S.
				MeinersburUnsubmitted Done Reply Inline Actions [Style] Please make them doxygen comments (`///`) The doxygen style for parameters we usually use is. @param Dependences The RAW dependences of the SCoP @p S. Meinersbur: [Style] Please make them doxygen comments (`///`) The doxygen style for parameters we usually…
				void expandRead(Scop &S, MemoryAccess MA, std::set<MemoryAccess > &Writes,
				isl::union_map &Dependences);
				};
				} // namespace

				char MaximalStaticExpander::ID = 0;

				niosegaAuthorUnsubmitted Not Done Reply Inline Actions By reading and trying to understand the work done in DeLICM and JSONImporter::importAccesses, I had begin the implementation of my pass. But I am stuck in building the new map. For example, if we have the following c code : for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: tmp = A[N+i]tmp; After the pass, we want this : for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) T: tmp[i, j] = A[N+i]tmp; This means that I want to transform the map of S, [N] -> {S[i,j] -> tmp}, to the map of T, [N] -> {S[i,j] -> tmp[i, j]}. But i was not able to find a way to do that. Anybody has an idea on how to proceed ? niosega: By reading and trying to understand the work done in DeLICM and JSONImporter::importAccesses, I…
				MeinersburUnsubmitted Done Reply Inline Actions [Style] Use `SmallPtrSetImpl<MemoryAccess >` to not require the small size in the parameter. Meinersbur:* [Style] Use `SmallPtrSetImpl<MemoryAccess *>` to not require the small size in the parameter.
				bool MaximalStaticExpander::isExpandable(const ScopArrayInfo *SAI) {
				return (NotExpandables.find(SAI) == NotExpandables.end());
				}

				void MaximalStaticExpander::checkExpendability(Scop &S,
				isl::union_map &Dependences) {
				for (auto &SAI : S.arrays()) {
				int NumberWrites = 0;
				bool NotExpandable = false;
				MeinersburUnsubmitted Done Reply Inline Actions [Style] Negation in variable name and double negation. Why not `bool Expandable = true`? Meinersbur: [Style] Negation in variable name and double negation. Why not `bool Expandable = true`?
				for (ScopStmt &Stmt : S) {
				for (MemoryAccess *MA : Stmt) {

				// Check if the current MemoryAccess involved the current SAI
				if (SAI != MA->getLatestScopArrayInfo()) {
				continue;
				}

				// For now, we are not able to expand Scalar
				MeinersburUnsubmitted Done Reply Inline Actions Check for whther this returns `nullptr`. At least an assertion. Suggestion: Get the domain sizes from the writing statement's domain. Meinersbur: Check for whther this returns `nullptr`. At least an assertion. Suggestion: Get the domain…
				if (MA->isLatestScalarKind() and 0) {
				errs() << "MSE ERROR : " << SAI->getName()
				<< " is a Scalar access. \n";
				NotExpandable = true;
				MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] One could skip the check for the current array once it is known to be unexpandable and continue with the next one. Meinersbur: [Suggestion] One could skip the check for the current array once it is known to be unexpandable…
				niosegaAuthorUnsubmitted Done Reply Inline Actions That is what I wanted to do in the beginning. But I didn't find a clean solution to "escape" from the two innermost loops and go to next iteration of the loop that iterate over the SAI niosega: That is what I wanted to do in the beginning. But I didn't find a clean solution to "escape"…
				MeinersburUnsubmitted Done Reply Inline Actions A possible solution is to refactor the inner two loops to a function, which can `return true/false` for a `ScopArrayInfo` at any point. bool isExpandable(SAI) { for (ScopStmt &Stmt : S) for (MemoryAccess MA : Stmt) { if (MA->isMayWrite()) return false; ... } } for (auto &SAI : S.arrays()) { if (!isExpandable()) NotExpandables.insert(SAI); } Meinersbur:* A possible solution is to refactor the inner two loops to a function, which can `return…
				}

				// For now, we are not able to expand MayWrite
				if (MA->isMayWrite()) {
				errs() << "MSE ERROR : " << SAI->getName()
				<< " has a maywrite access. \n";
				NotExpandable = true;
				}

				// For now, we are not able to expand SAI with more than one write
				if (MA->isMustWrite()) {
				NumberWrites++;
				if (NumberWrites > 1) {
				errs() << "MSE ERROR : " << SAI->getName()
				<< " has more than 1 write access. \n";
				NotExpandable = true;
				}
				}

				MeinersburUnsubmitted Done Reply Inline Actions `isl_dim_in` and `isl_dim_out` dimensions have no dim_id, so this is not necessary. Meinersbur: `isl_dim_in` and `isl_dim_out` dimensions have no dim_id, so this is not necessary.
				// Check if it is possible to extand this read
				if (MA->isRead()) {

				MeinersburUnsubmitted Done Reply Inline Actions At this point, in the regression test, we have { Stmt_for_cond1_preheader[i0] -> MemRef_tmp_11__phi1_expanded[o0] } (i0 and o0 are unconnected) That is, every statement instance accesses all array elements. It should be something like: { Stmt_for_cond1_preheader[i0] -> MemRef_tmp_11__phi1_expanded[i0] } Meinersbur: At this point, in the regression test, we have ``` { Stmt_for_cond1_preheader[i0] ->…
				// Get the domain of the current ScopStmt
				auto StmtDomain = isl::give(Stmt.getDomain());

				// Get the domain of the future Read access
				auto ReadDomainSet =
				isl::give(isl_map_domain(MA->getAccessRelation()));
				auto ReadDomain = isl::union_set(ReadDomainSet);
				auto CurrentReadWriteDependences =
				Dependences.reverse().intersect_domain(ReadDomain);
				auto DepsDomain = CurrentReadWriteDependences.domain();

				unsigned NumberElementMap = 0;
				CurrentReadWriteDependences.foreach_map(
				[=, &NumberElementMap](isl::map Map) -> isl::stat {
				NumberElementMap++;
				return isl::stat::ok;
				});
				MeinersburUnsubmitted Done Reply Inline Actions [Style] `NumerElementMap = isl_union_map_n_map(CurrentReadWriteDependences.get())` Meinersbur: [Style] `NumerElementMap = isl_union_map_n_map(CurrentReadWriteDependences.get())`

				// If there are multiple maps in the Deps, we cannot handle this case
				// for now
				if (NumberElementMap != 1) {
				errs() << "MSE ERROR : " << SAI->getName()
				<< " has too many dependences to be handle for now. \n";
				NotExpandable = true;
				MeinersburUnsubmitted Done Reply Inline Actions Use inline declarations: bool CorrectWrite = expandWrite(S, MA); Meinersbur: Use inline declarations: ``` bool CorrectWrite = expandWrite(S, MA); ```
				}

				MeinersburUnsubmitted Not Done Reply Inline Actions We cannot return early with transformations only partially applied. The SCoP representation will be inconsistent and at best passes after this will crash, at worst we miscompile. Please check in advance if a transformation can be applied before trying to apply it. In this case, you may want to check to each `ScopArrayInfo` whether it is applicable. The class `ScalarDefUseChains` in DeLICM.cpp might be helpful to get the accesses of a SAI. I think we will sooner-or-later integrate it into ScopInfo. Meinersbur: We cannot return early with transformations only partially applied. The SCoP representation…
				auto DepsDomainSet = isl::set(DepsDomain);

				// Partial read accesses are not handled by Polly
				if (!StmtDomain.is_subset(DepsDomainSet)) {
				errs() << "MSE ERROR : " << SAI->getName()
				<< " expansion leads to a partial read access. \n";
				MeinersburUnsubmitted Done Reply Inline Actions [Serious] In normal operation, do not print anything. Users expect only clang warnings and errors. Alternatives are: `DEBUG(dbgs() << "")` (discurraged for regression tests, this is meant for debugging) Print remarks using `-pass-remarks-missed` Print information at `-analysis` `STATISTIC` Meinersbur: [Serious] In normal operation, do not print anything. Users expect only clang warnings and…
				NotExpandable = true;
				}
				}
				}
				MeinersburUnsubmitted Done Reply Inline Actions `runOnScop` returns true if and only if the IR has been modified. We only modify the SCoP representation, therefore we only return `false`. Meinersbur: `runOnScop` returns true if and only if the IR has been modified. We only modify the SCoP…
				}

				// No need to expand SAI with no write
				if (NumberWrites == 0) {
				MeinersburUnsubmitted Done Reply Inline Actions In `printScop`, you must print to `OS`. It will print to `stdout` instead to `stderr`. This should also make the regression test simpler. Meinersbur: In `printScop`, you must print to `OS`. It will print to `stdout` instead to `stderr`. This…
				errs() << "MSE ERROR : " << SAI->getName() << " has 0 access. \n";
				NotExpandable = true;
				}
				if (NotExpandable) {
				MeinersburUnsubmitted Done Reply Inline Actions You will need to add `AU.addRequired<DependenceInfo>();` here at some point. Meinersbur: You will need to add `AU.addRequired<DependenceInfo>();` here at some point.
				NotExpandables.insert(SAI);
				}
				}
				}

				void MaximalStaticExpander::expandRead(Scop &S, MemoryAccess *MA,
				std::set<MemoryAccess *> &Writes,
				isl::union_map &Dependences) {

				// Get RAW dependences for the current WA
				auto WriteDomainSet = isl::give(isl_map_domain(MA->getAccessRelation()));
				auto WriteDomain = isl::union_set(WriteDomainSet);

				auto CurrentReadWriteDependences =
				Dependences.reverse().intersect_domain(WriteDomain);

				// If no dependences, no need to modify anything
				if (CurrentReadWriteDependences.is_empty()) {
				return;
				}

				auto NewAccessMap = isl::map::from_union_map(CurrentReadWriteDependences);

				isl::id Id;

				// Get the in and out ID of the read relation we want to expand
				simbuergUnsubmitted Done Reply Inline Actions Reminder: These should/will become diagnostics simbuerg: Reminder: These should/will become diagnostics
				auto ReadAccessRelation = isl::give(MA->getAccessRelation());
				auto ReadOutId = ReadAccessRelation.get_tuple_id(isl::dim::out);
				auto NewOutId = NewAccessMap.get_tuple_id(isl::dim::out);

				// Find the name of the expanded corresponding SAI
				for (auto Write : Writes) {
				MeinersburUnsubmitted Done Reply Inline Actions [Typo] extand -> expand Meinersbur: [Typo] extand -> expand
				auto WriteAccessRelation = isl::give(Write->getOriginalAccessRelation());
				auto WriteOutId = WriteAccessRelation.get_tuple_id(isl::dim::out);
				auto WriteInId = WriteAccessRelation.get_tuple_id(isl::dim::in);

				if (ReadOutId.keep() == WriteOutId.keep() &&
				NewOutId.keep() == WriteInId.keep()) {
				MeinersburUnsubmitted Done Reply Inline Actions I get a compile error here. `MA->getAccessRelation()` has been updated to use C++ object. Please update to Polly trunk. Meinersbur: I get a compile error here. `MA->getAccessRelation()` has been updated to use C++ object.
				auto NewWriteAccessRelation = isl::give(Write->getLatestAccessRelation());
				Id = NewWriteAccessRelation.get_tuple_id(isl::dim::out);
				break;
				}
				}
				MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] Instead of searching for the correct id, you could derive the name as in `expandWrite` and look it up in `ScopArrayNameMap`. [Serious] What if the `Id` is not found? Please add an assertion for that case. Meinersbur: [Suggestion] Instead of searching for the correct id, you could derive the name as in…

				// Replace the out tuple id with the one of the access array
				NewAccessMap = NewAccessMap.set_tuple_id(isl::dim::out, Id);

				// Set the new access relation
				MA->setNewAccessRelation(NewAccessMap.copy());
				}

				void MaximalStaticExpander::expandWrite(Scop &S, MemoryAccess *MA) {

				// Get the current AM
				auto CurrentAccessMap = isl::give(MA->getAccessRelation());

				// If the access is already fully expanded, do nothing
				unsigned in_dimensions = CurrentAccessMap.dim(isl::dim::in);
				unsigned out_dimensions = CurrentAccessMap.dim(isl::dim::out);
				if (in_dimensions == out_dimensions) {
				return;
				}
				MeinersburUnsubmitted Not Done Reply Inline Actions [Serious] This is not a sufficient condition for full expansion. E.g. one dimension can be a static `0`. There is also more than one possibility for full expansion (e.g. one starting at 0, another at 1). The reads must access the correct element. So if you want to implement this as a heuristic that expansion is not worth it in this case, implement it in `checkExpandability` such that reads are also not modified. Meinersbur: [Serious] This is not a sufficient condition for full expansion. E.g. one dimension can be a…

				MeinersburUnsubmitted Done Reply Inline Actions The consequence would not be a partial read access, but it would need to read the original value the element had before entering the SCoP. That's a special case similar to having more than one write. Meinersbur: The consequence would not be a partial read access, but it would need to read the original…
				niosegaAuthorUnsubmitted Done Reply Inline Actions Are you sure ? Because if I remember well. Let say that we are analyzing this code : for (i = 0; i < Ni; i++) { B[j] = i; for (int j = 0; j<Nj; j++) { B[j] = j; } A[i] = B[i]; When I try to set the new access relation for the B read, the setNewAccessRelation method of class MemoryAccess failed with the assert "Partial READ accesses not supported". niosega: Are you sure ? Because if I remember well. Let say that we are analyzing this code : ```…
				MeinersburUnsubmitted Done Reply Inline Actions How were you able to do that? setNewAccessRelation accepts only an isl_map, not isl_union_map. Let me explain in more detail. for (int k = 0; k < M; k+=1) { for (int i = 0; i<= N/2; i+=1) { S: B[i] = i; } for (int i = 0; i<N; i+=1) { T: = ... B[i] } } The flow dependence would look something like: { T[k, i] -> S[k, i] : 0 < i <= N/2 } We could naively expand B to B_expanded: for (int k = 0; k < M; k+=1) { for (int i = 0; i<= N/2; i+=1) { S: B_expanded[k][i] = i; } for (int i = 0; i<N; i+=1) { T: = ... B_expanded[k][i] } } The problem here is that `B_expanded[k][i]` for i > N/2 never gets written (And T would read uninitialized memory). The flow dependence doesn't tell which instance of S wrote it in the first place!! If you try to apply it naively anyway using setNewAccessRelation, we need a source of the value for all instances of S, but we don't have one for `i > N/2`! This is way partial read accesses are unsupported. The correct thing to do would be to read the value from the original array B (which then becomes read-only). { T[k,i] -> B_expanded[k,i] : i < N/2; T[k,i] -> B[i] : i>=N/2 } This again is an isl_union_map (NOT a partial access since it is defined for all instances of T), which we currently do not support support. Please try to understand what the problem with partial read accesses is. Not the partial read accesses are the problem, but the reason why you would want to use one. Meinersbur: How were you able to do that? setNewAccessRelation accepts only an isl_map, not isl_union_map.
				// Get domain from the current AM
				auto Domain = CurrentAccessMap.domain();

				// Create a new AM from the domain
				auto NewAccessMap = isl::map::from_domain(Domain);

				// Add dimensions to the new AM according to the current in_dim
				// Fully indexed expansion
				NewAccessMap = NewAccessMap.add_dims(isl::dim::out, in_dimensions);

				// Create the string representing the name of the new SAI
				// One new SAI for each statement so that each write go to a different memory
				// cell
				auto CurrentStmtDomain = isl::give(MA->getStatement()->getDomain());
				auto CurrentStmtName = CurrentStmtDomain.get_tuple_name();
				auto CurrentOutId = CurrentAccessMap.get_tuple_id(isl::dim::out);
				std::string CurrentOutIdString =
				MA->getScopArrayInfo()->getName() + "_" + CurrentStmtName + "_expanded";

				// Set the tuple id for the out dimension
				NewAccessMap = NewAccessMap.set_tuple_id(isl::dim::out, CurrentOutId);

				// Create the size vector
				// For now, use getSE but will use ISL in a future revision
				// Waiting for methods from FlattenAlgo to be available
				std::vector<const SCEV *> SCEVSizes;
				auto ScopStmt = MA->getStatement();
				for (unsigned i = 0; i < ScopStmt->getNumIterators(); i++) {
				auto Loop = ScopStmt->getLoopForDimension(i);
				auto SCEV = S.getSE()->getMaxBackedgeTakenCount(
				Loop); // +1 but this will change later to the ISL version
				MeinersburUnsubmitted Done Reply Inline Actions [Serious] I think `getMaxBackedgeTakenCount` can fail in cases where Polly is still able to detect affine loops. Please add at least an `assert` that SCEV is not `SCEVCountNotCompute`. Meinersbur: [Serious] I think `getMaxBackedgeTakenCount` can fail in cases where Polly is still able to…
				MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] Or use `ScalarEvolution::getAddExpr(SCEV, ScalarEvolution::getConstant(1))` [Serious] I don't see where +1 is added to the ISL version. Meinersbur: [Suggestion] Or use `ScalarEvolution::getAddExpr(SCEV, ScalarEvolution::getConstant(1))`…
				niosegaAuthorUnsubmitted Done Reply Inline Actions We discuss during the last phone call about an other solution that involves methods that are in FlattenAlgo.cpp to get the boundary of the loop iterations variables. That's what I meant by "ISL version". But for now, I can not access the methods from FlattenAlgo.cpp because they are defined inside an unamed namespace. niosega: We discuss during the last phone call about an other solution that involves methods that are in…
				MeinersburUnsubmitted Done Reply Inline Actions You may need to modify them anyway, so don't hesitate to copy them over, especially for a prototype. If later we find they share a significant amount of code, we can find a common file for them. Meinersbur: You may need to modify them anyway, so don't hesitate to copy them over, especially for a…
				SCEVSizes.push_back(SCEV);
				}

				simbuergUnsubmitted Done Reply Inline Actions As far as I remember, this will fail, if there are more than one map in the union_map. I would check that with at least an assert. simbuerg: As far as I remember, this will fail, if there are more than one map in the union_map. I would…
				// Get the ElementType of the current SAI
				auto ElementType = MA->getOriginalScopArrayInfo()->getElementType();
				MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] Should this be `getLatestScopArrayInfo()` ? Meinersbur: [Suggestion] Should this be `getLatestScopArrayInfo()` ?

				// Create (or get if already existing) the new expanded SAI
				auto ExpandedSAI =
				S.getOrCreateScopArrayInfo(nullptr, ElementType, SCEVSizes,
				simbuergUnsubmitted Done Reply Inline Actions Where do you use this name? simbuerg: Where do you use this name?
				MemoryKind::Array, CurrentOutIdString.c_str());
				ExpandedSAI->setIsOnHeap(true);

				// Get the out Id of the expanded Array
				auto NewOutId = isl::give(ExpandedSAI->getBasePtrId());

				// Set the out id of the new AM to the new SAI id
				NewAccessMap = NewAccessMap.set_tuple_id(isl::dim::out, NewOutId);

				// Add constraints to linked output with input id
				auto SpaceMap = NewAccessMap.get_space();
				auto ls = isl::local_space(SpaceMap);
				for (unsigned dim = 0; dim < in_dimensions; dim++) {
				auto Constraints = isl::constraint::alloc_equality(ls);
				Constraints = Constraints.set_coefficient_si(isl::dim::out, dim, 1);
				Constraints = Constraints.set_coefficient_si(isl::dim::in, dim, -1);
				NewAccessMap = NewAccessMap.add_constraint(Constraints);
				}

				// Set the new access relation map
				MA->setNewAccessRelation(NewAccessMap.copy());
				}

				bool MaximalStaticExpander::runOnScop(Scop &S) {

				// Get the RAW Dependences
				auto &DI = getAnalysis<DependenceInfo>();
				auto &D = DI.getDependences(Dependences::AL_Statement);
				auto Dependences = isl::give(D.getDependences(Dependences::TYPE_RAW));

				// Check for each SAI if we can expand it
				checkExpendability(S, Dependences);

				// Writes MemoryAccess
				std::set<MemoryAccess *> Writes;

				// Expand all expandable write MemoryAccesses
				for (ScopStmt &Stmt : S) {
				for (MemoryAccess *MA : Stmt) {
				MeinersburUnsubmitted Done Reply Inline Actions [Remark] The structure I had in mind was for (array : S.arrays()) { if (!checkExpendability(array)) continue; for (ScopStmt &Stmt : S) for (MemoryAccess MA : Stmt) { if (MA->isRead()) expandRead(MA); if (MA->isWrite()) expandRead(MA); } } This does not need a `NotExpandable` set, but has worse asymptotic runtime. So I guess your version has an advantage. Meinersbur:* [Remark] The structure I had in mind was ``` for (array : S.arrays()) { if (!
				niosegaAuthorUnsubmitted Done Reply Inline Actions If I am not wrong, we must first expand all writes before expanding the reads. Because otherwise problems can happened during trying to find expanded SAI during read expansion. niosega: If I am not wrong, we must first expand all writes before expanding the reads. Because…
				MeinersburUnsubmitted Done Reply Inline Actions Let me refine what I some time ago I had in mind. for (array : S.arrays()) { MemoryAccess TheWrite = nullptr; List<MemoryAccess> AllReads; if (!isExpandable(array, TheWrite, AllReads)) continue; assert(TheWrite); assert(AllReads.size() > 0); ScopArrayInfo ExpandedArray = expandWrite(TheWrite); for (MemoryAccess MA : AllReads) expandRead(MA, ExpandedArray); } Meinersbur:* Let me refine what I some time ago I had in mind. ``` for (array : S.arrays()) { MemoryAccess…

				// Check if we can expand this MemoryAccess
				auto SAI = MA->getLatestScopArrayInfo();
				if (!isExpandable(SAI)) {
				continue;
				}
				MeinersburUnsubmitted Done Reply Inline Actions [Style] We usually do not use parenthesis around single statements: if (isExpandable(SAI)) continue; Meinersbur: [Style] We usually do not use parenthesis around single statements: ``` if (isExpandable(SAI))…

				if (MA->isWrite()) {
				expandWrite(S, MA);
				Writes.insert(MA);
				MeinersburUnsubmitted Done Reply Inline Actions [Nit] The UpperBound could overflow a long. Add an assertion for that? Meinersbur: [Nit] The UpperBound could overflow a long. Add an assertion for that?
				niosegaAuthorUnsubmitted Done Reply Inline Actions How can I efficiently check that there is an overflow ? niosega: How can I efficiently check that there is an overflow ?
				MeinersburUnsubmitted Done Reply Inline Actions `UpperBound.le(INT_MAX)` (I think there is no implicit conversion from int to isl::val, but you gget the idea) Meinersbur: `UpperBound.le(INT_MAX)` (I think there is no implicit conversion from int to isl::val, but you…
				}
				MeinersburUnsubmitted Not Done Reply Inline Actions This assertion fails on Windows (and 32 bit platforms): The `isl::val` constructor takes a `long`, which is 32 bit this platforms. UINT_MAX exceeds its range. Meinersbur: This assertion fails on Windows (and 32 bit platforms): The `isl::val` constructor takes a…
				niosegaAuthorUnsubmitted Not Done Reply Inline Actions It's a mistake from my side to compare it with UINT_MAX. If I replace std::numeric_limits<unsigned>::max() with std::numeric_limits<long>::max() it should work on every platform, right ? niosega: It's a mistake from my side to compare it with UINT_MAX. If I replace ``` std…
				MeinersburUnsubmitted Not Done Reply Inline Actions Except that it is stored into a `std::vector<unsigned>`. Storing a `long` into as an `unsigned int` may get you another overflow. My suggestion is to stick with the lowest common maximum: `std::numeric_limits<int>::max()`. I don't think you would want to allocate memory larger than that anyway. Meinersbur: Except that it is stored into a `std::vector<unsigned>`. Storing a `long` into as an `unsigned…
				MeinersburUnsubmitted Not Done Reply Inline Actions Why `- 1`? Meinersbur: Why `- 1`?
				}
				MeinersburUnsubmitted Not Done Reply Inline Actions It has been tested for the range of an `unsigned int`. Meinersbur: It has been tested for the range of an `unsigned int`.
				}
				MeinersburUnsubmitted Not Done Reply Inline Actions If `UpperBound.get_num_si()` is `UINT_MAX`, you get an overflow when adding +1. Meinersbur: If `UpperBound.get_num_si()` is `UINT_MAX`, you get an overflow when adding +1.

				// Expand all expandable read MemoryAccesses
				MeinersburUnsubmitted Done Reply Inline Actions [Style] In Polly's coding style, all sentences end with a dot (but I personally don't care). Meinersbur: [Style] In Polly's coding style, all sentences end with a dot (but I personally don't care).
				for (ScopStmt &Stmt : S) {
				for (MemoryAccess *MA : Stmt) {
				simbuergUnsubmitted Done Reply Inline Actions This takes a const string &, no need to go over the c_str(). simbuerg: This takes a const string &, no need to go over the c_str().

				// Check if we can expand this MemoryAccess
				auto SAI = MA->getLatestScopArrayInfo();
				if (!isExpandable(SAI)) {
				continue;
				}

				if (MA->isRead()) {
				expandRead(S, MA, Writes, Dependences);
				}
				}
				}

				return false;
				}

				void MaximalStaticExpander::printScop(raw_ostream &OS, Scop &S) const {
				MeinersburUnsubmitted Done Reply Inline Actions [Style] This could be simpler using NewAccessMap = NewAccessMap->equate(isl::dim::in, dim. isl::dim::out, dim); or, even, better, use `basic_map::equal`. Meinersbur: [Style] This could be simpler using ``` NewAccessMap = NewAccessMap->equate(isl::dim::in, dim.
				niosegaAuthorUnsubmitted Done Reply Inline Actions I'd like to use isl_basic_map_equal but I did not find the documentation of this method on the online isl doc. There is also no example of uses in Polly. Can you explain me how it works ? niosega: I'd like to use isl_basic_map_equal but I did not find the documentation of this method on the…
				MeinersburUnsubmitted Done Reply Inline Actions isl_map_space(SpaceMap.copy(), SpaceMap.dim(isl::in)) should get you a basic_map of that space where the `n_equal = SpaceMap.dim(isl::in)` in- and out- dimensions are equal. Something like. { Stmt[i0, i1] -> MemRef[o0, o1] : i0 = o0 and i1 = o1 } However, no documention could mean that the function was not intended to be public. Meinersbur: ``` isl_map_space(SpaceMap.copy(), SpaceMap.dim(isl::in)) ``` should get you a basic_map of…
				S.print(OS);
				}

				void MaximalStaticExpander::getAnalysisUsage(AnalysisUsage &AU) const {
				ScopPass::getAnalysisUsage(AU);
				AU.addRequired<DependenceInfo>();
				}

				MeinersburUnsubmitted Done Reply Inline Actions [Suggestion] Pass string as `llvm::StringRef` (or `const std::string &` to avoid a copy) Meinersbur: [Suggestion] Pass string as `llvm::StringRef` (or `const std::string &` to avoid a copy)
				Pass *polly::createMaximalStaticExpansionPass() {
				return new MaximalStaticExpander();
				}
				simbuergUnsubmitted Done Reply Inline Actions Why 'AssumpRestrict'? simbuerg: Why 'AssumpRestrict'?

				INITIALIZE_PASS_BEGIN(MaximalStaticExpander, "polly-opt-mse",
				"Polly - Maximal static expansion of SCoP", false, false);
				INITIALIZE_PASS_DEPENDENCY(DependenceInfo);
				INITIALIZE_PASS_END(MaximalStaticExpander, "polly-opt-mse",
				MeinersburUnsubmitted Done Reply Inline Actions [Nit] `OptimizationRemarkEmitterWrapperPass` Meinersbur: [Nit] `OptimizationRemarkEmitterWrapperPass`
				"Polly - Maximal static expansion of SCoP", false, false)
				MeinersburUnsubmitted Done Reply Inline Actions [Style] No braces around single statements. Also possible: SmallPtrSet<ScopArrayInfo , 4> CurrentSAI(S.array_begin(), array_end()); Meinersbur:* [Style] No braces around single statements. Also possible: ``` SmallPtrSet<ScopArrayInfo *, 4>…

test/MaximalStaticExpansion/partial_access.ll

This file was added.

				; RUN: opt -polly-canonicalize %loadPolly -analyze -polly-opt-mse < %s 2>&1 \| FileCheck %s
				;
				MeinersburUnsubmitted Done Reply Inline Actions [Style] Please remove trailing whitespace. Meinersbur: [Style] Please remove trailing whitespace.
				MeinersburUnsubmitted Done Reply Inline Actions The interleaving of stdout (`-analyze`) and stderr (`-pass-remarks-analysis`) is undefined. It is better to have two separate RUN lines, one checking `analyze`, the other `-pass-remarks-analysis`. Meinersbur: The interleaving of stdout (`-analyze`) and stderr (`-pass-remarks-analysis`) is undefined. It…
				; Verify that Polly detects problems and does not expand the array
				;
				; Original source code :
				;
				; #define Ni 2000
				; #define Nj 3000
				;
				; double mse(double A[Ni], double B[Nj]) {
				; int i;
				; double tmp = 6;
				; for (i = 0; i < Ni; i++) {
				; for (int j = 2; j<Nj; j++) {
				; B[j-1] = j;
				; }
				; A[i] = B[i];
				; }
				; return tmp;
				; }
				;
				; Check that the pass detects the problem of partial read
				;
				; CHECK: MSE ERROR : MemRef_B expansion leads to a partial read access.
				;
				; Check that the SAI is not expanded
				;
				; CHECK-NOT: double MemRef_B2_expanded[1999][2999]; // Element size 8
				;
				; Check that the memory accesses are not modified
				;
				; CHECK-NOT: new: { Stmt_for_body3[i0, i1] -> MemRef_B_Stmt_for_body3_expanded[i0, i1] };
				; CHECK-NOT: new: { Stmt_for_end[i0] -> MemRef_B_Stmt_for_body3_expanded

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: noinline nounwind uwtable
				define double @mse(double* %A, double* %B) {
				entry:
				%A.addr = alloca double*, align 8
				%B.addr = alloca double*, align 8
				%i = alloca i32, align 4
				%tmp = alloca double, align 8
				%j = alloca i32, align 4
				store double* %A, double** %A.addr, align 8
				store double* %B, double** %B.addr, align 8
				store double 6.000000e+00, double* %tmp, align 8
				store i32 0, i32* %i, align 4
				br label %for.cond

				for.cond: ; preds = %for.inc8, %entry
				%0 = load i32, i32* %i, align 4
				%cmp = icmp slt i32 %0, 2000
				br i1 %cmp, label %for.body, label %for.end10

				for.body: ; preds = %for.cond
				store i32 2, i32* %j, align 4
				br label %for.cond1

				for.cond1: ; preds = %for.inc, %for.body
				%1 = load i32, i32* %j, align 4
				%cmp2 = icmp slt i32 %1, 3000
				br i1 %cmp2, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				%2 = load i32, i32* %j, align 4
				%conv = sitofp i32 %2 to double
				%3 = load double, double* %B.addr, align 8
				%4 = load i32, i32* %j, align 4
				%sub = sub nsw i32 %4, 1
				%idxprom = sext i32 %sub to i64
				%arrayidx = getelementptr inbounds double, double* %3, i64 %idxprom
				store double %conv, double* %arrayidx, align 8
				br label %for.inc

				for.inc: ; preds = %for.body3
				%5 = load i32, i32* %j, align 4
				%inc = add nsw i32 %5, 1
				store i32 %inc, i32* %j, align 4
				br label %for.cond1

				for.end: ; preds = %for.cond1
				%6 = load double, double* %B.addr, align 8
				%7 = load i32, i32* %i, align 4
				%idxprom4 = sext i32 %7 to i64
				%arrayidx5 = getelementptr inbounds double, double* %6, i64 %idxprom4
				%8 = load double, double* %arrayidx5, align 8
				%9 = load double, double* %A.addr, align 8
				%10 = load i32, i32* %i, align 4
				%idxprom6 = sext i32 %10 to i64
				%arrayidx7 = getelementptr inbounds double, double* %9, i64 %idxprom6
				store double %8, double* %arrayidx7, align 8
				br label %for.inc8

				for.inc8: ; preds = %for.end
				%11 = load i32, i32* %i, align 4
				%inc9 = add nsw i32 %11, 1
				store i32 %inc9, i32* %i, align 4
				br label %for.cond

				for.end10: ; preds = %for.cond
				%12 = load double, double* %tmp, align 8
				ret double %12
				}

test/MaximalStaticExpansion/too_many_writes.ll

This file was added.

				; RUN: opt -polly-canonicalize %loadPolly -analyze -polly-opt-mse < %s 2>&1 \| FileCheck %s
				;
				; Verify that Polly detects problems and does not expand the array
				;
				; Original source code :
				;
				; #define Ni 2000
				; #define Nj 3000
				;
				; double mse(double A[Ni], double B[Nj]) {
				; int i;
				; double tmp = 6;
				; for (i = 0; i < Ni; i++) {
				; B[i] += 2;
				; for (int j = 2; j<Nj; j++) {
				; B[j-1] = j;
				; }
				; A[i] = B[i];
				; }
				; return tmp;
				; }
				;
				; Check that the pass detects the problem of partial read and too many writes
				;
				; CHECK: MSE ERROR : MemRef_B expansion leads to a partial read access.
				; CHECK: MSE ERROR : MemRef_B has more than 1 write access.
				;
				; Check that the SAI is not expanded
				;
				; CHECK-NOT: double MemRef_B2_expanded[1999][2999]; // Element size 8
				;
				; Check that the memory accesses are not modified
				;
				; CHECK-NOT: new: { Stmt_for_body3[i0, i1] -> MemRef_B_Stmt_for_body3_expanded[i0, i1] };
				; CHECK-NOT: new: { Stmt_for_end[i0] -> MemRef_B_Stmt_for_body3_expanded

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: noinline nounwind uwtable
				define double @mse(double* %A, double* %B) {
				entry:
				%A.addr = alloca double*, align 8
				%B.addr = alloca double*, align 8
				%i = alloca i32, align 4
				%tmp = alloca double, align 8
				%j = alloca i32, align 4
				store double* %A, double** %A.addr, align 8
				store double* %B, double** %B.addr, align 8
				store double 6.000000e+00, double* %tmp, align 8
				store i32 0, i32* %i, align 4
				br label %for.cond

				for.cond: ; preds = %for.inc10, %entry
				%0 = load i32, i32* %i, align 4
				%cmp = icmp slt i32 %0, 2000
				br i1 %cmp, label %for.body, label %for.end12

				for.body: ; preds = %for.cond
				%1 = load double, double* %B.addr, align 8
				%2 = load i32, i32* %i, align 4
				%idxprom = sext i32 %2 to i64
				%arrayidx = getelementptr inbounds double, double* %1, i64 %idxprom
				%3 = load double, double* %arrayidx, align 8
				%add = fadd double %3, 2.000000e+00
				store double %add, double* %arrayidx, align 8
				store i32 0, i32* %j, align 4
				br label %for.cond1

				for.cond1: ; preds = %for.inc, %for.body
				%4 = load i32, i32* %j, align 4
				%cmp2 = icmp slt i32 %4, 3000
				br i1 %cmp2, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				%5 = load i32, i32* %j, align 4
				%conv = sitofp i32 %5 to double
				%6 = load double, double* %B.addr, align 8
				%7 = load i32, i32* %j, align 4
				%idxprom4 = sext i32 %7 to i64
				%arrayidx5 = getelementptr inbounds double, double* %6, i64 %idxprom4
				store double %conv, double* %arrayidx5, align 8
				br label %for.inc

				for.inc: ; preds = %for.body3
				%8 = load i32, i32* %j, align 4
				%inc = add nsw i32 %8, 1
				store i32 %inc, i32* %j, align 4
				br label %for.cond1

				for.end: ; preds = %for.cond1
				%9 = load double, double* %B.addr, align 8
				%10 = load i32, i32* %i, align 4
				%idxprom6 = sext i32 %10 to i64
				%arrayidx7 = getelementptr inbounds double, double* %9, i64 %idxprom6
				%11 = load double, double* %arrayidx7, align 8
				%12 = load double, double* %A.addr, align 8
				%13 = load i32, i32* %i, align 4
				%idxprom8 = sext i32 %13 to i64
				%arrayidx9 = getelementptr inbounds double, double* %12, i64 %idxprom8
				store double %11, double* %arrayidx9, align 8
				br label %for.inc10

				for.inc10: ; preds = %for.end
				%14 = load i32, i32* %i, align 4
				%inc11 = add nsw i32 %14, 1
				store i32 %inc11, i32* %i, align 4
				br label %for.cond

				for.end12: ; preds = %for.cond
				%15 = load double, double* %tmp, align 8
				ret double %15
				}

test/MaximalStaticExpansion/working_expansion.ll

This file was added.

				; RUN: opt -polly-canonicalize %loadPolly -analyze -polly-opt-mse < %s \| FileCheck %s
				;
				; Verify that the accesses are correctly expanded
				;
				; Original source code :
				;
				; #define Ni 2000
				; #define Nj 3000
				;
				; double mse(double A[Ni], double B[Nj]) {
				; int i;
				; double tmp = 6;
				; for (i = 0; i < Ni; i++) {
				; for (int j = 0; j<Nj; j++) {
				; B[j] = j;
				; }
				; A[i] = B[i];
				; }
				; return tmp;
				; }
				;
				; Check if the expanded SAI are created
				;
				; CHECK: double MemRef_B_Stmt_for_body3_expanded[1999][2999]; // Element size 8
				MeinersburUnsubmitted Done Reply Inline Actions Shouldn't this be MemRef_B_Stmt_for_body3_expanded[2000][3000] ? Meinersbur: Shouldn't this be ``` MemRef_B_Stmt_for_body3_expanded[2000][3000] ``` ?
				;
				; Check if the memory accesses are modified
				;
				; CHECK: new: { Stmt_for_body3[i0, i1] -> MemRef_B_Stmt_for_body3_expanded[i0, i1] };
				; CHECK: new: { Stmt_for_end[i0] -> MemRef_B_Stmt_for_body3_expanded[i0, i0] };

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: noinline nounwind uwtable
				define double @mse(double* %A, double* %B) {
				entry:
				%A.addr = alloca double*, align 8
				%B.addr = alloca double*, align 8
				%i = alloca i32, align 4
				%tmp = alloca double, align 8
				%j = alloca i32, align 4
				store double* %A, double** %A.addr, align 8
				store double* %B, double** %B.addr, align 8
				store double 6.000000e+00, double* %tmp, align 8
				store i32 0, i32* %i, align 4
				br label %for.cond

				for.cond: ; preds = %for.inc8, %entry
				%0 = load i32, i32* %i, align 4
				%cmp = icmp slt i32 %0, 2000
				br i1 %cmp, label %for.body, label %for.end10

				for.body: ; preds = %for.cond
				store i32 0, i32* %j, align 4
				br label %for.cond1

				for.cond1: ; preds = %for.inc, %for.body
				%1 = load i32, i32* %j, align 4
				%cmp2 = icmp slt i32 %1, 3000
				br i1 %cmp2, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				%2 = load i32, i32* %j, align 4
				%conv = sitofp i32 %2 to double
				%3 = load double, double* %B.addr, align 8
				%4 = load i32, i32* %j, align 4
				%idxprom = sext i32 %4 to i64
				%arrayidx = getelementptr inbounds double, double* %3, i64 %idxprom
				store double %conv, double* %arrayidx, align 8
				br label %for.inc

				for.inc: ; preds = %for.body3
				%5 = load i32, i32* %j, align 4
				%inc = add nsw i32 %5, 1
				store i32 %inc, i32* %j, align 4
				br label %for.cond1

				for.end: ; preds = %for.cond1
				%6 = load double, double* %B.addr, align 8
				%7 = load i32, i32* %i, align 4
				%idxprom4 = sext i32 %7 to i64
				%arrayidx5 = getelementptr inbounds double, double* %6, i64 %idxprom4
				%8 = load double, double* %arrayidx5, align 8
				%9 = load double, double* %A.addr, align 8
				%10 = load i32, i32* %i, align 4
				%idxprom6 = sext i32 %10 to i64
				%arrayidx7 = getelementptr inbounds double, double* %9, i64 %idxprom6
				store double %8, double* %arrayidx7, align 8
				br label %for.inc8

				for.inc8: ; preds = %for.end
				%11 = load i32, i32* %i, align 4
				%inc9 = add nsw i32 %11, 1
				store i32 %inc9, i32* %i, align 4
				br label %for.cond

				for.end10: ; preds = %for.cond
				%12 = load double, double* %tmp, align 8
				ret double %12
				}