This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
26/29
LoopNestAnalysis.h
-
lib/
-
Analysis/
-
CMakeLists.txt
17/17
LoopNestAnalysis.cpp
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
test/Analysis/LoopNestAnalysis/
-
Analysis/
-
LoopNestAnalysis/
-
imperfectnest.ll
-
infinite.ll
2/2
perfectnest.ll
-
unittests/Analysis/
-
Analysis/
-
CMakeLists.txt
-
LoopNestTest.cpp

Differential D68789

[LoopNest]: Analysis to discover properties of a loop nest.
ClosedPublic

Authored by etiotto on Oct 10 2019, 7:38 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
bmahjour
kbarton
Whitney
dmgreen
fhahn
reames
hfinkel
jdoerfert
ppc-slack

Commits

rGc84532a70aa4: [LoopNest]: Analysis to discover properties of a loop nest.
rG3a063d68e3c9: [LoopNest]: Analysis to discover properties of a loop nest.

Summary

This patch adds an analysis pass to collect loop nests and summarize properties of the nest (e.g the nest depth, whether the nest is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we discussed
the unimodular loop transformation framework ( “A Loop Transformation Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework provides a convenient way to unify legality checking and code generation for several loop nest transformations (e.g. loop reversal, loop interchange, loop skewing) and their compositions. Given that the unimodular framework is applicable to perfect loop nests this is one property of interest we expose in this analysis. Several other utility functions are also provided. In the future other properties of interest can be added in a centralized place.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

etiotto created this revision.Oct 10 2019, 7:38 AM

Herald added subscribers: llvm-commits, hiraditya, mgorny. · View Herald TranscriptOct 10 2019, 7:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2019, 7:38 AM

etiotto edited the summary of this revision. (Show Details)Oct 10 2019, 7:39 AM

The motivation for this patch was discussed at the latest meeting of the LLVM loop group (https://ibm.box.com/s/xn1sfd80jkpfssiqhal6t88yvzxebcex).

I think it would be great to summarize the motivation in the summary as well in a few sentences.

The motivation for this patch was discussed at the latest meeting of the LLVM loop group (https://ibm.box.com/s/xn1sfd80jkpfssiqhal6t88yvzxebcex).

Please change the link to: https://ibm.box.com/v/llvm-loop-nest-analysis

typo in title: [LoopNext] -> [LoopNest]

etiotto added a reviewer: ppc-slack.Oct 10 2019, 1:24 PM

etiotto edited the summary of this revision. (Show Details)

etiotto retitled this revision from [LoopNext]: Analysis to discover properties of a loop nest. to [LoopNest]: Analysis to discover properties of a loop nest..

etiotto edited the summary of this revision. (Show Details)Oct 10 2019, 1:52 PM

[serious] I think this pass should determine the maximum set of loops that are perfectly nested, as getMaxPerfectDepth returns. This would make

for (int i)
  for (int j) {
    preStmt();
    for (int k) ...
    postStmt();
  }

a perfectly nested loop nest of depth 2 with a body that contains a loop. In fact, any loop would be perfectly nested, but possibly just with depth one. This would avoid different analysis result between the above loop nest and

for (int i)
  for (int j) {
    OutlinedBody(i,j);
  }

[serious] There could be irreducible control flow between two loops, that needs to be ruled-out.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
66	[nit] outermost is a proper english word, so I suggest: `getOutermostLoop()`
78	[typo] breadth
125	[discussion] I understand "perfectly nested" as a property of a set of loops. That is, in for (int i) for (int j) { preStmt(); for (int k) ... postStmt(); } I'd understand the loop `i` and `j` to be a perfect loop nest. What code is in the body does not matter.
llvm/lib/Analysis/LoopNestAnalysis.cpp
24	Is this an established pattern? Never seen this before, but looks practical. The other pattern is `VerboseFusionDebugging`. Maybe we should agree on one patten.
71	[unrelated to this patch] Why isn't this prefixed with `LLVM`? [suggestion] We could `#define LLVM_DEBUG_VERBOSE(...) DEBUG_WITH_TYPE(DEBUG_TYPE "-verbose", __VA_ARGS__)`
93–95	[serious] Should this just use `llvm::isSafeToSpeculativelyExecute`? It basically contains anything that could just be moved into the innermost loop such that it is executed multiple times.
178–181	[serious] This looks inconsistent with `getMaxPerfectDepth()` which understand "perfectly nested" as a property of a set of loops. That is, in for (int i) for (int j) { preStmt(); for (int k) ... postStmt(); } I'd understand the loop `i` and `j` to be a perfect loop nest. What code is in the body does not matter. It doesn't seem to be used, so I'd just remove this function.

I am very much not convinced of either this analysis or the motivation.

First, restricting transformations to perfectly nested loops when there are known techniques to handle imperfect loop nests does not seem worthwhile. As such, I strongly reject the notion that we should make perfect nesting a requirement for all of our loop transformations.

Secondly, even if I accept the premise that perfect nesting is useful, I do not see the value of this analysis pass. We have an existing description of a loop and loop nest in loop info. How is this anything more than a set of helper function implemented over the existing analysis? (i.e. LoopNestUtils.hpp/cpp?)

In D68789#1723852, @reames wrote:

First, restricting transformations to perfectly nested loops when there are known techniques to handle imperfect loop nests does not seem worthwhile. As such, I strongly reject the notion that we should make perfect nesting a requirement for all of our loop transformations.

I'd see it as a tool for transformations that need perfect loop nests (interchange, tiling, unrollandjam, ...) rather as a new requirement.

Secondly, even if I accept the premise that perfect nesting is useful, I do not see the value of this analysis pass. We have an existing description of a loop and loop nest in loop info. How is this anything more than a set of helper function implemented over the existing analysis? (i.e. LoopNestUtils.hpp/cpp?)

We already had this discussion in D63459. In this case, I don't have a strong opinion on whether it should be a LoopAnalysis or a utility. In this case an advantage could be that passes may mark the analysis as preserved (e.g. it doesn't change the loop nest) rather it being recomputed.

etiotto marked 11 inline comments as done.Oct 29 2019, 1:48 PM

etiotto added inline comments.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
66	ok, I will also rename getInnerMost() to getInnermost() for consistency.
125	Correct. In your example loops i and j are perfectly nested with respect to each other. Loop j and k on the other hand are not perfectly nested because of the present of stmts between the loops. The entire loop nest is imperfect because of that. So the isPerfect member function returns true if and only if any adjacent pair of loops in the nest are perfectly nested with respect to each other (not only the othermost 2 loops). That is the only loop containing user code is the innermost loop.
llvm/lib/Analysis/LoopNestAnalysis.cpp
24	I personally like this one because is shorter than --debug-only=loopnest-verbose-debug.... but if ppl objects then I can change it.
71	Yep I like it. Would be useful to have the LLVM_DEBUG_VERBOSE macro to drive consistency in how debug traces could be toggled. It would be even better to have a way to pass a verbosity level to the debug/tracing infrastructure. Users would then be able to select which level of verbosity to associated with a particular debug stmt: #define LLVM_DEBUG_VERBOSE(DbgLvl, ...) DEBUG_WITH_TYPE(DEBUG_TYPE "-verbose-" + DbgLvl, VA_ARGS)
178–181	This function would return false for the`j` loop in the example because the loop nest it 'roots' is imperfect given that there is code between loops `j` and `k`. It would also return false of loop `i` given that its containing loop nest is imperfect. Does that answer your question?

Partially address code review comments from @Meinersbur.

Meinersbur added inline comments.Oct 30 2019, 9:18 AM

llvm/include/llvm/Analysis/LoopNestAnalysis.h
125	I am arguing that we should remove `isPerfect` entirely and replace by something that returns the maximum perfect loop nest depth. To avoid a difference whether in for (int i) for (int j) { OutlinedBody(i,j); } the function `OutlinedBody` is inlined or not. Also, since this is a loop analysis, there should be a function determining whether the currently analyzed loop is the outermost loop nest. A loop pass could use this to skip optimizing a loop of depth 2 when when including another outer loop it could optimize a loop nest depth of 3 (since the order of optimization is determined by the LoopPassManager and can be arbitrary).

bmahjour added inline comments.Nov 1 2019, 11:18 AM

llvm/lib/Analysis/LoopNestAnalysis.cpp
93–95	`isSafeToSpeculativelyExecute` returns false for phi nodes, branches and most integer divisions. We need to allow phi and branches due to inner loop induction and control. As far as I know integer divisions are safe to speculate most of the time, except on some platforms where zero-division generate a trap. It puzzles me why `isSafeToSpeculativelyExecute` checks for integer division but completely ignores floating-point division which would have visible side-effects on more platforms!

bmahjour added a subscriber: ppc-slack.Nov 1 2019, 11:20 AM

[serious] Could you add a test with early exits, e.g.

for (i = 0; i<ni; ++i) {
    for (j=0; j<nj; j+=1) {
       if (c)
        return i+j;
    }
}

llvm/lib/Analysis/LoopNestAnalysis.cpp
93–95	I guess control flow instructions are out of the scope of `isSafeToSpeculativelyExecute`, but could be checked explicitly. `isSafeToSpeculativelyExecute` should be the reference for whether a non-control-flow instruction can be executed speculatively/redundantly. If it returns true for something that has side-effect, it's a bug, not just if called here. If it can be improved, then we can improve it. For instance, passing the target determining whether division by zero is undefined or poison (by the LangRef, it's always undefined). Another question is whether `isSafeToSpeculativelyExecute` is what we need. It may return true on loads, but I'd exclude loads here (which may give us problems since LICM likes to move loads out if the innermost loop). If something can be moved into the innermost loop, then it does not matter whether it might be undefined behavior since execution of the innermost loop is required for anything in the innermost to execute. On the other side, hoisting out of the outermost loop might be required if needed to compute the iteration count of the innermost loop (e.g. for loop interchange). We cannot risk a division-by-zero to occur of it did not occur in the original loop nest. I think `isSafeToSpeculativelyExecute` is a safe default and we may think about loosen the requirements if we know what should happen with instructions of this kind for loop nest transformations. It's also better to have a more central place for the property we need.
llvm/test/Analysis/LoopNestAnalysis/perfectnest.ll
225	Does the guard condition be consistent with the loop exist condition? I.e. is for (i = 0; i<ni; ++i) { if (5<nj) { // this a guard? for (j=0; j<nj; j+=1) x+=(i+j); } }

bryanpkc added a subscriber: bryanpkc.Nov 6 2019, 8:22 AM

etiotto marked 8 inline comments as done.Nov 20 2019, 9:30 AM

etiotto added inline comments.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
125	Hi Michael, the isPerfect() member function is IMO convenient for consumers that want to know whether the entire loop nest is perfect (from a syntactic standpoint). However is not strictly necessary so I'll remove it and add this member function: /// Return true if the given loop \p L is the outermost loop in the nest. bool isOutermost(const Loop &L) const;
llvm/lib/Analysis/LoopNestAnalysis.cpp
93–95	In the next patch I'll will use isSafeToSpeculativelyExecute as suggested, and augment it for PHI nodes and cmp instructions.
llvm/test/Analysis/LoopNestAnalysis/perfectnest.ll
225	I rely on the LoopInfo getLoopGuardBranch() member function to determine whether (5<nj) is a guard for the inner loop (it is currently considered to be a guard). So the answer is yes, if (5<nj) would be considered a guard and therefore the nest considered perfect.

Addressing code review comments from Michael.

[suggestion] Add a methods that returns/fills a vector with all the Loop*s that are part of the perfect loop nest.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
125	In my domain it is very common to have additional loops inside otherwise perfectly nested loops (e.g. stencils). Here is a simple example of a grayscale conversion: for (int y = 0; y < Y; ++y) for (int x = 0; x < X; ++x) { double luminance = 0; for (int c = 0; c < 3; ++c) luminance += colored[x][y][c] / 3.0; grayscale[x][y] = luminance; } For optimization, we would e.g. interchange the x and y loops. If the interchange uses `isPerfect` to determine whether the loop can be transformed, it will have to bail out (Let's ignore that the inner loop could be unrolled before loop interchange, many stencils in HPC have inner loops with dynamic trip count or are too large). That is, using `isPerfect` would stop many HPC applications, maybe even a majority of our programs in production, to be optimized. For the syntactic property of two loops having no in-between code (with size-effects), what's inside the innermost body should be irrelevant and it would be surprising to me if it did. This is why I'd to not even include `isPerfect` in the API. If a pass really cannnot handle nested loops, it can still ask LoopInfo's innermost `Loop` whether it contains another loop. This would be an explicit restriction by the pass.

In D68789#1753777, @Meinersbur wrote:

[suggestion] Add a methods that returns/fills a vector with all the Loop*s that are part of the perfect loop nest.

That sounds like a good idea. I'd suggest returning a vector of vectors actually in case where there are multiple perfect sub nests in a loop nest. ie:

L1
  L2
   <code>
    L3
      L4
        L5

return:
{{L1,L2},{L3,L4,L5}}

In D68789#1753797, @bmahjour wrote:
In D68789#1753777, @Meinersbur wrote:

[suggestion] Add a methods that returns/fills a vector with all the Loop*s that are part of the perfect loop nest.

That sounds like a good idea. I'd suggest returning a vector of vectors actually in case where there are multiple perfect sub nests in a loop nest. ie:
L1
  L2
   <code>
    L3
      L4
        L5
return:
{{L1,L2},{L3,L4,L5}}

I see how that would be useful. Assuming data dependencies are preserved such a vector could be used to interchange loops in the 2 groups ({L1,L2} and {L3,L4,L5}}. I will add this functionality in the next patch.

Based on the feedback received during code review I have added a new public member function called 'getPerfectLoops' which can be used to retrieve a list of loops that are perfect with respect to each other.
For example, given the following loop nest containing 4 loops, 'getPerfectLoops' would return {{L1,L2},{L3,L4}}.

for(i) // L1
  for(j) // L2
    <code>
    for(k) // L3
       for(l) // L4

etiotto marked 2 inline comments as done.Dec 10 2019, 1:43 PM

etiotto added inline comments.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
125	@Meinersbur I have uploaded a patch that adds the a function which can be used to retrieve the list of loops that are perfect within each other in the nest. In your example it would return loop-y and loo-x into a list (and loop-c into another list by itself). Please take a look, it should address your concern.

I'd only store the list of perfectly nested loops that are have the analysis loop as outermost loop in the analysis results. This should be the most expensive analysis that might be worth caching/preserving between passes.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
35–36	Could you clarify whether this allows other perfectly nested loops between outer and inner? Eg. for outer: for middle: for inner:
37–38	[suggestion] Did you consider moving the static methods to LoopUtils?
63	[serious] As already mentioned, I don't see a use case for this. Transformations should not have more requirements than necessary, including the content of the innermost loop of a perfect loop nest, and I don't see how the presence of a nested loop should be a requirement of any transformation. Please either try to convince me otherwise or remove it to avoid it being used accidentally. Same applies to `getNestDepth`.
68–69	[subjective] I'd prefer `*Loops.front()` over `const_cast`.
103	[suggestion] Return a `ArrayRef<Loop*>`. Returning `LoopVector` exposes more implementation details than necessary.
117	[typo] "i.e." (that is) instead of "e.g." (for example)
llvm/lib/Analysis/LoopNestAnalysis.cpp
31–35	[discussion] The analysis looks like a cache for the the perfect loop depth and the loops in breadth-first order. Everything else is computed on-the-fly. Does this justify being a pass? [serious] I also don't see the advantage of storing the loop list over iterating the LoopInfo structure itself when needed. Is is for speed? Also note that loop analysis might be instantiated for every loop, each storing its own list.
92	[name] `containsOnlySafeInstructions`
218–222	[style] The LLVM code base does not use `const` for function-local variables.

etiotto marked 17 inline comments as done.Dec 18 2019, 1:47 PM

etiotto added inline comments.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
35–36	Added example in a comment.
37–38	Yes, I think is nice to keep all the code related to the loop nest analysis grouped together in this class.
63	OK removing these 2 static functions
68–69	ok, done
llvm/lib/Analysis/LoopNestAnalysis.cpp
31–35	OK I will change this from an analysis pass to an utility class in the next revision. Perhaps over time, as we find the need to add more functionality to this class, we can revisit this point and determine whether it makes more sense to have an analysis pass.
218–222	I am trying to ensure the basic blocks are not mutated in the function. I am not sure what was the rationale (in the LLVM code base) for not wanting to const qualify function-local variables if they are intended to remain immutable ... perhaps the thinking was that the original author of the code would not make a mistake and modify something that was intended to stay immutable, but I think is safer to const qualify variables to prevent unintended changes down the road.

Addressing code review comments.

ping

You may have though about ArrayRef<const Loop*> getLoops() const instead of const ArrayRef<Loop*> getLoops() const.

However, I think the const qualifier on the method applies to the analysis result, not the loops it holds (which are the analysis result of LoopInfo). The loops are identity objects: modifying them does not change which loops they represent. To create a LoopNest, one requires a non-const Loop* object anyway.

A language-layer might say that when requiring a non-const Loop* at construction, we already know that the Loop* is stored in modifyable memory (not a const global), hence can be modified.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
64	[suggestion] Add a remark that the outermost and the returned are not necessarily perfectly nested.
84–87	[serious] This accessor makes no sense. The same Loop* (non-const) can be obtained from a `const LoopNest` using `getLoops()[Index]`. `LoopNest` does not own the `Loop`s and therefore has no control over its const-ness. Just use a single `Loop *getLoop(unsigned) const` accessor. `const` on `LoopNest` refers to the constness of the analysis result. The `Loop`s themselves are identity objects: after e.g. removing a BB out of its lists (e.g. because it is dead), it still represents the same loop. Same applies to `getOutermostLoop()`. To make the accessor useful, also provide a `getNumLoops()` member (or just remove all of it and require users to use `getLoops()`).
94	[style] `const ArrayRef` is useless. It's an r-value/temporary.
140–141	[style] Since it is private, move it as a free function into the .cpp file. No need to expose it in a header file.
145	[typo] breadth
llvm/lib/Analysis/LoopNestAnalysis.cpp
32–33	I still think storing all a list of all loops doesn't have a lot of benefits.

Addressed review comments from @Meinersbur

etiotto added inline comments.Feb 12 2020, 7:05 AM

llvm/include/llvm/Analysis/LoopNestAnalysis.h
84–87	Removed the member functions that return a const Loop. Added `getNumLoops()`.
140–141	ok
llvm/lib/Analysis/LoopNestAnalysis.cpp
32–33	I think is useful to the transformation to know which loops are in the loop nest. It is used in several places. For example is handy tp have a list of loops in the nest to check whether they are all in simplified form. It is also used to determine the nst depth, etc...

LGTM

llvm/include/llvm/Analysis/LoopNestAnalysis.h
35–36	What I meant is what does it return for: for i for j for k `arePerfectlyNested(loop_i, loop_k, SE) == ?`
87	[style] `SmallVector::size()` is of type `size_t`, why not also return `size_t`?
114	[suggestion] Add an assertion ensuring that the result would not be negative.

This revision is now accepted and ready to land.Feb 17 2020, 9:25 AM

Closed by commit rG3a063d68e3c9: [LoopNest]: Analysis to discover properties of a loop nest. (authored by Whitney). · Explain WhyMar 3 2020, 5:29 AM

This revision was automatically updated to reflect the committed changes.

Looks like this broke the build with modules enabled: http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .

Looks like this broke the build with modules enabled: http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .

Reverted for now
https://reviews.llvm.org/rG613f791131ee6911f3cbb0c52245335ecfd791af

ychen added a subscriber: ychen.Jul 29 2020, 12:50 PM

I have trouble finding the definition of LoopNestAnalysis::run.

In D68789#2216266, @ychen wrote:

I have trouble finding the definition of LoopNestAnalysis::run.

The reason is that this code is not an analysis pass (it was at the start of code review but ppl felt that it would be better to make it a utility rather than an analysis proper).

In D68789#2216803, @etiotto wrote:

In D68789#2216266, @ychen wrote:

I have trouble finding the definition of LoopNestAnalysis::run.

The reason is that this code is not an analysis pass (it was at the start of code review but ppl felt that it would be better to make it a utility rather than an analysis proper).

I see. So the declaration is dead code?

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

LoopNestAnalysis.h

174 lines

lib/

Analysis/

CMakeLists.txt

1 line

LoopNestAnalysis.cpp

284 lines

Passes/

PassBuilder.cpp

1 line

PassRegistry.def

1 line

test/

Analysis/

LoopNestAnalysis/

imperfectnest.ll

493 lines

infinite.ll

35 lines

perfectnest.ll

275 lines

unittests/

Analysis/

CMakeLists.txt

1 line

LoopNestTest.cpp

194 lines

Diff 234611

llvm/include/llvm/Analysis/LoopNestAnalysis.h

This file was added.

				//===- llvm/Analysis/LoopNestAnalysis.h -------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// This file defines the interface for the loop nest analysis.
				///
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ANALYSIS_LOOPNESTANALYSIS_H
				#define LLVM_ANALYSIS_LOOPNESTANALYSIS_H

				#include "llvm/Analysis/LoopPass.h"
				#include "llvm/Analysis/ScalarEvolution.h"
				#include "llvm/Transforms/Scalar/LoopPassManager.h"

				namespace llvm {

				using LoopVectorTy = SmallVector<Loop *, 8>;

				/// This class represents a loop nest and can be used to query its properties.
				class LoopNest {
				public:
				/// Construct a loop nest rooted by loop \p Root.
				LoopNest(Loop &Root, ScalarEvolution &SE);

				LoopNest() = delete;
				LoopNest &operator=(const LoopNest &) = delete;

				/// Construct a LoopNest object.
				static std::unique_ptr<LoopNest> getLoopNest(Loop &Root, ScalarEvolution &SE);

				MeinersburUnsubmitted Done Reply Inline Actions Could you clarify whether this allows other perfectly nested loops between outer and inner? Eg. for outer: for middle: for inner: Meinersbur: Could you clarify whether this allows other perfectly nested loops between outer and inner? Eg.
				etiottoAuthorUnsubmitted Done Reply Inline Actions Added example in a comment. etiotto: Added example in a comment.
				MeinersburUnsubmitted Not Done Reply Inline Actions What I meant is what does it return for: for i for j for k `arePerfectlyNested(loop_i, loop_k, SE) == ?` Meinersbur: What I meant is what does it return for: ``` for i for j for k ``` `arePerfectlyNested…
				/// Return true if the given loops \p OuterLoop and \p InnerLoop are
				/// perfectly nested with respect to each other, and false otherwise.
				MeinersburUnsubmitted Done Reply Inline Actions [suggestion] Did you consider moving the static methods to LoopUtils? Meinersbur: [suggestion] Did you consider moving the static methods to LoopUtils?
				etiottoAuthorUnsubmitted Done Reply Inline Actions Yes, I think is nice to keep all the code related to the loop nest analysis grouped together in this class. etiotto: Yes, I think is nice to keep all the code related to the loop nest analysis grouped together in…
				/// Example:
				/// \code
				/// for(i)
				/// for(j)
				/// \endcode
				/// arePerfectlyNested(loop_i, loop_j, SE) would return true.
				static bool arePerfectlyNested(const Loop &OuterLoop, const Loop &InnerLoop,
				ScalarEvolution &SE);

				/// Return the maximum nesting depth of the loop nest rooted by loop \p Root.
				/// For example given the loop nest:
				/// \code
				/// for(i) // loop at level 1 and Root of the nest
				/// for(j) // loop at level 2
				/// <code>
				/// for(k) // loop at level 3
				/// \endcode
				/// getMaxPerfectDepth(Loop_i) would return 2.
				static unsigned getMaxPerfectDepth(const Loop &Root, ScalarEvolution &SE);

				/// Return the outermost loop in the loop nest.
				const Loop &getOutermostLoop() const { return *Loops.front(); }
				Loop &getOutermostLoop() { return *Loops.front(); }

				/// Return the innermost loop in the loop nest if the nest has only one
				MeinersburUnsubmitted Done Reply Inline Actions [serious] As already mentioned, I don't see a use case for this. Transformations should not have more requirements than necessary, including the content of the innermost loop of a perfect loop nest, and I don't see how the presence of a nested loop should be a requirement of any transformation. Please either try to convince me otherwise or remove it to avoid it being used accidentally. Same applies to `getNestDepth`. Meinersbur: [serious] As already mentioned, I don't see a use case for this. Transformations should not…
				etiottoAuthorUnsubmitted Done Reply Inline Actions OK removing these 2 static functions etiotto: OK removing these 2 static functions
				/// innermost loop, and a nullptr otherwise.
				MeinersburUnsubmitted Done Reply Inline Actions [suggestion] Add a remark that the outermost and the returned are not necessarily perfectly nested. Meinersbur: [suggestion] Add a remark that the outermost and the returned are not necessarily perfectly…
				const Loop *getInnermostLoop() const {
				if (Loops.size() == 1)
				MeinersburUnsubmitted Done Reply Inline Actions [nit] outermost is a proper english word, so I suggest: `getOutermostLoop()` Meinersbur: [nit] [[ https://en.wiktionary.org/wiki/outermost \| outermost ]] is a proper english word, so…
				etiottoAuthorUnsubmitted Done Reply Inline Actions ok, I will also rename getInnerMost() to getInnermost() for consistency. etiotto: ok, I will also rename getInnerMost() to getInnermost() for consistency.
				return Loops.back();

				// The loops in the 'Loops' vector have been collected in breadth first
				MeinersburUnsubmitted Done Reply Inline Actions [subjective] I'd prefer `Loops.front()` over `const_cast`. Meinersbur:* [subjective] I'd prefer `*Loops.front()` over `const_cast`.
				etiottoAuthorUnsubmitted Done Reply Inline Actions ok, done etiotto: ok, done
				// order, therefore if the last 2 loops in it have the same nesting depth
				// there isn't a unique innermost loop in the nest.
				const Loop *LastLoop = Loops.back();
				auto SecondLastLoopIter = ++Loops.rbegin();
				return (LastLoop->getLoopDepth() == (*SecondLastLoopIter)->getLoopDepth())
				? nullptr
				: LastLoop;
				}
				Loop *getInnermostLoop() {
				MeinersburUnsubmitted Done Reply Inline Actions [typo] breadth Meinersbur: [typo] breadth
				return const_cast<Loop *>(
				static_cast<const LoopNest &>(*this).getInnermostLoop());
				}

				/// Return the loop at the given \p Index.
				const Loop *getLoop(unsigned Index) const {
				assert(Index < Loops.size() && "Index is out of bounds");
				return Loops[Index];
				}
				MeinersburUnsubmitted Done Reply Inline Actions [serious] This accessor makes no sense. The same Loop* (non-const) can be obtained from a `const LoopNest` using `getLoops()[Index]`. `LoopNest` does not own the `Loop`s and therefore has no control over its const-ness. Just use a single `Loop getLoop(unsigned) const` accessor. `const` on `LoopNest` refers to the constness of the analysis result. The `Loop`s themselves are identity objects: after e.g. removing a BB out of its lists (e.g. because it is dead), it still represents the same loop. Same applies to `getOutermostLoop()`. To make the accessor useful, also provide a `getNumLoops()` member (or just remove all of it and require users to use `getLoops()`). Meinersbur:* [serious] This accessor makes no sense. The same Loop* (non-const) can be obtained from a…
				etiottoAuthorUnsubmitted Done Reply Inline Actions Removed the member functions that return a const Loop. Added `getNumLoops()`. etiotto: Removed the member functions that return a const Loop. Added `getNumLoops()`.
				MeinersburUnsubmitted Not Done Reply Inline Actions [style] `SmallVector::size()` is of type `size_t`, why not also return `size_t`? Meinersbur: [style] `SmallVector::size()` is of type `size_t`, why not also return `size_t`?
				Loop *getLoop(unsigned Index) {
				return const_cast<Loop *>(
				static_cast<const LoopNest &>(*this).getLoop(Index));
				}

				/// Get the loops in the nest.
				const ArrayRef<Loop*> getLoops() const { return Loops; }
				MeinersburUnsubmitted Done Reply Inline Actions [style] `const ArrayRef` is useless. It's an r-value/temporary. Meinersbur: [style] `const ArrayRef` is useless. It's an r-value/temporary.

				/// Retrieve a vector of perfect loop nests contained in the current loop
				/// nest. For example, given the following nest containing 4 loops, this
				/// member function would return {{L1,L2},{L3,L4}}.
				/// \code
				/// for(i) // L1
				/// for(j) // L2
				/// <code>
				/// for(k) // L3
				MeinersburUnsubmitted Done Reply Inline Actions [suggestion] Return a `ArrayRef<Loop>`. Returning `LoopVector` exposes more implementation details than necessary. Meinersbur:* [suggestion] Return a `ArrayRef<Loop*>`. Returning `LoopVector` exposes more implementation…
				/// for(l) // L4
				/// \endcode
				SmallVector<LoopVectorTy, 4> getPerfectLoops(ScalarEvolution &SE) const;

				/// Return the loop nest depth (i.e. the loop depth of the 'deepest' loop)
				/// For example given the loop nest:
				/// \code
				/// for(i) // loop at level 1 and Root of the nest
				/// for(j1) // loop at level 2
				/// for(k) // loop at level 3
				/// for(j2) // loop at level 2
				MeinersburUnsubmitted Not Done Reply Inline Actions [suggestion] Add an assertion ensuring that the result would not be negative. Meinersbur: [suggestion] Add an assertion ensuring that the result would not be negative.
				/// \endcode
				/// getNestDepth() would return 3.
				unsigned getNestDepth() const {
				MeinersburUnsubmitted Done Reply Inline Actions [typo] "i.e." (that is) instead of "e.g." (for example) Meinersbur: [typo] "i.e." (that is) instead of "e.g." (for example)
				return (Loops.back()->getLoopDepth() - Loops.front()->getLoopDepth() + 1);
				}

				/// Return the maximum perfect nesting depth.
				unsigned getMaxPerfectDepth() const { return MaxPerfectDepth; }

				/// Return true if all loops in the loop nest are in simplify form.
				bool areAllLoopsSimplifyForm() const {
				MeinersburUnsubmitted Done Reply Inline Actions [discussion] I understand "perfectly nested" as a property of a set of loops. That is, in for (int i) for (int j) { preStmt(); for (int k) ... postStmt(); } I'd understand the loop `i` and `j` to be a perfect loop nest. What code is in the body does not matter. Meinersbur: [discussion] I understand "perfectly nested" as a property of a set of loops. That is, in ```…
				etiottoAuthorUnsubmitted Done Reply Inline Actions Correct. In your example loops i and j are perfectly nested with respect to each other. Loop j and k on the other hand are not perfectly nested because of the present of stmts between the loops. The entire loop nest is imperfect because of that. So the isPerfect member function returns true if and only if any adjacent pair of loops in the nest are perfectly nested with respect to each other (not only the othermost 2 loops). That is the only loop containing user code is the innermost loop. etiotto: Correct. In your example loops i and j are perfectly nested with respect to each other. Loop j…
				MeinersburUnsubmitted Done Reply Inline Actions I am arguing that we should remove `isPerfect` entirely and replace by something that returns the maximum perfect loop nest depth. To avoid a difference whether in for (int i) for (int j) { OutlinedBody(i,j); } the function `OutlinedBody` is inlined or not. Also, since this is a loop analysis, there should be a function determining whether the currently analyzed loop is the outermost loop nest. A loop pass could use this to skip optimizing a loop of depth 2 when when including another outer loop it could optimize a loop nest depth of 3 (since the order of optimization is determined by the LoopPassManager and can be arbitrary). Meinersbur: I am arguing that we should remove `isPerfect` entirely and replace by something that returns…
				etiottoAuthorUnsubmitted Done Reply Inline Actions Hi Michael, the isPerfect() member function is IMO convenient for consumers that want to know whether the entire loop nest is perfect (from a syntactic standpoint). However is not strictly necessary so I'll remove it and add this member function: /// Return true if the given loop \p L is the outermost loop in the nest. bool isOutermost(const Loop &L) const; etiotto: Hi Michael, the isPerfect() member function is IMO convenient for consumers that want to know…
				MeinersburUnsubmitted Done Reply Inline Actions In my domain it is very common to have additional loops inside otherwise perfectly nested loops (e.g. stencils). Here is a simple example of a grayscale conversion: for (int y = 0; y < Y; ++y) for (int x = 0; x < X; ++x) { double luminance = 0; for (int c = 0; c < 3; ++c) luminance += colored[x][y][c] / 3.0; grayscale[x][y] = luminance; } For optimization, we would e.g. interchange the x and y loops. If the interchange uses `isPerfect` to determine whether the loop can be transformed, it will have to bail out (Let's ignore that the inner loop could be unrolled before loop interchange, many stencils in HPC have inner loops with dynamic trip count or are too large). That is, using `isPerfect` would stop many HPC applications, maybe even a majority of our programs in production, to be optimized. For the syntactic property of two loops having no in-between code (with size-effects), what's inside the innermost body should be irrelevant and it would be surprising to me if it did. This is why I'd to not even include `isPerfect` in the API. If a pass really cannnot handle nested loops, it can still ask LoopInfo's innermost `Loop` whether it contains another loop. This would be an explicit restriction by the pass. Meinersbur: In my domain it is very common to have additional loops inside otherwise perfectly nested loops…
				etiottoAuthorUnsubmitted Done Reply Inline Actions @Meinersbur I have uploaded a patch that adds the a function which can be used to retrieve the list of loops that are perfect within each other in the nest. In your example it would return loop-y and loo-x into a list (and loop-c into another list by itself). Please take a look, it should address your concern. etiotto: @meinersbur I have uploaded a patch that adds the a function which can be used to retrieve the…
				return llvm::all_of(Loops,
				[](const Loop *L) { return L->isLoopSimplifyForm(); });
				}

				private:
				/// Determine whether the loops structure violates basic requirements for
				/// perfect nesting:
				/// - the inner loop should be the outer loop's only child
				/// - the outer loop header should 'flow' into the inner loop preheader
				/// or jump around the inner loop to the outer loop latch
				/// - if the inner loop latch exits the inner loop, it should 'flow' into
				/// the outer loop latch.
				/// Returns true if the loop structure satisfies the basic requirements and
				/// false otherwise.
				static bool checkLoopsStructure(const Loop &OuterLoop, const Loop &InnerLoop,
				ScalarEvolution &SE);
				MeinersburUnsubmitted Done Reply Inline Actions [style] Since it is private, move it as a free function into the .cpp file. No need to expose it in a header file. Meinersbur: [style] Since it is private, move it as a free function into the .cpp file. No need to expose…
				etiottoAuthorUnsubmitted Done Reply Inline Actions ok etiotto: ok

				protected:
				const unsigned MaxPerfectDepth; // maximum perfect nesting depth level.
				LoopVectorTy Loops; // the loops in the nest (in breath first order).
				MeinersburUnsubmitted Done Reply Inline Actions [typo] breadth Meinersbur: [typo] breadth
				};

				raw_ostream &operator<<(raw_ostream &, const LoopNest &);

				/// This analysis provides information for a loop nest. The analysis runs on
				/// demand and can be initiated via AM.getResult<LoopNestAnalysis>.
				class LoopNestAnalysis : public AnalysisInfoMixin<LoopNestAnalysis> {
				friend AnalysisInfoMixin<LoopNestAnalysis>;
				static AnalysisKey Key;

				public:
				using Result = LoopNest;
				Result run(Loop &L, LoopAnalysisManager &AM, LoopStandardAnalysisResults &AR);
				};

				/// Printer pass for the \c LoopNest results.
				class LoopNestPrinterPass : public PassInfoMixin<LoopNestPrinterPass> {
				raw_ostream &OS;

				public:
				explicit LoopNestPrinterPass(raw_ostream &OS) : OS(OS) {}

				PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM,
				LoopStandardAnalysisResults &AR, LPMUpdater &U);
				};

				} // namespace llvm

				#endif // LLVM_ANALYSIS_LOOPNESTANALYSIS_H

llvm/lib/Analysis/CMakeLists.txt

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMAnalysis
LazyCallGraph.cpp		LazyCallGraph.cpp
LazyValueInfo.cpp		LazyValueInfo.cpp
LegacyDivergenceAnalysis.cpp		LegacyDivergenceAnalysis.cpp
Lint.cpp		Lint.cpp
Loads.cpp		Loads.cpp
LoopAccessAnalysis.cpp		LoopAccessAnalysis.cpp
LoopAnalysisManager.cpp		LoopAnalysisManager.cpp
LoopCacheAnalysis.cpp		LoopCacheAnalysis.cpp
		LoopNestAnalysis.cpp
LoopUnrollAnalyzer.cpp		LoopUnrollAnalyzer.cpp
LoopInfo.cpp		LoopInfo.cpp
LoopPass.cpp		LoopPass.cpp
MemDepPrinter.cpp		MemDepPrinter.cpp
MemDerefPrinter.cpp		MemDerefPrinter.cpp
MemoryBuiltins.cpp		MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp		MemoryDependenceAnalysis.cpp
MemoryLocation.cpp		MemoryLocation.cpp
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/lib/Analysis/LoopNestAnalysis.cpp

This file was added.

				//===- LoopNestAnalysis.cpp - Loop Nest Analysis --------------------------==//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// The implementation for the loop nest analysis.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/LoopNestAnalysis.h"
				#include "llvm/ADT/BreadthFirstIterator.h"
				#include "llvm/ADT/BreadthFirstIterator.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/PostDominators.h"
				#include "llvm/Analysis/ValueTracking.h"

				using namespace llvm;

				#define DEBUG_TYPE "loopnest"
				static const char *VerboseDebug = DEBUG_TYPE "-verbose";
				MeinersburUnsubmitted Done Reply Inline Actions Is this an established pattern? Never seen this before, but looks practical. The other pattern is `VerboseFusionDebugging`. Maybe we should agree on one patten. Meinersbur: Is this an established pattern? Never seen this before, but looks practical. The other pattern…
				etiottoAuthorUnsubmitted Done Reply Inline Actions I personally like this one because is shorter than --debug-only=loopnest-verbose-debug.... but if ppl objects then I can change it. etiotto: I personally like this one because is shorter than --debug-only=loopnest-verbose-debug.... but…

				//===----------------------------------------------------------------------===//
				// LoopNest implementation
				//

				LoopNest::LoopNest(Loop &Root, ScalarEvolution &SE)
				: MaxPerfectDepth(getMaxPerfectDepth(Root, SE)) {
				for (Loop *L : breadth_first(&Root))
				Loops.push_back(L);
				MeinersburUnsubmitted Done Reply Inline Actions I still think storing all a list of all loops doesn't have a lot of benefits. Meinersbur: I still think storing all a list of all loops doesn't have a lot of benefits.
				etiottoAuthorUnsubmitted Done Reply Inline Actions I think is useful to the transformation to know which loops are in the loop nest. It is used in several places. For example is handy tp have a list of loops in the nest to check whether they are all in simplified form. It is also used to determine the nst depth, etc... etiotto: I think is useful to the transformation to know which loops are in the loop nest. It is used in…
				}

				MeinersburUnsubmitted Done Reply Inline Actions [discussion] The analysis looks like a cache for the the perfect loop depth and the loops in breadth-first order. Everything else is computed on-the-fly. Does this justify being a pass? [serious] I also don't see the advantage of storing the loop list over iterating the LoopInfo structure itself when needed. Is is for speed? Also note that loop analysis might be instantiated for every loop, each storing its own list. Meinersbur: [discussion] The analysis looks like a cache for the the perfect loop depth and the loops in…
				etiottoAuthorUnsubmitted Done Reply Inline Actions OK I will change this from an analysis pass to an utility class in the next revision. Perhaps over time, as we find the need to add more functionality to this class, we can revisit this point and determine whether it makes more sense to have an analysis pass. etiotto: OK I will change this from an analysis pass to an utility class in the next revision. Perhaps…
				std::unique_ptr<LoopNest> LoopNest::getLoopNest(Loop &Root,
				ScalarEvolution &SE) {
				return std::make_unique<LoopNest>(Root, SE);
				}

				bool LoopNest::arePerfectlyNested(const Loop &OuterLoop, const Loop &InnerLoop,
				ScalarEvolution &SE) {
				assert(!OuterLoop.getSubLoops().empty() && "Outer loop should have subloops");
				assert(InnerLoop.getParentLoop() && "Inner loop should have a parent");
				LLVM_DEBUG(dbgs() << "Checking whether loop '" << OuterLoop.getName()
				<< "' and '" << InnerLoop.getName()
				<< "' are perfectly nested.\n");

				// Determine whether the loops structure satisfies the following requirements:
				// - the inner loop should be the outer loop's only child
				// - the outer loop header should 'flow' into the inner loop preheader
				// or jump around the inner loop to the outer loop latch
				// - if the inner loop latch exits the inner loop, it should 'flow' into
				// the outer loop latch.
				if (!checkLoopsStructure(OuterLoop, InnerLoop, SE)) {
				LLVM_DEBUG(dbgs() << "Not perfectly nested: invalid loop structure.\n");
				return false;
				}

				// Bail out if we cannot retrieve the outer loop bounds.
				auto OuterLoopLB = OuterLoop.getBounds(SE);
				if (OuterLoopLB == None) {
				LLVM_DEBUG(dbgs() << "Cannot compute loop bounds of OuterLoop: "
				<< OuterLoop << "\n";);
				return false;
				}

				// Identify the outer loop latch comparison instruction.
				const BasicBlock *Latch = OuterLoop.getLoopLatch();
				assert(Latch && "Expecting a valid loop latch");
				const BranchInst *BI = dyn_cast<BranchInst>(Latch->getTerminator());
				MeinersburUnsubmitted Done Reply Inline Actions [unrelated to this patch] Why isn't this prefixed with `LLVM`? [suggestion] We could `#define LLVM_DEBUG_VERBOSE(...) DEBUG_WITH_TYPE(DEBUG_TYPE "-verbose", __VA_ARGS__)` Meinersbur: [unrelated to this patch] Why isn't this prefixed with `LLVM`? [suggestion] We could `#define…
				etiottoAuthorUnsubmitted Done Reply Inline Actions Yep I like it. Would be useful to have the LLVM_DEBUG_VERBOSE macro to drive consistency in how debug traces could be toggled. It would be even better to have a way to pass a verbosity level to the debug/tracing infrastructure. Users would then be able to select which level of verbosity to associated with a particular debug stmt: #define LLVM_DEBUG_VERBOSE(DbgLvl, ...) DEBUG_WITH_TYPE(DEBUG_TYPE "-verbose-" + DbgLvl, VA_ARGS) etiotto: Yep I like it. Would be useful to have the LLVM_DEBUG_VERBOSE macro to drive consistency in how…
				assert(BI && BI->isConditional() &&
				"Expecting loop latch terminator to be a branch instruction");

				const CmpInst *OuterLoopLatchCmp = dyn_cast<CmpInst>(BI->getCondition());
				DEBUG_WITH_TYPE(VerboseDebug, if (OuterLoopLatchCmp) {
				dbgs() << "Outer loop latch compare instruction: " << *OuterLoopLatchCmp
				<< "\n";
				});

				// Identify the inner loop guard instruction.
				BranchInst *InnerGuard = InnerLoop.getLoopGuardBranch();
				const CmpInst *InnerLoopGuardCmp =
				(InnerGuard) ? dyn_cast<CmpInst>(InnerGuard->getCondition()) : nullptr;

				DEBUG_WITH_TYPE(VerboseDebug, if (InnerLoopGuardCmp) {
				dbgs() << "Inner loop guard compare instruction: " << *InnerLoopGuardCmp
				<< "\n";
				});

				// Determine whether instructions in a basic block are one of:
				// - the inner loop guard comparison
				MeinersburUnsubmitted Done Reply Inline Actions [name] `containsOnlySafeInstructions` Meinersbur: [name] `containsOnlySafeInstructions`
				// - the outer loop latch comparison
				// - the outer loop induction variable increment
				// - a phi node, a cast or a branch
				MeinersburUnsubmitted Done Reply Inline Actions [serious] Should this just use `llvm::isSafeToSpeculativelyExecute`? It basically contains anything that could just be moved into the innermost loop such that it is executed multiple times. Meinersbur: [serious] Should this just use `llvm::isSafeToSpeculativelyExecute`? It basically contains…
				bmahjourUnsubmitted Done Reply Inline Actions `isSafeToSpeculativelyExecute` returns false for phi nodes, branches and most integer divisions. We need to allow phi and branches due to inner loop induction and control. As far as I know integer divisions are safe to speculate most of the time, except on some platforms where zero-division generate a trap. It puzzles me why `isSafeToSpeculativelyExecute` checks for integer division but completely ignores floating-point division which would have visible side-effects on more platforms! bmahjour: `isSafeToSpeculativelyExecute` returns false for phi nodes, branches and most integer divisions.
				MeinersburUnsubmitted Done Reply Inline Actions I guess control flow instructions are out of the scope of `isSafeToSpeculativelyExecute`, but could be checked explicitly. `isSafeToSpeculativelyExecute` should be the reference for whether a non-control-flow instruction can be executed speculatively/redundantly. If it returns true for something that has side-effect, it's a bug, not just if called here. If it can be improved, then we can improve it. For instance, passing the target determining whether division by zero is undefined or poison (by the LangRef, it's always undefined). Another question is whether `isSafeToSpeculativelyExecute` is what we need. It may return true on loads, but I'd exclude loads here (which may give us problems since LICM likes to move loads out if the innermost loop). If something can be moved into the innermost loop, then it does not matter whether it might be undefined behavior since execution of the innermost loop is required for anything in the innermost to execute. On the other side, hoisting out of the outermost loop might be required if needed to compute the iteration count of the innermost loop (e.g. for loop interchange). We cannot risk a division-by-zero to occur of it did not occur in the original loop nest. I think `isSafeToSpeculativelyExecute` is a safe default and we may think about loosen the requirements if we know what should happen with instructions of this kind for loop nest transformations. It's also better to have a more central place for the property we need. Meinersbur: I guess control flow instructions are out of the scope of `isSafeToSpeculativelyExecute`, but…
				etiottoAuthorUnsubmitted Done Reply Inline Actions In the next patch I'll will use isSafeToSpeculativelyExecute as suggested, and augment it for PHI nodes and cmp instructions. etiotto: In the next patch I'll will use isSafeToSpeculativelyExecute as suggested, and augment it for…
				auto containsOnlySafeInstructions = [&](const BasicBlock &BB) {
				return llvm::all_of(BB, [&](const Instruction &I) {
				bool isAllowed = isSafeToSpeculativelyExecute(&I) \|\| isa<PHINode>(I) \|\|
				isa<BranchInst>(I);
				if (!isAllowed) {
				DEBUG_WITH_TYPE(VerboseDebug, {
				dbgs() << "Instruction: " << I << "\nin basic block: " << BB
				<< " is considered unsafe.\n";
				});
				return false;
				}

				// The only binary instruction allowed is the outer loop step instruction,
				// the only comparison instructions allowed are the inner loop guard
				// compare instruction and the outer loop latch compare instruction.
				if ((isa<BinaryOperator>(I) && &I != &OuterLoopLB->getStepInst()) \|\|
				(isa<CmpInst>(I) && &I != OuterLoopLatchCmp &&
				&I != InnerLoopGuardCmp)) {
				DEBUG_WITH_TYPE(VerboseDebug, {
				dbgs() << "Instruction: " << I << "\nin basic block:" << BB
				<< "is unsafe.\n";
				});
				return false;
				}
				return true;
				});
				};

				// Check the code surrounding the inner loop for instructions that are deemed
				// unsafe.
				const BasicBlock *OuterLoopHeader = OuterLoop.getHeader();
				const BasicBlock *OuterLoopLatch = OuterLoop.getLoopLatch();
				const BasicBlock *InnerLoopPreHeader = InnerLoop.getLoopPreheader();

				if (!containsOnlySafeInstructions(*OuterLoopHeader) \|\|
				!containsOnlySafeInstructions(*OuterLoopLatch) \|\|
				(InnerLoopPreHeader != OuterLoopHeader &&
				!containsOnlySafeInstructions(*InnerLoopPreHeader)) \|\|
				!containsOnlySafeInstructions(*InnerLoop.getExitBlock())) {
				LLVM_DEBUG(dbgs() << "Not perfectly nested: code surrounding inner loop is "
				"unsafe\n";);
				return false;
				}

				LLVM_DEBUG(dbgs() << "Loop '" << OuterLoop.getName() << "' and '"
				<< InnerLoop.getName() << "' are perfectly nested.\n");

				return true;
				}

				SmallVector<LoopVectorTy, 4>
				LoopNest::getPerfectLoops(ScalarEvolution &SE) const {
				SmallVector<LoopVectorTy, 4> LV;
				LoopVectorTy PerfectNest;

				for (Loop L : depth_first(const_cast<Loop>(Loops.front()))) {
				if (PerfectNest.empty())
				PerfectNest.push_back(L);

				auto &SubLoops = L->getSubLoops();
				if (SubLoops.size() == 1 &&
				arePerfectlyNested(L, SubLoops.front(), SE)) {
				PerfectNest.push_back(SubLoops.front());
				}
				else {
				LV.push_back(PerfectNest);
				PerfectNest.clear();
				}
				}

				return LV;
				}

				unsigned LoopNest::getMaxPerfectDepth(const Loop &Root, ScalarEvolution &SE) {
				LLVM_DEBUG(dbgs() << "Get maximum perfect depth of loop nest rooted by loop '"
				<< Root.getName() << "'\n");

				const Loop *CurrentLoop = &Root;
				const auto *SubLoops = &CurrentLoop->getSubLoops();
				unsigned CurrentDepth = 1;

				while (SubLoops->size() == 1) {
				const Loop *InnerLoop = SubLoops->front();
				if (!arePerfectlyNested(CurrentLoop, InnerLoop, SE)) {
				LLVM_DEBUG({
				dbgs() << "Not a perfect nest: loop '" << CurrentLoop->getName()
				MeinersburUnsubmitted Done Reply Inline Actions [serious] This looks inconsistent with `getMaxPerfectDepth()` which understand "perfectly nested" as a property of a set of loops. That is, in for (int i) for (int j) { preStmt(); for (int k) ... postStmt(); } I'd understand the loop `i` and `j` to be a perfect loop nest. What code is in the body does not matter. It doesn't seem to be used, so I'd just remove this function. Meinersbur: [serious] This looks inconsistent with `getMaxPerfectDepth()` which understand "perfectly…
				etiottoAuthorUnsubmitted Done Reply Inline Actions This function would return false for the`j` loop in the example because the loop nest it 'roots' is imperfect given that there is code between loops `j` and `k`. It would also return false of loop `i` given that its containing loop nest is imperfect. Does that answer your question? etiotto: This function would return false for the`j` loop in the example because the loop nest it…
				<< "' is not perfectly nested with loop '"
				<< InnerLoop->getName() << "'\n";
				});
				break;
				}

				CurrentLoop = InnerLoop;
				SubLoops = &CurrentLoop->getSubLoops();
				++CurrentDepth;
				}

				return CurrentDepth;
				}

				bool LoopNest::checkLoopsStructure(const Loop &OuterLoop, const Loop &InnerLoop,
				ScalarEvolution &SE) {
				// The inner loop must be the only outer loop's child.
				if ((OuterLoop.getSubLoops().size() != 1) \|\|
				(InnerLoop.getParentLoop() != &OuterLoop))
				return false;

				// We expect loops in normal form which have a preheader, header, latch...
				if (!OuterLoop.isLoopSimplifyForm() \|\| !InnerLoop.isLoopSimplifyForm())
				return false;

				const BasicBlock *OuterLoopHeader = OuterLoop.getHeader();
				const BasicBlock *OuterLoopLatch = OuterLoop.getLoopLatch();
				const BasicBlock *InnerLoopPreHeader = InnerLoop.getLoopPreheader();
				const BasicBlock *InnerLoopLatch = InnerLoop.getLoopLatch();
				const BasicBlock *InnerLoopExit = InnerLoop.getExitBlock();

				// We expect rotated loops. The inner loop should have a single exit block.
				if (OuterLoop.getExitingBlock() != OuterLoopLatch \|\|
				InnerLoop.getExitingBlock() != InnerLoopLatch \|\|
				!InnerLoopExit)
				return false;

				// Ensure the only branch that may exist between the loops is the inner loop
				// guard.
				if (OuterLoopHeader != InnerLoopPreHeader) {
				const BranchInst *BI =
				MeinersburUnsubmitted Done Reply Inline Actions [style] The LLVM code base does not use `const` for function-local variables. Meinersbur: [style] The LLVM code base does not use `const` for function-local variables.
				etiottoAuthorUnsubmitted Done Reply Inline Actions I am trying to ensure the basic blocks are not mutated in the function. I am not sure what was the rationale (in the LLVM code base) for not wanting to const qualify function-local variables if they are intended to remain immutable ... perhaps the thinking was that the original author of the code would not make a mistake and modify something that was intended to stay immutable, but I think is safer to const qualify variables to prevent unintended changes down the road. etiotto: I am trying to ensure the basic blocks are not mutated in the function. I am not sure what was…
				dyn_cast<BranchInst>(OuterLoopHeader->getTerminator());

				if (!BI \|\| BI != InnerLoop.getLoopGuardBranch())
				return false;

				// The successors of the inner loop guard should be the inner loop
				// preheader and the outer loop latch.
				for (const BasicBlock *Succ : BI->successors()) {
				if (Succ == InnerLoopPreHeader)
				continue;
				if (Succ == OuterLoopLatch)
				continue;

				DEBUG_WITH_TYPE(VerboseDebug, {
				dbgs() << "Inner loop guard successor " << Succ->getName()
				<< " doesn't lead to inner loop preheader or "
				"outer loop latch.\n";
				});
				return false;
				}
				}

				// Ensure the inner loop exit block leads to the outer loop latch.
				if (InnerLoopExit->getSingleSuccessor() != OuterLoopLatch) {
				DEBUG_WITH_TYPE(
				VerboseDebug,
				dbgs() << "Inner loop exit block " << *InnerLoopExit
				<< " does not directly lead to the outer loop latch.\n";);
				return false;
				}

				return true;
				}

				raw_ostream &llvm::operator<<(raw_ostream &OS, const LoopNest &LN) {
				OS << "IsPerfect=";
				if (LN.getMaxPerfectDepth() == LN.getNestDepth())
				OS << "true";
				else
				OS << "false";
				OS << ", Depth=" << LN.getNestDepth();
				OS << ", OutermostLoop: " << LN.getOutermostLoop().getName();
				OS << ", Loops: ( ";
				for (const Loop *L : LN.getLoops())
				OS << L->getName() << " ";
				OS << ")";

				return OS;
				}

				//===----------------------------------------------------------------------===//
				// LoopNestPrinterPass implementation
				//

				PreservedAnalyses LoopNestPrinterPass::run(Loop &L, LoopAnalysisManager &AM,
				LoopStandardAnalysisResults &AR,
				LPMUpdater &U) {
				if (auto LN = LoopNest::getLoopNest(L, AR.SE))
				OS << *LN << "\n";

				return PreservedAnalyses::all();
				}

llvm/lib/Passes/PassBuilder.cpp

	Show All 32 Lines
	#include "llvm/Analysis/DominanceFrontier.h"			#include "llvm/Analysis/DominanceFrontier.h"
	#include "llvm/Analysis/GlobalsModRef.h"			#include "llvm/Analysis/GlobalsModRef.h"
	#include "llvm/Analysis/IVUsers.h"			#include "llvm/Analysis/IVUsers.h"
	#include "llvm/Analysis/LazyCallGraph.h"			#include "llvm/Analysis/LazyCallGraph.h"
	#include "llvm/Analysis/LazyValueInfo.h"			#include "llvm/Analysis/LazyValueInfo.h"
	#include "llvm/Analysis/LoopAccessAnalysis.h"			#include "llvm/Analysis/LoopAccessAnalysis.h"
	#include "llvm/Analysis/LoopCacheAnalysis.h"			#include "llvm/Analysis/LoopCacheAnalysis.h"
	#include "llvm/Analysis/LoopInfo.h"			#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/LoopNestAnalysis.h"
	#include "llvm/Analysis/MemoryDependenceAnalysis.h"			#include "llvm/Analysis/MemoryDependenceAnalysis.h"
	#include "llvm/Analysis/MemorySSA.h"			#include "llvm/Analysis/MemorySSA.h"
	#include "llvm/Analysis/ModuleSummaryAnalysis.h"			#include "llvm/Analysis/ModuleSummaryAnalysis.h"
	#include "llvm/Analysis/OptimizationRemarkEmitter.h"			#include "llvm/Analysis/OptimizationRemarkEmitter.h"
	#include "llvm/Analysis/PhiValues.h"			#include "llvm/Analysis/PhiValues.h"
	#include "llvm/Analysis/PostDominators.h"			#include "llvm/Analysis/PostDominators.h"
	#include "llvm/Analysis/ProfileSummaryInfo.h"			#include "llvm/Analysis/ProfileSummaryInfo.h"
	#include "llvm/Analysis/RegionInfo.h"			#include "llvm/Analysis/RegionInfo.h"
	▲ Show 20 Lines • Show All 2,333 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines
	LOOP_PASS("strength-reduce", LoopStrengthReducePass())			LOOP_PASS("strength-reduce", LoopStrengthReducePass())
	LOOP_PASS("indvars", IndVarSimplifyPass())			LOOP_PASS("indvars", IndVarSimplifyPass())
	LOOP_PASS("irce", IRCEPass())			LOOP_PASS("irce", IRCEPass())
	LOOP_PASS("unroll-and-jam", LoopUnrollAndJamPass())			LOOP_PASS("unroll-and-jam", LoopUnrollAndJamPass())
	LOOP_PASS("unroll-full", LoopFullUnrollPass())			LOOP_PASS("unroll-full", LoopFullUnrollPass())
	LOOP_PASS("print-access-info", LoopAccessInfoPrinterPass(dbgs()))			LOOP_PASS("print-access-info", LoopAccessInfoPrinterPass(dbgs()))
	LOOP_PASS("print<ddg>", DDGAnalysisPrinterPass(dbgs()))			LOOP_PASS("print<ddg>", DDGAnalysisPrinterPass(dbgs()))
	LOOP_PASS("print<ivusers>", IVUsersPrinterPass(dbgs()))			LOOP_PASS("print<ivusers>", IVUsersPrinterPass(dbgs()))
				LOOP_PASS("print<loopnest>", LoopNestPrinterPass(dbgs()))
	LOOP_PASS("print<loop-cache-cost>", LoopCachePrinterPass(dbgs()))			LOOP_PASS("print<loop-cache-cost>", LoopCachePrinterPass(dbgs()))
	LOOP_PASS("loop-predication", LoopPredicationPass())			LOOP_PASS("loop-predication", LoopPredicationPass())
	LOOP_PASS("guard-widening", GuardWideningPass())			LOOP_PASS("guard-widening", GuardWideningPass())
	#undef LOOP_PASS			#undef LOOP_PASS

	#ifndef LOOP_PASS_WITH_PARAMS			#ifndef LOOP_PASS_WITH_PARAMS
	#define LOOP_PASS_WITH_PARAMS(NAME, CREATE_PASS, PARSER)			#define LOOP_PASS_WITH_PARAMS(NAME, CREATE_PASS, PARSER)
	#endif			#endif
	LOOP_PASS_WITH_PARAMS("unswitch",			LOOP_PASS_WITH_PARAMS("unswitch",
	[](bool NonTrivial) {			[](bool NonTrivial) {
	return SimpleLoopUnswitchPass(NonTrivial);			return SimpleLoopUnswitchPass(NonTrivial);
	},			},
	parseLoopUnswitchOptions)			parseLoopUnswitchOptions)
	#undef LOOP_PASS_WITH_PARAMS			#undef LOOP_PASS_WITH_PARAMS

llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll

This file was added.

				; RUN: opt < %s -passes='print<loopnest>' -disable-output 2>&1 \| FileCheck %s

				; Test an imperfect 2-dim loop nest of the form:
				; for (int i = 0; i < nx; ++i) {
				; x[i] = i;
				; for (int j = 0; j < ny; ++j)
				; y[j][i] = x[i] + j;
				; }

				define void @imperf_nest_1(i32 signext %nx, i32 signext %ny) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_1_loop_i, Loops: ( imperf_nest_1_loop_i imperf_nest_1_loop_j )
				entry:
				%0 = zext i32 %ny to i64
				%1 = zext i32 %nx to i64
				%2 = mul nuw i64 %0, %1
				%vla = alloca double, i64 %2, align 8
				%3 = zext i32 %ny to i64
				%vla1 = alloca double, i64 %3, align 8
				br label %imperf_nest_1_loop_i

				imperf_nest_1_loop_i:
				%i2.0 = phi i32 [ 0, %entry ], [ %inc16, %for.inc15 ]
				%cmp = icmp slt i32 %i2.0, %nx
				br i1 %cmp, label %for.body, label %for.end17

				for.body:
				%conv = sitofp i32 %i2.0 to double
				%idxprom = sext i32 %i2.0 to i64
				%arrayidx = getelementptr inbounds double, double* %vla1, i64 %idxprom
				store double %conv, double* %arrayidx, align 8
				br label %imperf_nest_1_loop_j

				imperf_nest_1_loop_j:
				%j3.0 = phi i32 [ 0, %for.body ], [ %inc, %for.inc ]
				%cmp5 = icmp slt i32 %j3.0, %ny
				br i1 %cmp5, label %for.body7, label %for.end

				for.body7:
				%idxprom8 = sext i32 %i2.0 to i64
				%arrayidx9 = getelementptr inbounds double, double* %vla1, i64 %idxprom8
				%4 = load double, double* %arrayidx9, align 8
				%conv10 = sitofp i32 %j3.0 to double
				%add = fadd double %4, %conv10
				%idxprom11 = sext i32 %j3.0 to i64
				%5 = mul nsw i64 %idxprom11, %1
				%arrayidx12 = getelementptr inbounds double, double* %vla, i64 %5
				%idxprom13 = sext i32 %i2.0 to i64
				%arrayidx14 = getelementptr inbounds double, double* %arrayidx12, i64 %idxprom13
				store double %add, double* %arrayidx14, align 8
				br label %for.inc

				for.inc:
				%inc = add nsw i32 %j3.0, 1
				br label %imperf_nest_1_loop_j

				for.end:
				br label %for.inc15

				for.inc15:
				%inc16 = add nsw i32 %i2.0, 1
				br label %imperf_nest_1_loop_i

				for.end17:
				ret void
				}

				; Test an imperfect 2-dim loop nest of the form:
				; for (int i = 0; i < nx; ++i) {
				; for (int j = 0; j < ny; ++j)
				; y[j][i] = x[i] + j;
				; y[0][i] += i;
				; }

				define void @imperf_nest_2(i32 signext %nx, i32 signext %ny) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_2_loop_i, Loops: ( imperf_nest_2_loop_i imperf_nest_2_loop_j )
				entry:
				%0 = zext i32 %ny to i64
				%1 = zext i32 %nx to i64
				%2 = mul nuw i64 %0, %1
				%vla = alloca double, i64 %2, align 8
				%3 = zext i32 %ny to i64
				%vla1 = alloca double, i64 %3, align 8
				br label %imperf_nest_2_loop_i

				imperf_nest_2_loop_i:
				%i2.0 = phi i32 [ 0, %entry ], [ %inc17, %for.inc16 ]
				%cmp = icmp slt i32 %i2.0, %nx
				br i1 %cmp, label %for.body, label %for.end18

				for.body:
				br label %imperf_nest_2_loop_j

				imperf_nest_2_loop_j:
				%j3.0 = phi i32 [ 0, %for.body ], [ %inc, %for.inc ]
				%cmp5 = icmp slt i32 %j3.0, %ny
				br i1 %cmp5, label %for.body6, label %for.end

				for.body6:
				%idxprom = sext i32 %i2.0 to i64
				%arrayidx = getelementptr inbounds double, double* %vla1, i64 %idxprom
				%4 = load double, double* %arrayidx, align 8
				%conv = sitofp i32 %j3.0 to double
				%add = fadd double %4, %conv
				%idxprom7 = sext i32 %j3.0 to i64
				%5 = mul nsw i64 %idxprom7, %1
				%arrayidx8 = getelementptr inbounds double, double* %vla, i64 %5
				%idxprom9 = sext i32 %i2.0 to i64
				%arrayidx10 = getelementptr inbounds double, double* %arrayidx8, i64 %idxprom9
				store double %add, double* %arrayidx10, align 8
				br label %for.inc

				for.inc:
				%inc = add nsw i32 %j3.0, 1
				br label %imperf_nest_2_loop_j

				for.end:
				%conv11 = sitofp i32 %i2.0 to double
				%6 = mul nsw i64 0, %1
				%arrayidx12 = getelementptr inbounds double, double* %vla, i64 %6
				%idxprom13 = sext i32 %i2.0 to i64
				%arrayidx14 = getelementptr inbounds double, double* %arrayidx12, i64 %idxprom13
				%7 = load double, double* %arrayidx14, align 8
				%add15 = fadd double %7, %conv11
				store double %add15, double* %arrayidx14, align 8
				br label %for.inc16

				for.inc16:
				%inc17 = add nsw i32 %i2.0, 1
				br label %imperf_nest_2_loop_i

				for.end18:
				ret void
				}

				; Test an imperfect 2-dim loop nest of the form:
				; for (i = 0; i < nx; ++i) {
				; for (j = 0; j < ny-nk; ++j)
				; y[i][j] = x[i] + j;
				; for (j = ny-nk; j < ny; ++j)
				; y[i][j] = x[i] - j;
				; }

				define void @imperf_nest_3(i32 signext %nx, i32 signext %ny, i32 signext %nk) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_3_loop_i, Loops: ( imperf_nest_3_loop_i imperf_nest_3_loop_j imperf_nest_3_loop_k )
				entry:
				%0 = zext i32 %nx to i64
				%1 = zext i32 %ny to i64
				%2 = mul nuw i64 %0, %1
				%vla = alloca double, i64 %2, align 8
				%3 = zext i32 %ny to i64
				%vla1 = alloca double, i64 %3, align 8
				br label %imperf_nest_3_loop_i

				imperf_nest_3_loop_i: ; preds = %for.inc25, %entry
				%i.0 = phi i32 [ 0, %entry ], [ %inc26, %for.inc25 ]
				%cmp = icmp slt i32 %i.0, %nx
				br i1 %cmp, label %for.body, label %for.end27

				for.body: ; preds = %for.cond
				br label %imperf_nest_3_loop_j

				imperf_nest_3_loop_j: ; preds = %for.inc, %for.body
				%j.0 = phi i32 [ 0, %for.body ], [ %inc, %for.inc ]
				%sub = sub nsw i32 %ny, %nk
				%cmp3 = icmp slt i32 %j.0, %sub
				br i1 %cmp3, label %for.body4, label %for.end

				for.body4: ; preds = %imperf_nest_3_loop_j
				%idxprom = sext i32 %i.0 to i64
				%arrayidx = getelementptr inbounds double, double* %vla1, i64 %idxprom
				%4 = load double, double* %arrayidx, align 8
				%conv = sitofp i32 %j.0 to double
				%add = fadd double %4, %conv
				%idxprom5 = sext i32 %i.0 to i64
				%5 = mul nsw i64 %idxprom5, %1
				%arrayidx6 = getelementptr inbounds double, double* %vla, i64 %5
				%idxprom7 = sext i32 %j.0 to i64
				%arrayidx8 = getelementptr inbounds double, double* %arrayidx6, i64 %idxprom7
				store double %add, double* %arrayidx8, align 8
				br label %for.inc

				for.inc: ; preds = %for.body4
				%inc = add nsw i32 %j.0, 1
				br label %imperf_nest_3_loop_j

				for.end: ; preds = %imperf_nest_3_loop_j
				%sub9 = sub nsw i32 %ny, %nk
				br label %imperf_nest_3_loop_k

				imperf_nest_3_loop_k: ; preds = %for.inc22, %for.end
				%j.1 = phi i32 [ %sub9, %for.end ], [ %inc23, %for.inc22 ]
				%cmp11 = icmp slt i32 %j.1, %ny
				br i1 %cmp11, label %for.body13, label %for.end24

				for.body13: ; preds = %imperf_nest_3_loop_k
				%idxprom14 = sext i32 %i.0 to i64
				%arrayidx15 = getelementptr inbounds double, double* %vla1, i64 %idxprom14
				%6 = load double, double* %arrayidx15, align 8
				%conv16 = sitofp i32 %j.1 to double
				%sub17 = fsub double %6, %conv16
				%idxprom18 = sext i32 %i.0 to i64
				%7 = mul nsw i64 %idxprom18, %1
				%arrayidx19 = getelementptr inbounds double, double* %vla, i64 %7
				%idxprom20 = sext i32 %j.1 to i64
				%arrayidx21 = getelementptr inbounds double, double* %arrayidx19, i64 %idxprom20
				store double %sub17, double* %arrayidx21, align 8
				br label %for.inc22

				for.inc22: ; preds = %for.body13
				%inc23 = add nsw i32 %j.1, 1
				br label %imperf_nest_3_loop_k

				for.end24: ; preds = %imperf_nest_3_loop_k
				br label %for.inc25

				for.inc25: ; preds = %for.end24
				%inc26 = add nsw i32 %i.0, 1
				br label %imperf_nest_3_loop_i

				for.end27: ; preds = %for.cond
				ret void
				}

				; Test an imperfect loop nest of the form:
				; for (i = 0; i < nx; ++i) {
				; for (j = 0; j < ny-nk; ++j)
				; for (k = 0; k < nk; ++k)
				; y[i][j][k] = x[i+j] + k;
				; for (j = ny-nk; j < ny; ++j)
				; y[i][j][0] = x[i] - j;
				; }

				define void @imperf_nest_4(i32 signext %nx, i32 signext %ny, i32 signext %nk) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_4_loop_j, Loops: ( imperf_nest_4_loop_j imperf_nest_4_loop_k )
				; CHECK-LABEL: IsPerfect=false, Depth=3, OutermostLoop: imperf_nest_4_loop_i, Loops: ( imperf_nest_4_loop_i imperf_nest_4_loop_j imperf_nest_4_loop_j2 imperf_nest_4_loop_k )
				entry:
				%0 = zext i32 %nx to i64
				%1 = zext i32 %ny to i64
				%2 = zext i32 %nk to i64
				%3 = mul nuw i64 %0, %1
				%4 = mul nuw i64 %3, %2
				%vla = alloca double, i64 %4, align 8
				%5 = zext i32 %ny to i64
				%vla1 = alloca double, i64 %5, align 8
				%cmp5 = icmp slt i32 0, %nx
				br i1 %cmp5, label %imperf_nest_4_loop_i.lr.ph, label %for.end37

				imperf_nest_4_loop_i.lr.ph:
				br label %imperf_nest_4_loop_i

				imperf_nest_4_loop_i:
				%i.0 = phi i32 [ 0, %imperf_nest_4_loop_i.lr.ph ], [ %inc36, %for.inc35 ]
				%sub2 = sub nsw i32 %ny, %nk
				%cmp33 = icmp slt i32 0, %sub2
				br i1 %cmp33, label %imperf_nest_4_loop_j.lr.ph, label %for.end17

				imperf_nest_4_loop_j.lr.ph:
				br label %imperf_nest_4_loop_j

				imperf_nest_4_loop_j:
				%j.0 = phi i32 [ 0, %imperf_nest_4_loop_j.lr.ph ], [ %inc16, %for.inc15 ]
				%cmp61 = icmp slt i32 0, %nk
				br i1 %cmp61, label %imperf_nest_4_loop_k.lr.ph, label %for.end

				imperf_nest_4_loop_k.lr.ph:
				br label %imperf_nest_4_loop_k

				imperf_nest_4_loop_k:
				%k.0 = phi i32 [ 0, %imperf_nest_4_loop_k.lr.ph ], [ %inc, %for.inc ]
				%add = add nsw i32 %i.0, %j.0
				%idxprom = sext i32 %add to i64
				%arrayidx = getelementptr inbounds double, double* %vla1, i64 %idxprom
				%6 = load double, double* %arrayidx, align 8
				%conv = sitofp i32 %k.0 to double
				%add8 = fadd double %6, %conv
				%idxprom9 = sext i32 %i.0 to i64
				%7 = mul nuw i64 %1, %2
				%8 = mul nsw i64 %idxprom9, %7
				%arrayidx10 = getelementptr inbounds double, double* %vla, i64 %8
				%idxprom11 = sext i32 %j.0 to i64
				%9 = mul nsw i64 %idxprom11, %2
				%arrayidx12 = getelementptr inbounds double, double* %arrayidx10, i64 %9
				%idxprom13 = sext i32 %k.0 to i64
				%arrayidx14 = getelementptr inbounds double, double* %arrayidx12, i64 %idxprom13
				store double %add8, double* %arrayidx14, align 8
				br label %for.inc

				for.inc:
				%inc = add nsw i32 %k.0, 1
				%cmp6 = icmp slt i32 %inc, %nk
				br i1 %cmp6, label %imperf_nest_4_loop_k, label %for.cond5.for.end_crit_edge

				for.cond5.for.end_crit_edge:
				br label %for.end

				for.end:
				br label %for.inc15

				for.inc15:
				%inc16 = add nsw i32 %j.0, 1
				%sub = sub nsw i32 %ny, %nk
				%cmp3 = icmp slt i32 %inc16, %sub
				br i1 %cmp3, label %imperf_nest_4_loop_j, label %for.cond2.for.end17_crit_edge

				for.cond2.for.end17_crit_edge:
				br label %for.end17

				for.end17:
				%sub18 = sub nsw i32 %ny, %nk
				%cmp204 = icmp slt i32 %sub18, %ny
				br i1 %cmp204, label %imperf_nest_4_loop_j2.lr.ph, label %for.end34

				imperf_nest_4_loop_j2.lr.ph:
				br label %imperf_nest_4_loop_j2

				imperf_nest_4_loop_j2:
				%j.1 = phi i32 [ %sub18, %imperf_nest_4_loop_j2.lr.ph ], [ %inc33, %for.inc32 ]
				%idxprom23 = sext i32 %i.0 to i64
				%arrayidx24 = getelementptr inbounds double, double* %vla1, i64 %idxprom23
				%10 = load double, double* %arrayidx24, align 8
				%conv25 = sitofp i32 %j.1 to double
				%sub26 = fsub double %10, %conv25
				%idxprom27 = sext i32 %i.0 to i64
				%idxprom29 = sext i32 %j.1 to i64
				%11 = mul nsw i64 %idxprom29, %2
				%12 = mul nuw i64 %1, %2
				%13 = mul nsw i64 %idxprom27, %12
				%arrayidx28 = getelementptr inbounds double, double* %vla, i64 %13
				%arrayidx30 = getelementptr inbounds double, double* %arrayidx28, i64 %11
				%arrayidx31 = getelementptr inbounds double, double* %arrayidx30, i64 0
				store double %sub26, double* %arrayidx31, align 8
				br label %for.inc32

				for.inc32:
				%inc33 = add nsw i32 %j.1, 1
				%cmp20 = icmp slt i32 %inc33, %ny
				br i1 %cmp20, label %imperf_nest_4_loop_j2, label %for.cond19.for.end34_crit_edge

				for.cond19.for.end34_crit_edge:
				br label %for.end34

				for.end34:
				br label %for.inc35

				for.inc35:
				%inc36 = add nsw i32 %i.0, 1
				%cmp = icmp slt i32 %inc36, %nx
				br i1 %cmp, label %imperf_nest_4_loop_i, label %for.cond.for.end37_crit_edge

				for.cond.for.end37_crit_edge:
				br label %for.end37

				for.end37:
				ret void
				}

				; Test an imperfect loop nest of the form:
				; for (int i = 0; i < nx; ++i)
				; if (i > 5) {
				; for (int j = 0; j < ny; ++j)
				; y[j][i] = x[i][j] + j;
				; }

				define void @imperf_nest_5(i32 %y, i32 %x, i32 signext %nx, i32 signext %ny) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_5_loop_i, Loops: ( imperf_nest_5_loop_i imperf_nest_5_loop_j )
				entry:
				%cmp2 = icmp slt i32 0, %nx
				br i1 %cmp2, label %imperf_nest_5_loop_i.lr.ph, label %for.end13

				imperf_nest_5_loop_i.lr.ph:
				br label %imperf_nest_5_loop_i

				imperf_nest_5_loop_i:
				%i.0 = phi i32 [ 0, %imperf_nest_5_loop_i.lr.ph ], [ %inc12, %for.inc11 ]
				%cmp1 = icmp sgt i32 %i.0, 5
				br i1 %cmp1, label %if.then, label %if.end

				if.then:
				%cmp31 = icmp slt i32 0, %ny
				br i1 %cmp31, label %imperf_nest_5_loop_j.lr.ph, label %for.end

				imperf_nest_5_loop_j.lr.ph:
				br label %imperf_nest_5_loop_j

				imperf_nest_5_loop_j:
				%j.0 = phi i32 [ 0, %imperf_nest_5_loop_j.lr.ph ], [ %inc, %for.inc ]
				%idxprom = sext i32 %i.0 to i64
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 8
				%idxprom5 = sext i32 %j.0 to i64
				%arrayidx6 = getelementptr inbounds i32, i32* %0, i64 %idxprom5
				%1 = load i32, i32* %arrayidx6, align 4
				%add = add nsw i32 %1, %j.0
				%idxprom7 = sext i32 %j.0 to i64
				%arrayidx8 = getelementptr inbounds i32, i32* %y, i64 %idxprom7
				%2 = load i32, i32* %arrayidx8, align 8
				%idxprom9 = sext i32 %i.0 to i64
				%arrayidx10 = getelementptr inbounds i32, i32* %2, i64 %idxprom9
				store i32 %add, i32* %arrayidx10, align 4
				br label %for.inc

				for.inc:
				%inc = add nsw i32 %j.0, 1
				%cmp3 = icmp slt i32 %inc, %ny
				br i1 %cmp3, label %imperf_nest_5_loop_j, label %for.cond2.for.end_crit_edge

				for.cond2.for.end_crit_edge:
				br label %for.end

				for.end:
				br label %if.end

				if.end:
				br label %for.inc11

				for.inc11:
				%inc12 = add nsw i32 %i.0, 1
				%cmp = icmp slt i32 %inc12, %nx
				br i1 %cmp, label %imperf_nest_5_loop_i, label %for.cond.for.end13_crit_edge

				for.cond.for.end13_crit_edge:
				br label %for.end13

				for.end13:
				ret void
				}

				; Test an imperfect loop nest of the form:
				; for (int i = 0; i < nx; ++i)
				; if (i > 5) { // user branch
				; for (int j = 1; j <= 5; j+=2)
				; y[j][i] = x[i][j] + j;
				; }

				define void @imperf_nest_6(i32 %y, i32 %x, i32 signext %nx, i32 signext %ny) {
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: imperf_nest_6_loop_i, Loops: ( imperf_nest_6_loop_i imperf_nest_6_loop_j )
				entry:
				%cmp2 = icmp slt i32 0, %nx
				br i1 %cmp2, label %imperf_nest_6_loop_i.lr.ph, label %for.end13

				imperf_nest_6_loop_i.lr.ph:
				br label %imperf_nest_6_loop_i

				imperf_nest_6_loop_i:
				%i.0 = phi i32 [ 0, %imperf_nest_6_loop_i.lr.ph ], [ %inc12, %for.inc11 ]
				%cmp1 = icmp sgt i32 %i.0, 5
				br i1 %cmp1, label %imperf_nest_6_loop_j.lr.ph, label %if.end

				imperf_nest_6_loop_j.lr.ph:
				br label %imperf_nest_6_loop_j

				imperf_nest_6_loop_j:
				%j.0 = phi i32 [ 1, %imperf_nest_6_loop_j.lr.ph ], [ %inc, %for.inc ]
				%idxprom = sext i32 %i.0 to i64
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 8
				%idxprom5 = sext i32 %j.0 to i64
				%arrayidx6 = getelementptr inbounds i32, i32* %0, i64 %idxprom5
				%1 = load i32, i32* %arrayidx6, align 4
				%add = add nsw i32 %1, %j.0
				%idxprom7 = sext i32 %j.0 to i64
				%arrayidx8 = getelementptr inbounds i32, i32* %y, i64 %idxprom7
				%2 = load i32, i32* %arrayidx8, align 8
				%idxprom9 = sext i32 %i.0 to i64
				%arrayidx10 = getelementptr inbounds i32, i32* %2, i64 %idxprom9
				store i32 %add, i32* %arrayidx10, align 4
				br label %for.inc

				for.inc:
				%inc = add nsw i32 %j.0, 2
				%cmp3 = icmp sle i32 %inc, 5
				br i1 %cmp3, label %imperf_nest_6_loop_j, label %for.cond2.for.end_crit_edge

				for.cond2.for.end_crit_edge:
				br label %for.end

				for.end:
				br label %if.end

				if.end:
				br label %for.inc11

				for.inc11:
				%inc12 = add nsw i32 %i.0, 1
				%cmp = icmp slt i32 %inc12, %nx
				br i1 %cmp, label %imperf_nest_6_loop_i, label %for.cond.for.end13_crit_edge

				for.cond.for.end13_crit_edge:
				br label %for.end13

				for.end13:
				ret void
				}

llvm/test/Analysis/LoopNestAnalysis/infinite.ll

This file was added.

				; RUN: opt < %s -passes='print<loopnest>' -disable-output 2>&1 \| FileCheck %s

				; Test that the loop nest analysis is able to analyze an infinite loop in a loop nest.
				define void @test1(i32** %A, i1 %cond) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: for.inner, Loops: ( for.inner )
				; CHECK-LABEL: IsPerfect=false, Depth=2, OutermostLoop: for.outer, Loops: ( for.outer for.inner )
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: for.infinite, Loops: ( for.infinite )
				entry:
				br label %for.outer

				for.outer:
				%i = phi i64 [ 0, %entry ], [ %inc_i, %for.outer.latch ]
				br i1 %cond, label %for.inner, label %for.infinite

				for.inner:
				%j = phi i64 [ 0, %for.outer ], [ %inc_j, %for.inner ]
				%arrayidx_i = getelementptr inbounds i32, i32* %A, i64 %i
				%0 = load i32, i32* %arrayidx_i, align 8
				%arrayidx_j = getelementptr inbounds i32, i32* %0, i64 %j
				store i32 0, i32* %arrayidx_j, align 4
				%inc_j = add nsw i64 %j, 1
				%cmp_j = icmp slt i64 %inc_j, 100
				br i1 %cmp_j, label %for.inner, label %for.outer.latch

				for.infinite:
				br label %for.infinite

				for.outer.latch:
				%inc_i = add nsw i64 %i, 1
				%cmp_i = icmp slt i64 %inc_i, 100
				br i1 %cmp_i, label %for.outer, label %for.end

				for.end:
				ret void
				}

llvm/test/Analysis/LoopNestAnalysis/perfectnest.ll

This file was added.

				; RUN: opt < %s -passes='print<loopnest>' -disable-output 2>&1 \| FileCheck %s

				; Test a perfect 2-dim loop nest of the form:
				; for(i=0; i<nx; ++i)
				; for(j=0; j<nx; ++j)
				; y[i][j] = x[i][j];

				define void @perf_nest_2D_1(i32 %y, i32 %x, i64 signext %nx, i64 signext %ny) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: perf_nest_2D_1_loop_j, Loops: ( perf_nest_2D_1_loop_j )
				; CHECK-LABEL: IsPerfect=true, Depth=2, OutermostLoop: perf_nest_2D_1_loop_i, Loops: ( perf_nest_2D_1_loop_i perf_nest_2D_1_loop_j )
				entry:
				br label %perf_nest_2D_1_loop_i

				perf_nest_2D_1_loop_i:
				%i = phi i64 [ 0, %entry ], [ %inc13, %inc_i ]
				%cmp21 = icmp slt i64 0, %ny
				br i1 %cmp21, label %perf_nest_2D_1_loop_j, label %inc_i

				perf_nest_2D_1_loop_j:
				%j = phi i64 [ 0, %perf_nest_2D_1_loop_i ], [ %inc, %inc_j ]
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %j
				%0 = load i32, i32* %arrayidx, align 8
				%arrayidx6 = getelementptr inbounds i32, i32* %0, i64 %j
				%1 = load i32, i32* %arrayidx6, align 4
				%arrayidx8 = getelementptr inbounds i32, i32* %y, i64 %j
				%2 = load i32, i32* %arrayidx8, align 8
				%arrayidx11 = getelementptr inbounds i32, i32* %2, i64 %i
				store i32 %1, i32* %arrayidx11, align 4
				br label %inc_j

				inc_j:
				%inc = add nsw i64 %j, 1
				%cmp2 = icmp slt i64 %inc, %ny
				br i1 %cmp2, label %perf_nest_2D_1_loop_j, label %inc_i

				inc_i:
				%inc13 = add nsw i64 %i, 1
				%cmp = icmp slt i64 %inc13, %nx
				br i1 %cmp, label %perf_nest_2D_1_loop_i, label %perf_nest_2D_1_loop_i_end

				perf_nest_2D_1_loop_i_end:
				ret void
				}

				; Test a perfect 2-dim loop nest of the form:
				; for (i=0; i<100; ++i)
				; for (j=0; j<100; ++j)
				; y[i][j] = x[i][j];
				define void @perf_nest_2D_2(i32 %y, i32 %x) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: perf_nest_2D_2_loop_j, Loops: ( perf_nest_2D_2_loop_j )
				; CHECK-LABEL: IsPerfect=true, Depth=2, OutermostLoop: perf_nest_2D_2_loop_i, Loops: ( perf_nest_2D_2_loop_i perf_nest_2D_2_loop_j )
				entry:
				br label %perf_nest_2D_2_loop_i

				perf_nest_2D_2_loop_i:
				%i = phi i64 [ 0, %entry ], [ %inc13, %inc_i ]
				br label %perf_nest_2D_2_loop_j

				perf_nest_2D_2_loop_j:
				%j = phi i64 [ 0, %perf_nest_2D_2_loop_i ], [ %inc, %inc_j ]
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %j
				%0 = load i32, i32* %arrayidx, align 8
				%arrayidx6 = getelementptr inbounds i32, i32* %0, i64 %j
				%1 = load i32, i32* %arrayidx6, align 4
				%arrayidx8 = getelementptr inbounds i32, i32* %y, i64 %j
				%2 = load i32, i32* %arrayidx8, align 8
				%arrayidx11 = getelementptr inbounds i32, i32* %2, i64 %i
				store i32 %1, i32* %arrayidx11, align 4
				br label %inc_j

				inc_j:
				%inc = add nsw i64 %j, 1
				%cmp2 = icmp slt i64 %inc, 100
				br i1 %cmp2, label %perf_nest_2D_2_loop_j, label %loop_j_end

				loop_j_end:
				br label %inc_i

				inc_i:
				%inc13 = add nsw i64 %i, 1
				%cmp = icmp slt i64 %inc13, 100
				br i1 %cmp, label %perf_nest_2D_2_loop_i, label %perf_nest_2D_2_loop_i_end

				perf_nest_2D_2_loop_i_end:
				ret void
				}

				; Test a perfect 3-dim loop nest of the form:
				; for (i=0; i<nx; ++i)
				; for (j=0; j<ny; ++j)
				; for (k=0; j<nk; ++k)
				; y[j][j][k] = x[i][j][k];
				;

				define void @perf_nest_3D_1(i32* %y, i32* %x, i32 signext %nx, i32 signext %ny, i32 signext %nk) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: perf_nest_3D_1_loop_k, Loops: ( perf_nest_3D_1_loop_k )
				; CHECK-NEXT: IsPerfect=true, Depth=2, OutermostLoop: perf_nest_3D_1_loop_j, Loops: ( perf_nest_3D_1_loop_j perf_nest_3D_1_loop_k )
				; CHECK-NEXT: IsPerfect=true, Depth=3, OutermostLoop: perf_nest_3D_1_loop_i, Loops: ( perf_nest_3D_1_loop_i perf_nest_3D_1_loop_j perf_nest_3D_1_loop_k )
				entry:
				br label %perf_nest_3D_1_loop_i

				perf_nest_3D_1_loop_i:
				%i = phi i32 [ 0, %entry ], [ %inci, %for.inci ]
				%cmp21 = icmp slt i32 0, %ny
				br i1 %cmp21, label %perf_nest_3D_1_loop_j, label %for.inci

				perf_nest_3D_1_loop_j:
				%j = phi i32 [ 0, %perf_nest_3D_1_loop_i ], [ %incj, %for.incj ]
				%cmp22 = icmp slt i32 0, %nk
				br i1 %cmp22, label %perf_nest_3D_1_loop_k, label %for.incj

				perf_nest_3D_1_loop_k:
				%k = phi i32 [ 0, %perf_nest_3D_1_loop_j ], [ %inck, %for.inck ]
				%idxprom = sext i32 %i to i64
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 8
				%idxprom7 = sext i32 %j to i64
				%arrayidx8 = getelementptr inbounds i32, i32* %0, i64 %idxprom7
				%1 = load i32, i32* %arrayidx8, align 8
				%idxprom9 = sext i32 %k to i64
				%arrayidx10 = getelementptr inbounds i32, i32* %1, i64 %idxprom9
				%2 = load i32, i32* %arrayidx10, align 4
				%idxprom11 = sext i32 %j to i64
				%arrayidx12 = getelementptr inbounds i32, i32* %y, i64 %idxprom11
				%3 = load i32, i32* %arrayidx12, align 8
				%idxprom13 = sext i32 %j to i64
				%arrayidx14 = getelementptr inbounds i32, i32* %3, i64 %idxprom13
				%4 = load i32, i32* %arrayidx14, align 8
				%idxprom15 = sext i32 %k to i64
				%arrayidx16 = getelementptr inbounds i32, i32* %4, i64 %idxprom15
				store i32 %2, i32* %arrayidx16, align 4
				br label %for.inck

				for.inck:
				%inck = add nsw i32 %k, 1
				%cmp5 = icmp slt i32 %inck, %nk
				br i1 %cmp5, label %perf_nest_3D_1_loop_k, label %for.incj

				for.incj:
				%incj = add nsw i32 %j, 1
				%cmp2 = icmp slt i32 %incj, %ny
				br i1 %cmp2, label %perf_nest_3D_1_loop_j, label %for.inci

				for.inci:
				%inci = add nsw i32 %i, 1
				%cmp = icmp slt i32 %inci, %nx
				br i1 %cmp, label %perf_nest_3D_1_loop_i, label %perf_nest_3D_1_loop_i_end

				perf_nest_3D_1_loop_i_end:
				ret void
				}

				; Test a perfect 3-dim loop nest of the form:
				; for (i=0; i<100; ++i)
				; for (j=0; j<100; ++j)
				; for (k=0; j<100; ++k)
				; y[j][j][k] = x[i][j][k];
				;

				define void @perf_nest_3D_2(i32* %y, i32* %x) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: perf_nest_3D_2_loop_k, Loops: ( perf_nest_3D_2_loop_k )
				; CHECK-NEXT: IsPerfect=true, Depth=2, OutermostLoop: perf_nest_3D_2_loop_j, Loops: ( perf_nest_3D_2_loop_j perf_nest_3D_2_loop_k )
				; CHECK-NEXT: IsPerfect=true, Depth=3, OutermostLoop: perf_nest_3D_2_loop_i, Loops: ( perf_nest_3D_2_loop_i perf_nest_3D_2_loop_j perf_nest_3D_2_loop_k )
				entry:
				br label %perf_nest_3D_2_loop_i

				perf_nest_3D_2_loop_i:
				%i = phi i32 [ 0, %entry ], [ %inci, %for.inci ]
				br label %perf_nest_3D_2_loop_j

				perf_nest_3D_2_loop_j:
				%j = phi i32 [ 0, %perf_nest_3D_2_loop_i ], [ %incj, %for.incj ]
				br label %perf_nest_3D_2_loop_k

				perf_nest_3D_2_loop_k:
				%k = phi i32 [ 0, %perf_nest_3D_2_loop_j ], [ %inck, %for.inck ]
				%idxprom = sext i32 %i to i64
				%arrayidx = getelementptr inbounds i32, i32* %x, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 8
				%idxprom7 = sext i32 %j to i64
				%arrayidx8 = getelementptr inbounds i32, i32* %0, i64 %idxprom7
				%1 = load i32, i32* %arrayidx8, align 8
				%idxprom9 = sext i32 %k to i64
				%arrayidx10 = getelementptr inbounds i32, i32* %1, i64 %idxprom9
				%2 = load i32, i32* %arrayidx10, align 4
				%idxprom11 = sext i32 %j to i64
				%arrayidx12 = getelementptr inbounds i32, i32* %y, i64 %idxprom11
				%3 = load i32, i32* %arrayidx12, align 8
				%idxprom13 = sext i32 %j to i64
				%arrayidx14 = getelementptr inbounds i32, i32* %3, i64 %idxprom13
				%4 = load i32, i32* %arrayidx14, align 8
				%idxprom15 = sext i32 %k to i64
				%arrayidx16 = getelementptr inbounds i32, i32* %4, i64 %idxprom15
				store i32 %2, i32* %arrayidx16, align 4
				br label %for.inck

				for.inck:
				%inck = add nsw i32 %k, 1
				%cmp5 = icmp slt i32 %inck, 100
				br i1 %cmp5, label %perf_nest_3D_2_loop_k, label %loop_k_end

				loop_k_end:
				br label %for.incj

				for.incj:
				%incj = add nsw i32 %j, 1
				%cmp2 = icmp slt i32 %incj, 100
				br i1 %cmp2, label %perf_nest_3D_2_loop_j, label %loop_j_end

				loop_j_end:
				br label %for.inci

				for.inci:
				%inci = add nsw i32 %i, 1
				%cmp = icmp slt i32 %inci, 100
				br i1 %cmp, label %perf_nest_3D_2_loop_i, label %perf_nest_3D_2_loop_i_end

				perf_nest_3D_2_loop_i_end:
				ret void
				}

				; Test a perfect loop nest with a live out reduction:
				; for (i = 0; i<ni; ++i)
				; if (0<nj) { // guard branch for the j-loop
				; for (j=0; j<nj; j+=1)
				MeinersburUnsubmitted Done Reply Inline Actions Does the guard condition be consistent with the loop exist condition? I.e. is for (i = 0; i<ni; ++i) { if (5<nj) { // this a guard? for (j=0; j<nj; j+=1) x+=(i+j); } } Meinersbur: Does the guard condition be consistent with the loop exist condition? I.e. is ``` for (i = 0…
				etiottoAuthorUnsubmitted Done Reply Inline Actions I rely on the LoopInfo getLoopGuardBranch() member function to determine whether (5<nj) is a guard for the inner loop (it is currently considered to be a guard). So the answer is yes, if (5<nj) would be considered a guard and therefore the nest considered perfect. etiotto: I rely on the LoopInfo getLoopGuardBranch() member function to determine whether (5<nj) is a…
				; x+=(i+j);
				; }
				; return x;

				define signext i32 @perf_nest_live_out(i32 signext %x, i32 signext %ni, i32 signext %nj) {
				; CHECK-LABEL: IsPerfect=true, Depth=1, OutermostLoop: perf_nest_live_out_loop_j, Loops: ( perf_nest_live_out_loop_j )
				; CHECK-LABEL: IsPerfect=true, Depth=2, OutermostLoop: perf_nest_live_out_loop_i, Loops: ( perf_nest_live_out_loop_i perf_nest_live_out_loop_j )
				entry:
				%cmp4 = icmp slt i32 0, %ni
				br i1 %cmp4, label %perf_nest_live_out_loop_i.lr.ph, label %for.end7

				perf_nest_live_out_loop_i.lr.ph:
				br label %perf_nest_live_out_loop_i

				perf_nest_live_out_loop_i:
				%x.addr.06 = phi i32 [ %x, %perf_nest_live_out_loop_i.lr.ph ], [ %x.addr.1.lcssa, %for.inc5 ]
				%i.05 = phi i32 [ 0, %perf_nest_live_out_loop_i.lr.ph ], [ %inc6, %for.inc5 ]
				%cmp21 = icmp slt i32 0, %nj
				br i1 %cmp21, label %perf_nest_live_out_loop_j.lr.ph, label %for.inc5

				perf_nest_live_out_loop_j.lr.ph:
				br label %perf_nest_live_out_loop_j

				perf_nest_live_out_loop_j:
				%x.addr.13 = phi i32 [ %x.addr.06, %perf_nest_live_out_loop_j.lr.ph ], [ %add4, %perf_nest_live_out_loop_j ]
				%j.02 = phi i32 [ 0, %perf_nest_live_out_loop_j.lr.ph ], [ %inc, %perf_nest_live_out_loop_j ]
				%add = add nsw i32 %i.05, %j.02
				%add4 = add nsw i32 %x.addr.13, %add
				%inc = add nsw i32 %j.02, 1
				%cmp2 = icmp slt i32 %inc, %nj
				br i1 %cmp2, label %perf_nest_live_out_loop_j, label %for.cond1.for.inc5_crit_edge

				for.cond1.for.inc5_crit_edge:
				%split = phi i32 [ %add4, %perf_nest_live_out_loop_j ]
				br label %for.inc5

				for.inc5:
				%x.addr.1.lcssa = phi i32 [ %split, %for.cond1.for.inc5_crit_edge ], [ %x.addr.06, %perf_nest_live_out_loop_i ]
				%inc6 = add nsw i32 %i.05, 1
				%cmp = icmp slt i32 %inc6, %ni
				br i1 %cmp, label %perf_nest_live_out_loop_i, label %for.cond.for.end7_crit_edge

				for.cond.for.end7_crit_edge:
				%split7 = phi i32 [ %x.addr.1.lcssa, %for.inc5 ]
				br label %for.end7

				for.end7:
				%x.addr.0.lcssa = phi i32 [ %split7, %for.cond.for.end7_crit_edge ], [ %x, %entry ]
				ret i32 %x.addr.0.lcssa
				}

llvm/unittests/Analysis/CMakeLists.txt

Show All 15 Lines	add_llvm_unittest(AnalysisTests
CFGTest.cpp		CFGTest.cpp
CGSCCPassManagerTest.cpp		CGSCCPassManagerTest.cpp
DivergenceAnalysisTest.cpp		DivergenceAnalysisTest.cpp
DomTreeUpdaterTest.cpp		DomTreeUpdaterTest.cpp
GlobalsModRefTest.cpp		GlobalsModRefTest.cpp
IVDescriptorsTest.cpp		IVDescriptorsTest.cpp
LazyCallGraphTest.cpp		LazyCallGraphTest.cpp
LoopInfoTest.cpp		LoopInfoTest.cpp
		LoopNestTest.cpp
MemoryBuiltinsTest.cpp		MemoryBuiltinsTest.cpp
MemorySSATest.cpp		MemorySSATest.cpp
OrderedBasicBlockTest.cpp		OrderedBasicBlockTest.cpp
OrderedInstructionsTest.cpp		OrderedInstructionsTest.cpp
PhiValuesTest.cpp		PhiValuesTest.cpp
ProfileSummaryInfoTest.cpp		ProfileSummaryInfoTest.cpp
ScalarEvolutionTest.cpp		ScalarEvolutionTest.cpp
VectorFunctionABITest.cpp		VectorFunctionABITest.cpp
SparsePropagation.cpp		SparsePropagation.cpp
TargetLibraryInfoTest.cpp		TargetLibraryInfoTest.cpp
TBAATest.cpp		TBAATest.cpp
UnrollAnalyzerTest.cpp		UnrollAnalyzerTest.cpp
ValueLatticeTest.cpp		ValueLatticeTest.cpp
ValueTrackingTest.cpp		ValueTrackingTest.cpp
VectorUtilsTest.cpp		VectorUtilsTest.cpp
)		)

llvm/unittests/Analysis/LoopNestTest.cpp

This file was added.

				//===- LoopNestTest.cpp - LoopNestAnalysis unit tests ---------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/LoopNestAnalysis.h"
				#include "llvm/Analysis/ScalarEvolution.h"
				#include "llvm/Analysis/TargetLibraryInfo.h"
				#include "llvm/AsmParser/Parser.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/Support/SourceMgr.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				/// Build the loop nest analysis for a loop nest and run the given test \p Test.
				static void runTest(
				Module &M, StringRef FuncName,
				function_ref<void(Function &F, LoopInfo &LI, ScalarEvolution &SE)> Test) {
				auto *F = M.getFunction(FuncName);
				ASSERT_NE(F, nullptr) << "Could not find " << FuncName;

				TargetLibraryInfoImpl TLII;
				TargetLibraryInfo TLI(TLII);
				AssumptionCache AC(*F);
				DominatorTree DT(*F);
				LoopInfo LI(DT);
				ScalarEvolution SE(*F, TLI, AC, DT, LI);

				Test(*F, LI, SE);
				}

				static std::unique_ptr<Module> makeLLVMModule(LLVMContext &Context,
				const char *ModuleStr) {
				SMDiagnostic Err;
				return parseAssemblyString(ModuleStr, Err, Context);
				}

				TEST(LoopNestTest, PerfectLoopNest) {
				const char *ModuleStr =
				"target datalayout = \"e-m:o-i64:64-f80:128-n8:16:32:64-S128\"\n"
				"define void @foo(i64 signext %nx, i64 signext %ny) {\n"
				"entry:\n"
				" br label %for.outer\n"
				"for.outer:\n"
				" %i = phi i64 [ 0, %entry ], [ %inc13, %for.outer.latch ]\n"
				" %cmp21 = icmp slt i64 0, %ny\n"
				" br i1 %cmp21, label %for.inner.preheader, label %for.outer.latch\n"
				"for.inner.preheader:\n"
				" br label %for.inner\n"
				"for.inner:\n"
				" %j = phi i64 [ 0, %for.inner.preheader ], [ %inc, %for.inner.latch ]\n"
				" br label %for.inner.latch\n"
				"for.inner.latch:\n"
				" %inc = add nsw i64 %j, 1\n"
				" %cmp2 = icmp slt i64 %inc, %ny\n"
				" br i1 %cmp2, label %for.inner, label %for.inner.exit\n"
				"for.inner.exit:\n"
				" br label %for.outer.latch\n"
				"for.outer.latch:\n"
				" %inc13 = add nsw i64 %i, 1\n"
				" %cmp = icmp slt i64 %inc13, %nx\n"
				" br i1 %cmp, label %for.outer, label %for.outer.exit\n"
				"for.outer.exit:\n"
				" br label %for.end\n"
				"for.end:\n"
				" ret void\n"
				"}\n";

				LLVMContext Context;
				std::unique_ptr<Module> M = makeLLVMModule(Context, ModuleStr);

				runTest(*M, "foo", [&](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
				Function::iterator FI = F.begin();
				// Skip the first basic block (entry), get to the outer loop header.
				BasicBlock Header = &(++FI);
				assert(Header->getName() == "for.outer");
				Loop *L = LI.getLoopFor(Header);
				EXPECT_NE(L, nullptr);

				LoopNest LN(*L, SE);
				EXPECT_TRUE(LN.areAllLoopsSimplifyForm());

				// Ensure that we can identify the outermost loop in the nest.
				const Loop &OL = LN.getOutermostLoop();
				EXPECT_EQ(OL.getName(), "for.outer");

				// Ensure that we can identify the innermost loop in the nest.
				const Loop *IL = LN.getInnermostLoop();
				EXPECT_NE(IL, nullptr);
				EXPECT_EQ(IL->getName(), "for.inner");

				// Ensure the loop nest is recognized as having 2 loops.
				const ArrayRef<Loop*> Loops = LN.getLoops();
				EXPECT_EQ(Loops.size(), 2ull);

				// Ensure the loop nest is recognized as perfect in its entirety.
				const SmallVector<LoopVectorTy, 4> &PLV = LN.getPerfectLoops(SE);
				EXPECT_EQ(PLV.size(), 1ull);
				EXPECT_EQ(PLV.front().size(), 2ull);

				// Ensure the nest depth and perfect nest depth are computed correctly.
				EXPECT_EQ(LN.getNestDepth(), 2u);
				EXPECT_EQ(LN.getMaxPerfectDepth(), 2u);
				});
				}

				TEST(LoopNestTest, ImperfectLoopNest) {
				const char *ModuleStr =
				"target datalayout = \"e-m:o-i64:64-f80:128-n8:16:32:64-S128\"\n"
				"define void @foo(i32 signext %nx, i32 signext %ny, i32 signext %nk) {\n"
				"entry:\n"
				" br label %loop.i\n"
				"loop.i:\n"
				" %i = phi i32 [ 0, %entry ], [ %inci, %for.inci ]\n"
				" %cmp21 = icmp slt i32 0, %ny\n"
				" br i1 %cmp21, label %loop.j.preheader, label %for.inci\n"
				"loop.j.preheader:\n"
				" br label %loop.j\n"
				"loop.j:\n"
				" %j = phi i32 [ %incj, %for.incj ], [ 0, %loop.j.preheader ]\n"
				" %cmp22 = icmp slt i32 0, %nk\n"
				" br i1 %cmp22, label %loop.k.preheader, label %for.incj\n"
				"loop.k.preheader:\n"
				" call void @bar()\n"
				" br label %loop.k\n"
				"loop.k:\n"
				" %k = phi i32 [ %inck, %for.inck ], [ 0, %loop.k.preheader ]\n"
				" br label %for.inck\n"
				"for.inck:\n"
				" %inck = add nsw i32 %k, 1\n"
				" %cmp5 = icmp slt i32 %inck, %nk\n"
				" br i1 %cmp5, label %loop.k, label %for.incj.loopexit\n"
				"for.incj.loopexit:\n"
				" br label %for.incj\n"
				"for.incj:\n"
				" %incj = add nsw i32 %j, 1\n"
				" %cmp2 = icmp slt i32 %incj, %ny\n"
				" br i1 %cmp2, label %loop.j, label %for.inci.loopexit\n"
				"for.inci.loopexit:\n"
				" br label %for.inci\n"
				"for.inci:\n"
				" %inci = add nsw i32 %i, 1\n"
				" %cmp = icmp slt i32 %inci, %nx\n"
				" br i1 %cmp, label %loop.i, label %loop.i.end\n"
				"loop.i.end:\n"
				" ret void\n"
				"}\n"
				"declare void @bar()\n";

				LLVMContext Context;
				std::unique_ptr<Module> M = makeLLVMModule(Context, ModuleStr);

				runTest(*M, "foo", [&](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
				Function::iterator FI = F.begin();
				// Skip the first basic block (entry), get to the outermost loop header.
				BasicBlock Header = &(++FI);
				assert(Header->getName() == "loop.i");
				Loop *L = LI.getLoopFor(Header);
				EXPECT_NE(L, nullptr);

				LoopNest LN(*L, SE);
				EXPECT_TRUE(LN.areAllLoopsSimplifyForm());

				dbgs() << "LN: " << LN << "\n";

				// Ensure that we can identify the outermost loop in the nest.
				const Loop &OL = LN.getOutermostLoop();
				EXPECT_EQ(OL.getName(), "loop.i");

				// Ensure that we can identify the innermost loop in the nest.
				const Loop *IL = LN.getInnermostLoop();
				EXPECT_NE(IL, nullptr);
				EXPECT_EQ(IL->getName(), "loop.k");

				// Ensure the loop nest is recognized as having 3 loops.
				const ArrayRef<Loop*> Loops = LN.getLoops();
				EXPECT_EQ(Loops.size(), 3ull);

				// Ensure the loop nest is recognized as having 2 separate perfect loops groups.
				const SmallVector<LoopVectorTy, 4> &PLV = LN.getPerfectLoops(SE);
				EXPECT_EQ(PLV.size(), 2ull);
				EXPECT_EQ(PLV.front().size(), 2ull);
				EXPECT_EQ(PLV.back().size(), 1ull);

				// Ensure the nest depth and perfect nest depth are computed correctly.
				EXPECT_EQ(LN.getNestDepth(), 3u);
				EXPECT_EQ(LN.getMaxPerfectDepth(), 2u);
				});
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LoopNest]: Analysis to discover properties of a loop nest.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 234611

llvm/include/llvm/Analysis/LoopNestAnalysis.h

llvm/lib/Analysis/CMakeLists.txt

llvm/lib/Analysis/LoopNestAnalysis.cpp

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Passes/PassRegistry.def

llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll

llvm/test/Analysis/LoopNestAnalysis/infinite.ll

llvm/test/Analysis/LoopNestAnalysis/perfectnest.ll

llvm/unittests/Analysis/CMakeLists.txt

llvm/unittests/Analysis/LoopNestTest.cpp

[LoopNest]: Analysis to discover properties of a loop nest.
ClosedPublic