This is an archive of the discontinued LLVM Phabricator instance.

[DA] Handle mismatching loop levels by considering them non-linear
ClosedPublic

Authored by bmahjour on Oct 1 2021, 1:58 PM.

Details

Summary

To represent various loop levels within a nest, DA implements a special numbering scheme (see comment atop establishNestingLevels). The goal of this numbering scheme appears to be representing each unique loop distinctively by using as little memory as possible. This numbering scheme is simple when the source and destination of the dependence are in the same loop. In such cases the level is simply the depth of the loop in which src and dst reside. When the src and dst are not in the same loop, we could run into the following situation exposed by https://reviews.llvm.org/D71539.

Suppose we have the following loop structure and memory access:

1: L1
2:    L2
3:    A[s1] 
4:    L3
5:      A[s2]

Further assume that the subscript of the first access (ie s1) is an LCSSA phi with a single incoming value from L2. With the change in D71539, the SCEV expression for s1 (queried at the scope of L1 where the access on line 3 occurs) will be an AddRec whose loop is L2.

Since the access A[s1] occurs at a loop depth of 1, but its subscript references a loop at depth 2, the current numbering scheme gets confused and assigns the same level value to both s1 and s2, which further causes classifyPair to incorrectly classify the subscript pair. In an assert enabled build, this would trigger the following assertion:

bool llvm::DependenceInfo::testSIV(const llvm::SCEV *, const llvm::SCEV *, unsigned int &, llvm::FullDependence &, llvm::DependenceInfo::Constraint &, const llvm::SCEV *&) const: Assertion `CurLoop == DstAddRec->getLoop() && "both loops in SIV should be same"' failed.

This patch proposes a simpler but much more conservative approach than D110972. It simply treats such cases as non-linear and therefore not analyzable.

Diff Detail

Event Timeline

bmahjour created this revision.Oct 1 2021, 1:58 PM
bmahjour requested review of this revision.Oct 1 2021, 1:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2021, 1:58 PM
bmahjour edited the summary of this revision. (Show Details)Oct 1 2021, 2:20 PM
bmahjour added reviewers: Meinersbur, Florian, Restricted Project.Apr 20 2022, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2022, 8:49 AM
bmahjour edited reviewers, added: fhahn; removed: Florian.Apr 20 2022, 8:54 AM
Meinersbur added inline comments.Apr 21 2022, 8:46 AM
llvm/lib/Analysis/DependenceAnalysis.cpp
918–919

Consider using ScalarEvolution::computeLoopDisposition()

Meinersbur added inline comments.Apr 21 2022, 9:32 AM
llvm/lib/Analysis/DependenceAnalysis.cpp
900–901

isLoopInvariant is already using computeLoopDisposition(). We should use it even when Expr is a SCEVAddRecExpr.

bmahjour updated this revision to Diff 426861.EditedMay 3 2022, 3:19 PM

Used getLoopDisposition and improved tests.

bmahjour marked 2 inline comments as done.May 3 2022, 3:19 PM
congzhe added inline comments.
llvm/lib/Analysis/DependenceAnalysis.cpp
920

Thanks for the change, It looks good to me. I don't know if Micheal has further comments?

[nit] I'm wondering if it would be better to simplify the code like this?

if (SE->getLoopDisposition(AddRec, LoopNest) == ScalarEvolution::LoopDisposition::LoopVariant )
  return false;
congzhe added inline comments.May 4 2022, 10:33 AM
llvm/lib/Analysis/DependenceAnalysis.cpp
920

Thanks for the change, It looks good to me. I don't know if Micheal has further comments?

[nit] I'm wondering if it would be better to simplify the code like this?

if (SE->getLoopDisposition(AddRec, LoopNest) == ScalarEvolution::LoopDisposition::LoopVariant )
  return false;

Actually could be futher simplified to

if (SE->getLoopDisposition(AddRec, LoopNest) == ScalarEvolution::LoopVariant )
  return false;
bmahjour added inline comments.May 4 2022, 12:06 PM
llvm/lib/Analysis/DependenceAnalysis.cpp
920

I thought about that too, but decided to check for LoopComputable and LoopInvariant specifically, in case new enums are added to LoopDisposition in the future. The logic also makes a bit more sense when I think about it as neither invariant nor computable. LoopVariant isn't the best wording for what it really means.

congzhe added inline comments.May 4 2022, 12:29 PM
llvm/lib/Analysis/DependenceAnalysis.cpp
920

Sure, makes sense to me.

I was expecting that we only need to call getLoopDisposition once at the beginning of checkSubscript and then decide depending on whether it is a SCEVAddRecExpr, instead of

if (!AddRec)
  return isLoopInvariant(Expr, LoopNest);

Is this possible?

Also, this patch only checks the innermost loop while isLoopInvariant checks all loops in the nest. Would we need as well? Would this fix miss the following case?

for (int i = ) {
   int j = ...
   for (j < ...; ++j ) {
   }
   for (int k = ) {
      S[i][j][k];
  }
}

That is, j is invariant to the k-loop, put still contains a SCEVAddRecExpr of a loop outside the loop nest.

An alternative check:

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 61bc6750277f7..4a3c5c5e4b202 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -891,6 +893,20 @@ void DependenceInfo::removeMatchingExtensions(Subscript *Pair) {
 // Collect any loops mentioned in the set of "Loops".
 bool DependenceInfo::checkSubscript(const SCEV *Expr, const Loop *LoopNest,
                                     SmallBitVector &Loops, bool IsSrc) {
+  DenseSet<const Loop *> LoopsInNest;
+  const Loop *L = LoopNest;
+  while (L) {
+    LoopsInNest.insert(L);
+    L = L->getParentLoop();
+  }
+
+  if (SCEVExprContains(Expr, [&](const SCEV *E) -> bool {
+        if (auto *AddRec = dyn_cast<SCEVAddRecExpr>(E))
+          return !LoopsInNest.contains(AddRec->getLoop());
+        return false;
+      }))
+    return false;
+
   const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(Expr);
   if (!AddRec)
     return isLoopInvariant(Expr, LoopNest);

This change passes all the DA tests, inlcuding MismatchingNestLevels
I think (hope?) that such a loop will result in the SCEV loop disposition to be LoopDisposition::LoopVariant, so the getLoopDisposition solution should be better. However, it might not suffice if the loop is not in loop all:

int j = ...
for (j < ...; ++j ) {
}
S[j];

Here, we don't have a loop that we could check the accesses' disposition for.

bmahjour added a comment.EditedMay 13 2022, 12:42 PM

I was expecting that we only need to call getLoopDisposition once at the beginning of checkSubscript and then decide depending on whether it is a SCEVAddRecExpr, instead of

if (!AddRec)
  return isLoopInvariant(Expr, LoopNest);

Is this possible?

I don't think we can do that, because if input is an AddRec that is invariant at the "LoopNest" level but variant at an outer level, we don't have a good way to detect it. This is related to your question about checking "all loops in the nest" (similar to isLoopInvariant), which we cannot do as explained below.

Also, this patch only checks the innermost loop while isLoopInvariant checks all loops in the nest. Would we need as well?

When computing LoopDisposition of an AddRec at an outer loop that contains the AddRec's loop, the algorithm considers it LoopVariant due to the following logic in computeLoopDisposition:

// Everything that is not defined at loop entry is variant.
if (DT.dominates(L->getHeader(), AR->getLoop()->getHeader()))
  return LoopVariant;

For this reason it would not be correct to check LoopDisposition all the way from innermost to outermost. Doing so, would pessimistically classify subscripts that we care about the most. isLoopInvariant on the other hand is only called for non-AddRecs (which are less analyzable anyway) so it conservatively checks for all loops to make sure the subscript is considered linear only if it's invariant at each loop level.

Would this fix miss the following case?

for (int i = ) {
   int j = ...
   for (j < ...; ++j ) {
   }
   for (int k = ) {
      S[i][j][k];
  }
}

That is, j is invariant to the k-loop, put still contains a SCEVAddRecExpr of a loop outside the loop nest.

I don't think this is a problem, because for the level-mismatch problem to occur there must be another aliasing access in a different level outside of loop j and loop k (for example at i level). If we have an access at i level that has a recurrence with loop j, the disposition will be LoopVariant at i level. See test2 in MismatchingNestLevels.ll below.

An alternative check:

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 61bc6750277f7..4a3c5c5e4b202 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -891,6 +893,20 @@ void DependenceInfo::removeMatchingExtensions(Subscript *Pair) {
 // Collect any loops mentioned in the set of "Loops".
 bool DependenceInfo::checkSubscript(const SCEV *Expr, const Loop *LoopNest,
                                     SmallBitVector &Loops, bool IsSrc) {
+  DenseSet<const Loop *> LoopsInNest;
+  const Loop *L = LoopNest;
+  while (L) {
+    LoopsInNest.insert(L);
+    L = L->getParentLoop();
+  }
+
+  if (SCEVExprContains(Expr, [&](const SCEV *E) -> bool {
+        if (auto *AddRec = dyn_cast<SCEVAddRecExpr>(E))
+          return !LoopsInNest.contains(AddRec->getLoop());
+        return false;
+      }))
+    return false;
+
   const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(Expr);
   if (!AddRec)
     return isLoopInvariant(Expr, LoopNest);

This change passes all the DA tests, inlcuding MismatchingNestLevels
I think (hope?) that such a loop will result in the SCEV loop disposition to be LoopDisposition::LoopVariant, so the getLoopDisposition solution should be better. However, it might not suffice if the loop is not in loop all:

int j = ...
for (j < ...; ++j ) {
}
S[j];

Here, we don't have a loop that we could check the accesses' disposition for.

computeLoopDisposition returns LoopVariant for cases where the loop is nullptr (ie function scope), so it should be sufficient for this case too. However, your alternative makes it more obvious and easier to understand, so I'd be ok with that too.

bmahjour updated this revision to Diff 429351.May 13 2022, 2:15 PM

Fix test2 to make sure when D71539 is applied, it fails without this patch and passes with it.

As mentioned I would experimented a bit more. Unfortunately I found more problems.

The following crashes even with this patch:

void foo(int n, double * restrict A, double * restrict B) {
    for (int i = 0; i < n; ++i) {
        int s = 0;
        for (; s*s < n*n; ++s) { } // incomputable AddRec
         for (int k = 0; k < n; ++k) 
            A[s] = 0; // Invariant in innermost loop
         
        A[i] = 1; // do create a dependency pair
    }
}

because the the index of A[s] is invariant to its innermost surrounding loop k is invariant and hence passes the additional test.

Even more concerning is something like this, even though it does not crash directly:

   int s = 0;
   for (; s*s < n*n; ++s) ; // incomputable AddRec

for (int i = 0; i < n; ++i) {
     for (int k = 0; k < n; ++k) 
         for (int j = 0; j < n; ++j) 
            A[s + k] = 0; // Invariant: j; Computable: k; Variant: s
     
    A[i] = 1; // do create a dependency pair
}

ScalarEvolution's LoopDisposition test skips s because it is nested inside an AddRec that is invariant because its of an outer loop (((8 * (sext i32 {{0,+,1}<nuw><nsw><%for.cond>,+,1}<nuw><%for.cond5> to i64))<nsw> + %A)) and hence returns the entire thing as invariant even though it contains an SCEVAddRecExpr outside the loop nest. I again tend towards to scanning the SCEVExpr for illegal SCEVAddRecExpr. See [the only important remark].

llvm/lib/Analysis/DependenceAnalysis.cpp
801–802

This seems in conflict with handling within ScalarEvolution:

// Instructions are never considered invariant in the function body
// (null loop) because they are defined within the "loop".

Suggested alternative comments/non-recursive implementation:

 // Outside any loop, an expression will not vary between executions (there is only one) and we consider it invariant. In contrast to ScalarEvolution::isLoopInvariant, we only consider expression evalution at a specific position, the array access, not accross the function.
 if (!LoopNest)
   return true;

 // If the expression is invariant in the outermost loop of the loop nest, it is invariant anywhere in the loop nest. 
const  Loop * OutermostLoop = LoopNest;
 while (OutermostLoop->getParentLoop())
     OutermostLoop = OutermostLoop->getParentLoop();
 return SE->isLoopInvariant(Expression, OutermostLoop);
895

We allow for non-zero Start value, hence it is affine, not linear.

903

[suggestion] SCEVAddRecExpr has a member isAffine that could be queried directly.

920

[the only important remark] I suggest to use this check instead to fix the additional crash:

// The AddRec must depend on one of the surrounding loops. Otherwise, mapSrcLoop and mapDstLoop return indices outside the intended range. This can happen when a subscript in one loop references an IV from a sibling loop that could not be replaced with a concrete exit value by getSCEVAtScope. 
const Loop* L = LoopNest;
while (L && AddRec->getLoop() != L) 
    L = L->getParentLoop();
if (!L)
    return false;

Currently I am too unsure whether we can make computeLoopDisposition work reliably.

This ignores that there could be SCEVAddRecExpr in other SCEV subexpressions, but I think DA doesn't try to reach them anyway. Maybe add a comment about this somewhere.

921–924

mapSrcLoop/mapDstLoop may already be out-of-range because AddRec->getLoop() is not surrounding the SCEVExpr.

One could include a sanity check inside mapSrcLoop/mapDstLoop:

const Loop* L = LoopNest;
while (L && AddRec->getLoop() != L) 
    L = L->getParentLoop();
assert (L && "Must be in loop nest, otherwise mapSrcLoop/mapDstLoop will be out-of-range");
925

This recursive call may also return false, in which case Loops.set has already been executed. (This may be irrelevant because when returning false, Loops is probably be ignored anyway).

The following crashes even with this patch:

I actually don't see any crashes with the examples provided, but I do see that A[s] is being treated as invariant, so it does seem to bypass the getLoopDisposition checks. I'm changing checkSubscript to look for the addrec loop in the nest as you've suggested.

llvm/lib/Analysis/DependenceAnalysis.cpp
903

I think this change should be pursued as a separate patche. isAffine just checks the number of operands, and for example doesn't check that the step is invariant, so there could be some semantic differences.

925

Yes, classifyPair returns immediately if checkSubscript returns false, and the loop lists get destructed without being used.

bmahjour updated this revision to Diff 432128.May 25 2022, 2:13 PM
Meinersbur accepted this revision.Jun 1 2022, 8:54 AM

LGTM

llvm/lib/Analysis/DependenceAnalysis.cpp
914

[nit] whitespace

This revision is now accepted and ready to land.Jun 1 2022, 8:54 AM
bmahjour updated this revision to Diff 433865.Jun 2 2022, 1:34 PM

Rebased and added Michael's test case.

llvm/lib/Analysis/DependenceAnalysis.cpp
921–924

I tried adding this assert inside checkSubscripts right before the calls to mapSrcLoop and mapDstLoop and was able to get your test case to assert with the previous version of this patch. I've now added your test into MismatchingNestLevels.ll.

I also looked into adding this assert, but decided not to do it for the following reasons:

  1. Except for checkSubscripts, all callers of mapSrcLoop and mapDstLoop, get their loops from an AddRec. Checking that the AddRec's loop exists in the loop nest will always satisfy the assertion, so it's pointless to add in those cases. Inside checkSubscript with this patch we already check for the condition being asserted and exit early, so no point in adding the assert inside checkSubscript either.
  2. I tried adding assert(D > 0 && D <= MaxLevels) to mapSrcLoop and mapDstLoop, but it didn't get triggered with the previous version of this patch. That's actually expected because with the numbering system a level will never be outside of this range. The consideration for a level to be out of range depends on the specific dependence test (eg. while exactSIVtest asserts that Level <= CommonLevels, weakZeroSrcSIVtest asserts that Level <= MaxLevels), so there is no way to add such assertion to mapSrcLoop and mapDstLoop because they are used by various kinds of dependence tests.
bmahjour marked 9 inline comments as done.Jun 2 2022, 1:36 PM
This revision was landed with ongoing or failed builds.Jun 8 2022, 8:16 AM
This revision was automatically updated to reflect the committed changes.
fhahn added a comment.Jun 8 2022, 11:50 AM

Thanks for pushing the fix through! I think this should unblock D71539 now.

bmahjour added a comment.EditedJun 8 2022, 1:40 PM

Thanks for pushing the fix through! I think this should unblock D71539 now.

Yes, D71539 should be safe to re-commit now. I've also closed https://github.com/llvm/llvm-project/issues/50612. Thanks!