This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Loop bounds inference in linalg.generic op improved to support bounds for convolution
ClosedPublic

Authored by limo1996 on Jul 6 2020, 12:59 AM.

Details

Summary

Loop bound inference is right now very limited as it supports only permutation maps and thus
it is impossible to implement convolution with linalg.generic as it requires more advanced
loop bound inference. This commits solves it for the convolution case.

Depends On D83158

Diff Detail

Event Timeline

limo1996 created this revision.Jul 6 2020, 12:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2020, 1:00 AM
ftynse requested changes to this revision.Jul 6 2020, 5:59 AM
ftynse added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
80

Nit: Top-level functions should use /// for comments.

84

Please explain the semantics of the argument in the top-level comment instead.

85

Prefer ValueRange unless it is missing some functionality

86

We tend to use assert(condition && "textual description")

96

Prefer early-return to decrease horizontal alignment. E.g.,

for (...) {
  if (auto x = dyn_cast<..>) {
    // code
  }
}

can be transformed into

for (...) {

auto x = dyn_cast<..>;
if (!x) continue;
// code

}

98

cast assume the operand has the expected type and fails an assertion if it does not. If there is no other check, e.g. in the op verifier, this may trigger an assertion on user-entered IR, which we must avoid

100

MLIR uses camelBack for identifiers

504

Please use a more descriptive variable name. Also avoid C-style casts.

507

No need for a cast here, but I'd check or assert somewhere (depending on whether it can happen with user-entered IR or not) that numIn > numDims, one can have a map of the shape (i) -> (i,i,i).

This revision now requires changes to proceed.Jul 6 2020, 5:59 AM
limo1996 updated this revision to Diff 275931.Jul 7 2020, 1:07 AM
limo1996 marked an inline comment as done.

All comments of ftynse incorporated

ftynse accepted this revision.Jul 7 2020, 1:38 AM

Deferring final approval to @nicolasvasilache

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
109

This is an anti-pattern in LLVM, if you intend to cast, you should use dyn_cast and check that the result is non-null instead.

auto llhs = lhs.getLHS().dyn_cast<AffineDimExpr>();
if (!llhs)
  continue;

int dimPosition = llhs.getPosition();

(This will not work for "add" and "mul" above because they don't have a dedicated type)

Also, avoid single-character variable names.

This revision is now accepted and ready to land.Jul 7 2020, 1:38 AM
limo1996 marked 9 inline comments as done.Jul 7 2020, 1:57 AM
limo1996 added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
507

numInputs = numDims + numSymbols so it should never happen

limo1996 marked 2 inline comments as done.Jul 7 2020, 3:03 AM
limo1996 added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
517

@ftynse inversePermutation needs to "support symbols by ignoring them" as verification uses the function to check whether concatination of indexing maps is invertible.. Another option is to ignore the check if map has symbols which can cause problems..

limo1996 updated this revision to Diff 275972.Jul 7 2020, 3:11 AM

Proposed change to circumvent call to inversePermutation during verification of linalg.generic

limo1996 marked an inline comment as done.Jul 7 2020, 3:13 AM
limo1996 added inline comments.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
298 ↗(On Diff #275972)

@ftynse Here is the proposed change which requires the least amount of code..

ftynse requested changes to this revision.Jul 7 2020, 3:43 AM
ftynse added inline comments.
mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
298 ↗(On Diff #275972)

I am interested in a change that keeps the verifier correct and useful (that is, not allowing operations we cannot handle elsewhere), not in the change with the minimal number of LoC. This change should be included in the previous commit instead, I suppose.

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
504

mlir uses camelBack

This revision now requires changes to proceed.Jul 7 2020, 3:43 AM
limo1996 updated this revision to Diff 276013.Jul 7 2020, 5:20 AM

2 of ftynse's comments addressed

limo1996 marked 2 inline comments as done.Jul 7 2020, 5:21 AM
limo1996 updated this revision to Diff 276402.Jul 8 2020, 6:22 AM

getSymbolSource: llvm::Optional instead of -1

ftynse added inline comments.Jul 16 2020, 6:56 AM
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
95

This does not look correct. Assume the concatenated map looks like (d0,d1,d2)[s0] -> (d2,d0,d1) (the symbol is unused for simplicity). The expected ranges should be the same as produced by the inversion method, i.e. (0..allViewSizes[1], 0..allViewSizes[2], 0..allViewSizes[0]), but this code will produce (0..allViewSizes[0], 0..allViewSizes[1], 0..allViewSizes[2]).

I think you need allViewSizes[ <index-of-result-in-map> ] instead of allViewSizes[d.getPosition()], but please double check.

100

Could you please add a comment that explains the math behind this? Affine expressions are not super-easy to follow.

103

Shouldn't this also check that LHS and RHS are binary expressions of a specific kind, like "add" or "floordiv"?

Like: "if the expression has the shape di + dj - sk floordiv CST, the bounds are ... because ... ".

Also explain that it always overwrite the existing range because the range is always smaller in this case. Imagine having a map like (d0,d1)[s0] -> (d0,d1,d0+d1-s0 floordiv 3), you will have computed the bounds for both loops when you hit this case.

<disregard for now>
Normally, we should have computed the maximum across all found lower bounds and the minimum across all found upper bounds.
</disregard for now>

500

Since the parent commit introduced symbol_source as an ODS attribute, it should be possible to just call linalgOp.symbol_source() here and avoid using a hardcoded name.

506–507

unsigned diff = map.getNumSymbols(); ? Also, numSymb sounds like a better name for this?

513

I'm not convinced diff * symbolSource + idx is the correct expression. You actually want the sizes that correspond to symbolSource's operand, which should be computed as the sum of ranks of all operands before symbolSource. (diff is the current number of symbols).

ftynse requested changes to this revision.Jul 16 2020, 6:56 AM
This revision now requires changes to proceed.Jul 16 2020, 6:56 AM
limo1996 marked 10 inline comments as done.Jul 17 2020, 8:50 AM

All comments resolved

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
95

Yes. I moved the logic from the next commit here..

500

This code was removed and simplified so no need for that

506–507

This code was also removed

513

Yes this was also fixed in the previous commit and evolved here..

limo1996 updated this revision to Diff 278790.Jul 17 2020, 8:50 AM
limo1996 marked 4 inline comments as done.

All comments resolved

limo1996 updated this revision to Diff 278796.Jul 17 2020, 9:00 AM

comment moved

ftynse added inline comments.Jul 20 2020, 1:57 AM
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
100

The comment I see (m + n - s floordiv 2) is overly concise to qualify as explanation. I was expecting something like: "if the access pattern is (m,n)[s] -> (m + n - s floordiv C)", then the bounds are s floordiv C <= n <= max m - s floordiv C because reason1, reason2.

Actually, I seem to have said the exact same thing in the comment below, which is marked "done". Did you forget to reupload the diff after changes?

103

I don't see checks for binary expression kinds on lhs and rhs. In the current state, this can match something like d0-d1*s0 floordiv 42, in which case the computed bounds will be wrong. Did you forget to upload a new version?

limo1996 updated this revision to Diff 279275.Jul 20 2020, 9:23 AM

Evolved changes from parent commit which included changes to emitting loop ranges with symbols

limo1996 updated this revision to Diff 279283.Jul 20 2020, 9:35 AM
limo1996 marked 2 inline comments as done.

last comment of nicolas resolved

ftynse requested changes to this revision.Jul 21 2020, 3:04 AM
ftynse added inline comments.
mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
123 ↗(On Diff #279283)

Let's be more explicit: "Reserve is mandatory to avoid a potential undefined behavior with pushing back to smallvector from itself."

Also, please terminate sentences with a dot in comments. Always.

mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
100

Thanks for providing more details!

Unfortunately, this confirmed my understanding that the code went too far with eager simplification and produces off-by-one results.

Given

0 <= n < A
0 <= n + m - s floordiv 2 < B

the correct upper bound on m is

m < B - max n + s floordiv 2

We can substitute max n with (A-1) from the first inequality, thus obtaining

m < B - A + 1 + s floordiv 2

Now, the code makes the assumption that s = A, which can be reasonable for the use case. This gives

m < B - A + 1 + A floordiv 2

and then it tries to simplify to

m < B - A floordiv 2

which only holds when A = 1 mod 2, because in this case A = 2 * (A floordiv 2) + 1 and cancels out the +/-1. Otherwise, it should have been m < B - A floordiv 2 + 1. This does not produce out-of-bounds accesses, but ignores the last iteration for even values of A.

Let's not be too eager with simplification and just compute the upper bound on m as m < B - A + 1 + s floordiv 2 without any assumption on the divisibility of A by 2.

103

I still don't see the checks I expected. To be precise,
auto lhs = binOp.getLHS().dyn_cast<AffineBinaryOpExpr>(); does _not_ guarantee that lhs is a sum. I expect to see
if (lhs.getKind != AffineExprKind::Add) continue; and similarly for rhs.

This revision now requires changes to proceed.Jul 21 2020, 3:04 AM
limo1996 updated this revision to Diff 279499.Jul 21 2020, 6:22 AM
limo1996 marked 3 inline comments as done.

All comments of Alex resolved

ftynse accepted this revision.Jul 22 2020, 4:35 AM
ftynse added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Loops.cpp
124

Actually, it shouldn't matter anymore whether you do floordiv 2 or floordiv any-other-value.

126

This could use a better name. Please try really hard to avoid short semantically meaningless names. I recall making this same comment. The purpose of the name may seem obvious today for you, but it won't be when you'll have to debug this code two month from now. Does c2 stand for c * 2, c squared? Given that there is only one use of this variable, just fold it to its use if (minusOne.getValue() != 1) continue;.

143

Can't this be done as part of affine apply? viewSizes is at least partially contained by values. Also, nobody prevents you from adding more inputs to one of affine apply's.

This revision is now accepted and ready to land.Jul 22 2020, 4:35 AM
limo1996 updated this revision to Diff 279823.Jul 22 2020, 7:32 AM

comments of Alex incorporated

limo1996 marked 3 inline comments as done.Jul 22 2020, 7:32 AM

All comments of Alex done

This revision was automatically updated to reflect the committed changes.