This is an archive of the discontinued LLVM Phabricator instance.

[mlir][memref] Fix ExpandShapeOp verifier
ClosedPublic

Authored by springerm on Mar 29 2022, 2:26 AM.

Details

Summary
  • Complete rewrite of the verifier.
  • CollapseShapeOp verifier will be updated in a subsequent commit.
  • Update and expand op documentation.
  • Add a new builder that infers the result type based on the source type, result shape and reassociation indices. In essence, only the result layout map is inferred.

Diff Detail

Event Timeline

springerm created this revision.Mar 29 2022, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2022, 2:26 AM
springerm requested review of this revision.Mar 29 2022, 2:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2022, 2:26 AM
nicolasvasilache requested changes to this revision.Mar 29 2022, 6:00 AM
nicolasvasilache added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1817

I find the logic unwieldy, and I can't tell offhand whether correct or not.
You should be able to rewrite with something resembling:

unsigned shapeIndex = resultShape.size() - 1;
// 1-1 mapping between srcStrides and reassociation packs.
// Each srcStride starts with the given value and gets expanded according to the proper entries in resultShape.
for (auto it : llvm::zip(llvm::reverse(reassociation), llvm::reverse(srcStrides)))
  ReassociationIndices reassoc = std::get<0>(it);
  int64_t currentStrideToExpand = std::get<1>(it);
  // Note that a stride does not get expanded along the first entry of each shape pack.
  // Example:
  //   strides = [10000, 100, 1], 
  //   reassociations = [[0], [1, 2, 3], [4]], 
  //   resultSizes = [2, 5, 4, 3, 2] = [[2], [5, 4, 3], [2]]
  // For the purpose of stride calculation, the useful sizes are: [x, x, 4, 3, x] = [[x], [x, 4, 3], [x]].
  //   resultStrides = [10000, 1200, 300, 100, 1]
  for (unsigned idx = 0, e = reassoc.size() ; idx < e; ++idx) {
    using namespace saturated_arith;
    reverseResultStrides.push_back(currentStrideToExpand);
    currentStrideToExpand = Wrapper(currentStrideToExpand) * resultShape[shapeIndex--];
  }
} 
return makeStridedLinearLayoutMap(llvm::to_vector<8>(llvm::reverse(reverseResultStrides)), srcOffset, srcType.getContext())
1853

I forget how no-layout / identity layout are both represented, please double check you have both cases working.

1925

typo

1955

Please extract the code below in a helper function that you can also reuse in the builder.
If you need the intermediate state to return different errors in the verifier/builder, consider returning an error status enum.

This revision now requires changes to proceed.Mar 29 2022, 6:00 AM
springerm updated this revision to Diff 418870.Mar 29 2022, 7:13 AM
springerm marked 4 inline comments as done.

update

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1817

The saturated_path thing didn't help much because I also have to deal with dynamic dim sizes (not just strides). There are different constants for dynamic strides and dynamic dim sizes.

But good idea with the zip. This made things already much simpler.

Also added an example. Is this easier to follow now?

1853

From my understanding, "no layout map specified" = "standard layout" = "isIdentity()". Both cases are covered by unit tests.

mravishankar requested changes to this revision.Mar 29 2022, 12:32 PM
mravishankar added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

This is where things become interesting. The ordering of strides might not be same as the ordering of dimensions. Affine maps (unfortunately?) allows you to specify non-inner dimension to be the fastest varying. I am OK if the decision is to say that we arent going to allow that case. But this patch seems to be making an implicit assumption about striding (unless I am mistaken), and I'd like it to be explicit.

1908

This should be logic that is common to collapse and expand for tensor and memrefs. Could we put this in a utility ?

This revision now requires changes to proceed.Mar 29 2022, 12:32 PM
springerm added inline comments.Mar 29 2022, 12:58 PM
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

You mean for the expanded dims, right? The strides of the other dims are already specified in the source memref. But for the expanded dims within a reassociation group, yes, I currently make that assumption. (I didn't think about this much, but when you don't specify any custom layout, this is what you get.)

This commit adds a builder that infers the layout map. The function that we are looking at here is doing that. The strides of the expanded dims have to be decided during bufferization. Either as we do it here or in a different way.

When it comes to the verifier, we could make it more permissive, also allowing memref result types where the expanded strides have a different ordering. That would make the logic a bit more complex though, because we could not reuse computeExpandedLayoutMap for both the builder and the verifier that easily anymore.

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

@mravishankar could you please elaborate a bit?

The algorithm I described does not care about stride value ordering; i.e. we could (and we should) replace the doc and test by e.g. [1, ?, 1000].

1817

Yes the fact that dynamic dims use -1 as a sentinel is an unfortunate accident of history.
The use of the -1 magic constant that has crept in very deeply.

Still, this is easily solvable in the saturated_arith case with an explicit constructor for sizes and strides and a saturated bit.
Then we would also drop the binary operators that work with int to avoid surprises:

/// Helpers to write more idiomatic operations.
namespace saturated_arith {
struct Wrapper {
  static Wrapper stride(int64_t v) {
    if (ShapedType::isDynamicStrideOrOffset(v))
      return Wrapper{true, 0}; // the value does not matter, use 0 for more defensive programming
    return Wrapper{false, v};
  }
  static Wrapper offset(int64_t v) {...} 
  static Wrapper size(int64_t v) { ... }
  int64_t asOffset() { return saturated ? ShapedType::kDynamicStrideOrOffset : v; }
  int64_t asSize() { return saturated ? ShapedType::kDynamicSize : v; }
  int64_t asStride() { return saturated ? ShapedType::kDynamicStrideOrOffset : v; }
  bool saturated;
  int64_t v;
};
Wrapper operator+(Wrapper a) {
  if (saturated || a.saturated)
    return Wrapper{true, 0};
  return Wrapper{false, a.v + b};
}
Wrapper operator*(Wrapper a) {
  if (saturated || a.saturated)
    return Wrapper{true, 0};
  return Wrapper{false, a.v * b};
}
} // namespace saturated_arith

Confining the branchy logic to this helper greatly simplifies the logic and reader's ability to follow comments.
Even in the way it is currently written with the zip + reverse, it is unclear to me that 1728-1763 is correct.

springerm added inline comments.Mar 30 2022, 12:37 AM
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

I think what Mahesh said is that there are multiple ways to compute the result strides. E.g., the middle dim of memref<20x10x5xf32, offset: 0, strides: [50, 5, 1]> could be expanded into:

  • memref<20x2x5x5xf32, offset: 0, strides: [50, 25, 5, 1]>
  • memref<20x2x5x5xf32, offset: 0, strides: [50, 5, 25, 1]>

Is this correct? I currently choose the first one.

When calling the builder that infers the layout map (which we need for bufferization), we have to make decision. This decision is not currently documented. When it comes to the verifier, we could generalize it to allow both of the above options. At the moment only the first one is allowed.

springerm updated this revision to Diff 419066.Mar 30 2022, 1:20 AM

address comments

springerm marked an inline comment as done.Mar 30 2022, 1:23 AM
springerm added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1817

Added a helper function computeNextStride that contains the branching logic and is reused during the CollapseShapeOp computation.

springerm updated this revision to Diff 419069.Mar 30 2022, 1:37 AM
springerm marked an inline comment as done.

simplify code

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

I see, we can reduce it to just 10xf32 with stride = [1] and split that into 2x5xf32.
The data looks like this: 0, 1 ... 9.

Now if we reshape to 2x5 with stride = [5, 1], the data is:

0, 1, 2, 3, 4,
5, 6, 7, 8, 9

an i, j iterator with 0 <= i < 2, 0 <= j < 5 would access data in the expected order.
In particular, (i, j) = (1, 3) accesses 8 = [1, 3] * [5, 1].

However, if we reshape to 2x5 with stride = [2, 1], the data is:

0, 1, 
2, 3, 
4, 5, 
6, 7, 
8, 9

This is incorrect because only a subset of the data would be accessed and some elements would be accessed twice.
I.e. when size > stride_increment along a given dimension, the memref has internal aliasing.
In particular, (i, j) = (1, 3) accesses 5 = [1, 3] * [2, 1].

So this internal permutation of strides is generally not correct and one should just use memref.transpose which does the right thing.

springerm marked an inline comment as done.Mar 30 2022, 10:51 AM
springerm added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1908

Can't be used for the tensor op verification yet because it's still in MemRefOps.cpp, but extracted it into a separate function that will be used in the CollapseShapeOp verification in the next commit.

mravishankar accepted this revision.Mar 30 2022, 12:38 PM

Removing my blocker cause I was mistaken. Please add tests for non-row major striding.

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

Actually my bad. I missed that the code does not make assumptions about the stride layout ordering. For an expanded dimension the stride increasing from right to left of the expanded dims is fine. I was looking for tests and didnt find any. Please do add test as Nicolas suggested.

nicolasvasilache accepted this revision.Mar 30 2022, 2:23 PM
nicolasvasilache added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1841

Ok, I see this is now the same logic as I've written down so we can go ahead.
I'll still want to clean up and unify the saturated arith path for consistency, I can do that myself as a followup.

This revision is now accepted and ready to land.Mar 30 2022, 2:23 PM
springerm marked an inline comment as done.Mar 30 2022, 2:46 PM
springerm added inline comments.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

It's not clear to me yet what additional test I should add. Expansions into memrefs such as memref<20x2x5x5xf32, offset: 0, strides: [50, 5, 25, 1]> will correctly fail verification. Do we need to extend this?

springerm added inline comments.Mar 31 2022, 12:30 AM
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

I just did another experiment. This implementation only works if the inner-dimensions are the fastest-varying ones when it comes to strides. Otherwise, we can get faulty stride computations like here:

%r6 = memref.expand_shape %arg6 [[0], [1], [2, 3], [4]] :
    memref<20x2x10x5xf32, offset: 0, strides: [100, 10, 50, 1]>
    into memref<20x2x2x5x5xf32, offset : 0, strides : [100, 10, 250, 50, 1]>

This op passes the verifier. (All other result types will be rejected.) I think we should just limit the verifier/op to source memrefs where the strides come in the same order as the dims. And then extend the op if we actually need to support other layout maps.

springerm added inline comments.Mar 31 2022, 12:56 AM
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

Added a check and expanded the op docs to disallow such memrefs. Can be extended in a follow-up revision if necessary.

springerm updated this revision to Diff 419355.Mar 31 2022, 1:04 AM

fix warning

This revision was landed with ongoing or failed builds.Mar 31 2022, 1:08 AM
This revision was automatically updated to reflect the committed changes.
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
1806

I don't see any faulty stride computation but your source memref has internal aliasing. The restriction seems artificial and based on incorrect assumptions.

I addressed in https://reviews.llvm.org/D122845.

1806

Ordering strides looks adhoc and undesirable.