This is an archive of the discontinued LLVM Phabricator instance.

[mlir][tensor] Add transform to make tensor.pad/empty loop-independent
ClosedPublic

Authored by springerm on Feb 13 2023, 6:21 AM.

Details

Summary

Add a transform to make tensor.pad and tensor.empty ops independent of SCF loop IVs. Such ops can then be hoisted.

E.g.:

scf.for %iv = %lb to %ub step %step {
  %high = affine.apply affine_map<(d0)[s0] -> (s0 - d0)> (%i)[%ub]
  %p = tensor.pad %t low[5] high[%high] ...
  ...
}

Is transformed to:

%high_new = affine.apply affine_map<()[s0, s1] -> (-s0 + s1)> ()[%lb, %ub]
%p_hoistable = tensor.pad %t low[5] high[%high_new]
%dim = tensor.dim %t, %c0
%size = affine.apply affine_map<(d0)[s0, s1] -> (-d0 + s0 + s1 + 5)>(%iv)[%ub, %dim]
%slice = tensor.extract_slice %p_hoistable [0] [%size] [1]

Depends On: D146524

Diff Detail

Event Timeline

springerm created this revision.Feb 13 2023, 6:21 AM
springerm requested review of this revision.Feb 13 2023, 6:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 6:21 AM

@dcaballe The same infrastructure can be used for computing upper bounds for memref allocations.

add op documentation

mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h
17 ↗(On Diff #497222)

Isn't the best practice here to just forward-declare ?
I am unclear why you reverted to a mix of forward-declaration for OpBuilder and include for AffineMap/Value/ValueRange ?

mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt
7

This fails to build for me with:

CMake Error at /usr/local/google/home/ntv/github/llvm-project/mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt:6 (add_mlir_doc):
  add_mlir_doc Function invoked with incorrect arguments for function named:
  add_mlir_doc
mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
27

Please explain a little more what this entails because it involves increasing the tensor size/dimensionality.

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

The footprint in this example can be too high: you'd want to take the ceildiv by %step.

mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp
87 ↗(On Diff #497222)

Can we use step 1 instead here and mention that this is conservative but less precise in the context of this method?

springerm marked 4 inline comments as done.Feb 27 2023, 1:24 AM
springerm added inline comments.
mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h
17 ↗(On Diff #497222)

I thought I needed the definition so that it can be used in FailureOr<...>. But it actually works without.

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

I tried to do that but FlatAffineConstraints was unable to compute an upper bound in that case. It's unclear why this is happening, I'm looking into this. Maybe the system of inequalities is getting too complex (with various "semi-affine exprs"...). It looks like FlatAffineConstraints must be extended.

springerm updated this revision to Diff 500698.Feb 27 2023, 1:24 AM
springerm marked an inline comment as done.

address comments

springerm added inline comments.Feb 27 2023, 2:16 AM
mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

This is indeed due to a shortcoming in FlatAffineValueConstraints:

// TODO: Whenever there are local variables in the dependence
// constraints, we'll conservatively over-approximate, since we don't
// always explicitly compute them above (in the while loop).

It is worth fixing this? Will probably take a while to understand and rewrite a 200 LOC function.

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

Yes, this is the same rationale as using step 1 below.

Can we update the doc to make this explicit?
I.e. divide by %step and mention that in the case of a symbol, we over-approximate with setting step to 1 to circumvent the limitation of the analysis.

mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp
86 ↗(On Diff #500698)

if we use step 1, this is not an optional anymore and there is a bit of simplification in your code down the line

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
23

Can we rename this transform to transform.loop_privatize or something similar and have it take the op to privatize and the number of enclosing loops above which we want to hoist ?

This should be kept in sync and compose with the recently landed hoist_pad.
We can later evolve the syntax from num_loops to something better, informed by how we want these things to compose.

mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp
21 ↗(On Diff #500698)

The filename is misleading here, this is not performing hoisting but privatization that will later enable hoisting.
Can we move this functionality to a LoopPrivatization.cpp file ?

As a followup, we should integrate the usage of privatization its usage into HoistPadding.cpp.

We also now have SubsetHoisting.cpp for mechanical parts related to actual hoisting of loop-independent quantities that will also come in handy..

springerm marked 4 inline comments as done.Mar 1 2023, 1:58 AM
springerm added inline comments.
mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
23

This does not privatize the tensor/op though. It simply changes the size of the tensor.

Added the number of loops to the op.

mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp
21 ↗(On Diff #500698)

How about LoopTransforms.cpp? This transformation does not do any privatization. (Assuming "privatization" = "making a private copy of a tensor for each loop iteration".)

springerm updated this revision to Diff 501427.Mar 1 2023, 1:58 AM
springerm marked 2 inline comments as done.

address comments

springerm updated this revision to Diff 506957.Mar 21 2023, 6:36 AM

reimplement with ValueBoundsOpInterface

springerm retitled this revision from [mlir][tensor] Add transform to make tensor.pad loop-independent to [mlir][tensor] Add transform to make tensor.pad/empty loop-independent.Mar 21 2023, 6:36 AM
springerm edited the summary of this revision. (Show Details)

Does this update change the way this is supposed to interact with HoistPadding ?
Or in other words, how do you see this interacting with HoistPadding ?

Does this update change the way this is supposed to interact with HoistPadding ?
Or in other words, how do you see this interacting with HoistPadding ?

HoistPadding does not require this functionality, it just clones the entire loop nest. So no interaction with HoistPadding.

This revision is already a month old and was never landed, but is useful for Diego as an example how to make ops (memref.alloca in his example) hoistable. So I reimplemented it so that it takes advantage of ValueBoundsOpInterface.

Looking at the ValueBounds part only for now.

mlir/include/mlir/Dialect/Affine/Transforms/Transforms.h
91

I can't follow from the description... This is converting an Affine-based value bound into an Arith-based value bound?

mlir/include/mlir/Interfaces/ValueBoundsOpInterface.h
138

typos

Could you please elaborate a bit more on what "independent of the values in independencies" mean?

mlir/lib/Interfaces/ValueBoundsOpInterface.cpp
377

Something like this in the header doc is what I was hoping for :)

springerm updated this revision to Diff 515554.Apr 20 2023, 6:33 PM
springerm marked 4 inline comments as done.

address comments

springerm added inline comments.Apr 20 2023, 6:42 PM
mlir/include/mlir/Dialect/Affine/Transforms/Transforms.h
91

The name was confusing. I renamed the function and added some more documentation.

dcaballe accepted this revision.Apr 26 2023, 10:06 PM

Thanks!

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

What happened with this? Was it addressed?

mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp
67

nit: ub to var

mlir/lib/Dialect/Tensor/Transforms/IndependenceTransforms.cpp
80

nit: ub to var

This revision is now accepted and ready to land.Apr 26 2023, 10:06 PM
This revision was landed with ongoing or failed builds.Apr 27 2023, 7:47 PM
This revision was automatically updated to reflect the committed changes.
springerm marked 3 inline comments as done.
springerm added inline comments.Apr 27 2023, 7:52 PM
mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
39

There is a TODO in SCF/IR/ValueBoundsOpInterfaceImpl.cpp. We can't do any better at the moment.