This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp.
ClosedPublic

Authored by okkwon on Feb 11 2022, 4:21 PM.

Details

Summary

Fold ExtractSliceOp when the source is a constant.

Diff Detail

Event Timeline

okkwon created this revision.Feb 11 2022, 4:21 PM
okkwon requested review of this revision.Feb 11 2022, 4:21 PM
rriddle added inline comments.Feb 11 2022, 4:26 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1364

Drop trivial braces here.

1366

Drop trivial braces here.

1377

Can you use longer names here? I and E aren't clear to me here.

1409

Drop the llvm:: here. Also do we need the hard coded 6?

1418

Please use m_Constant instead of hardcoding constant operations.

1422–1435

Can you format here?

okkwon updated this revision to Diff 408125.EditedFeb 11 2022, 5:23 PM

Addressed the comments.

okkwon updated this revision to Diff 408131.Feb 11 2022, 6:24 PM
okkwon marked 2 inline comments as done.

[mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp. │

Fold ExtractSliceOp when the source is a constant.

okkwon updated this revision to Diff 408192.Feb 12 2022, 10:13 AM
okkwon marked 4 inline comments as done.

apply clang-format

mravishankar requested changes to this revision.Feb 12 2022, 10:20 PM

Actually, I was under the impression that you were handling the splat case first before moving on to the non-splat case. In reality, we probably do not want to do such a transformation for the non-splat case. Typically these constants are very large (several MBs). This would end up reading these values element by element in the compiler. That leads to a huge increase in compilation time. Still there is a value of doing that as a preprocessing step within the compiler to (a) reduce runtime costs, and (b) also reduce the binary size of the compiled model.
So one way this has been done in IREE is to extract out such computations as a pre-processing step, compile it, and run it using IREE, and use the result of the run as an input to the rest of the computation.
Long story short, we need to do such transformations only for either splat constants, or "small constants".

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1394

Probably better to move this out of here and check much earlier. Then you can make the function just take the offsets, strides, and sizes directly as arguments.

1426

I have a broader comment on the constants that need to be folded this way on the patch itself.
In addition to that, I'd consider doing this only if the constant op has a single use. If the whole constant is used anyway, then taking slice of it will eventually just be doing a pointer offset. The value of this transformation is that if the constant has a single use, then you could just discard the unused values.

1431

Nit: Instead of templating on the iterator type, can we template on the Attribute type directly , i.e. DenseIntElementsAttr/DenseFPElementsAttr?

This revision now requires changes to proceed.Feb 12 2022, 10:20 PM
okkwon updated this revision to Diff 408586.Feb 14 2022, 1:07 PM
okkwon marked an inline comment as done.

Addressed Mahesh's comment by folding the single use case.

okkwon marked 2 inline comments as done.Feb 14 2022, 1:11 PM
okkwon added inline comments.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1394

This will make the code in foldConstant() lengthy so I would keep this as is.

1426

This is done. Thank you for your thoughtful comments!

1431

I tried to use the Attribute only, but there is a difficulty to get the underlying concrete element type solely from the attribute since it is an implementation detail. Let's keep it as is.

mravishankar requested changes to this revision.Feb 14 2022, 1:19 PM
mravishankar added inline comments.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1350–1351

You also need to have a size control here to avoid compilation time explosion.

1350–1351

Related to the comment below, if this method took the offsets, size and strides directly, say

static Attribute sliceDenseElementsAttr(Value constant, ArrayRef<int64_t> offsets, ArrayRef<int64_t> sizes, ArrayRef<int64_t> strides) {
  ...
}

then the logic here is agnostic to the fact that these came from a tensor.extract_slice. That separation of concerns seems cleaner to me.

1394

See comment about. Doing this processing in the matchAndRewrite method will separate the actual slice handling of the constant and getting the values of offsets, sizes and strides from the extract_slice op. You could just check for the fold conditions and also retrieve the values directly in the ExtractSliceOp::fold(...) method makes it clear when the folding is done.

This revision now requires changes to proceed.Feb 14 2022, 1:19 PM
okkwon updated this revision to Diff 408592.Feb 14 2022, 1:23 PM
okkwon marked 2 inline comments as done.

For some unknown reason, arg diff --update only pushed one commit instead of two.
Now the change for the single use case is correctly updated.

okkwon updated this revision to Diff 408602.Feb 14 2022, 1:47 PM

Control the size of constant folding. When the size > 1024 (hard-coded now),
the constant folding won't be applied.

Can you re-upload with the full diff?

okkwon updated this revision to Diff 408639.Feb 14 2022, 3:11 PM

hoist various checks to fold() to clean up the code

rriddle added inline comments.Feb 14 2022, 3:16 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1351

When is the result not a ShapedType?

1355–1356
1356–1357
okkwon updated this revision to Diff 408643.Feb 14 2022, 3:23 PM
okkwon marked an inline comment as done.

Updated to use cast to get ShapedType.

okkwon updated this revision to Diff 408644.Feb 14 2022, 3:26 PM
okkwon marked an inline comment as done.

reuse resultType

okkwon marked 4 inline comments as done.Feb 14 2022, 3:28 PM
okkwon added inline comments.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1350–1351

Hoisted the offsets, sizes, and strides calculation to fold(). Now the fold function directly calls sliceElements().

1351

I will use cast. The type should always be ShapedType. Thanks!

okkwon updated this revision to Diff 408646.Feb 14 2022, 3:30 PM
okkwon marked an inline comment as done.

Resolve the patch error

okkwon updated this revision to Diff 408651.Feb 14 2022, 3:40 PM

rebase onto main

okkwon updated this revision to Diff 408654.Feb 14 2022, 3:45 PM

Fix a missing change for IterTy and ElemTy

mravishankar requested changes to this revision.Feb 14 2022, 4:46 PM

Looks good overall. Just a nit, and maybe some more lit tests?

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1401

Nit : s/Sice/Since

1403

A better TODO is "If the size of the constant folded is to be controlled, move this out of folding and make it a separate pattern which can accept an option to control the size"

mlir/test/Dialect/Tensor/canonicalize.mlir
394 ↗(On Diff #408654)

Sorry should have mentioned earlier. Might be worth adding more tests, like having a larger constant and taking an interior out of it, etc.

This revision now requires changes to proceed.Feb 14 2022, 4:46 PM
rriddle added inline comments.Feb 14 2022, 5:20 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1363

Spell out auto here.

1384

attr is guaranteed to be non-null here. Also, you can use SplatElementsAttr in place of DenseElementsAttr above and drop this call to isSplat.

1389

Why go to the defining op instead of op.source().hasOneUse()?

1408

nit: Spell out auto here.

1425
1427

Spell out auto here.

mehdi_amini added inline comments.Feb 14 2022, 5:43 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

I'm not thrilled by this heuristic, I don't think it the right thing to handle this kind of thing in such an ad-hoc way here.

In particular because if there are two uses for this constant and both of them can be folded but both of them are guarded the same way, then we don't fold anything...

1403

Right, but a corollary of the TODO is also that we shouldn't have an ad-hoc threshold like this here, because why this operation and no other? How is the threshold suitable in general?

1426

This is unusual for the folder to look

okkwon added inline comments.Feb 14 2022, 6:55 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

Thanks for the comment! Even though the pattern caught by the current code looks very limited, it is still incremental enhancement. I am going to add more patterns. I am also very interested in seeing what is the most common pattern regarding the constant folding for the real applications.

mehdi_amini added inline comments.Feb 14 2022, 8:20 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

I commenting on the guard itself, I'm fine with adding the pattern, I object to the way it is guarded though: both by this "hasOneUse" and the arbitrary "kConstantFoldingMaxNumElements".

okkwon updated this revision to Diff 409085.Feb 15 2022, 3:59 PM

add two more unit tests

okkwon marked 8 inline comments as done.Feb 15 2022, 4:04 PM

I have added two more unit tests.

I will revisit the change to address Mehdi's comments. Thanks River, Mehdi, and Mahesh for the comments!

mravishankar requested changes to this revision.Feb 15 2022, 9:16 PM

I see that you changed the condition to check for splat constants, but the same handling does not apply for splat constants. There is no need to take slices, etc (actually that might even be wrong, not sure)

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

There is a difference between splat constants and non-splat constants. The reasoning for non-splat constants is, say you have a constant of type tensor<4x2xf32>. If there are multiple uses of this constant (potentially accessing different slices of the constant, or the whole constant), you will need to store the constant values in the binary somewhere. Having a new constant of smaller shape is just adding to binary size without any value. You might as well leave the tensor.extract_slice as is. Eventually it will just become a memref.subview followed by load. So when there are multiple uses it makes sense to leave the constant as is. if there is a single use, it means that the values outside of the slice are dead and you can just remove those values. It saves binary size. I agree with you about the kConstantFoldingMaxNumElements is fairly arbitrary.

The same reasoning does not hold for splat constants though. So now that this change is targeting splat constants, i'd also suggest dropping both these heuristics.

1393

IIUC, constants cannot have a dynamic shape.

1403

Replied above with more context on this.

1426

I am not sure I understand the comment here.

1435

All of this is not necessary for a splat constant. There is only a single value. You just need to change the type of the constant as dictated by the offsets, sizes and strides.

This revision now requires changes to proceed.Feb 15 2022, 9:16 PM
mehdi_amini added inline comments.Feb 15 2022, 9:32 PM
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

So now that this change is targeting splat constants

Right now the code is targeting only non-splat unless I am mis-reading?

Right. It does not handle the splat case yet.

My bad. Saw the patch too late in the night for me and misread the conditions. Will sync with Okwan to figure out next steps.

okkwon updated this revision to Diff 411432.Feb 25 2022, 9:11 AM

Add control function support

okkwon marked 7 inline comments as done.Feb 25 2022, 9:16 AM

The code is updated with the control function approach. Please take a look.

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1389

I used the control function approach to make the decision user-controllable.

1393

I believe this comment is outdated and the code shows a

mravishankar accepted this revision.Feb 25 2022, 10:09 AM

Looks good to me.

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1215

Nit: Combine this and next checks into one

if (!sourceType.hasStaticShape() && !resultType.hasStaticShape()) return failure();
1227

I am not sure we really need this case?

mlir/test/lib/Dialect/Tensor/TestTensorTransforms.cpp
67

This will assert if result type is not static. In this case the constant verification enforces that the result of a constant is statically shaped. So maybe its fine, but FYI.

This revision is now accepted and ready to land.Feb 25 2022, 10:09 AM
okkwon updated this revision to Diff 411476.Feb 25 2022, 11:28 AM
okkwon marked 6 inline comments as done.

Address Mahesh's comments

okkwon marked an inline comment as done.Feb 25 2022, 11:30 AM
okkwon added inline comments.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1215

nit: the logic should be ||. Updated.

1227

Are you referring to if (count == 0)? shape itself is used below. I will move it down.

mlir/test/lib/Dialect/Tensor/TestTensorTransforms.cpp
67

Yeah, the control function is just another control over whether the folding should happen or not.
The sanity check for the folding itself is done in the folding function.

bondhugula requested changes to this revision.Feb 26 2022, 2:45 AM
bondhugula added a subscriber: bondhugula.
bondhugula added inline comments.
mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
12

Please use forward declaration instead of including headers: https://llvm.org/docs/CodingStandards.html#include-as-little-as-possible

32

Is it disabled -> Disable ...

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
19–22

Trim headers -- you don't need all of these.

1167

Missing doc comment.

1169–1193

Code comments are completely missing.

1250–1262

Likewise.

1273

Doc comment.

mlir/test/Dialect/Tensor/fold-constant-extractslice.mlir
1

Nit: extractslice -> extract-slice

This revision now requires changes to proceed.Feb 26 2022, 2:45 AM
okkwon marked 8 inline comments as done.Feb 26 2022, 1:51 PM

Thanks Uday for the comments. Please take another look.

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
12

Unfortunately, ExtractSliceOp is used in the default value and it needs its concrete type, so it is not avoidable.

mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1250–1262

Please elaborate. I put a simple comment over newAttr.

okkwon updated this revision to Diff 411642.Feb 26 2022, 1:51 PM
okkwon marked an inline comment as done.

Addressed Uday's comments.

bondhugula added inline comments.
mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
1254

All comments are to be terminated with a full stop.

This revision is now accepted and ready to land.Feb 27 2022, 9:20 AM
okkwon closed this revision.Feb 28 2022, 9:48 AM

Merged to origin/main.

Try to keep the last line of the commit message with Differential Revision: https://reviews.llvm.org/D119605 ; not only will this close the revision, but also it'll allow folks in the future to find the discussion in the review from a git log.

(also the title of the revision / commit could have been updated since this evolved into not hooking up into the fold of the operations)

okkwon reopened this revision.EditedFeb 28 2022, 11:20 AM

Reopening: The push to origin/main caused a build break due to a cyclic header inclusion between Tensor.h and Transforms.h. The issue didn't occur in my local machine, but it broke the testing system.

This revision is now accepted and ready to land.Feb 28 2022, 11:20 AM
okkwon updated this revision to Diff 411883.Feb 28 2022, 1:43 PM

Fix the bazel build error and rewrite the commit message.

This revision was landed with ongoing or failed builds.Feb 28 2022, 3:09 PM
This revision was automatically updated to reflect the committed changes.