This is an archive of the discontinued LLVM Phabricator instance.

[mlir][shape] add outline-shape-computation pass
ClosedPublic

Authored by yaochengji on Aug 12 2022, 2:42 PM.

Details

Summary
  • add outline-shape-computation pass

Diff Detail

Event Timeline

yaochengji created this revision.Aug 12 2022, 2:42 PM
yaochengji requested review of this revision.Aug 12 2022, 2:42 PM
yaochengji edited the summary of this revision. (Show Details)Aug 12 2022, 2:44 PM
jpienaar edited the summary of this revision. (Show Details)Aug 19 2022, 1:31 PM
jpienaar added inline comments.Aug 19 2022, 1:44 PM
mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
15

No punctuation at the end (https://mlir.llvm.org/docs/OpDefinitions/#operation-documentation)

Could you also add a description of the pass?

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
52
60

\p notatiion isn't generally used here. shape is more common.

64

This looks like an assert? (assert(!cluster.empty() && ...)

70

If you use ValueRange , you can do getTypes() and can also skip creating vector here.

76

Is there a better location that could be given? Perhaps NameLoc related to function.

83

Same re single instruction {}

94

typo: inconvenient

95

Could you expand here how this orders instructions ?

124

Does symbol names need to be unique at Module level? Can symbol table behavior of Module be used to handle this automatically for you?

136

Lets use some constant rather than -1 (-1 is also used in ShapedType::kDynamic and if the goal is return the equivalent of a dynamic dim, then that constant could be used here)

319

Please add newline at end

mlir/test/Dialect/Shape/outline-shape-computation.mlir
4

Could you add a description of what is being tested?

yaochengji marked 8 inline comments as done.Aug 22 2022, 11:49 AM

Note that this diff should be finalized after https://reviews.llvm.org/D131977 is merged.

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
52

It is fixed as well as other corresponding places.

76

Here I use the loc of value' of shape.with_shape, do you think it's OK?

95

Add some explanation, hope it could help.

124
  1. It does if we the arguments of the shape computation function is out of current original function.
  2. Symbol table could not be since it could only handle operations with symbol trait, while it is symbol attribute here.
136

use llvm::Optional instead.

mlir/test/Dialect/Shape/outline-shape-computation.mlir
4

Added a concise description here, as well as more tests.

Detailed description could be fetched from the Passes.td.

yaochengji marked an inline comment as done.

rebase on new main branch

Nice, thanks

mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
17

A bit more context would be good. E.g., do we start with a function with reified shape computations? Where is this expected kind of level. And then post that the why one would do that (this is more detail than the other passes here but I think the others are simpler/more obvious:-))

17

Nit: space before ( (here and below)

20

What happens if the RankedTensorType already has an encoding? I'm guessing no expectations here if another pass post this also changes encoding

22

Is there scoping for the symbol? E.g., is this a global value, per function definition/call site?

24

And the agurments are symbols again?

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
76

Yes, that will help in error reporting. If this turns out to not be sufficient we can look at FusedLoc (but that'll be very large in some cases)

89

Could you document return and expected usage? (DenseMap has non deterministic iteration order I believe, so shouldn't be iterated upon unless result of iteration doesn't effect produced IR)

244

And this effects main function or only inside outlined shape function (s)?

369

What does cal mean here? (Perhaps val/value?)

yaochengji added inline comments.
mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
17

Done.

17

Added more details, including

  1. usually there's shape reification before this pass
  2. which level
  3. why this pass is needed?

I'm not a good doc writer, feel free to modify it directly :)

20

It is not handled by now. And I supplemented a TODO here.

22

It is global. Considering symbol might be referred across function.

24

Yes. Symbols are added in Type.

It is because Value and Operation are volatile, while Type is relatively stable.

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
244

This pattern is applied on main function before the shape-outlining logic.

369

cal is short for calculate :)

Nice and sorry for delays, as mentioned traveling a bit but luckily not blocker

mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
20

The pass should not overwrite the type in that case/exit when these encountered.

That means this pass can't be used with sparsity today and with MHLO & StableHLO shortly (i think today already it can't be, but only partially rolled out). That's undesirable. Instead we can get more use if this pass is split in two. Have the outline functions & populating an Analysis in this pass (for Analysis a key of Value and index should be sufficient, extending that to subtypes with nested shapes would be a natural extension later) and the pass (potentially downstream) that follows it would serialize the Analysis into the type and perform type propagation/joins across call sites (the follow up pass could also mark the Analysis as retained).

This would allow this outlining and analysis to be used in more general contexts (incl MHLO ones, torch-mlir with it's custom tensor types and I believe MemRef/Vector type cases) which would be great.

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
369

Ha :-)

mlir/test/Dialect/Shape/outline-shape-computation.mlir
2

Could you add a test case where the shape computation shares values with a value being computed? (E.g., shape computation interleaved with and feeding into other computations/returned)

Could you add a test where you have multiple reused shape computations? Say you have two dims, concat, associate , concat dims while repeating last 1x, associate, concat dims while repeating last dim 2x, associate etc and return all associated values - as if you had unrolled a loop with a fill inside. This one would be interesting in both unsimplified and CSE'd form.

63

You may need a CHECK-NOT to verify only one is generated.

yaochengji added inline comments.Aug 30 2022, 11:38 PM
mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
20
  1. To confirm, I'm using two dynamic shapes: a trivial symbol and a shape.func test as an example.

After the pass, the IR will be like this? The difference with the current one is that all the types remain unchanged. And there will be a retained Analysis used to mapping the Values to their shape computation function. In this case, the table will be {%0 : %arg0, %1 : [ @shape_cal_0, %arg0]} ?

func.func @main(%arg0: tensor<?x4x?xf32>, %arg1: tensor<2x4x?xf32>) -> tensor<?x4x?xf32>
    %0 = "test.abs"(%arg0) : (tensor<?x4x?xf32>) -> tensor<?x4x?xf32>
    %1 = "test.concat"(%0, %arg1) {axis = 0 : i64} : (tensor<?x4x?xf32>, tensor<2x4x?xf32>) -> tensor<?x4x?xf32>
     return %1 : tensor<?x4x?xf32>

shape.func private @shape_cal_0
    ...

If so (specifically the mapping table uses Value as a key), it will conflict our use cases. The original purpose of the pass is to preserve shape computation info as well as not let the computation part affect lowering or optimization passes badly (Ex. shape computation part will add more users to original Value). Based on these two preconditions, it is quite possible that %0 and %1 will be replaced by some new Values after some passes, which invalidates the mapping table of the above Analysis.

  1. I'm also interested about why type cannot be overwrite with sparsity/mhlo/stablehlo, could you give me more details, a link to doc or code? We also use the encoding attr of RankedTensorType to represent bounded shape info. Maybe I need to persuade my colleagues to think of a better way.

Sorry fell off review queue while I was out.

mlir/include/mlir/Dialect/Shape/Transforms/Passes.td
20
  1. Yes, so that's why in this pass you populate an Analysis and the pass just after this one you encode it in the type in some way. Encoding it here in the type would limit utility, but if it is useful for your flow then the pass just after (and both have to be module passes here whether or not 1 pass as you are mutating types across callsites) would capture it in type. That way you do get the behavior supporting your original use case but also have this as a general utility pass that can be used where folks have different assumptions. E.g., you _still_ encode in Type, just in the pass that runs directly after this one and that is more specialized for a deployment route (see 2 below).
  1. Sure, so if you and sparse compiler folks both use encoding attribute, what attribute is stored there? Sparse uses mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td while MHLO & StableHLO uses (or uses some variant of) https://github.com/tensorflow/mlir-hlo/blob/8a3ec6e5187e94a6c4f91d67a69545b9a1075d3e/include/mlir-hlo/Dialect/mhlo/IR/hlo_ops_attrs.td#L171. Now what you could do here is: create a new Attribute that contains both of these or implements both sets of classes (and can be cast to both) and then you can still encode in Type. BUT you can't do that here as you don't have access to the downstream Attribute type here. Which is why if you make this 2 passes, you could have this pass here that does 99% of the work and then another pass that does know about the MHLO/StableHLO attributes and can produce that new Attribute. For sparse you could do the same, but it gets a bit messy to just accumulate all different encodings into a new mega encoding, so I'd consider: where do you need what? Potentially where you do the outlining you could just nest sparsity inside your attribute and then post where you utilize it in codegen, flip to the nested sparse attribute and let the sparse codegen do its thing (or vice versa). Dropping info once you've utilized it is fine, doing it so needlessly would be bad.

An alternative would be to encode it via an op. This is where some of the shape assuming ops do (and this would be good candidate to add one that has a function reference to make it even simpler). Also what IREE does flow dialect side. The main downside to ops is that it obscures use-def chains. So doing it such that it gets out of the way of what you do want to retain or optimize is good (of course you could also have passes that optimize across such metadata ops knowingly). I think the op route will be better long term. It'll be cheaper and a bit more flexible/robust. But I don't think an ergonomic version of it exists at the moment.

yaochengji edited the summary of this revision. (Show Details)

populate shape mapping information into Analysis instead and add two required tests.

jpienaar accepted this revision.Sep 29 2022, 11:15 AM

I think this is a good place to iterate from! It can be refined. Thanks for making this into 2. Couple of changes, but nothing major. I'll help landing post.

[seems I didn't send my last reply of "Sorry fell off review queue while I was out.", I get confused with all the review systems when which one sends what comments]

mlir/include/mlir/Dialect/Shape/Analysis/ShapeMappingAnalysis.h
24

Could you document these?

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
179

The naming convention here doesn't follow suffixed with _ for class members.

241

This would be wrong here I believe as you are mutating the module structure and the folding above could change behavior too. You don't need to do this one here if the goal is to avoid invalidation before use, that you can control in your pipeline for this case. Unless there is another analysis that existed before here that you wish to retain.

mlir/test/Dialect/Shape/outline-shape-computation.mlir
8

Unfortunately I think DAG will allow this block arg matching to be interchanged. E.g., even if you had the result from shape_cal_1 swapped with here it would not complain. Unfortunately we don't have a DAG-NEXT thing.

Is DAG here for blocks of function? (e.g., could line 5-8 be switched with 10-13 conceptually). In your pass I think the ordering is fixed, it is not load bearing the order that the functions are emitted but it is deterministically fixed. I just want to avoid the case where the result is arbitrarily mangled. Another alternative is to sort the shape functions in the print function - given its only for debugging the extra work/cost is OK and then here you don't need DAG.

This revision is now accepted and ready to land.Sep 29 2022, 11:15 AM

[seems I didn't send my last reply of "Sorry fell off review queue while I was out.", I get confused with all the review systems when which one sends what comments]


I saw this and it's fine. I didn't actively ping you because I was also quite busy previously.

yaochengji added inline comments.Sep 30 2022, 12:01 AM
mlir/include/mlir/Dialect/Shape/Analysis/ShapeMappingAnalysis.h
24

Done.

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
179

Done.

241

Deleted. I cannot find an example about controlling analysis in pass pipeline. Could you point out it for me if possible? And as currently the analysis is not preserved, the analysis output is not tested in lit test for now.

mlir/test/Dialect/Shape/outline-shape-computation.mlir
8

Line 5-8 could be switched with 10-13, because it's a map and the order of keys is undetermined when iterating.

BTW, the order of created shape.funcs is determined.

format code

This revision was landed with ongoing or failed builds.Oct 2 2022, 8:25 PM
This revision was automatically updated to reflect the committed changes.

Did a scan and submitted with small changes.

mlir/lib/Dialect/Shape/Transforms/OutlineShapeComputation.cpp
61

Note: this returns a reference to on stack object. Fixed it.

241

I couldn't find good reference, that needs to be fixed (Analysis themselves need a bit of TLC). And you were correct one has to mark the Analysis being retained here as this is not a pure Analysis. Added back the check and marked only this Analysis as retained.

I think this can be improved still, but don't want to block on that now as its not directly related to this change.

mlir/test/Dialect/Shape/outline-shape-computation.mlir
8

Yes, but CHECK-DAG will allow interleaving at line level. So

Shape function symbol: @shape_cal_1
Shape function symbol: @shape_cal_0
Shape function arguments:
...

would pass the test, and so in particular here it would be unable to detect if you got the analysis wrong and switched results of 1 with those of 0. Here the block arguments being the same would result in not finding this.

Updated to make this more concise and have DAG matching work without allowing the above accidental passing case. Can revise later.

csigg added a subscriber: csigg.Oct 3 2022, 6:50 AM

FYI, I pushed 5faebb5 fix for Windows for the test added in this change.