Page MenuHomePhabricator

[mlir][linalg] Remove IndexedGenericOp support from FusionOnTensors...

Authored by gysit on May 10 2021, 5:19 AM.



after introducing the IndexedGenericOp to GenericOp canonicalization (

Diff Detail

Event Timeline

gysit created this revision.May 10 2021, 5:19 AM
gysit requested review of this revision.May 10 2021, 5:19 AM
gysit updated this revision to Diff 344027.May 10 2021, 5:43 AM

minor changes.

mravishankar requested changes to this revision.May 10 2021, 9:10 AM

Looking through this (will take me a bit), but just adding a "request change" till i finish my review.


I am ok with dropping the , 4 , but one place where it gets compilcated is when you try something like

SmallVector<T> vec = llvm::to_vector<4>(...);

AFAIK, this will create a vector with the llvm::to_vector and then instead of a move assignment will do a copy assignment since the type is different. So something to watch out for.

This revision now requires changes to proceed.May 10 2021, 9:10 AM

Nice simplification, I'll let @mravishankar finish the review and provide approval, LGTM form eyeballing.

Mostly looks fine. Couple of minor comments about use of SmallVector<T>.

Major concern though is the number of tests deleted. I think these should be adapted to use the linalg.index operations, and verified that it works as expected.


Nit: could you change this to either return failure() (or better return motifyMatchFailure() . There is better error reporting now than when this was written (and hasnt been updated). This just adds noise to the logs.


See comment above. If I am not mistaken this is going to be a copy assignment. If so would rather leave 4 in there. (Ideally llvm::to_vector can be modified to handle the default case)


Flagging this as potential copy assignment.


Looking at this, need to revisit this one cause this seems buggy. Need to check whether all the arguments "after" the dropped operand need to be remapped as well since we have a smaller list of arguments


I would prefer not dropping these tests. Instead change them to use the index operation in the body and make sure that the generated op is as expected. The argument to the index operation should be verified.

gysit added a comment.May 10 2021, 9:44 AM

Thanks for the review! I think most points are clear but I have a question regarding the tests and I need to think about the block argument remapping myself.


We probably can delete entryBlock.getNumArguments() - genericOp.getNumShapedOperands() since this is always zero now that we have no index arguements! I will search for more such patterns tomorrow.

Regarding the mapping I am not sure I believe the mapper will care about remapping all the remaining argument. But it is one thing I am never 100% sure myself...


Are you referring to the two constant fusion tests here? I believe for the other tests there should be an index op counter part.

At the moment, the constant fusion tests do not make use of the indices and there are already generic op constant fusion tests. But I could enhance the test to for example add one of the indices to the fused constant?

gysit updated this revision to Diff 344332.May 11 2021, 2:24 AM
gysit marked 5 inline comments as done.

Address comments and rebase:

  • fix all instances of to_vector
  • improve error handling
  • fix operand index computation
  • add constant tests again
gysit marked an inline comment as done.May 11 2021, 2:25 AM
gysit updated this revision to Diff 344400.EditedMay 11 2021, 7:43 AM

Adapt comment in include/mlir/Dialect/Linalg/Transforms/Transforms.h

mravishankar requested changes to this revision.May 11 2021, 11:19 AM
mravishankar added inline comments.

Sorry I just left a blanket comment here and not specific enough. I see these tests are not really working on indexed_generic operations. They are really generic operations (they dont use the index bb args). But there are tests below that do use. (see below)


This is the test I am talking about. The producer here is using index_cast %arg etc. i.e. actually using the index basic block. These need to be kept so that the transformation of the linalg.index can be verified.


Same here. This needs to be kept as well.


This needs to be kept too. Any tests that you see using the index bb args need to be kept to verify that the fusion is working as expected.

This revision now requires changes to proceed.May 11 2021, 11:19 AM
gysit marked 4 inline comments as done.May 11 2021, 11:53 AM
gysit added inline comments.

Ah now I understand... These tests already exist for some time. I added them when I introduced index op support ( meaning they unfortunately do not show up in the diff! But they are already there (the index op counter part should always be right below the deleted indexed_generic test):

generic_op_indexed_generic_op_fusion -> producer_indexed_consumer_fusion
indexed_generic_op_generic_op_fusion -> indexed_producer_consumer_fusion
indexed_producer_indexed_consumer_fusion -> indexed_generic_op_fusion

Sorry I should have mentioned that as a review comment.

gysit updated this revision to Diff 344702.May 12 2021, 12:20 AM
gysit marked an inline comment as done.

Remove wrongly added tests.

mravishankar accepted this revision.May 12 2021, 9:47 AM
mravishankar added inline comments.

Ah ok! I see these now. Thanks for the clarifications.

Looking at these tests now, they seem to be only testing the case where the linalg.index values are the same before and after fusion. It would be good to add one more test of this form

#map1 = affine_map<(d0) -> (d0)>
#map2 = affine_map<(d0, d1) -> (d0, d1)>
#map3 = affine_map<(d0, d1) -> (d1)>
func (%arg0 : tensor<?xf32>, %arg1 : tensor<?x?xf32>) -> tensor<?x?xf32> {
  %c0 = constant 0 : index
  %c1 = constant 1 : index
  %d0 = memref.dim %arg0, %c0 : tensor<?xf32>
  %0 = linalg.init_tensor [%d0] : tensor<?xf32>
  %1 = linalg.generic
      {indexing_maps = [#map1, #map1]
       iterator_types = ["parallel"]}
      ins(%arg0 : tensor<?xf32>) outs(%0 : tensor<?xf32>) {
      ^bb0(%arg2 : f32, %arg3 : f32):
        %2 = linalg.index 0 : index
        %3 = index_cast %2 : index to i32
        %4 = sitofp %3 : i32 to f32
        %5 = addf %arg2, %4 : f32
        linalg.yield %2 : f32
      } -> tensor<?xf32>
  %2 = memref.dim %arg1, %c0 : tensor<?x?xf32>
  %3 = memref.dim %arg1, %c1 : tensor<?x?xf32>
  %4 = linalg.init_tensor [%2, %3] : tensor<?x?xf32>
  %5 = linalg.generic
      {indexing_maps = [#map2, #map3, #map2],
       iterator_types = ["parallel", "parallel"]}
      ins(%arg1, %1 : tensor<?x?xf32>, tensor<?xf32>)
      outs(%4 : tensor<?x?xf32>) {
      ^bb0(%arg2 : f32, %arg3 : f32, %arg4: f32):
        %6 = addf %arg2, %arg3 : f32
        linalg.yield %6 : f32
     } -> tensor<?x?xf32>
  return %5 : tensor<?x?xf32>

If my understanding is right, the fused op should have linalg.index 1. I am sure it works, but this is more to make sure no changes break it. I appreciate that this test is not there already and it should have been, but I just realized now looking at all the tests and saw this missing coverage.

Not blocking on this, but really hope you can just add this test.

This revision is now accepted and ready to land.May 12 2021, 9:47 AM
gysit updated this revision to Diff 345114.May 13 2021, 6:08 AM

Added unit test.