after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

# Details

# Diff Detail

- Repository
- rG LLVM Github Monorepo

### Event Timeline

Looking through this (will take me a bit), but just adding a "request change" till i finish my review.

mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp | ||
---|---|---|

333–334 | I am ok with dropping the SmallVector<T> vec = llvm::to_vector<4>(...); AFAIK, this will create a vector with the |

Nice simplification, I'll let @mravishankar finish the review and provide approval, LGTM form eyeballing.

Mostly looks fine. Couple of minor comments about use of `SmallVector<T>`.

Major concern though is the number of tests deleted. I think these should be adapted to use the `linalg.index` operations, and verified that it works as expected.

mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp | ||
---|---|---|

569–570 | Nit: could you change this to either | |

585 | See comment above. If I am not mistaken this is going to be a | |

1187 | Flagging this as potential | |

1216–1217 | Looking at this, need to revisit this one cause this seems buggy. Need to check whether all the arguments "after" the dropped operand need to be remapped as well since we have a smaller list of arguments | |

mlir/test/Dialect/Linalg/fusion-tensor.mlir | ||

208 | I would prefer not dropping these tests. Instead change them to use the |

Thanks for the review! I think most points are clear but I have a question regarding the tests and I need to think about the block argument remapping myself.

mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp | ||
---|---|---|

1216–1217 | We probably can delete entryBlock.getNumArguments() - genericOp.getNumShapedOperands() since this is always zero now that we have no index arguements! I will search for more such patterns tomorrow. Regarding the mapping I am not sure I believe the mapper will care about remapping all the remaining argument. But it is one thing I am never 100% sure myself... | |

mlir/test/Dialect/Linalg/fusion-tensor.mlir | ||

208 | Are you referring to the two constant fusion tests here? I believe for the other tests there should be an index op counter part. At the moment, the constant fusion tests do not make use of the indices and there are already generic op constant fusion tests. But I could enhance the test to for example add one of the indices to the fused constant? |

Address comments and rebase:

- fix all instances of to_vector
- improve error handling
- fix operand index computation
- add constant tests again

mlir/test/Dialect/Linalg/fusion-tensor.mlir | ||
---|---|---|

208 | Sorry I just left a blanket comment here and not specific enough. I see these tests are not really working on | |

307 | This is the test I am talking about. The producer here is using | |

409 | Same here. This needs to be kept as well. | |

509 | This needs to be kept too. Any tests that you see using the |

mlir/test/Dialect/Linalg/fusion-tensor.mlir | ||
---|---|---|

208 | Ah now I understand... These tests already exist for some time. I added them when I introduced index op support (https://reviews.llvm.org/D100479) meaning they unfortunately do not show up in the diff! But they are already there (the index op counter part should always be right below the deleted indexed_generic test): generic_op_indexed_generic_op_fusion -> producer_indexed_consumer_fusion Sorry I should have mentioned that as a review comment. |

mlir/test/Dialect/Linalg/fusion-tensor.mlir | ||
---|---|---|

208 | Ah ok! I see these now. Thanks for the clarifications. Looking at these tests now, they seem to be only testing the case where the #map1 = affine_map<(d0) -> (d0)> #map2 = affine_map<(d0, d1) -> (d0, d1)> #map3 = affine_map<(d0, d1) -> (d1)> func (%arg0 : tensor<?xf32>, %arg1 : tensor<?x?xf32>) -> tensor<?x?xf32> { %c0 = constant 0 : index %c1 = constant 1 : index %d0 = memref.dim %arg0, %c0 : tensor<?xf32> %0 = linalg.init_tensor [%d0] : tensor<?xf32> %1 = linalg.generic {indexing_maps = [#map1, #map1] iterator_types = ["parallel"]} ins(%arg0 : tensor<?xf32>) outs(%0 : tensor<?xf32>) { ^bb0(%arg2 : f32, %arg3 : f32): %2 = linalg.index 0 : index %3 = index_cast %2 : index to i32 %4 = sitofp %3 : i32 to f32 %5 = addf %arg2, %4 : f32 linalg.yield %2 : f32 } -> tensor<?xf32> %2 = memref.dim %arg1, %c0 : tensor<?x?xf32> %3 = memref.dim %arg1, %c1 : tensor<?x?xf32> %4 = linalg.init_tensor [%2, %3] : tensor<?x?xf32> %5 = linalg.generic {indexing_maps = [#map2, #map3, #map2], iterator_types = ["parallel", "parallel"]} ins(%arg1, %1 : tensor<?x?xf32>, tensor<?xf32>) outs(%4 : tensor<?x?xf32>) { ^bb0(%arg2 : f32, %arg3 : f32, %arg4: f32): %6 = addf %arg2, %arg3 : f32 linalg.yield %6 : f32 } -> tensor<?x?xf32> return %5 : tensor<?x?xf32> } If my understanding is right, the fused op should have Not blocking on this, but really hope you can just add this test. |