Move the existing optimization for transfer op on tensor to folder and canonicalization. This handles the write after write case and read after write and also add write after read case.
Details
Diff Detail
Event Timeline
mlir/lib/Dialect/Vector/VectorOps.cpp | ||
---|---|---|
2525 | I'd suggest to rename foldRAW so it becomes very clear what type of folding this is and how it relates to the others. | |
2530 | I'd suggest to rename checkSameValueRAW (or a better name but with RAW in it) so it becomes very clear what type of folding this is and how it relates to the others. | |
2770 | I'd suggest to rename foldWAR so it becomes very clear what type of folding this is and how it relates to the others. | |
2779 | I'd suggest hoisting that in a helper checkSameValueWAR (or a better name but with RAW in it) so it becomes very clear what type of folding this is and how it relates to the others. | |
2836 | I'd suggest to rename foldRAW so it becomes very clear what type of folding this is and how it relates to the others. | |
2839 | Why does this need to be a canonicalization pattern and not a simple fold ? | |
2846 | I'd suggest to rename checkSameValueWAW (or a better name but with WAW in it) so it becomes very clear what type of folding this is and how it relates to the others. |
mlir/lib/Dialect/Vector/VectorOps.cpp | ||
---|---|---|
2839 | If we want to support looking more than one level up in the WaW case we need to modify a Transfer_write that isn't the root like in the example here: %w0 = vector.transfer_write %v0, %arg0[%c1, %c0] {in_bounds = [true, true]} : vector<1x4xf32>, tensor<4x4xf32> %w1 = vector.transfer_write %v0, %arg0[%c2, %c0] {in_bounds = [true, true]} : vector<1x4xf32>, tensor<4x4xf32> %w2 = vector.transfer_write %v1, %w1[%c1, %c0] {in_bounds = [true, true]} : vector<1x4xf32>, tensor<4x4xf32> In this case we need to update %w1. It could be implemented as a folding starting from %w0 considering only the case where the write has a single use and going through the SSA chain in the other direction. Do you think this is better? |
mlir/lib/Dialect/Vector/VectorOps.cpp | ||
---|---|---|
2839 | I had missed that is more powerful than what you could get with a folder and hop-1 use-def. |
clang-format: please reformat the code