This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Memref] Fold nvgpu.device_async_copy on on src memref- to dst memref-subviews
ClosedPublic

Authored by manishucsd on Apr 12 2023, 1:23 PM.

Details

Summary

This patch folds the nvgpu.device_async_copy on subviews into the original memref and eliminating memref.subviews

Diff Detail

Event Timeline

manishucsd created this revision.Apr 12 2023, 1:23 PM
manishucsd requested review of this revision.Apr 12 2023, 1:23 PM
ThomasRaoux added inline comments.Apr 12 2023, 3:06 PM
mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
13

nit: I would leave this empty line

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
556

would be good to test cases where the index is not 0

Applied comment. Adding back an empty line after the includes.

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
556

The test itself is not anchored on c0. After the subview, the the read will be on %gmem_memref_subview_2d[%c0, %c0] for all the reads and %c0 is in the original IR is defined to use there. The check is that the subview indices are are used in the nvgpu.device_async_copy's after the folding %[[GMEM_MEMREF_3d]][%[[IDX_1]], %[[IDX_2]], %[[IDX_3]]], i.e., IDX_1, IDX_2, IDX_3. %smem_memref_4d is not folded so we can ignore that for now.

ThomasRaoux added inline comments.Apr 12 2023, 6:17 PM
mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
556

If the original indices are not zero they need to be combined with the subview indices right? Don’t we miss testing this part?
Also that means the folding of the dst subview is not tested?

manishucsd marked an inline comment as done.Apr 13 2023, 11:18 PM

Added more tests.

manishucsd marked an inline comment as done.Apr 14 2023, 8:17 AM
ThomasRaoux accepted this revision.Apr 14 2023, 8:20 AM
This revision is now accepted and ready to land.Apr 14 2023, 8:20 AM