This work adds nvgpu.tma.async.load Op that requests tma load asyncronusly using mbarrier object.
It also creates nvgpu.tma.descriptor type. The type is supposed be created by cuTensorMapEncodeTiled cuda drivers api.
Paths
| Differential D155453
[mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor ClosedPublic Authored by guraypp on Jul 17 2023, 5:56 AM.
Details Summary This work adds nvgpu.tma.async.load Op that requests tma load asyncronusly using mbarrier object. It also creates nvgpu.tma.descriptor type. The type is supposed be created by cuTensorMapEncodeTiled cuda drivers api.
Diff Detail
Event Timelinenicolasvasilache added inline comments.
guraypp added inline comments.
This revision is now accepted and ready to land.Jul 20 2023, 11:59 PM Closed by commit rG70c2e0618a0f: [mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor (authored by guraypp). · Explain WhyJul 21 2023, 1:23 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 540979 mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td
mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
mlir/lib/Dialect/NVGPU/IR/NVGPUDialect.cpp
mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
|
typo: parameter