This is an archive of the discontinued LLVM Phabricator instance.

[mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor
ClosedPublic

Authored by guraypp on Jul 17 2023, 5:56 AM.

Details

Summary

This work adds nvgpu.tma.async.load Op that requests tma load asyncronusly using mbarrier object.

It also creates nvgpu.tma.descriptor type. The type is supposed be created by cuTensorMapEncodeTiled cuda drivers api.

Diff Detail

Event Timeline

guraypp created this revision.Jul 17 2023, 5:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2023, 5:56 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td
109

typo: parameter

mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
870

some comment as to why you need the Tunc.

Also, can it ever be Ext ? (i.e. 16b or 24b addressing that we want to turn back into 32b ?)

guraypp updated this revision to Diff 541216.Jul 17 2023, 1:32 PM

add enum types for descriptor

guraypp added inline comments.Jul 17 2023, 1:35 PM
mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td
109

qq: Can I use attribute here? I tried the one below, I could make tablegen compile.

DefaultValuedOptionalAttr<TensorMapSwizzleAttr, TensorMapSwizzleKind::none>:$swizzle,
guraypp updated this revision to Diff 541714.Jul 18 2023, 1:36 PM

update on types

guraypp published this revision for review.Jul 19 2023, 1:24 AM
mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td
109

If it works as expected, this makes sense to me (never tried to use with custom enums myself).

I suppose the benefit is that we need less boilerplate when using ?

mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
889

seems this should be prefixed with NVGPU to be consistent

mlir/lib/Dialect/NVGPU/IR/NVGPUDialect.cpp
337

typo: destination

mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
575

nit: nl

580

you want to expand this test a little, we should see the mbarrier ops appear and be consumed

588

Is the box rank supposed to follow the memref rank ?
Atm this looks inconsistent.

guraypp updated this revision to Diff 541928.Jul 19 2023, 3:02 AM

add attributes

guraypp updated this revision to Diff 541970.Jul 19 2023, 5:41 AM
guraypp marked 5 inline comments as done.

address some comments

guraypp updated this revision to Diff 541972.Jul 19 2023, 5:48 AM

fix the test

guraypp marked an inline comment as done.Jul 19 2023, 5:49 AM
guraypp added inline comments.
mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
580

I need to land one more mbarriers PR. After that I can put realistic examples

588

Yes, it has to match at least for the time being. I fixed the test.

I use only base pointer of the memref.

guraypp added inline comments.Jul 19 2023, 5:51 AM
mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
870

afaik, 64bits registers are emulated via 2x32bit registers. Here I truncate when we don't need 64 bit addressing.
We don't need EXT.

guraypp updated this revision to Diff 542325.Jul 19 2023, 11:53 PM

fix enums

guraypp updated this revision to Diff 542405.Jul 20 2023, 3:05 AM

remove using managed memory

This revision is now accepted and ready to land.Jul 20 2023, 11:59 PM