This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][NVVM] Introduction of the `wgmma.mma_async` Op
ClosedPublic

Authored by guraypp on Aug 8 2023, 1:01 AM.

Details

Summary

This work introduces the wgmma.mma_async Op along PTX generation using BasicPtxBuilderOpInterface. The Op is designed to execute the matrix multiply-and-accumulate operation across a warpgroup (128 threads). It's important to note that this operation works for devices with the sm_90a capability.

The matrix multiply-and-accumulate operation can take one of the following forms. In both cases, matrix D is referred to as the accumulator:
D = A * B + D : Result is added to the accumulator matrix D.
D = A * B : The input from the accumulator matrix D is not utilized.

Diff Detail

Event Timeline

guraypp created this revision.Aug 8 2023, 1:01 AM
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
guraypp requested review of this revision.Aug 8 2023, 1:01 AM
guraypp updated this revision to Diff 548112.Aug 8 2023, 2:05 AM

add invalid test. improve the verifier

guraypp updated this revision to Diff 548113.Aug 8 2023, 2:06 AM

add newline

guraypp updated this revision to Diff 548570.Aug 9 2023, 5:09 AM

Now the Op takes and returns a struct in SSA fashion.
Removed + PTX constraint modifier. Now it uses + also maps them as read with index number.

guraypp updated this revision to Diff 548575.Aug 9 2023, 5:22 AM

add convert-arith-to-llvm in the test to have same lowering as the convert-to-llvm

nicolasvasilache accepted this revision.Aug 9 2023, 8:41 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
1502

please add the tablegen constraint that result and operand 0 have the same type, see:

class TCresIsSameAsOpBase<int i, int j>
mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
271

nit: nl

This revision is now accepted and ready to land.Aug 9 2023, 8:41 AM
guraypp updated this revision to Diff 548640.Aug 9 2023, 8:57 AM

added TCresIsSameAsOpBase in tablegen

This revision was automatically updated to reflect the committed changes.