wmma intrinsics have a large number of combinations, ideally we want to be able to target all the different variants. To avoid a combinatorial explosion in the number of mlir op we use attributes to represent the different variation of load/store/mma ops.
We also can generate with tablegen helpers to know which combination are available.
Using this we can avoid having too hardcode a path for specific shapes and can support more types.
This patch also adds boiler plates for tf32 op support.
Details
Diff Detail
Event Timeline
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | ||
---|---|---|
163–172 | This does hold all the parameters attached to the matrix/mma intrinsic if that's what you mean. |
mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir | ||
---|---|---|
26 | Is the type here correct? the instruction is taking two operands and returning one. The type in the modified code is just taking a (ptr) -> (struct). Is there a reason for the change? |
mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir | ||
---|---|---|
26 | The previous version was using anytype for the stride, since it has to be i32 I changed it in the declaration and we don't need to print it explicitly. |
mlir/test/Conversion/GPUToNVVM/wmma-ops-to-nvvm.mlir | ||
---|---|---|
26 | That's overall how it is done for rest of the ops I see in LLVM dialect, if there is no ambiguity on the type to use it is not printed out. |
Overall, it mostly looks like a refactoring, but a lot of tablegen ninja code . If possible could we move most of it into C++ and use the attributes directly to get what you want?
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | ||
---|---|---|
163–172 | Ok, I see your point. I was thinking of some better way of representing this, but here is better to mirror LLVM | |
275 | THis is very hard to read. COuld we just add a C++ method to do whatever this is doing? | |
402 | Could we move this to c++ file. Its hard to read here. | |
439 | THis isnt the format of the new instructions? Probably needs to be restated |
Fix descriptions
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | ||
---|---|---|
275 | The problem is that because of how getIntrinsicID is my source of truth is in tablegen so I need to have some function generated from tablegen to query the different operation available. Unless we can dynamically generate the intrinsicID annd move getIntrinsicID I don't see an obvious way to avoid some sort of tablegen for this. | |
402 | I don't think it is possible because it needs to instantiate the intrinsic ID. Unfortunately it is not possible to build the intrinsic iD dynamically from C++ and since there are over 100 intrinsics it would be a lot of code to write and would be error prone. | |
439 | Good catch, fixed it. |
After looking at the generated code, this looks much better! Nice cleanup! Its hard to read the tblgen, but presumably nobody has to.
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | ||
---|---|---|
549 | Restate this cause nvvm.wmma.m*n*k.mma is not the instruction after this change. | |
558 | Restate this cause nvvm.wmma.m*n*k.store is not the instruction after this change. |
These are mostly requests for documentation and code comments.
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | ||
---|---|---|
163–172 | Can you have a doc comment for these individual classes? I see a high-level top-level comment but individual doc comments are important for posterity. | |
174–183 | Doc comment here. | |
275 | Can you add a comment capturing this note? | |
289–350 | Again, without additional comments, these are going to be hard to read and maintain. | |
355–363 | Doc comment here again. | |
412–422 | We need to be documenting what these methods return. | |
mlir/lib/Conversion/GPUToNVVM/WmmaOpsToNvvm.cpp | ||
38–39 | Nit: -> when an unimplemented | |
mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp | ||
137 | Doc comment. |
mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp | ||
---|---|---|
137 | Oh, sorry, I missed that. We shouldn't duplicate it then. |
Nit typo : s/instanciate/instantiate