This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.
mlir/include/mlir/Dialect/GPU/GPUDialect.h | ||
---|---|---|
169–174 ↗ | (On Diff #422369) | Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch? |
mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
99–110 | You can just inherit from GPU_IndexOp instead. | |
mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
65–66 ↗ | (On Diff #422369) | Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types. |
74 ↗ | (On Diff #422369) | I don't think this comment applies here. |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
226–228 | for future patch that should probably move into ldmatrix verifier | |
280–290 | nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though. | |
384 | nit: use SmallVector<Value> if possible | |
443 | ||
mlir/lib/Dialect/GPU/IR/GPUDialect.cpp | ||
62–65 ↗ | (On Diff #422369) | I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched. |
mlir/test/Dialect/GPU/invalid.mlir | ||
479–487 ↗ | (On Diff #422369) | shouldn't need to remove that |
This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.
mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
---|---|---|
68 ↗ | (On Diff #422369) | Doc comment here. |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
31–32 | Most of these includes are redundant. Please prune. | |
232 | getDimSize(0) | |
237 | Likewise. | |
252–264 | These block needs code comments. | |
384 | Doc comment here. | |
410 | Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else. |
mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
---|---|---|
105 | nit: remove : index |
nit: remove : index