This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.
mlir/include/mlir/Dialect/GPU/GPUDialect.h | ||
---|---|---|
169–174 | Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch? | |
mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
99–110 | You can just inherit from GPU_IndexOp instead. | |
mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
65–66 | Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types. | |
74 | I don't think this comment applies here. | |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
236–238 | for future patch that should probably move into ldmatrix verifier | |
290–300 | nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though. | |
394 | nit: use SmallVector<Value> if possible | |
453 | ||
mlir/lib/Dialect/GPU/IR/GPUDialect.cpp | ||
62–65 | I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched. | |
mlir/test/Dialect/GPU/invalid.mlir | ||
479–487 | shouldn't need to remove that |
This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.
mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
---|---|---|
68 | Doc comment here. | |
mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
36–41 | Most of these includes are redundant. Please prune. | |
242 | getDimSize(0) | |
247 | Likewise. | |
262–274 | These block needs code comments. | |
394 | Doc comment here. | |
420 | Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else. |
mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
---|---|---|
105 | nit: remove : index |
Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch?