This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.
| mlir/include/mlir/Dialect/GPU/GPUDialect.h | ||
|---|---|---|
| 169–174 ↗ | (On Diff #422369) | Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch? |
| mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
| 99–110 | You can just inherit from GPU_IndexOp instead. | |
| mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
| 65–66 ↗ | (On Diff #422369) | Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types. |
| 74 ↗ | (On Diff #422369) | I don't think this comment applies here. |
| mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
| 226–228 | for future patch that should probably move into ldmatrix verifier | |
| 280–290 | nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though. | |
| 384 | nit: use SmallVector<Value> if possible | |
| 443 | ||
| mlir/lib/Dialect/GPU/IR/GPUDialect.cpp | ||
| 62–65 ↗ | (On Diff #422369) | I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched. |
| mlir/test/Dialect/GPU/invalid.mlir | ||
| 479–487 ↗ | (On Diff #422369) | shouldn't need to remove that |
This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.
| mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
|---|---|---|
| 68 ↗ | (On Diff #422369) | Doc comment here. |
| mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
| 31–32 | Most of these includes are redundant. Please prune. | |
| 232 | getDimSize(0) | |
| 237 | Likewise. | |
| 252–264 | These block needs code comments. | |
| 384 | Doc comment here. | |
| 410 | Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else. | |
| mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
|---|---|---|
| 105 | nit: remove : index | |
nit: remove : index