This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.
| mlir/include/mlir/Dialect/GPU/GPUDialect.h | ||
|---|---|---|
| 169–174 | Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch? | |
| mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
| 99–110 | You can just inherit from GPU_IndexOp instead. | |
| mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
| 65–66 | Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types. | |
| 74 | I don't think this comment applies here. | |
| mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
| 236–238 | for future patch that should probably move into ldmatrix verifier | |
| 290–300 | nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though. | |
| 394 | nit: use SmallVector<Value> if possible | |
| 453 | ||
| mlir/lib/Dialect/GPU/IR/GPUDialect.cpp | ||
| 62–65 | I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched. | |
| mlir/test/Dialect/GPU/invalid.mlir | ||
| 479–487 | shouldn't need to remove that | |
This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.
| mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h | ||
|---|---|---|
| 68 | Doc comment here. | |
| mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | ||
| 36–41 | Most of these includes are redundant. Please prune. | |
| 242 | getDimSize(0) | |
| 247 | Likewise. | |
| 262–274 | These block needs code comments. | |
| 394 | Doc comment here. | |
| 420 | Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else. | |
| mlir/include/mlir/Dialect/GPU/GPUOps.td | ||
|---|---|---|
| 105 | nit: remove : index | |
Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch?