This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][GPU] Add GPU ops nvvm.mma.sync, nvvm.mma.ldmatrix, lane_id
ClosedPublic

Authored by christopherbate on Apr 12 2022, 5:05 PM.

Details

Summary

This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.

Diff Detail

Event Timeline

christopherbate requested review of this revision.Apr 12 2022, 5:05 PM

Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.

mlir/include/mlir/Dialect/GPU/GPUDialect.h
169–174

Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch?

mlir/include/mlir/Dialect/GPU/GPUOps.td
99–110

You can just inherit from GPU_IndexOp instead.
That will also remove the index type from the printing which is not really needed since the result has to be an index.

mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
65–66

Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types.

74

I don't think this comment applies here.

mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
236–238

for future patch that should probably move into ldmatrix verifier

290–300

nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though.

394

nit: use SmallVector<Value> if possible

453
mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
62–65

I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched.

mlir/test/Dialect/GPU/invalid.mlir
479–487

shouldn't need to remove that

This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.

mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
68

Doc comment here.

mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
36–41

Most of these includes are redundant. Please prune.

242

getDimSize(0)

247

Likewise.

262–274

These block needs code comments.

394

Doc comment here.

420

Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else.

christopherbate marked 14 inline comments as done.

Address reviewer comments.

I updated the revision to address the comments so far, thanks!

mlir/include/mlir/Dialect/GPU/GPUDialect.h
169–174

removed

mlir/include/mlir/Dialect/GPU/GPUOps.td
99–110

Since the GPU_IndexOp includes a dimension argument which is not needed here, I just updated the assembly format.

ThomasRaoux accepted this revision.Apr 13 2022, 9:15 AM
ThomasRaoux added inline comments.
mlir/include/mlir/Dialect/GPU/GPUOps.td
105

nit: remove : index

This revision is now accepted and ready to land.Apr 13 2022, 9:15 AM
This revision was landed with ongoing or failed builds.Apr 13 2022, 3:50 PM
This revision was automatically updated to reflect the committed changes.