This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][GPU] Add GPU ops nvvm.mma.sync, nvvm.mma.ldmatrix, lane_id
ClosedPublic

Authored by christopherbate on Apr 12 2022, 5:05 PM.

Details

Summary

This change adds three new operations to the GPU dialect: gpu.mma.sync,
gpu.mma.ldmatrix, and gpu.lane_id. The former two are meant to target
the lower level nvvm.mma.sync and nvvm.ldmatrix instructions, respectively.
Lowerings are added for the new GPU operations for conversion to
NVVM.

Diff Detail

Event Timeline

christopherbate requested review of this revision.Apr 12 2022, 5:05 PM

Looks good, I added some minor comments. Some of the MMAMatrixType change seem unrelated. I would remove it and have them in a separate patch if needed.

mlir/include/mlir/Dialect/GPU/GPUDialect.h
169–174 ↗(On Diff #422369)

Do you need this? I don't see it used anywhere and seems unrelated to the rest of the patch?

mlir/include/mlir/Dialect/GPU/GPUOps.td
99–110

You can just inherit from GPU_IndexOp instead.
That will also remove the index type from the printing which is not really needed since the result has to be an index.

mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
65–66 ↗(On Diff #422369)

Since the name suggest that this is meant to convert LaneId, it probably shouldn't be a template. I would use directly the Op types.

74 ↗(On Diff #422369)

I don't think this comment applies here.

mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
226–228

for future patch that should probably move into ldmatrix verifier

280–290

nit: In general static functions are not members (although I can't find it mentioned in the coding standard so might be a soft rule). For consistency it would be good to change it though.

384

nit: use SmallVector<Value> if possible

443
mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
62–65 ↗(On Diff #422369)

I don't think you need that, the rest of the patch should be independent of MMAMatrixType so it is better to leave that untouched.

mlir/test/Dialect/GPU/invalid.mlir
479–487 ↗(On Diff #422369)

shouldn't need to remove that

This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.

mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.h
68 ↗(On Diff #422369)

Doc comment here.

mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
31–32

Most of these includes are redundant. Please prune.

232

getDimSize(0)

237

Likewise.

252–264

These block needs code comments.

384

Doc comment here.

410

Use e = ... ; i < e; .. form to avoid repeated evaluation. Here and everywhere else.

christopherbate marked 14 inline comments as done.

Address reviewer comments.

I updated the revision to address the comments so far, thanks!

mlir/include/mlir/Dialect/GPU/GPUDialect.h
169–174 ↗(On Diff #422369)

removed

mlir/include/mlir/Dialect/GPU/GPUOps.td
99–110

Since the GPU_IndexOp includes a dimension argument which is not needed here, I just updated the assembly format.

ThomasRaoux accepted this revision.Apr 13 2022, 9:15 AM
ThomasRaoux added inline comments.
mlir/include/mlir/Dialect/GPU/GPUOps.td
105

nit: remove : index

This revision is now accepted and ready to land.Apr 13 2022, 9:15 AM
This revision was landed with ongoing or failed builds.Apr 13 2022, 3:50 PM
This revision was automatically updated to reflect the committed changes.