This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Adding support for transposed mma_load_matrix
ClosedPublic

Authored by qedawkins on Nov 27 2022, 3:50 PM.

Details

Summary

Enables transposed gpu.subgroup_mma_load_matrix and updates the lowerings in Vector to GPU and GPU to SPIRV. Needed to enable B transpose matmuls lowering to wmma ops.

Taken over from author: stanley-nod <stanley@nod-labs.com>

Diff Detail

Event Timeline

qedawkins created this revision.Nov 27 2022, 3:50 PM
qedawkins requested review of this revision.Nov 27 2022, 3:50 PM
qedawkins edited the summary of this revision. (Show Details)Nov 27 2022, 3:54 PM

You also need to change the lowering to llvm otherwise the transpose attribute is just ignored which would cause a miscompile. You don’t need to implement the full lowering (I wouldn’t at least not on this patch) but you need to at least fail the lowering pattern.

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1121

Use OptionalAttr<UnitAttr> type instead. This will make the syntax and code simpler

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
143

I don’t think this is the correct way to check for transpose, you need to check that the map is actually a transpose.

antiagainst added inline comments.Nov 28 2022, 11:10 AM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1121

Can we also add explanation to this attribute in the description?

Switch to OptionalAttr<UnitAttr> and update tests accordingly. Additionally changes the check for a transposed vector.transfer_read to directly check for a 2d transpose.

qedawkins added inline comments.Nov 28 2022, 11:47 AM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1121

Should the semantics here be that the result of the load is transposed or that the values are arranged in memory already transposed?

ThomasRaoux added inline comments.Nov 28 2022, 11:53 AM
mlir/lib/Conversion/GPUToSPIRV/WmmaOpsToSPIRV.cpp
91

you can just do bool useColMajor = subgroupMmaLoadMatrixOp.getTranspose()?

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
147

why do we need this restriction? This is something we will need to relax. we should basically detect transpose by check equality to a map like this:
`(d0, d1, d2, ...) -> (dn-2, dn-1)

qedawkins added inline comments.Nov 28 2022, 12:05 PM
mlir/lib/Conversion/GPUToSPIRV/WmmaOpsToSPIRV.cpp
91

It's an OptionalAttr<UnitAttr> so it needs to be this. It could just be a UnitAttr directly, although I don't know what the standard approach here is.

ThomasRaoux added inline comments.Nov 28 2022, 12:21 PM
mlir/lib/Conversion/GPUToSPIRV/WmmaOpsToSPIRV.cpp
91

I believe you can cast it to bool directly using static_cast<bool>

qedawkins updated this revision to Diff 478352.Nov 28 2022, 1:25 PM

Add description of transpose and relax check for transposed load.

ThomasRaoux accepted this revision.Nov 28 2022, 1:36 PM
This revision is now accepted and ready to land.Nov 28 2022, 1:36 PM
qedawkins marked 5 inline comments as done.Nov 28 2022, 1:45 PM
qedawkins marked an inline comment as done.
qedawkins set the repository for this revision to rG LLVM Github Monorepo.Nov 28 2022, 2:15 PM
antiagainst accepted this revision.Nov 28 2022, 2:41 PM
This revision was automatically updated to reflect the committed changes.