This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Transform] Add support for mapping to GPU warps and to linear ids
ClosedPublic

Authored by nicolasvasilache on Mar 15 2023, 5:08 AM.

Details

Summary

This revisions refactors the implementation of mapping to threads to additionally allow warps and linear ids to be specified.

warp_dims is currently specified along with block_dims as a transform attribute.

Linear ids on th other hand use the flattened block_dims to predicate on the first (linearized) k threads.
An additional GPULinearIdMappingAttr is added to the GPU dialect to allow specifying loops mapped to this new scheme.

Various implementation and transform op semantics cleanups are also applied.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Mar 15 2023, 5:08 AM
Herald added a project: Restricted Project. · View Herald Transcript
nicolasvasilache planned changes to this revision.Mar 15 2023, 5:08 AM
nicolasvasilache retitled this revision from [mlir][Transform] Add support for mapping to GPU warps to [mlir][Transform] Add support for mapping to GPU warps and to linear ids.
nicolasvasilache edited the summary of this revision. (Show Details)

Merge mapping to warps and linear ids.

Delete unused code and better handle insertion points.

bondhugula requested changes to this revision.Mar 16 2023, 8:08 PM

Please split out the IR/ changes out from the TransformOps/ changes if appropriate; add test cases for linear_id/lane_id op and its lowering - it doesn't seem to itself appear in test cases. It looks odd that the only code manipulating a GPU IR dialect op appears in TransformOps/. Commit summary appears incomplete - doesn't seems to capture all the changes.

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
124 ↗(On Diff #505973)

This doesn't seem to have any test cases here. Is this PR ready for review?

124 ↗(On Diff #505973)

linear_id or GPU_LaneIdOp to be consistent.

mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
79

This description needs to be completed (recommend focusing on "what" instead of what it allows in general).

mlir/include/mlir/Dialect/GPU/TransformOps/GPUTransformOps.h
36

Doc comment.

mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
121–229

Missing code comments for these.

This revision now requires changes to proceed.Mar 16 2023, 8:08 PM
nicolasvasilache marked 5 inline comments as done.Mar 17 2023, 8:22 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
124 ↗(On Diff #505973)

Good point, there is indeed no value in introducing new GPU ops that do not lower through the common pass for the purpose of the transformations I am implementing.

Refactored the implementation to avoid relying on a new op.

This op may make sense to be introduced separately and lowered through the existing schemes that are mindful of index type length. But this is indeed a separate concern and a separate PR.

nicolasvasilache marked an inline comment as done.
nicolasvasilache edited the summary of this revision. (Show Details)

Address comments and refactor.

Dismissing change request that has now been addressed. This is now purely a GPU transform ops related revision. Any of the other reviewers familiar/interested in transforms ops may be able to review it.

ThomasRaoux accepted this revision.Mar 19 2023, 10:21 PM

LGTM, added a few nits

mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
68

Doesn't have to be addressed in this patch but DimX, DimY and DimZ don't really make sense for linearID, I wonder if we could just have an index to give the order of distribution?

mlir/include/mlir/Dialect/GPU/TransformOps/GPUTransformOps.h
52

nit: replace A the -> The?

mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
608

nit: doesn't feel like a very useful comment

This revision is now accepted and ready to land.Mar 19 2023, 10:21 PM
This revision was landed with ongoing or failed builds.Mar 20 2023, 1:05 AM
This revision was automatically updated to reflect the committed changes.
nicolasvasilache marked 3 inline comments as done.
mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
68

ack, we can also go higher-D than 3