This is an archive of the discontinued LLVM Phabricator instance.

[mlir][transforms] Revamp the implementation of mapping loops to GPUs
ClosedPublic

Authored by nicolasvasilache on Jul 21 2023, 4:20 AM.

Details

Summary

This revision significantly simplifies the specification and implementation of mapping loops to GPU ids.

Each type of mapping (block, warpgroup, warp, thread) now comes with 2 mapping modes:

  1. a 3-D "grid-like" mode, subject to alignment considerations on threadIdx.x, on which predication may occur on a per-dimension 3-D sub-rectangle basis.
  2. a n-D linearized mode, on which predication may only occur on a linear basis.

In the process, better size and alignment requirement inference are introduced along with improved runtime verification messages.

The warp_dims attribute was deemed confusing and is removed from the transform in favor of better size inference.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Jul 21 2023, 4:20 AM
Herald added a project: Restricted Project. · View Herald Transcript
nicolasvasilache planned changes to this revision.Jul 21 2023, 4:22 AM
nicolasvasilache added reviewers: springerm, ftynse.
springerm added inline comments.Jul 21 2023, 5:27 AM
mlir/include/mlir/Dialect/SCF/IR/DeviceMappingInterface.td
40–43

We usually write /*desc=*/ etc here.

ftynse added inline comments.Jul 21 2023, 7:03 AM
mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
23–41

Should we just make it a proper (non-enum) attribute parameterized by an integer?

nicolasvasilache edited the summary of this revision. (Show Details)

Finish the impl and drop warp_dims

nicolasvasilache marked an inline comment as done.

Beef up tests

qedawkins added inline comments.
mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
216

Is the division by (kNumWarpsPerGroup * kWarpSize) correct here?

mlir/include/mlir/Dialect/GPU/TransformOps/Utils.h
103

Is this part of the planned changes? Previously warp_dims effectively allowed specifying the warp size, but that is removed here.

Fix the block-linear case which also needs to adapt to existence/creation of gpu.launch.

nicolasvasilache marked an inline comment as done.Jul 24 2023, 6:12 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/GPU/TransformOps/GPUDeviceMappingAttr.td
23–41

I am not sure how to mix an optional enum and an optional attr with a name.
Let's punt on the cosmetic for now if you don't mind, I'll read up and update at a later time.

216

ah thanks!

mlir/include/mlir/Dialect/GPU/TransformOps/Utils.h
103

Good point, yes I can see how the num_warps could be made to support other warp sizes.

I'll add an attribute for passing this quantity.
I'll leave kNumWarpsPerGroup to 4 for now, we can generalize in the future if needed.

nicolasvasilache marked an inline comment as done.

Add option to allow specifying the waro size for a particular target.
In the future this should be some better integrated HW model attribute.

qedawkins added inline comments.Jul 24 2023, 8:02 AM
mlir/include/mlir/Dialect/GPU/TransformOps/Utils.h
103

Perfect, thanks!

Drop spurious test file

springerm added inline comments.Jul 24 2023, 3:41 PM
mlir/include/mlir/Dialect/Utils/StaticValueUtils.h
88

ofrs

mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
725

double parentheses

767

double parentheses

1055

typo

1057

typo

springerm accepted this revision.Jul 24 2023, 5:10 PM
This revision is now accepted and ready to land.Jul 24 2023, 5:10 PM

Rework multiplicity and predication to be more clear about quantities in the scaled and original bases.

qedawkins accepted this revision.Jul 25 2023, 2:00 PM
qedawkins added inline comments.
mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
947

nit: gridDims is now effectively just being used to check that there are <= 3 mappingSizes after rewriteOneForallCommonImpl.

nicolasvasilache marked 8 inline comments as done.Jul 25 2023, 3:07 PM
nicolasvasilache added inline comments.
mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
947

Not sure I can relate this comment to code anymore, gridDims is either used if provided or filled if not.
I can update in a followup if necessary.