This is an archive of the discontinued LLVM Phabricator instance.

[mlir][transform] Assing blockDim automatically
Needs ReviewPublic

Authored by guraypp on Oct 5 2022, 3:14 AM.

Details

Summary

Current map_nested_foreach_to_threads op expects blockDim argument to
be present. It is tedious to provide it.

This revision automatically sets blockDim if all the 'scf.foreach thread'
trip counts are known at compile-time. It traverses all sibling
scf.foreach_thread ops, finds the largest number of trips in the same
level. Then it assigns the largest trip count to blockDim.

For the example shown below, it sets blockDim = [12, 9, 1] that is x, y and z dimensions respectively.

gpu.launch() {
  scf.foreach_thread (%i, %j) in (7,9)  --> parallelized threadIdx.x + threadIdx.y
  scf.foreach_thread (%i) in (12)       --> parallelized threadIdx.x 
}

Diff Detail

Event Timeline

guraypp created this revision.Oct 5 2022, 3:14 AM
guraypp requested review of this revision.Oct 5 2022, 3:14 AM
Herald added a project: Restricted Project. · View Herald Transcript
guraypp edited the summary of this revision. (Show Details)Oct 5 2022, 4:35 AM
nicolasvasilache requested changes to this revision.Oct 5 2022, 4:54 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/GPU/TransformOps/GPUTransformOps.h
45 ↗(On Diff #465329)

this is a bad API smell, please split out the part that assigns from the part that uses and remains const

mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
493–505

default computation needs to happen here, not hidden under the map function

504–505

you're significantly changing the design of the op, this needs serious documentation

mlir/test/Dialect/GPU/transform-gpu.mlir
205

nl

This revision now requires changes to proceed.Oct 5 2022, 4:54 AM
guraypp updated this revision to Diff 465362.Oct 5 2022, 6:06 AM

address nicolasvasilache comments

guraypp marked 3 inline comments as done.Oct 5 2022, 6:10 AM
guraypp added inline comments.
mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
493–505

I moved the calculation here as you suggested.
Now I need to replicate this part in other places that call the map function, actually I hid it intentionally inside the map to avoid replication. But I understand your concern.

guraypp updated this revision to Diff 473240.Nov 4 2022, 7:43 AM
guraypp marked an inline comment as done.

rebase and ping

guraypp updated this revision to Diff 473241.Nov 4 2022, 7:49 AM

update the test

nicolasvasilache resigned from this revision.Feb 15 2023, 8:59 AM