This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] Add patterns to propagate vector distribution
ClosedPublic

Authored by ThomasRaoux on Jun 6 2022, 4:42 PM.

Details

Summary

Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.

co-authored with @springerm

Diff Detail

Event Timeline

ThomasRaoux created this revision.Jun 6 2022, 4:42 PM
Herald added a project: Restricted Project. · View Herald Transcript
ThomasRaoux requested review of this revision.Jun 6 2022, 4:42 PM
Herald added a project: Restricted Project. · View Herald Transcript
ThomasRaoux retitled this revision from [mlir][vector] Add patterns to ppropagate vector distribution to [mlir][vector] Add patterns to propagate vector distribution.Jun 6 2022, 4:52 PM
csigg added inline comments.Jun 6 2022, 10:46 PM
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
225

How do the two vector types above imply this map?

230

This comment doesn't seem to match the code.

394

I'm starting to think I'm not understanding the semantic of this op.

What is the type of %arg0?

How does the vector<32xf32> operand type from vector.yield and the vector<1xf32> return type of vector.warp_execute_on_lane_0 relate?

ThomasRaoux added inline comments.Jun 7 2022, 8:16 AM
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
225

The map indicates which dimensions are distributed. Here based on the vector shape we know that dimension 0 and 2 are different in the yield and result meaning they must be distributed on threads.

394

I have tried to describe it in the op but probably could use some feedback on better wording. %arg0 is the threadId, just an ID identifying the different threads we want to distribute on.
The Value being yield is distributed on multiple lanes/threads. Therefore the vector<32xf32> value outside of the region is distributed so that the 32 lanes each old 1 element of the vector hence the vector<1xf32>. This conversion implicitly goes through memory to transfer the vector<32xf32> from lane 0 to a Value distributed on all the threads.

Update comment

mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
230

Updated the comment.

springerm accepted this revision.Jun 9 2022, 2:54 AM
springerm added inline comments.
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
188

and is not dead.

204

operation

206

RewriterBase &rewriter

212

rewriter

452

I think we need a loop iterating over all uses, then wrap each update them one-by-one. Each of them wrapped in a rewriter.updateRootInplace(use.getOwner(), ...)

512

Same as above, we need a loop with updateRootInPlace.

541

Same as above, we need a loop with updateRootInPlace. At this point it may be worth adding a helper function.

I suggested deleting the original op during moveRegionToNewWarpOpAndReplaceReturns in the previous change. In that case, you can't iterate over warpOp.getResults here.

585

Use rewriter.

609

This is probably safe, but I would use the rewriter for consistency.

666

updateRootInplace

681–694

I think this can be simplified with rewriter.mergeBlocks. The replacements for the bbArgs can be specified as the third argument. The bvm should not be necessary then.

This is also more efficient because the ops are not cloned. The terminator is also moved, so just remove that one afterwards.

703–704

Use rewriter

This revision is now accepted and ready to land.Jun 9 2022, 2:54 AM

Address review comments.

ThomasRaoux marked 9 inline comments as done.Jun 13 2022, 9:27 AM
ThomasRaoux added inline comments.
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
452

I don't see user being notified when the uses are replaced in replaceOp or other cases. I'd rather avoid having a special case here. If it is needed we most likely need to update Rewriter class to support such thing. I think at this point it is safe so I'd prefer keeping it like that.

512

same comment.

681–694

Good point, changed it.

This revision was landed with ongoing or failed builds.Jun 13 2022, 9:39 AM
This revision was automatically updated to reflect the committed changes.