This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] Add pattern to distribute vector reduction to GPU shuffles
ClosedPublic

Authored by ThomasRaoux on Jun 6 2022, 7:40 PM.

Details

Summary

Add a pattern to do ad hoc lowering of vector.reduction to a sequence of warp shuffles. This allow distributing reduction on a warp for GPU targets.
Also add an execution test for warp reduction.

co-authored with @springerm

Diff Detail

Event Timeline

ThomasRaoux created this revision.Jun 6 2022, 7:40 PM
Herald added a project: Restricted Project. · View Herald Transcript
ThomasRaoux requested review of this revision.Jun 6 2022, 7:40 PM
Herald added a project: Restricted Project. · View Herald Transcript
csigg added inline comments.Jun 6 2022, 10:35 PM
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
746

These %e look wrong.

769

Could you use rewriter.notifyMatchFailure() to make debugging easier? Here and below.

784–785

Would it be cleaner to initialize those vectors during construction instead of push_back?

785

.getType()?

790

Would getResults().back() be easier to parse?

791

I would have expected this to be 'laneId' of some sort. Does this happen latter in the lowering?

807

Could we use 'bfly' reduction instead of broadcasting separately?

ThomasRaoux marked an inline comment as done.

Address review comments

ThomasRaoux marked an inline comment as done.Jun 7 2022, 8:08 AM
ThomasRaoux added inline comments.
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
746

good catch, fixed it.

784–785

indeed.

790

The new op has extra results so this is not the last element of the newWarpOp results.

791

The op will return Value vector<1xf32> this extract just converts it back to a scalar. The laneId indexing is implicitly done due to the distribution of the vector onto simt lanes.

807

That’s a good point. One potential problem I see with the

807

sorry outdated reply, changed it to butterfly reduction.

springerm accepted this revision.Jun 9 2022, 2:55 AM
springerm added inline comments.
mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
806

Better to wrap in updateRootInplace

This revision is now accepted and ready to land.Jun 9 2022, 2:55 AM
mehdi_amini added inline comments.Jun 14 2022, 3:32 AM
mlir/lib/Dialect/Vector/Transforms/CMakeLists.txt
28 ↗(On Diff #436659)

Seems like a layering violation?

springerm added inline comments.Jun 14 2022, 3:37 AM
mlir/lib/Dialect/Vector/Transforms/CMakeLists.txt
28 ↗(On Diff #436659)

Note this is VectorTransforms not VectorDialect. But this code could also live in GPUTransforms (adding a dependency on VectorDialect to GPUTransforms instead).

ThomasRaoux added inline comments.Jun 15 2022, 8:20 AM
mlir/lib/Dialect/Vector/Transforms/CMakeLists.txt
28 ↗(On Diff #436659)

address review comment