Introduce subgroup_reduce operation, similar to all_reduce, but operating on subgroup scope instead of workgroup.
It is intended as low-level building block for more high level abstractions (e.g for workgroup-wide all_reduce ops).
Only introduce version taking reduce operation enum for simplicity sake.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
are you planning to add some lowering for it? This can be useful to at least "enforce the semantic". For instance it should be easy to have a lowering of this shuffle ops.
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td | ||
---|---|---|
727–728 | can you clarify the behavior when some lanes in the subgroups are inactive? |
are you planning to add some lowering for it? This can be useful to at least "enforce the semantic". For instance it should be easy to have a lowering of this shuffle ops.
We are planning to lower them directly to GroupNonUniformXYZ spirv ops (those ops allows both Workgroup and Subgroup scope, but Intel Level Zero/Intel Graphics Compiler we are targeting only supports Subgroup version at the moment).
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td | ||
---|---|---|
727–728 | spirv ops like GroupNonUniformFAddOp we are targeting allow non-uniform execution, but we don't need it at the moment. I will add uniform requirement similar to all_reduce. We can always add something like non-uniform flag to these ops later if needed. |
can you clarify the behavior when some lanes in the subgroups are inactive?