This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [scf] Add RegionBranchOpInterface to scf.forall and scf.parallel op
ClosedPublic

Authored by cxy-1993 on May 23 2023, 9:59 PM.

Details

Summary

Add RegionBranchOpIntefface to scf.forall and scf.parallel op to make analysis trace through subregions.

Diff Detail

Event Timeline

cxy-1993 created this revision.May 23 2023, 9:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 9:59 PM
cxy-1993 requested review of this revision.May 23 2023, 9:59 PM

As discussed on https://discourse.llvm.org/t/why-scf-forall-op-doesnt-have-regionbranchop-interface/70789/4, these two operation has special terminators.

Operation forall's terminator is InParallelOp without input operands.
Operation parallel's termnator is yield op without input operands.

So Interface "RegionBranchTerminatorOpInterface" don't need any more, return empty operands will keep “value propagation” invalid.

cxy-1993 updated this revision to Diff 525023.May 23 2023, 11:23 PM

Update coding style.

scf.parallel can have an scf.reduce terminator.
E.g.:

%init = arith.constant 0.0 : f32
scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init) -> f32 {
  %elem_to_reduce = load %buffer[%iv] : memref<100xf32>
  scf.reduce(%elem_to_reduce) : f32 {
    ^bb0(%lhs : f32, %rhs: f32):
      %res = arith.addf %lhs, %rhs : f32
      scf.reduce.return %res : f32
  }
}

scf.parallel can have an scf.reduce terminator.
E.g.:

%init = arith.constant 0.0 : f32
scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init) -> f32 {
  %elem_to_reduce = load %buffer[%iv] : memref<100xf32>
  scf.reduce(%elem_to_reduce) : f32 {
    ^bb0(%lhs : f32, %rhs: f32):
      %res = arith.addf %lhs, %rhs : f32
      scf.reduce.return %res : f32
  }
}

scf.parallel has a implicit scf.yield terminator(https://github.com/llvm/llvm-project/blob/cf1ef4161006e8119761b3a137423c23436bcf33/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td#L812).
And scf.reduce don't have terminator trait(https://github.com/llvm/llvm-project/blob/cf1ef4161006e8119761b3a137423c23436bcf33/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td#L900).

The actual result of scf.parallel is generated by scf.reduce, but terminator of scf.parallel yield empty. So we can prevent value propagation without registe RegionBranchTerminatorOpInterface to terminator.

springerm accepted this revision.May 24 2023, 12:35 AM
This revision is now accepted and ready to land.May 24 2023, 12:35 AM

Could you please help me push this patch to master @springerm