This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/SCF/IR/
-
mlir/
-
Dialect/
-
SCF/
-
IR/
-
SCFOps.td
-
lib/Dialect/SCF/IR/
-
Dialect/
-
SCF/
-
IR/
-
SCF.cpp

Differential D151287

[mlir] [scf] Add RegionBranchOpInterface to scf.forall and scf.parallel op
ClosedPublic

Authored by cxy-1993 on May 23 2023, 9:59 PM.

Download Raw Diff

Details

Reviewers

springerm
nicolasvasilache

Commits

rGe8bfec26a6da: [mlir] [scf] Add RegionBranchOpInterface to scf.forall and scf.parallel op

Summary

Add RegionBranchOpIntefface to scf.forall and scf.parallel op to make analysis trace through subregions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cxy-1993 created this revision.May 23 2023, 9:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2023, 9:59 PM

Herald added subscribers: bviyer, Moerafaat, bzcheeseman and 20 others. · View Herald Transcript

cxy-1993 requested review of this revision.May 23 2023, 9:59 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptMay 23 2023, 9:59 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

As discussed on https://discourse.llvm.org/t/why-scf-forall-op-doesnt-have-regionbranchop-interface/70789/4, these two operation has special terminators.

Operation forall's terminator is InParallelOp without input operands.
Operation parallel's termnator is yield op without input operands.

So Interface "RegionBranchTerminatorOpInterface" don't need any more, return empty operands will keep “value propagation” invalid.

Harbormaster completed remote builds in B234073: Diff 525001.May 23 2023, 10:22 PM

Update coding style.

scf.parallel can have an scf.reduce terminator.
E.g.:

%init = arith.constant 0.0 : f32
scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init) -> f32 {
  %elem_to_reduce = load %buffer[%iv] : memref<100xf32>
  scf.reduce(%elem_to_reduce) : f32 {
    ^bb0(%lhs : f32, %rhs: f32):
      %res = arith.addf %lhs, %rhs : f32
      scf.reduce.return %res : f32
  }
}

Harbormaster completed remote builds in B234092: Diff 525023.May 23 2023, 11:41 PM

In D151287#4367113, @springerm wrote:

scf.parallel can have an scf.reduce terminator.
E.g.:

%init = arith.constant 0.0 : f32
scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init) -> f32 {
  %elem_to_reduce = load %buffer[%iv] : memref<100xf32>
  scf.reduce(%elem_to_reduce) : f32 {
    ^bb0(%lhs : f32, %rhs: f32):
      %res = arith.addf %lhs, %rhs : f32
      scf.reduce.return %res : f32
  }
}

scf.parallel has a implicit scf.yield terminator(https://github.com/llvm/llvm-project/blob/cf1ef4161006e8119761b3a137423c23436bcf33/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td#L812).
And scf.reduce don't have terminator trait(https://github.com/llvm/llvm-project/blob/cf1ef4161006e8119761b3a137423c23436bcf33/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td#L900).

The actual result of scf.parallel is generated by scf.reduce, but terminator of scf.parallel yield empty. So we can prevent value propagation without registe RegionBranchTerminatorOpInterface to terminator.

springerm accepted this revision.May 24 2023, 12:35 AM

This revision is now accepted and ready to land.May 24 2023, 12:35 AM

Could you please help me push this patch to master @springerm

Closed by commit rGe8bfec26a6da: [mlir] [scf] Add RegionBranchOpInterface to scf.forall and scf.parallel op (authored by cxy-1993, committed by springerm). · Explain WhyMay 24 2023, 3:15 AM

This revision was automatically updated to reflect the committed changes.

springerm added a commit: rGe8bfec26a6da: [mlir] [scf] Add RegionBranchOpInterface to scf.forall and scf.parallel op.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SCF/

IR/

SCFOps.td

2 lines

lib/

Dialect/

SCF/

IR/

SCF.cpp

39 lines

Diff 525001

mlir/include/mlir/Dialect/SCF/IR/SCFOps.td

	Show First 20 Lines • Show All 357 Lines • ▼ Show 20 Lines
	// ForallOp			// ForallOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def ForallOp : SCF_Op<"forall", [			def ForallOp : SCF_Op<"forall", [
	AttrSizedOperandSegments,			AttrSizedOperandSegments,
	AutomaticAllocationScope,			AutomaticAllocationScope,
	RecursiveMemoryEffects,			RecursiveMemoryEffects,
	SingleBlockImplicitTerminator<"scf::InParallelOp">,			SingleBlockImplicitTerminator<"scf::InParallelOp">,
				DeclareOpInterfaceMethods<RegionBranchOpInterface>,
	]> {			]> {
	let summary = "evaluate a block multiple times in parallel";			let summary = "evaluate a block multiple times in parallel";
	let description = [{			let description = [{
	`scf.forall` is a target-independent multi-dimensional parallel			`scf.forall` is a target-independent multi-dimensional parallel
	region application operation. It has exactly one block that represents the			region application operation. It has exactly one block that represents the
	parallel body and it takes index operands that specify lower bounds, upper			parallel body and it takes index operands that specify lower bounds, upper
	bounds and steps.			bounds and steps.

	▲ Show 20 Lines • Show All 435 Lines • ▼ Show 20 Lines
	// ParallelOp			// ParallelOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def ParallelOp : SCF_Op<"parallel",			def ParallelOp : SCF_Op<"parallel",
	[AutomaticAllocationScope,			[AutomaticAllocationScope,
	AttrSizedOperandSegments,			AttrSizedOperandSegments,
	DeclareOpInterfaceMethods<LoopLikeOpInterface>,			DeclareOpInterfaceMethods<LoopLikeOpInterface>,
	RecursiveMemoryEffects,			RecursiveMemoryEffects,
				DeclareOpInterfaceMethods<RegionBranchOpInterface>,
	SingleBlockImplicitTerminator<"scf::YieldOp">]> {			SingleBlockImplicitTerminator<"scf::YieldOp">]> {
	let summary = "parallel for operation";			let summary = "parallel for operation";
	let description = [{			let description = [{
	The "scf.parallel" operation represents a loop nest taking 4 groups of SSA			The "scf.parallel" operation represents a loop nest taking 4 groups of SSA
	values as operands that represent the lower bounds, upper bounds, steps and			values as operands that represent the lower bounds, upper bounds, steps and
	initial values, respectively. The operation defines a variadic number of			initial values, respectively. The operation defines a variadic number of
	SSA values for its induction variables. It has one region capturing the			SSA values for its induction variables. It has one region capturing the
	loop body. The induction variables are represented as an argument of this			loop body. The induction variables are represented as an argument of this
	▲ Show 20 Lines • Show All 387 Lines • Show Last 20 Lines

mlir/lib/Dialect/SCF/IR/SCF.cpp

	Show First 20 Lines • Show All 1,676 Lines • ▼ Show 20 Lines

	void ForallOp::getCanonicalizationPatterns(RewritePatternSet &results,			void ForallOp::getCanonicalizationPatterns(RewritePatternSet &results,
	MLIRContext *context) {			MLIRContext *context) {
	results.add<DimOfForallOp, FoldTensorCastOfOutputIntoForallOp,			results.add<DimOfForallOp, FoldTensorCastOfOutputIntoForallOp,
	ForallOpControlOperandsFolder,			ForallOpControlOperandsFolder,
	ForallOpSingleOrZeroIterationDimsFolder>(context);			ForallOpSingleOrZeroIterationDimsFolder>(context);
	}			}

				/// Given the region at `index`, or the parent operation if `index` is None,
				/// return the successor regions. These are the regions that may be selected
				/// during the flow of control. `operands` is a set of optional attributes that
				/// correspond to a constant value for each operand, or null if that operand is
				/// not a constant.
				void ForallOp::getSuccessorRegions(std::optional<unsigned> index,
				ArrayRef<Attribute> operands,
				SmallVectorImpl<RegionSuccessor> &regions) {
				// If the predecessor is ForallOp, branch into the body with empty arguments.
				if (!index) {
				regions.push_back(RegionSuccessor(&getRegion()));
				return;
				}

				// Otherwise, the loop should branch back to the parent operation.
				assert(*index == 0 && "expected loop region");
				regions.push_back(RegionSuccessor());
				}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// InParallelOp			// InParallelOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Build a InParallelOp with mixed static and dynamic entries.			// Build a InParallelOp with mixed static and dynamic entries.
	void InParallelOp::build(OpBuilder &b, OperationState &result) {			void InParallelOp::build(OpBuilder &b, OperationState &result) {
	OpBuilder::InsertionGuard g(b);			OpBuilder::InsertionGuard g(b);
	Region *bodyRegion = result.addRegion();			Region *bodyRegion = result.addRegion();
	▲ Show 20 Lines • Show All 1,278 Lines • ▼ Show 20 Lines

	void ParallelOp::getCanonicalizationPatterns(RewritePatternSet &results,			void ParallelOp::getCanonicalizationPatterns(RewritePatternSet &results,
	MLIRContext *context) {			MLIRContext *context) {
	results			results
	.add<ParallelOpSingleOrZeroIterationDimsFolder, MergeNestedParallelLoops>(			.add<ParallelOpSingleOrZeroIterationDimsFolder, MergeNestedParallelLoops>(
	context);			context);
	}			}

				/// Given the region at `index`, or the parent operation if `index` is None,
				/// return the successor regions. These are the regions that may be selected
				/// during the flow of control. `operands` is a set of optional attributes that
				/// correspond to a constant value for each operand, or null if that operand is
				/// not a constant.
				void ParallelOp::getSuccessorRegions(std::optional<unsigned> index,
				ArrayRef<Attribute> operands,
				SmallVectorImpl<RegionSuccessor> &regions) {
				// If the predecessor is ParallelOp, branch into the body with empty
				// arguments.
				if (!index) {
				regions.push_back(RegionSuccessor(&getRegion()));
				return;
				}

				assert(*index == 0 && "expected loop region");
				// Otherwise, the loop should branch back to the parent operation.
				regions.push_back(RegionSuccessor());
				}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// ReduceOp			// ReduceOp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	void ReduceOp::build(			void ReduceOp::build(
	OpBuilder &builder, OperationState &result, Value operand,			OpBuilder &builder, OperationState &result, Value operand,
	function_ref<void(OpBuilder &, Location, Value, Value)> bodyBuilderFn) {			function_ref<void(OpBuilder &, Location, Value, Value)> bodyBuilderFn) {
	auto type = operand.getType();			auto type = operand.getType();
	▲ Show 20 Lines • Show All 1,024 Lines • Show Last 20 Lines