Download Raw Diff

Details

Reviewers

mravishankar
rriddle
nicolasvasilache
mehdi_amini

Summary

-cse does not eliminate common sub-expressions that appear on both branches of an scf.if op. Such IR is sometimes produced by the bufferization (e.g., duplicate memref.subview ops). CSE'ing can enable additional optimizations such as the removal of self-copies.

This revisions adds a new pass to CSE ops within scf.if branches.

Depends On: D143253

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

springerm created this revision.Jan 24 2023, 8:29 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 8:29 AM

Herald added subscribers: Moerafaat, bzcheeseman, sdasgup3 and 19 others. · View Herald Transcript

springerm requested review of this revision.Jan 24 2023, 8:29 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 8:29 AM

Herald added a subscriber: stephenneuendorffer. · View Herald Transcript

This is awesome! Out of curiosity, is the intent for this to be run as part of a fixed-point iteration in order to handle nested scf.ifs?

In D142480#4077492, @benvanik wrote:

This is awesome! Out of curiosity, is the intent for this to be run as part of a fixed-point iteration in order to handle nested scf.ifs?

void SCFCSEBranches::runOnOperation() {
  getOperation()->walk([](IfOp ifOp) {

This walk is post-order, so a single run of the pass should be sufficient to handle nested scf.ifs. But it may be beneficial to run this pass multiple times interleaved with the normal -cse pass because this pass only hoists+CSE's direct children ops of scf.if.

Could we just use the hoisting part here. You can just hoist operations outside of if and let standard CSE common out the users.... THis is duplicating a lot of functionality

This revision now requires changes to proceed.Jan 24 2023, 9:25 AM

In D142480#4077671, @mravishankar wrote:

Could we just use the hoisting part here. You can just hoist operations outside of if and let standard CSE common out the users.... THis is duplicating a lot of functionality

We could but this allows us to CSE irrespective of op spectuability and it has no effect on performance (because CSE’d ops are executed in either branch).

I’m going to address the TODO tmr and reuse isEqual() from the other CSE.cpp, then there’s no code duplication anymore.

In D142480#4077680, @springerm wrote:

In D142480#4077671, @mravishankar wrote:

Could we just use the hoisting part here. You can just hoist operations outside of if and let standard CSE common out the users.... THis is duplicating a lot of functionality

We could but this allows us to CSE irrespective of op spectuability and it has no effect on performance (because CSE’d ops are executed in either branch).

You are CSE-ing and hoisting... thats already within the preview of treating it as a speculatable op. Doing this on non-speculatable ops would be illegal AFAIK.

I’m going to address the TODO tmr and reuse isEqual() from the other CSE.cpp, then there’s no code duplication anymore.

You are CSE-ing and hoisting... thats already within the preview of treating it as a speculatable op. Doing this on non-speculatable ops would be illegal AFAIK.

That’s why this pass is specific to scf.if. It does not hoist or CSE non-specutable ops from other ops. That should be legal.

In D142480#4077799, @springerm wrote:

You are CSE-ing and hoisting... thats already within the preview of treating it as a speculatable op. Doing this on non-speculatable ops would be illegal AFAIK.

That’s why this pass is specific to scf.if. It does not hoist or CSE non-specutable ops from other ops. That should be legal.

Then I am confused. If it is speculatable only, then you can just hoist.. Then CSE will handle the equivalence and replacing uses.... Not sure what I am missing....

Harbormaster completed remote builds in B209687: Diff 491819.Jan 24 2023, 11:25 AM

In D142480#4077817, @mravishankar wrote:

Then I am confused. If it is speculatable only, then you can just hoist.. Then CSE will handle the equivalence and replacing uses.... Not sure what I am missing....

What I meant:

%r = scf.if ... {
  "non_specutable_op"()
  ...
} else {
  "non_specutable_op"()
  ...
}

==> /* This transformation does not affect performance. */

"non_specutable_op"()
%r = scf.if ... {
  ...
} else {
  ...
}

Hoisting and CSE'ing "non_spectuable_op" from both branches is OK. Any op for that matter. Just hoisting from the "then" branch is not OK; that's only allowed if the op is pure and specutable.

My main concerns about "just hoisting" are performance implications. Should we hoist a tensor.insert_slice from an scf.if? That's a buffer copy after bufferization, so it can lead to a performance regression. We could go even further: Hoist the entire "then" and "else" bodies (assuming they are all pure+specutable ops; usually the case in tensor IR), so that only the terminators are left over; scf.if will become an arith.select. That's probably not desirable.

So we would need to define a set of "hoistable" ops: ops that we can hoisted without causing UB (specutable) or performance regressions. The latter condition is currently not modeled in MLIR. Sure, I could write a pattern to just hoist memref.subview for now, but we will probably need to hoist additional ops soon (e.g., memref.cast) and I'm thinking of a more general solution here.

Share operation equivalence check with CSE.cpp

This change to CSE doesn't seem that ideal to me in general, it's leaking the internal details of what CSE is looking for -- which may change in the future, in which case we'd have to propagate that to the hoisting pass being added here.

This revision now requires changes to proceed.Jan 25 2023, 1:08 AM

In D142480#4079266, @rriddle wrote:

This change to CSE doesn't seem that ideal to me in general, it's leaking the internal details of what CSE is looking for -- which may change in the future, in which case we'd have to propagate that to the hoisting pass being added here.

How about about extending OperationEquivalence in OperationSupport.h so that it also takes into account the contents of nested regions? (E.g., adding a new mlir::isEquivalent(Operation *, Operation *) helper to that file. It would have to be a bit smarter than the check in this file; i.e., support >1 regions/blocks.) Then we don't need a custom isEqual check in either CSE pass.

Harbormaster completed remote builds in B209811: Diff 492016.Jan 25 2023, 1:18 AM

In D142480#4079278, @springerm wrote:

In D142480#4079266, @rriddle wrote:

This change to CSE doesn't seem that ideal to me in general, it's leaking the internal details of what CSE is looking for -- which may change in the future, in which case we'd have to propagate that to the hoisting pass being added here.

How about about extending OperationEquivalence in OperationSupport.h so that it also takes into account the contents of nested regions? (E.g., adding a new mlir::isEquivalent(Operation *, Operation *) helper to that file. It would have to be a bit smarter than the check in this file; i.e., support >1 regions/blocks.) Then we don't need a custom isEqual check in either CSE pass.

OperationEquivalent::isEquivalentTo already checks regions, but corresponding bbArgs of two ops are not considered "equivalent" by default (unless they are added to mapOperands). E.g.:

%0 = tensor.generate %size {
  ^bb0(%arg0: index):
  ...
} : tensor<?xf32>

%1 = tensor.generate %size {
  ^bb0(%arg1: index):
  ...
} : tensor<?xf32>

%arg0 and %arg1 are not equivalent by default. They must be marked as such in mapOperands. That's why OperationEquivalent::isEquivalentTo cannot be directly used in CSE.

address comments

springerm edited the summary of this revision. (Show Details)Jan 25 2023, 8:55 AM

springerm added a parent revision: D142558: [mlir][transforms] Simplify OperationEquivalence and CSE.

Harbormaster completed remote builds in B209892: Diff 492139.Jan 25 2023, 9:59 AM

In D142480#4079206, @springerm wrote:
In D142480#4077817, @mravishankar wrote:

Then I am confused. If it is speculatable only, then you can just hoist.. Then CSE will handle the equivalence and replacing uses.... Not sure what I am missing....

What I meant:
%r = scf.if ... {
  "non_specutable_op"()
  ...
} else {
  "non_specutable_op"()
  ...
}

==> /* This transformation does not affect performance. */

"non_specutable_op"()
%r = scf.if ... {
  ...
} else {
  ...
}
Hoisting and CSE'ing "non_spectuable_op" from both branches is OK. Any op for that matter. Just hoisting from the "then" branch is not OK; that's only allowed if the op is pure and specutable.

My main concerns about "just hoisting" are performance implications. Should we hoist a tensor.insert_slice from an scf.if? That's a buffer copy after bufferization, so it can lead to a performance regression. We could go even further: Hoist the entire "then" and "else" bodies (assuming they are all pure+specutable ops; usually the case in tensor IR), so that only the terminators are left over; scf.if will become an arith.select. That's probably not desirable.

That seems to be mixing concerns here. If a tensor.insert_slice can be hoisted, bufferization should be immune to that (I'd think it would even be better, but dont know the example that you have in mind). Hoisting entire then and else branches and ending up with a select seems to be a good thing in general.

So we would need to define a set of "hoistable" ops: ops that we can hoisted without causing UB (specutable) or performance regressions. The latter condition is currently not modeled in MLIR. Sure, I could write a pattern to just hoist memref.subview for now, but we will probably need to hoist additional ops soon (e.g., memref.cast) and I'm thinking of a more general solution here.

I dont fully understand this. Hoisting should always be better since the dependence chain you track doesnt need to go through conditionals. So I am not able to see the regression aspect.... Hoisting can also reduce life times of values...

Overall, I'd like some more input on these. Maybe @rriddle since he is already looking into it... Overall, I dont have very strong concerns, but seems like a specific solution when it doesnt need to be.

Herald added a subscriber: thopre. · View Herald TranscriptJan 25 2023, 10:07 PM

If a tensor.insert_slice can be hoisted, bufferization should be immune to that (I'd think it would even be better, but dont know the example that you have in mind). Hoisting entire then and else branches and ending up with a select seems to be a good thing in general.

Yes it is actually better for the bufferization — a flat structure is easier to bufferize than one with nested blocks; fewer things can go wrong.

But hoisting things also means that ops that were previously inside a branch and executed 50% of the time are now executed 100% of the time. (After bufferization, pure tensor ops have a side effect and therefore a cost.)

In D142480#4081907, @springerm wrote:

If a tensor.insert_slice can be hoisted, bufferization should be immune to that (I'd think it would even be better, but dont know the example that you have in mind). Hoisting entire then and else branches and ending up with a select seems to be a good thing in general.

Yes it is actually better for the bufferization — a flat structure is easier to bufferize than one with nested blocks; fewer things can go wrong.

But hoisting things also means that ops that were previously inside a branch and executed 50% of the time are now executed 100% of the time. (After bufferization, pure tensor ops have a side effect and therefore a cost.)

IMO that is just how tensor-based codegeneration works.... You cannot rely on that not being hoisted, unless there is some other mechanism to stop it from hoisting (either move it into a region that is not isolated from above, or maybe we need a way to say "dont move any op outside of this region"). But like I said, I can live with what you are doing here... but would really hope someone else can weigh in if they have a more informed opinion.

@rriddle This is no longer leaking internal details CSE. Does this look good to you?

Hardcode84 added a subscriber: Hardcode84.Jan 31 2023, 8:33 AM

Hardcode84 added inline comments.

mlir/lib/Dialect/SCF/Transforms/CSE.cpp
35	I think first check `ifOp.getThenRegion().hasOneBlock()` is useless and will always succeed.
44	I think `llvm::make_early_inc_range(*thenBlock)` will work.

Hardcode84 added inline comments.Jan 31 2023, 8:36 AM

mlir/lib/Dialect/SCF/Transforms/CSE.cpp
59	This algorithm is quadratic as it compares each op from `then` block with each op from `else` block. Is this ok for MLIR? It probably can be optimized by putting ops from `then` block into hashmap first and using it while traversing `else` block.

address comments

springerm edited the summary of this revision. (Show Details)Feb 3 2023, 2:24 AM

springerm edited parent revisions, added: D143253: [mlir][SCF] Disallow multiple blocks in scf.if "else" region; removed: D142558: [mlir][transforms] Simplify OperationEquivalence and CSE.

springerm marked 2 inline comments as done.

springerm added inline comments.

mlir/lib/Dialect/SCF/Transforms/CSE.cpp
35	I wasn't sure because the "else" block used to allow multiple blocks. I think this was an oversight, addressed in D143253.

springerm added a reviewer: mehdi_amini.Feb 3 2023, 2:26 AM

Harbormaster completed remote builds in B211675: Diff 494563.Feb 3 2023, 2:59 AM

Hardcode84 added inline comments.Feb 3 2023, 3:16 AM

mlir/lib/Dialect/SCF/Transforms/CSE.cpp
35	Check for else region is still needed, but it can be `!elseRegion.empty()`or `IfOp.getNumResults() != 0`, also pls add test for scf.if without else just to check we don't crash on it.

address comments

springerm marked an inline comment as done.Feb 3 2023, 3:25 AM

Harbormaster completed remote builds in B211690: Diff 494582.Feb 3 2023, 3:52 AM

Couldn't CSE be extended to support the BranchOpInferface and work on scf.if the same way it works on CFG?

mlir/include/mlir/Dialect/SCF/Transforms/Passes.h
26	Typo: `eliminates`

In D142480#4102856, @mehdi_amini wrote:

Couldn't CSE be extended to support the BranchOpInferface and work on scf.if the same way it works on CFG?

This CSE variant is a bit different from Transforms/CSE.cpp in a sense that even ops with side effects can be CSE'd if they appear on both branches. So far so good, it should be possible to do this as part of CSE::simplifyOperation.

However, I think the RegionBranchOpInterface is not powerful enough. From the documentation of getSuccessorRegions:

These are the regions that may be selected during the flow of control.

The problem here is "may be". By querying this interface, we cannot deduce that either the "then" branch or the "else" branch will be executed. getSuccessorRegions will return both regions, but it is possible that none of the regions is executed.

Diff 492139

mlir/include/mlir/Dialect/SCF/Transforms/Passes.h

	Show All 17 Lines
	namespace mlir {			namespace mlir {

	#define GEN_PASS_DECL			#define GEN_PASS_DECL
	#include "mlir/Dialect/SCF/Transforms/Passes.h.inc"			#include "mlir/Dialect/SCF/Transforms/Passes.h.inc"

	/// Creates a pass that bufferizes the SCF dialect.			/// Creates a pass that bufferizes the SCF dialect.
	std::unique_ptr<Pass> createSCFBufferizePass();			std::unique_ptr<Pass> createSCFBufferizePass();

				/// Creates a pass that hoists and elimiantes common sub-expressions from
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Typo: `eliminates` mehdi_amini: Typo: `eliminates`
				/// scf.if branches.
				std::unique_ptr<Pass> createSCFCSEBranchesPass();

	/// Creates a pass that specializes for loop for unrolling and			/// Creates a pass that specializes for loop for unrolling and
	/// vectorization.			/// vectorization.
	std::unique_ptr<Pass> createForLoopSpecializationPass();			std::unique_ptr<Pass> createForLoopSpecializationPass();

	/// Creates a pass that peels for loops at their upper bounds for			/// Creates a pass that peels for loops at their upper bounds for
	/// better vectorization.			/// better vectorization.
	std::unique_ptr<Pass> createForLoopPeelingPass();			std::unique_ptr<Pass> createForLoopPeelingPass();

	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/SCF/Transforms/Passes.td

	Show All 12 Lines

	def SCFBufferize : Pass<"scf-bufferize"> {			def SCFBufferize : Pass<"scf-bufferize"> {
	let summary = "Bufferize the scf dialect.";			let summary = "Bufferize the scf dialect.";
	let constructor = "mlir::createSCFBufferizePass()";			let constructor = "mlir::createSCFBufferizePass()";
	let dependentDialects = ["bufferization::BufferizationDialect",			let dependentDialects = ["bufferization::BufferizationDialect",
	"memref::MemRefDialect"];			"memref::MemRefDialect"];
	}			}

				def SCFCSEBranches : Pass<"scf-cse-branches"> {
				let summary = "Hoist and eliminate common sub-expressions in scf.if branches";
				let description = [{
				This pass eliminates common sub-expressions inside the "then" and the "else"
				branches of scf.if ops. This pass relies on information provided by the
				MemoryEffectOpInterface to identify when it is safe to eliminate operations.

				Note: This pass does not eliminate duplicate operations outside of scf.if
				ops. The `-cse` pass can be used in such cases.
				}];
				let constructor = "mlir::createSCFCSEBranchesPass()";
				}

	// Note: Making these canonicalization patterns would require a dependency			// Note: Making these canonicalization patterns would require a dependency
	// of the SCF dialect on the Affine/Tensor/MemRef dialects or vice versa.			// of the SCF dialect on the Affine/Tensor/MemRef dialects or vice versa.
	def SCFForLoopCanonicalization			def SCFForLoopCanonicalization
	: Pass<"scf-for-loop-canonicalization"> {			: Pass<"scf-for-loop-canonicalization"> {
	let summary = "Canonicalize operations within scf.for loop bodies";			let summary = "Canonicalize operations within scf.for loop bodies";
	let constructor = "mlir::createSCFForLoopCanonicalizationPass()";			let constructor = "mlir::createSCFForLoopCanonicalizationPass()";
	let dependentDialects = ["AffineDialect", "tensor::TensorDialect",			let dependentDialects = ["AffineDialect", "tensor::TensorDialect",
	"memref::MemRefDialect"];			"memref::MemRefDialect"];
	▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

mlir/lib/Dialect/SCF/Transforms/CMakeLists.txt

	add_mlir_dialect_library(MLIRSCFTransforms			add_mlir_dialect_library(MLIRSCFTransforms
	BufferizableOpInterfaceImpl.cpp			BufferizableOpInterfaceImpl.cpp
	Bufferize.cpp			Bufferize.cpp
				CSE.cpp
	ForToWhile.cpp			ForToWhile.cpp
	LoopCanonicalization.cpp			LoopCanonicalization.cpp
	LoopPipelining.cpp			LoopPipelining.cpp
	LoopRangeFolding.cpp			LoopRangeFolding.cpp
	LoopSpecialization.cpp			LoopSpecialization.cpp
	ParallelLoopCollapsing.cpp			ParallelLoopCollapsing.cpp
	ParallelLoopFusion.cpp			ParallelLoopFusion.cpp
	ParallelLoopTiling.cpp			ParallelLoopTiling.cpp
	Show All 30 Lines

mlir/lib/Dialect/SCF/Transforms/CSE.cpp

This file was added.

				#include "mlir/Dialect/SCF/Transforms/Passes.h"

				#include "mlir/Dialect/SCF/IR/SCF.h"
				#include "mlir/Interfaces/SideEffectInterfaces.h"

				namespace mlir {
				#define GEN_PASS_DEF_SCFCSEBRANCHES
				#include "mlir/Dialect/SCF/Transforms/Passes.h.inc"
				} // namespace mlir

				using namespace mlir;
				using namespace mlir::scf;

				/// Return `true` if `op` uses an OpResult defined inside `block`.
				static bool usesValuesDefinedInBlock(Operation op, Block block) {
				WalkResult status = op->walk([&](Operation *op) {
				if (llvm::any_of(op->getOperands(), [&](Value v) {
				return v.isa<OpResult>() && v.getDefiningOp()->getBlock() == block;
				}))
				return WalkResult::interrupt();
				return WalkResult::advance();
				});
				return status.wasInterrupted();
				}

				namespace {
				struct SCFCSEBranches : public impl::SCFCSEBranchesBase<SCFCSEBranches> {
				void runOnOperation() override;
				};
				} // namespace

				void SCFCSEBranches::runOnOperation() {
				getOperation()->walk([](IfOp ifOp) {
				// Only scf.if ops with a single block are supported.
				if (!ifOp.getThenRegion().hasOneBlock() \|\|
				Hardcode84Unsubmitted Done Reply Inline Actions I think first check `ifOp.getThenRegion().hasOneBlock()` is useless and will always succeed. Hardcode84: I think first check `ifOp.getThenRegion().hasOneBlock()` is useless and will always succeed.
				springermAuthorUnsubmitted Done Reply Inline Actions I wasn't sure because the "else" block used to allow multiple blocks. I think this was an oversight, addressed in D143253. springerm: I wasn't sure because the "else" block used to allow multiple blocks. I think this was an…
				Hardcode84Unsubmitted Done Reply Inline Actions Check for else region is still needed, but it can be `!elseRegion.empty()`or `IfOp.getNumResults() != 0`, also pls add test for scf.if without else just to check we don't crash on it. Hardcode84: Check for else region is still needed, but it can be `!elseRegion.empty()`or `IfOp.
				!ifOp.getElseRegion().hasOneBlock())
				return;
				Block *thenBlock = &ifOp.getThenRegion().front();
				Block *elseBlock = &ifOp.getElseRegion().front();

				// Indicates if there is an impure op in the "then" branch before `thenOp`.
				bool thenImpureOpFound = false;

				for (auto &thenOp : llvm::make_early_inc_range(
				Hardcode84Unsubmitted Done Reply Inline Actions I think `llvm::make_early_inc_range(thenBlock)` will work. Hardcode84:* I think `llvm::make_early_inc_range(*thenBlock)` will work.
				llvm::make_range(thenBlock->begin(), thenBlock->end()))) {
				// Terminators and ops that use values defined in this block are skipped.
				if (isa<scf::YieldOp>(&thenOp) \|\|
				usesValuesDefinedInBlock(&thenOp, thenBlock)) {
				thenImpureOpFound \|= !isPure(&thenOp);
				continue;
				}

				// Set to true if `thenOp` was hoisted.
				bool hoisted = false;
				// Indicates if there is an impure op in the "else" branch before
				// `elseOp`.
				bool elseImpureOpFound = false;

				for (auto &elseOp : llvm::make_early_inc_range(
				Hardcode84Unsubmitted Not Done Reply Inline Actions This algorithm is quadratic as it compares each op from `then` block with each op from `else` block. Is this ok for MLIR? It probably can be optimized by putting ops from `then` block into hashmap first and using it while traversing `else` block. Hardcode84: This algorithm is quadratic as it compares each op from `then` block with each op from `else`…
				llvm::make_range(elseBlock->begin(), elseBlock->end()))) {
				// Do not CSE if:
				// 1. `thenOp` and `elseOp` are not equivalent, or
				bool equalOps = OperationEquivalence::isEquivalentTo(
				&thenOp, &elseOp, OperationEquivalence::Flags::IgnoreLocations);
				// 2. CSE'ing would change side effects. If the ops are pure, there then
				// there can be no change in side effects. Otherwise, hoist and CSE
				// them only if there is no side-effecting (impure) op in the branch
				// before `thenOp` or `elseOp`.
				bool sideEffectViolation =
				!isPure(&thenOp) && (thenImpureOpFound \|\| elseImpureOpFound);
				if (!equalOps \|\| sideEffectViolation) {
				elseImpureOpFound \|= !isPure(&elseOp);
				continue;
				}

				// There may be multiple matches for `thenOp` in the "else" branch.
				// Hoist `thenOp` only once.
				if (!hoisted) {
				thenOp.moveBefore(ifOp);
				hoisted = true;
				}

				// Erase duplicate op in the "else" branch.
				elseOp.replaceAllUsesWith(thenOp.getResults());
				elseOp.erase();
				}

				if (!hoisted) {
				thenImpureOpFound \|= !isPure(&thenOp);
				}
				}
				});
				}

				std::unique_ptr<Pass> mlir::createSCFCSEBranchesPass() {
				return std::make_unique<SCFCSEBranches>();
				}

mlir/test/Dialect/SCF/cse.mlir

This file was added.

				// RUN: mlir-opt -split-input-file -scf-cse-branches %s \| FileCheck %s

				// CHECK-LABEL: func @cse_side_effecting_ops
				// CHECK-NEXT: %[[a:.*]] = "test.dummy_a"
				// CHECK-NEXT: %[[b:.*]] = "test.dummy_b"(%[[a]])
				// CHECK-NEXT: %[[c5:.*]] = arith.constant 5 : index
				// CHECK-NEXT: %[[if:.]] = scf.if %{{.}} {
				// CHECK-NEXT: "test.other_side_effecting_op"
				// CHECK-NEXT: %[[c0:.*]] = "test.dummy_c"(%[[b]], %[[c5]])
				// CHECK-NEXT: scf.yield %[[c0]]
				// CHECK-NEXT: } else {
				// CHECK-NEXT: %[[c1:.*]] = "test.dummy_c"(%[[b]], %[[c5]])
				// CHECK-NEXT: scf.yield %[[c1]]
				// CHECK-NEXT: }
				// CHECK-NEXT: return %[[if]]
				func.func @cse_side_effecting_ops(%arg0: i1) -> f32 {
				%0 = scf.if %arg0 -> (f32) {
				// %1 is CSE'd because there is no preceding impure op.
				%1 = "test.dummy_a"() : () -> (f32)
				// %2 is CSE'd because there is no preceding impure op (after %1 was CSE'd).
				%2 = "test.dummy_b"(%1) : (f32) -> (f32)
				// Not CSE'd, this op does not exist in the "else" branch.
				"test.other_side_effecting_op"() : () -> ()
				// %c5 is CSE'd because it is pure and does not depend on values in this
				// block.
				%c5 = arith.constant 5 : index
				// %3 is not CSE'd because it is impure and there is a preceding impure op
				// in the "then" branch.
				%3 = "test.dummy_c"(%2, %c5) : (f32, index) -> (f32)
				// Terminator is not hoisted.
				scf.yield %3 : f32
				} else {
				%1 = "test.dummy_a"() : () -> (f32)
				%2 = "test.dummy_b"(%1) : (f32) -> (f32)
				%c5 = arith.constant 5 : index
				%3 = "test.dummy_c"(%2, %c5) : (f32, index) -> (f32)
				scf.yield %3 : f32
				}
				return %0 : f32
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][SCF] CSE and hoist operations in scf.if branches
Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 492139

mlir/include/mlir/Dialect/SCF/Transforms/Passes.h

mlir/include/mlir/Dialect/SCF/Transforms/Passes.td

mlir/lib/Dialect/SCF/Transforms/CMakeLists.txt

mlir/lib/Dialect/SCF/Transforms/CSE.cpp

mlir/test/Dialect/SCF/cse.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][SCF] CSE and hoist operations in scf.if branchesNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 492139

mlir/include/mlir/Dialect/SCF/Transforms/Passes.h

mlir/include/mlir/Dialect/SCF/Transforms/Passes.td

mlir/lib/Dialect/SCF/Transforms/CMakeLists.txt

mlir/lib/Dialect/SCF/Transforms/CSE.cpp

mlir/test/Dialect/SCF/cse.mlir

[mlir][SCF] CSE and hoist operations in scf.if branches
Needs ReviewPublic