This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Transforms][NFC] CSE: Add C++ entry point
ClosedPublic

Authored by springerm on Mar 3 2023, 3:04 AM.

Details

Summary
  • All IR modifications are done with a rewriter.
  • The new C++ entry point takes a RewriterBase &, which may have a listener attached to it.

This revision is useful because it allows users to run CSE and track IR modifications via a listener that can be attached to the rewriter.

This is a reupload. The original CL was reverted (9979417d4db4) due to a memory leak. The memory leak is unrelated to this change and fixed with D154185.

Depends On: D154185

Diff Detail

Event Timeline

springerm created this revision.Mar 3 2023, 3:04 AM
Herald added a project: Restricted Project. · View Herald Transcript
springerm requested review of this revision.Mar 3 2023, 3:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2023, 3:04 AM

I am not sure I understand the overall direction: is the goal to instrument every single transformation manually by injecting listeners that way?
That seems a bit ad-how to me and I am not convinced that we should just scale this approach in an uncoordinated fashion. I would like to have a more comprehensive idea of the general pattern we aim for.

This is also intersecting with the IR listener discussions on Discourse.

I am not sure I understand the overall direction: is the goal to instrument every single transformation manually by injecting listeners that way?
That seems a bit ad-how to me and I am not convinced that we should just scale this approach in an uncoordinated fashion. I would like to have a more comprehensive idea of the general pattern we aim for.

This is also intersecting with the IR listener discussions on Discourse.

Yes IR listeners would be ideal here. Then we wouldn't need any of this.

We currently maintain a copy of the CSE pass in IREE because we cannot get notifications for the transform dialect. This was the only transform I was going to add this to (apart from the GreedyPatternRewriteDriver which already has it). But it does leave some asymmetry because some transforms have a listener and other do not.

An alternative would be to give the eliminateCommonSubExpressions a RewriterBase & argument instead of a listener. The rewriter could already come with a listener attached (or not). Or we could just wait until IR listeners have landed... (There would still be some benefit of being able to run CSE without building a pass pipeline with a single pass and running the pipeline.)

springerm planned changes to this revision.Mar 3 2023, 4:26 AM
springerm updated this revision to Diff 502124.Mar 3 2023, 7:04 AM

remove config and listener

springerm retitled this revision from [mlir][Transforms][NFC] CSE: Add CSERewriteConfig and non-pass entry point to [mlir][Transforms][NFC] CSE: Add non-pass entry point.Mar 3 2023, 7:05 AM
springerm edited the summary of this revision. (Show Details)
springerm updated this revision to Diff 535758.Jun 29 2023, 6:04 AM
springerm edited the summary of this revision. (Show Details)

rebase

springerm edited the summary of this revision. (Show Details)Jun 29 2023, 6:08 AM
mehdi_amini accepted this revision.Jun 29 2023, 7:21 AM
mehdi_amini added inline comments.
mlir/lib/Transforms/CSE.cpp
410

Shouldn't this function takes an optional DominanceInfo here?
We could then call this from the runOnOperation()

This revision is now accepted and ready to land.Jun 29 2023, 7:21 AM
springerm marked an inline comment as done.Jun 29 2023, 7:35 AM
springerm added inline comments.
mlir/lib/Transforms/CSE.cpp
410

We still need the driver in runOnOperation to query stats, but I added a DominanceInfo parameter anyway, because the caller may already have one.

springerm updated this revision to Diff 535791.Jun 29 2023, 7:35 AM
springerm marked an inline comment as done.
springerm edited the summary of this revision. (Show Details)

address comments

This revision was landed with ongoing or failed builds.Jun 29 2023, 7:42 AM
This revision was automatically updated to reflect the committed changes.

I suspect that this change caused memory leak for an internal program.

	@ 0x7f5e80df3165 llvm::DenseMap<>::grow()
	@ 0x7f5e80dffbce llvm::DenseMapBase<>::InsertIntoBucket<>()
	@ 0x7f5e80ddf537 llvm::DominatorTreeBase<>::createNode()
	@ 0x7f5e80dec536 llvm::DomTreeBuilder::SemiNCAInfo<>::CalculateFromScratch()
	@ 0x7f5e80de5505 mlir::detail::DominanceInfoBase<>::getDominanceInfo()
	@ 0x7f5e88317572 (anonymous namespace)::CSEDriver::simplifyRegion()
	@ 0x7f5e88317a4e (anonymous namespace)::CSEDriver::simplifyBlock()
	@ 0x7f5e8831766f (anonymous namespace)::CSEDriver::simplifyRegion()
	@ 0x7f5e883171ee (anonymous namespace)::CSEDriver::simplify()
	@ 0x7f5e8831b8ed (anonymous namespace)::CSE::runOnOperation()
	@ 0x7f5e823657ca mlir::detail::OpToOpPassAdaptor::run()
	@ 0x7f5e82366028 mlir::detail::OpToOpPassAdaptor::runPipeline()
	@ 0x7f5e823682a4 mlir::PassManager::run()

I suspect that this change caused memory leak for an internal program.

	@ 0x7f5e80df3165 llvm::DenseMap<>::grow()
	@ 0x7f5e80dffbce llvm::DenseMapBase<>::InsertIntoBucket<>()
	@ 0x7f5e80ddf537 llvm::DominatorTreeBase<>::createNode()
	@ 0x7f5e80dec536 llvm::DomTreeBuilder::SemiNCAInfo<>::CalculateFromScratch()
	@ 0x7f5e80de5505 mlir::detail::DominanceInfoBase<>::getDominanceInfo()
	@ 0x7f5e88317572 (anonymous namespace)::CSEDriver::simplifyRegion()
	@ 0x7f5e88317a4e (anonymous namespace)::CSEDriver::simplifyBlock()
	@ 0x7f5e8831766f (anonymous namespace)::CSEDriver::simplifyRegion()
	@ 0x7f5e883171ee (anonymous namespace)::CSEDriver::simplify()
	@ 0x7f5e8831b8ed (anonymous namespace)::CSE::runOnOperation()
	@ 0x7f5e823657ca mlir::detail::OpToOpPassAdaptor::run()
	@ 0x7f5e82366028 mlir::detail::OpToOpPassAdaptor::runPipeline()
	@ 0x7f5e823682a4 mlir::PassManager::run()

I have double checked that the leak is reliable and reverting this fixes some tensorflow tests. I've informed @springerm internally as well.
Apologies but I'll revert this commit soon.

Maybe I'm missing something, but I don't see how the problems with this patch have been resolved. Pushing tracking throughout passes isn't scalable, and I don't think we should have landed this.

Maybe I'm missing something, but I don't see how the problems with this patch have been resolved. Pushing tracking throughout passes isn't scalable, and I don't think we should have landed this.

This revision does not introduce any tracking. It just adds a C++ entry point to perform CSE with a RewriterBase. The rewriter may or may not have a listener attached.

I am not sure what caused the memory leak but it is unrelated to any op tracking and/or the transform dialect.

Should I post an RFC on Discourse before re-uploading a fixed version?

springerm added a comment.EditedJun 30 2023, 1:06 AM

The memory leak was caused by a bug in DominanceInfo (fixed in D154185).

springerm reopened this revision.Jun 30 2023, 11:11 AM
This revision is now accepted and ready to land.Jun 30 2023, 11:11 AM
springerm retitled this revision from [mlir][Transforms][NFC] CSE: Add non-pass entry point to [mlir][Transforms][NFC] CSE: Add C++ entry point.Jun 30 2023, 11:12 AM
springerm edited the summary of this revision. (Show Details)
springerm edited the summary of this revision. (Show Details)

Do not invalidate DominanceInfo. This is unnecessary because CSE does not erase ops with regions.

This revision was automatically updated to reflect the committed changes.