Page MenuHomePhabricator

[mlir] Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow.
Needs ReviewPublic

Authored by dfki-mako on Wed, May 13, 4:42 AM.

Details

Summary

The current BufferPlacement pass does not support allocation nodes that carry additional dependencies (like in the case of dynamic shaped types). These allocations can often not be moved freely and in turn might remain in divergent control-flow branches. This requires a different strategy with respect to block arguments and aliases. This CL adds additional functionality to support allocation nodes in divergent control flow while avoiding memory leaks.

Diff Detail

Event Timeline

dfki-mako created this revision.Wed, May 13, 4:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptWed, May 13, 4:42 AM
dfki-mako retitled this revision from Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow. to [mlir] Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow..Wed, May 13, 4:43 AM
dfki-mako edited the summary of this revision. (Show Details)
dfki-mako added a project: Restricted Project.
herhut requested changes to this revision.Thu, May 14, 8:09 AM

Can you add tests for dynamic allocation cases, as well?

mlir/lib/Transforms/BufferPlacement.cpp
11

Incomplete sentence to ensure that all buffers

37

In other words, the analysis uses the post-dominator to free the allocated memory. If no such post-dominator exists, aliases are removed by inserting allocs and copies. Is that the idea?

45

Nit: Remove highly.

So alloc works the same way but this time with a common dominator? If no such dominator exists, copies are inserted?

175

Optimizations should be independent of BufferPlacement, so this TODO does not belong here.

194

Would it make sense to compute this once and share it between the different phases? The moving of allocs does not influence the aliasing.

210

This should not be an assert. Instead either ignore the alloc (which AFAIK was the mode before) or fail the transformation and report an error.

Generally, the reason this was not supported before was that multiple results creates situations where the alloc cannot be moved. Can the new system not tolerate this with copies?

I am OK with this not being supported, as a multi-result alloc seems a very boutique use.

226

Allocation node with know shape? Another way to look at it would be to check whether the allocation has any operands, which structurally is the reason it cannot be moved.

236

Same here, this could just use the operands of the alloc.

237–260

Is this dependency information that you are lacking?

238

It is a clever reuse of findPlacementBlock. Maybe it should no longer talk about aliases. Instead, it is a helper that finds a dominator/postdominator for a range of values.

354–355

This does not account for nested blocks in nested regions. Also, why is this needed?

371

Would it make sense to store BlockArgument to start with? Aliases are always BlockArguments, right?

373

Somewhat sad that MutableOperandRange does not support to get values out...

374

This breaks the aliasing between source and value. If there are further aliases of source that alias via value, they would no longer be an alias. But we would still potentially insert a copy to break the aliasing?

448

Why would we want to insert before the end operation? That would free it before the last use?

453

a an -> an

mlir/test/Transforms/buffer-placement.mlir
87

The numbering is strange of ALLOC02

211

I cannot follow this without the block numbers in the checks. Why do we get copies here?

301

Same here.

This revision now requires changes to proceed.Thu, May 14, 8:09 AM
dfki-mako updated this revision to Diff 264912.Tue, May 19, 7:55 AM
dfki-mako marked 12 inline comments as done.

Implemented a fix-point iteration to reduce the number of required copies.
Fixed minor issues (see review comments).

dfki-mako marked 11 inline comments as done.Tue, May 19, 7:59 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
37

Yep, that's the general idea.

210

In principal yes; however, we currently remove all Dealloc nodes in the beginning of the transformation. If we keep them for unsupported Alloc ops, this should not be an issue.

354–355

Obsolete; we have added a new fix-point iteration.

373

Unfortunately, it seems to be the only way at the moment...

mlir/test/Transforms/buffer-placement.mlir
87

The name ALLOC02 should indicate that the allocations ALLOC0 and ALLOC2 will be freed at this location.

211

The updated algorithm does not insert any copies in these cases any more.

mehdi_amini added inline comments.Tue, May 19, 1:27 PM
mlir/lib/Transforms/BufferPlacement.cpp
210

Can we be *safe* in this case instead of failing entirely?

dfki-mako marked 4 inline comments as done.Wed, May 20, 12:48 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
210

@mehdi_amini If we do not remove Dealloc nodes (or nodes with free semantics) that are related to unsupported alloc nodes, we can be sure that the program will not be "worse" than before with respect to these buffers.

herhut added inline comments.Wed, May 20, 2:14 AM
mlir/lib/Transforms/BufferPlacement.cpp
281

Aren't there two cases here? If this immediate alias is not dominated, then a copy is inserted, the alias turns effectively into an allocation and also needs to be processed. If we do not insert a copy, then the alias still has to be processed, using the original allocation as the block that needs to dominate.

I think the second case is missing here.

291

Nit: Remove the { }.

mehdi_amini added inline comments.Sat, May 23, 3:31 PM
mlir/lib/Transforms/BufferPlacement.cpp
210

Right, so we shouldn't need to emit an error here then? We can just skip here?