This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow.
ClosedPublic

Authored by dfki-mako on May 13 2020, 4:42 AM.

Details

Summary

The current BufferPlacement pass does not support allocation nodes that carry additional dependencies (like in the case of dynamic shaped types). These allocations can often not be moved freely and in turn might remain in divergent control-flow branches. This requires a different strategy with respect to block arguments and aliases. This CL adds additional functionality to support allocation nodes in divergent control flow while avoiding memory leaks.

Diff Detail

Event Timeline

dfki-mako created this revision.May 13 2020, 4:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2020, 4:42 AM
dfki-mako retitled this revision from Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow. to [mlir] Extended BufferPlacement to support more sophisticated scenarios in which allocations cannot be moved freely and can remain in divergent control flow..May 13 2020, 4:43 AM
dfki-mako edited the summary of this revision. (Show Details)
dfki-mako added a project: Restricted Project.
herhut requested changes to this revision.May 14 2020, 8:09 AM

Can you add tests for dynamic allocation cases, as well?

mlir/lib/Transforms/BufferPlacement.cpp
11

Incomplete sentence to ensure that all buffers

37

In other words, the analysis uses the post-dominator to free the allocated memory. If no such post-dominator exists, aliases are removed by inserting allocs and copies. Is that the idea?

45

Nit: Remove highly.

So alloc works the same way but this time with a common dominator? If no such dominator exists, copies are inserted?

167

Optimizations should be independent of BufferPlacement, so this TODO does not belong here.

186

Would it make sense to compute this once and share it between the different phases? The moving of allocs does not influence the aliasing.

203

This should not be an assert. Instead either ignore the alloc (which AFAIK was the mode before) or fail the transformation and report an error.

Generally, the reason this was not supported before was that multiple results creates situations where the alloc cannot be moved. Can the new system not tolerate this with copies?

I am OK with this not being supported, as a multi-result alloc seems a very boutique use.

217

Allocation node with know shape? Another way to look at it would be to check whether the allocation has any operands, which structurally is the reason it cannot be moved.

227

Same here, this could just use the operands of the alloc.

229

It is a clever reuse of findPlacementBlock. Maybe it should no longer talk about aliases. Instead, it is a helper that finds a dominator/postdominator for a range of values.

233

Is this dependency information that you are lacking?

281

Would it make sense to store BlockArgument to start with? Aliases are always BlockArguments, right?

284

This breaks the aliasing between source and value. If there are further aliases of source that alias via value, they would no longer be an alias. But we would still potentially insert a copy to break the aliasing?

293

This does not account for nested blocks in nested regions. Also, why is this needed?

313

Somewhat sad that MutableOperandRange does not support to get values out...

388

Why would we want to insert before the end operation? That would free it before the last use?

446

a an -> an

mlir/test/Transforms/buffer-placement.mlir
87

The numbering is strange of ALLOC02

211

I cannot follow this without the block numbers in the checks. Why do we get copies here?

309

Same here.

This revision now requires changes to proceed.May 14 2020, 8:09 AM
dfki-mako updated this revision to Diff 264912.May 19 2020, 7:55 AM
dfki-mako marked 12 inline comments as done.

Implemented a fix-point iteration to reduce the number of required copies.
Fixed minor issues (see review comments).

dfki-mako marked 11 inline comments as done.May 19 2020, 7:59 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
37

Yep, that's the general idea.

203

In principal yes; however, we currently remove all Dealloc nodes in the beginning of the transformation. If we keep them for unsupported Alloc ops, this should not be an issue.

293

Obsolete; we have added a new fix-point iteration.

313

Unfortunately, it seems to be the only way at the moment...

mlir/test/Transforms/buffer-placement.mlir
87

The name ALLOC02 should indicate that the allocations ALLOC0 and ALLOC2 will be freed at this location.

211

The updated algorithm does not insert any copies in these cases any more.

mehdi_amini added inline comments.May 19 2020, 1:27 PM
mlir/lib/Transforms/BufferPlacement.cpp
203

Can we be *safe* in this case instead of failing entirely?

dfki-mako marked 4 inline comments as done.May 20 2020, 12:48 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
203

@mehdi_amini If we do not remove Dealloc nodes (or nodes with free semantics) that are related to unsupported alloc nodes, we can be sure that the program will not be "worse" than before with respect to these buffers.

herhut added inline comments.May 20 2020, 2:14 AM
mlir/lib/Transforms/BufferPlacement.cpp
242

Nit: Remove the { }.

255

Aren't there two cases here? If this immediate alias is not dominated, then a copy is inserted, the alias turns effectively into an allocation and also needs to be processed. If we do not insert a copy, then the alias still has to be processed, using the original allocation as the block that needs to dominate.

I think the second case is missing here.

mehdi_amini added inline comments.May 23 2020, 3:31 PM
mlir/lib/Transforms/BufferPlacement.cpp
203

Right, so we shouldn't need to emit an error here then? We can just skip here?

dfki-mako updated this revision to Diff 266480.May 27 2020, 3:46 AM

Fixed corner cases in which no copies have been created due to aliasing (see reviewer comments).
Added additional test cases to simulate such special control-flow scenarios.
Changed behavior of the BufferPlacement pass to ignore Alloc operations with multiple allocation results.

dfki-mako marked 6 inline comments as done.May 27 2020, 3:48 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
203

The latest version skips these allocation nodes without aborting the transformation process.

255

We have added a separate test case to simulate such a case and changed the CL accordingly.

herhut added inline comments.May 27 2020, 1:26 PM
mlir/lib/Transforms/BufferPlacement.cpp
177

Nit: Finds -> Find?

250

With this change, you consider all transitive aliases of a value. However, as soon as you insert a copy (by adding a blockarg to the list of blocks to free) that set would need updating, as you now have fewer transitive aliases.

In you example, consider the case where there is another aliasing block (say b7) after b6. Then we would get a copy from b5 to b6, as b2 does not dominate b6. This would also take care of the block b7 after b6, as long as b6 dominates it. However, with the current approach, you would still check whether b2 dominates this additional block b7 and like insert a copy from b6 to b7.

I think that, when adding a blockarg to the free list, you have to use that block as the dominating block (as it now becomes the allocating place) when processing aliases of the new freed blockarg.

305

I do not understand what the block argument refers to here.

314

Nit: delloc -> dealloc

mlir/test/Transforms/buffer-placement.mlir
146

Could you add all block numbers here. That makes it easier to follow where the br operations belong.

pifon2a added inline comments.May 28 2020, 7:03 AM
mlir/test/Transforms/buffer-placement.mlir
54

can this (and all other comments and also IR below) fit 80 chars?

dfki-mako updated this revision to Diff 268124.Jun 3 2020, 3:33 AM
dfki-mako marked 10 inline comments as done.

Updated processing of aliases to avoid the aggressive insertion of copy operations in special cases.
Reformatted test cases.

herhut accepted this revision.Jun 8 2020, 7:20 AM

Thanks. This looks correct to me now.

You mentioned offline that you added a test for the case that was not covered before. I did not find it in this CL though.

mlir/lib/Transforms/BufferPlacement.cpp
241

You could also populate toProcess here and then just have the loop below. This works for me too, though.

This revision is now accepted and ready to land.Jun 8 2020, 7:20 AM
dfki-mako updated this revision to Diff 269474.Jun 9 2020, 3:13 AM

Resolved merge conflicts and integrated an additional test case into an existing one in order to capture an additional special case.

dfki-mako marked 2 inline comments as done.Jun 9 2020, 3:15 AM
dfki-mako added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
241

This would also work. However, we would prefer the more explicit version to help others understand the code more easily.

herhut accepted this revision.Jun 10 2020, 2:07 AM

Thanks for expanding the test.

This revision was automatically updated to reflect the committed changes.
dfki-mako marked an inline comment as done.