This is an archive of the discontinued LLVM Phabricator instance.

Providing buffer assignment for MLIR
ClosedPublic

Authored by dfki-ehna on Apr 20 2020, 4:32 AM.

Details

Summary

We have provided a generic buffer assignment transformation ported from TensorFlow. This generic transformation pass automatically analyzes the values and their aliases (also in other blocks) and returns the valid positions for Alloc and Dealloc operations. To find these positions, the algorithm uses the block Dominator and Post-Dominator analyses. In our proposed algorithm, we have considered aliasing, liveness, nested regions, branches, conditional branches, critical edges, and independency to custom block terminators. This implementation doesn't support block loops. However, we have considered this in our design. For this purpose, it is only required to have a loop analysis to insert Alloc and Dealloc operations outside of these loops in some special cases.

Diff Detail

Event Timeline

dfki-ehna created this revision.Apr 20 2020, 4:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2020, 4:32 AM
dfki-ehna updated this revision to Diff 258747.Apr 20 2020, 8:09 AM

Replacing fake test-purpose operations by Linalg.Generic operations. For testing computeAllocPosition of Buffer Assignment (BA), GenericOpConverter is introduced inside TestBufferAssignmentPreparationPass to convert tensor-type linalg.generic operations to memref ones. FunctionAndBlockSignatureConverter and NonVoidToVoidReturnOpConverter of BA are also tested.

rriddle requested changes to this revision.Apr 20 2020, 9:55 AM

Thanks! Left some stylistic comments for now, will review in more detail later.

mlir/include/mlir/Transforms/BufferAssignment.h
74 ↗(On Diff #258747)

Please use /// for top-level comments.

91 ↗(On Diff #258747)

Same here.

102 ↗(On Diff #258747)

Same here.

107 ↗(On Diff #258747)

Do we need to hard-code FuncOp here? Could we instead just check the number of arguments in the entry block of the function? Or is there some better way to make this more general?

mlir/lib/Transforms/BufferAssignment.cpp
41 ↗(On Diff #258747)

Please drop the (username) from the TODO.

105 ↗(On Diff #258747)

Use the predecessor iterators instead, so that you get the successor index for free:

for (auto it = block.pred_begin(), e = block.pred_end(); it !=e; ++it) {
  unsigned successorIndex = it.getSuccessorIndex();
}
113 ↗(On Diff #258747)

Please prefer early exit when possible, e.g.:

auto branchInterface = ...;
if (!branchInterface)
  continue;
197 ↗(On Diff #258747)

nit: auto -> Value

Please use the full type if possible.

198 ↗(On Diff #258747)

nit: auto -> Operation *

330 ↗(On Diff #258747)

nit: Please remove the (username)

330 ↗(On Diff #258747)

Generally using traits and interfaces are how you should generalize a pass, instead of templating.

384 ↗(On Diff #258747)

Please just construct an InsertionPoint directly if you need one.

446 ↗(On Diff #258747)

Why block2? Can we just use block?

460 ↗(On Diff #258747)

PassRegistration should use the declarative tablegen backend, i.e. you should add your pass here:

https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/Passes.td

This revision now requires changes to proceed.Apr 20 2020, 9:55 AM
mehdi_amini added a project: Restricted Project.Apr 20 2020, 4:12 PM

Thanks for upstreaming this!

mlir/include/mlir/Transforms/BufferAssignment.h
9 ↗(On Diff #258747)

Typo assginment

mlir/test/Transforms/buffer-assignment.mlir
1 ↗(On Diff #258747)

Can you try to use only operations with registered dialects? The test dialect in-tree is a potential host for such operations, and std.call @external_func is also frequently a good escape hatch.

64 ↗(On Diff #258747)

Can you document the property you're trying to test in each test-case? I really helps to interpret minimal check lines.

mehdi_amini added inline comments.Apr 20 2020, 8:09 PM
mlir/include/mlir/Transforms/BufferAssignment.h
50 ↗(On Diff #258747)

There seems to be a precondition that is does not work for a BlockArgument, can you take an OpResult as argument instead?

58 ↗(On Diff #258747)

Can you expand on the intended uses of it? Maybe a small example?

107 ↗(On Diff #258747)

I agree. Some tricky part is that this goes with the other pattern above which is also hard-coded on FuncOp and this pattern can only operate correctly after the other one rewrote the function. This is an even stronger point of coupling I believe.

113 ↗(On Diff #258747)

Nit: numFuncArgs - numReturnValues is a loop invariant, and it could help readability to hoist it out and name it, like int firstReturnParameter = numFuncArgs - numReturnValues; or similar.

120 ↗(On Diff #258747)

I am puzzled about why not just rewriter.setInsertionPoint(returnOp.getOperation()); here?

mlir/lib/Transforms/BufferAssignment.cpp
44 ↗(On Diff #258747)

When you say "does not support loops": can you be more precise in what will happen in the presence of loops (miscompile, crash, or just suboptimal allocations?)

117 ↗(On Diff #258747)

successorOps->successorOperands ("Op" is for "Operation" in general)

120 ↗(On Diff #258747)

Nit: Value arg

122 ↗(On Diff #258747)

I'd consider iterating on llvm::zip(block.getArguments(), successorOps.getValue())

126 ↗(On Diff #258747)

what happens in the "else" case with respect to aliasing? It isn't clear to me that this is safe?

185 ↗(On Diff #258747)

Nit: avoid trivial braces

194 ↗(On Diff #258747)

Isn't there a trait we could use instead of hard-coding the AllocOp here (and everywhere)?

232 ↗(On Diff #258747)

Can you write the API as taking an OpResult instead of the base Value?

249 ↗(On Diff #258747)

Does it need to be a template? This is a private method and I see a single use?

368 ↗(On Diff #258747)

Isn't this all assuming a particular model with a single allocation / deallocation per buffer?
What about a pattern like the following with two allocs feeding into one dealloc? (we can imagine more complex cases)

  cond_br %cond, ^bb1, ^bb2
^bb1:
  %buffer1 = alloc ...
  br ^exit(%buffer1 : ....)
^bb2:
  %buffer2 = alloc ...
  br ^exit(%buffer2 : ...)
^exit(%buffer: ...):
  dealloc %buffer
388 ↗(On Diff #258747)

This does not seem to be "computing" anything? Is this like "future work"?

409 ↗(On Diff #258747)

Nit: auto -> Type

416 ↗(On Diff #258747)

Typo: arugments

426 ↗(On Diff #258747)

Can't you just call setType() on the BlockArgument?
(also don't use auto when unnecessary)

446 ↗(On Diff #258747)

block instead of block2 here?
Also don't use auto when it does not make the code mode readable, I think that Block &block would be just fine here.

449 ↗(On Diff #258747)

Remove the legality boolean and just if (all_of(...)) return false; here.

You can even replace the outer loop with an all_of/any_of ;)

return all_of(funcOp.getBlocks(), [] (Block &block) {
  return all_of(block2.getArguments(), isLegalBlockArg);
}
464 ↗(On Diff #258747)

It seems to me that the pass is a misnomer: it does not really "assign" buffers, but optimizes the placement, would there be a more accurate name?

Also saying "into their proper positions" seems like it is intended for correctness.

mlir/test/lib/Transforms/TestBufferAssignment.cpp
29 ↗(On Diff #258747)

Can you clarify this sentence? I am missing context

mlir/tools/mlir-opt/mlir-opt.cpp
97

This isn't in the test directory, so it shouldn't be registered here but use the same mechanism as the other non-tests passes.

benvanik added inline comments.
mlir/lib/Transforms/BufferAssignment.cpp
398 ↗(On Diff #258747)

How tied is this pass to memref? If we have our own dialect type that represents buffers that we want to use with our own dialect alloc/dealloc ops, how can we use that here?

Specifically this kind of function type conversion seems better served by a TypeConverter that can be provided by the target dialect. For us, for example, we'd not have it change types at all probably, and instead just use this for inserting our alloc/dealloc markers.

silvas added a subscriber: silvas.Apr 21 2020, 2:22 PM
silvas added inline comments.
mlir/test/lib/Transforms/TestBufferAssignment.cpp
57 ↗(On Diff #258747)

How would this pass be extended to support dynamic shapes? It would be good to have that written down in a comment (and hopefully implemented as a fast follow-on to this CL).

dfki-mako added inline comments.Apr 22 2020, 1:00 AM
mlir/lib/Transforms/BufferAssignment.cpp
330 ↗(On Diff #258747)

Definitely. We are going to make this pass more generic in one of the next CLs.

388 ↗(On Diff #258747)

This function does not "compute" anything. This is just a simple abstraction to make the integration into the dialect-specific legalization phases more convenient and pratical with respect to future extensions. For instance, adding support for additional types and interfaces might require an extension of this functionality.

398 ↗(On Diff #258747)

Currently, the implementation is strongly coupled to memref types. However, this only affects the helper converters provided. The underlying pass will be extended in a follow-up CL to work on alloc and free interfaces instead of AllocOp and DeallocOp. This will make the general pass compatible with arbitrary dialects that implement the required interfaces.

464 ↗(On Diff #258747)

I guess a better name would be "BufferPlacement", since the pass moves allocations/deallocations into "better" positions.

mlir/test/lib/Transforms/TestBufferAssignment.cpp
57 ↗(On Diff #258747)

We can add more specific comments about the feature you mentioned. Adding support for dynamic shapes would be in fact on the next follow-up CLs. One thing that has to be adapted is the computation of the alloc and free positions. Moreover, it might be necessary to adapt the BufferAssignmentPlacer (however, I don't think so at the moment).

dfki-ehna updated this revision to Diff 259228.Apr 22 2020, 4:00 AM

Resolving all the comments and adding description for tests.

dfki-ehna marked 33 inline comments as done.Apr 22 2020, 4:21 AM
dfki-mako added inline comments.Apr 22 2020, 4:31 AM
mlir/lib/Transforms/BufferAssignment.cpp
194 ↗(On Diff #258747)

There is a MemoryEffectsOpInterface that could be used in favor of hard code standard allocation operations. However, I would prefer making BA more generic in a follow-up CL. Furthermore, this would require this CL to be merged first in order to support standard Alloc and Dealloc nodes.

368 ↗(On Diff #258747)

This pass currently assumes a single-allocation/deallocation model that usually appears during straight-forward legalization (lowering) of operations. We wanted to keep the first version simple and would like to significantly extend the functionality in one of the follow-up CLs.

dfki-ehna updated this revision to Diff 259263.Apr 22 2020, 6:24 AM

Change NonVoidToVoidReturnOpConverter to use arguments of entry block arguments instead of FuncOp.

dfki-ehna marked an inline comment as done.Apr 22 2020, 6:35 AM
dfki-ehna added inline comments.
mlir/include/mlir/Transforms/BufferAssignment.h
107 ↗(On Diff #258747)

Thanks for the suggestion. Replaced with the block entry.

benvanik added inline comments.Apr 22 2020, 11:54 AM
mlir/lib/Transforms/BufferAssignment.cpp
398 ↗(On Diff #258747)

If that future work could remove the use of MemRefType to instead use a TypeConverter that'd be awesome. Is there a bug you are using to track this work that I could follow along on to see when it lands? I really like the behavior of the pass and am excited to plug it in to our stuff :)

Overall LGTM, it'd be nice if someone else could also also approve though (@silvas ?)

mlir/include/mlir/Transforms/BufferPlacement.h
120

Almost all the uses of auto above could benefit from using the actual types.

mlir/lib/Transforms/BufferAssignment.cpp
368 ↗(On Diff #258747)

Sure, but please document this restriction in the TableGen pass description and add a TODO in the code so the reader is aware that this is a known limitation
(in general it helps me reviewing code when the known limitations are spelled out and there is explicit acknowledgement of what will be addressed in followup revisions)

Even in single-allocation/deallocation, would the pass pessimizes case of conditional allocation? For example in the following (pseudo) code I suspect you'd increase the dynamic memory consumption by always allocating both buffers:

if (cond)
 %buffer1 = alloc ...
else
 %buffer2 = alloc ...
...
if (cond) {
 consume(buffer1)
 dealloc(buffer1)
} else {
 consume(buffer2)
 dealloc(buffer2)
}
mlir/lib/Transforms/BufferPlacement.cpp
252

Can you spell the types above?

283

(here as well)

292

Operation * ?

mlir/test/lib/Transforms/TestBufferPlacement.cpp
128

FuncOp

rriddle added inline comments.Apr 22 2020, 1:41 PM
mlir/include/mlir/Transforms/BufferPlacement.h
14

typo TRANSFORM -> TRANSFORMS

21

This header isn't necessary I believe.

69

Please remove the trailing _ from these variable names.

mlir/include/mlir/Transforms/Passes.td
109

Can you add some example input/output here?

mlir/lib/Transforms/BufferPlacement.cpp
95

nit: auto -> Value

183

nit: Add punctuation here.

350

Can you emit an error and use signalPassFailure() instead of assert here?

422

This isn't really valid to do directly in a pattern, as it is being done outside of the rewriter. Seems like this pattern can just be replaced by using a TypeConverter instead.

dfki-mako added inline comments.Apr 23 2020, 4:40 AM
mlir/lib/Transforms/BufferAssignment.cpp
368 ↗(On Diff #258747)

In this case, the current allocation policies will cause the dynamic memory allocation to increase, as you outlined above. However, we are going to extend the documentation to describe the current limitations and assumptions. In one of the future CLs, the buffer placement transformation will include several optimization passes to optimize the overall memory consumption.

dfki-ehna marked 2 inline comments as done.Apr 23 2020, 6:45 AM
dfki-ehna added inline comments.
mlir/lib/Transforms/BufferAssignment.cpp
398 ↗(On Diff #258747)

@benvanik We are definitely going to use TypeConverter instead either in this CL or in the following up one. Are you referring to an open discussion issue?

mlir/lib/Transforms/BufferPlacement.cpp
422

TypeConverter has convertBlockSignature which returns SignatureConversion but there is no applySignatureConversion for the rewriter that gets a Block as an input (the current version only accepts a region). Are we missing the point?

dfki-ehna updated this revision to Diff 259570.Apr 23 2020, 7:39 AM

Resolved second pass comments.

dfki-ehna updated this revision to Diff 259573.Apr 23 2020, 7:55 AM
dfki-ehna marked an inline comment as done.

Polish BufferPlacement.h

dfki-ehna marked 12 inline comments as done.Apr 23 2020, 7:59 AM
dfki-ehna added inline comments.
mlir/include/mlir/Transforms/BufferPlacement.h
21

BufferAssignmentOpConversionPattern inherits from OpConversionPattern in this file.

Harbormaster completed remote builds in B54400: Diff 259573.
rriddle added inline comments.Apr 23 2020, 11:11 AM
mlir/include/mlir/Transforms/BufferPlacement.h
21

Sorry, I meant the one above it. Support/LLVM.h is included transitively.

mlir/lib/Transforms/BufferPlacement.cpp
422

When passing a type converter the non-entry blocks are converted automatically using that converter. After that I would expect that the default function conversion pattern would remove the need for this pattern:
https://github.com/llvm/llvm-project/blob/3d178581ac7f5336b1ac75e31001de074ecca937/mlir/include/mlir/Transforms/DialectConversion.h#L300

dfki-ehna updated this revision to Diff 259873.Apr 24 2020, 5:38 AM
dfki-ehna marked an inline comment as done.

Provide BufferAssignmentTypeConverter for using inside FunctionAndBlockSignatureConverter.

dfki-ehna marked 3 inline comments as done.Apr 24 2020, 5:39 AM
dfki-ehna added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
422

Thanks. Resolved.

rriddle accepted this revision.Apr 24 2020, 11:04 AM
rriddle added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
355

Can you change this walk to return WalkResult? That allows for interrupting the walk early.

You can return WalkResult::interrupt()(or failure()) to stop the walk, and WalkResult::advance()/success() to continue the walk.

438

An easy way of doing this is just:

target.addDynamicallyLegalOp<FuncOp>([&](FuncOp funcOp) {
  return typeConverter.isSignatureLegal(funcOp.getType());
});
mlir/test/lib/Transforms/TestBufferPlacement.cpp
30

A function pass is not allowed to mutate the public type of the function, so this should be a module pass.

131

Can you just do this inside of the converter constructor? Otherwise, you don't need a specific converter class.

This revision is now accepted and ready to land.Apr 24 2020, 11:04 AM
dfki-ehna updated this revision to Diff 260263.Apr 27 2020, 3:07 AM
dfki-ehna marked an inline comment as done.

Resolve the latest comments.

dfki-ehna marked 5 inline comments as done.Apr 27 2020, 3:14 AM
dfki-ehna added inline comments.
mlir/lib/Transforms/BufferPlacement.cpp
438

Thanks. We got rid of FunctionAndBlockSignatureConverter::addDynamicallyLegalFuncOp and added it directly to the TestBufferPlacement pass. So, the dialect experts also need to add this to their targets in their legalization passes.

dfki-ehna updated this revision to Diff 260276.Apr 27 2020, 4:32 AM
dfki-ehna marked an inline comment as done.

Taking Context and ConversionTarget out of the scope of function walk.

This revision was automatically updated to reflect the committed changes.