This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Introduce CloneOp and adapt test cases in BufferDeallocation.
ClosedPublic

Authored by dfki-jugr on Mar 23 2021, 4:13 AM.

Details

Summary

Add a new clone operation to the memref dialect. This operation implicitly
copies data from a source buffer to a new buffer. In contrast to the linalg.copy
operation, this operation does not accept a target buffer as an argument.
Instead, this operation performs a conceptual allocation which does not need to
be performed manually.

Furthermore, this operation resolves the dependency from the linalg-dialect
in the BufferDeallocation pass. In addition, we also extended the canonicalization
patterns to fold clone operations. This is a competitive alternative to the
copy removal pass which is more expensive.

Diff Detail

Event Timeline

dfki-jugr created this revision.Mar 23 2021, 4:13 AM
dfki-jugr requested review of this revision.Mar 23 2021, 4:13 AM

Nice!

mlir/docs/BufferDeallocationInternals.md
312–313

Nit, should we keep the comment that explains the reason for the clone?

608–609

Is this pass still needed? The copy removal as described here relies on the fact that source and destination of the copy are not mutated after the copy operation. While this is true for memref.clone, it is not true for copy operations. So to do this kind of optimization, one would need to analyse the program first.

As the bufferization pass no longer introduces copy operations, maybe we should rather drop the incomplete pass for now.

696–697

This is no longer true.

mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
354

Can you add a remark that mutating the source and result of clone after the clone operation has executed has undefined behavior? This is a requirement to making the bufferization logic work.

And while at it, could you also add a similar comment to tensor_load (the source may not be mutated afterwards, at least when used in the context of bufferization) and memref.buffer_cast (the result may not be mutated).

As we are restricting the existing tensor_load here, maybe it would be better to have a new operation with the restricted semantics but that is outside of the scope of this diff.

mlir/include/mlir/Dialect/MemRef/Utils/MemRefUtils.h
26

With the guarantee that the cloned buffer cannot be mutated, is this actually required?

30

This only finds one dealloc. For its use, would it not be OK to find the dealloc in the current block (can only be one).

mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp
22

should this be &&?

mlir/lib/Transforms/BufferDeallocation.cpp
405

Is this still true?

mlir/test/Transforms/canonicalize.mlir
1122

Why does this remain?

1184

It is not clear to me what these tests are testing.

dfki-jugr updated this revision to Diff 332971.Mar 24 2021, 6:55 AM
dfki-jugr marked 9 inline comments as done.

Addressed comments.

mlir/lib/Transforms/BufferDeallocation.cpp
405

Yes,
it can still happens that there are clones of clones.

mlir/test/Transforms/canonicalize.mlir
1122

%3 results from the scf.if. The canonicalization pattern only checks, if %3 is an alloc operation.
Since this is not the case, this clone remains.

herhut added inline comments.Mar 25 2021, 5:13 AM
mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
362

nit: source or result.

1157

nit: source source

mlir/include/mlir/Dialect/MemRef/Utils/MemRefUtils.h
26

These are only used for the canonicalization pattern, so I'd prefer to make them local.

mlir/test/Transforms/canonicalize.mlir
1122

Now that the operand to clone is required to not be mutated, why do we need to see the alloc? Is it not good enough if a clone has a following dealloc of the source, if the clone itself does not have an earlier dealloc?

We can land it like it is (which is quite restricted) but this should be revisited again to make it apply more broadly.

dfki-jugr updated this revision to Diff 333524.Mar 26 2021, 3:50 AM
dfki-jugr marked 4 inline comments as done.

Addressed comments.

mlir/include/mlir/Dialect/MemRef/Utils/MemRefUtils.h
26

findDealloc is also used in BufferDeallocation.

pifon2a accepted this revision.Mar 28 2021, 11:16 PM
pifon2a added a subscriber: pifon2a.
pifon2a added inline comments.
mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
359

nit: arg 1 -> arg1

This revision is now accepted and ready to land.Mar 28 2021, 11:16 PM

@dfki-jugr But why does this revision remove the copy removal pass when there are things that generate copies? The extension to canonicalization done eliminates clone ops, but IR with copy ops will no longer have a way to remove copies? For eg. -linalg-bufferize would still generate copy operations, right? @silvas

herhut added a comment.Apr 6 2021, 7:39 AM

@dfki-jugr But why does this revision remove the copy removal pass when there are things that generate copies? The extension to canonicalization done eliminates clone ops, but IR with copy ops will no longer have a way to remove copies? For eg. -linalg-bufferize would still generate copy operations, right? @silvas

I asked for it to be removed, as I was unaware of where it got used. I assumed it was only used for copies created by the bufferization pass itself. The copy removal pass was created for bufferization and only works if all assumptions that bufferization makes about copies hold. We had reports where this was not the case. So instead of keeping the broken one, I thought it would be better to remove it for now and implement a proper one.

@dfki-jugr But why does this revision remove the copy removal pass when there are things that generate copies? The extension to canonicalization done eliminates clone ops, but IR with copy ops will no longer have a way to remove copies? For eg. -linalg-bufferize would still generate copy operations, right? @silvas

I believe all uses of copy in linalg-bufferize can be replaced with memref.clone easily -- it's a much more natural representation and we should do that cleanup.

@dfki-jugr But why does this revision remove the copy removal pass when there are things that generate copies? The extension to canonicalization done eliminates clone ops, but IR with copy ops will no longer have a way to remove copies? For eg. -linalg-bufferize would still generate copy operations, right? @silvas

I believe all uses of copy in linalg-bufferize can be replaced with memref.clone easily -- it's a much more natural representation and we should do that cleanup.

Yes, but I wish -copy-removal was removed *after* this cleanup was done --- since there are now common copy patterns for which there is no longer anything in the infrastructure to eliminate/optimize.

@dfki-jugr But why does this revision remove the copy removal pass when there are things that generate copies? The extension to canonicalization done eliminates clone ops, but IR with copy ops will no longer have a way to remove copies? For eg. -linalg-bufferize would still generate copy operations, right? @silvas

I believe all uses of copy in linalg-bufferize can be replaced with memref.clone easily -- it's a much more natural representation and we should do that cleanup.

Yes, but I wish -copy-removal was removed *after* this cleanup was done --- since there are now common copy patterns for which there is no longer anything in the infrastructure to eliminate/optimize.

If this patch is breaking you please revert it, by all means! If you concern is hypothetical though, I tend to trust that the patch authors did their homework on this to make sure it isn't an issue in practice (AFAIK they are well in tune with all users of this code).

I have concerns about this kind of restriction on an operation that is named so broadly: a "memref.clone" shouldn't impose anything about mutability!
Can you rename this into something really unambiguously tied to bufferization, like memref.__bufferize__clone or something like that please?

mlir/test/Transforms/canonicalize.mlir