This is an archive of the discontinued LLVM Phabricator instance.

[mlir][bufferization] Add bufferization.alloc_tensor op
ClosedPublic

Authored by springerm on May 19 2022, 11:55 AM.

Details

Summary

This change adds a new op alloc_tensor to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot.

This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor.

linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation.

linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc.

Diff Detail

Event Timeline

springerm created this revision.May 19 2022, 11:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2022, 11:55 AM
springerm requested review of this revision.May 19 2022, 11:55 AM
silvas accepted this revision.May 20 2022, 5:37 AM

This direction makes sense to me. I am a little out of the loop from the current bufferization interfaces, so I will let someone else review the code in detail.

This revision is now accepted and ready to land.May 20 2022, 5:37 AM
mehdi_amini added inline comments.May 20 2022, 8:44 AM
mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td
34

This does not document clearly if this op is breaking the value semantics of tensors or not, can you clarify?

springerm added inline comments.May 20 2022, 8:58 AM
mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td
34

I'd say that reading from the result of a bufferization.alloc_tensor is undefinied behavior. Would that clarify it?

Similarly, reading from an uninitialized portion of a tensor is undefined behavior. E.g.:

%0 = alloc_tensor : tensor<10xf32>
%1 = tensor.insert_tensor ... into %0[0][5][1] : tensor<5xf32> into tensor<10xf32>
%2 = tensor.extract %1[%c6] : tensor<10xf32>   # undefined
springerm added inline comments.May 20 2022, 9:15 AM
mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td
34

Just read this again, and I think the last sentence The contents of the buffer are unspecified. may be badly formulated. This op returns a tensor, not a buffer. So talking about buffers in the op description could be confusing. The tensor returned by this op is read-only (just as any other tensor), there is no concept of writing into a tensor, etc. That's probably what you mean by "value semantics".

mehdi_amini added inline comments.May 20 2022, 1:59 PM
mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td
34

The tensor returned by this op is read-only (just as any other tensor), there is no concept of writing into a tensor, etc. That's probably what you mean by "value semantics".

Yes. I am also not sure why there is UB involved here? Couldn't we leave it at "reading from the result of an alloc_tensor op yields an undefined value"?

Also, something like this should be legal IR and bufferization should gracefully manage the need of two allocations for example:

%0 = alloc_tensor : tensor<10xf32>
%1 = linalg.generic outs(%0)
%2 = linalg.generic outs(%0)
springerm added inline comments.May 20 2022, 5:25 PM
mlir/include/mlir/Dialect/Bufferization/IR/BufferizationOps.td
34

Ah yes you're right. I was thinking of tensor.insert_slice, which is essentially a copy. But it's actually just another example of "reading a tensor".

Your example above is legal IR and the bufferization indeed generates two allocations. (Assuming that %1 is read at some point.)

This revision was landed with ongoing or failed builds.May 20 2022, 5:57 PM
This revision was automatically updated to reflect the committed changes.
mlir/lib/Dialect/Linalg/Transforms/CMakeLists.txt