Previously, linalg-bufferize was a "finalizing" bufferization pass (it
did a "full" conversion). This wasn't great because it couldn't be used
composably with other bufferization passes like std-bufferize and
scf-bufferize.
This patch makes linalg-bufferize a composable bufferization pass.
Notice that the integration tests are switched over to using a pipeline
of std-bufferize, linalg-bufferize, and (to finalize the conversion)
func-bufferize. It all "just works" together.
While doing this transition, I ran into a nasty bug in the 1-use special
case logic for forwarding init tensors. That logic, while
well-intentioned, was fundamentally flawed, because it assumed that if
the original tensor value had one use, then the converted memref could
be mutated in place. That assumption is wrong in many cases. For
example:
%0 = some_tensor : tensor<4xf32> br ^bb0(%0, %0: tensor<4xf32>, tensor<4xf32>) ^bb0(%bbarg0: tensor<4xf32>, %bbarg1: tensor<4xf32>) // %bbarg0 is an alias of %bbarg1. We cannot safely write // to it without analyzing uses of %bbarg1. linalg.generic ... init(%bbarg0) {...}
A similar example can happen in many scenarios with function arguments.
Even more sinister, if the converted memref is produced by a
std.get_global_memref of a constant global memref, then we might
attempt to write into read-only statically allocated storage! Not all
memrefs are writable!
Clearly, this 1-use check is not a local transformation that we can do
on the fly in this pattern, so I removed it.
The test is now drastically shorter and I basically rewrote the CHECK
lines from scratch because:
- the new composable linalg-bufferize just doesn't do as much, so there
is less to test
- a lot of the tests were related to the 1-use check, which is now gone,
so there is less to test
- the -buffer-hoisting -buffer-deallocation is no longer mixed in, so
the checks related to that had to be rewritten