getAliasingOpOperands/getAliasingOpResults now encodes OpOperand/OpResult, buffer relation and a degree of certainty. E.g.:
// aliasingOpOperands(%r) = {(%t, EQUIV, DEFINITE)} // aliasingOpResults(%t) = {(%r, EQUIV, DEFINITE)} %r = tensor.insert %f into %t[%idx] : tensor<?xf32> // aliasingOpOperands(%r) = {(%t0, EQUIV, MAYBE), (%t1, EQUIV, MAYBE)} // aliasingOpResults(%t0) = {(%r, EQUIV, MAYBE)} // aliasingOpResults(%t1) = {(%r, EQUIV, MAYBE)} %r = arith.select %c, %t0, %t1 : tensor<?xf32>
BufferizableOpInterface::bufferRelation is removed, as it is now part of getAliasingOpOperands/getAliasingOpResults.
This change allows for better analysis, in particular wrt. equivalence. This allows additional optimizations and better error checking (which is sometimes overly conservative). Examples:
- EmptyTensorElimination can eliminate tensor.empty inside scf.if blocks. This requires a modeling of equivalence: It is not a per-OpResult property anymore. Instead, it can be specified for each OpOperand and OpResult. This is important because tensor.empty may be eliminated only if all values on the SSA use-def chain to the final consumer (tensor.insert_slice) are equivalent.
- The detection of "returning allocs from a block" can be improved. (Addresses a TODO in assertNoAllocsReturned.) This allows us to bufferize IR such as "yielding a tensor.extract_slice result from an scf.if branch", which currently fails to bufferize because the alloc detection is too conservative.
- Better bufferization of loops. Aliases of the iter_arg can be yielded (even if they are not equivalent) without having to realloc and copy the entire buffer on each iteration.
The above-mentioned examples are not yet implemented with this change. This change just improves the BufferizableOpInterface, its implementations and related helper functions, so that better aliasing information is available for each op.
typo