Tile and fuse using TilingInterface allows fusing producers with
consumers at tile granularity, i.e. using a tiled implementation of
the producer to compute the tile needed for producer a tile of the
consumer. Current implementation of tile and fuse only yields the
value to replace the consumer. In some cases it is also useful to have
the loop nest yield a value to replace the tiled + fused producer.
This is only valid to do in cases where the producer is not computed
in a redundant fashion, i.e. after tiling the innermost loop body
produces a disjoint portion of the producer. This patch modifies the
implementation of tile and fuse to support this use case.
Currently there is no way to know automatically if there are redundant
computations introduced while fusing the producer. So a new option is
added which allows the caller to assert that this is indeed the case,
i.e. the tile sizes selected for the consumer are such that this
assumption can be made. Under this option, the generated tiled loop
nest also yeilds a value that can be used to replace the untiled
producer.
Quick fly-by comment.
Genereally we want to go towards SubsetInsert/Extract interfaces, committing more APIs to offsets + sizes is not where we want to go IMO.