This change does two main things
- An operation might have multiple dependences to the same producer. Not tracking them correctly can result in incorrect code generation with fusion. To rectify this the dependence tracking needs to also have the operand number in the consumer.
- Improve the logic used to find the fused loops making it easier to follow. The only constraint for fusion is that linalg ops (on buffers) have update semantics for the result. Fusion should be such that only one iteration of the fused loop (which is also a tiled loop) must touch only one (disjoint) tile of the output. This could be relaxed by allowing for recomputation that is the default when oeprands are tensors, or can be made legal with promotion of the fused view (in future).