The current approach for handling iter_args was to replace all uses
of the value that is used as init value with the corresponding
region block argument within the scf.for. This is not always
correct. Instead a more deliberate approach needs to be taken to
handle these. If the slice being fused represents a slice of the
destination operand of the untiled op, then
- Make the destination of the fused producer the init value of the loop nest
- For the tiled and fused producer op created, replace the slice of the destination operand with a slice of the corresponding region iter arg of the innermost loop of the generated loop nest
nit: double are replaced