Add the plumbing necessary to call the memref dialect's multiBuffer
function. This will allow separation between choosing which buffers
to multi-buffer and the actual transform.
Alter the multibuffer function to return the newly created
allocation if multi-buffering succeeds. This is necessary to
communicate with the transform dialect hooks what allocation
multi-buffering created.
should this capture the fact that the step should has doubled from 4 to 8 ?