Prior to this change, the "ExtractSliceFromReshape" pattern would transform
%collapsed = tensor.collapse_shape %input [[0, 1], [2]] : tensor<1x11x100xf32> into tensor<11x100xf32> %slice = tensor.extract_slice %collapsed [%offt, 0] [%size, 100] [1, 1] : tensor<11x100xf32> to tensor<?x100xf32>
into a loop that iterated over the range %size - %offt, that pieces
together multiple sub-slices of %input along the first dimension. This
is correct but obviously inefficient. The technical condition is that
collapsing at-most-one non-unit dimension of %src will not result in a
subsequent slice along the corresponding dimension of %collapsed
mapping across discontinuities in the index space of %src. Thus, the
definition of a "linearized dimension" (from the perspective of
tensor.collapse_shape) is updated to reflect this condition.
The transform will now generate
%slice = tensor.extract_slice %input [0, %offt, 0][1, %size, 100] [1, 1] : tensor<1x11x100xf32> to tensor<1x?x100xf32> %result = tensor.collapse_shape [[0, 1], [2]] : tensor<1x?x100xf32> to tensor<?x100xf32>
which can be further canonicalized.
Additional tests are added to check this family of edge cases.
nit: would this be more clear?