This patch recognizes when tensor.pack/unpack operations are simple tensor.pad/unpad (a.k.a. tensor.extract_slice) and lowers them in a simpler sequence of instruction.
For pack, instead of doing:
pad expand_shape transpose
we do
pad insert_slice
For unpack, instead of doing:
transpose collapse_shape extract_slice
we do
extract_slice
Note: returning nullptr for the transform dialect is fine. The related handles are just ignored by the following transformation.