The InterleavedAccess pass currently matches (de)interleaving
shufflevector instructions with loads or stores, and calls into
target lowering to generate ldN or stN instructions.
Since we can't use shufflevector for scalable vectors (besides a
splat with zeroinitializer), we have interleave2 and deinterleave2
intrinsics. This patch extends InterleavedAccess to recognize those
intrinsics and if possible replace them with ld2/st2 via target lowering.
Unlike the fixed-length version, we currently cannot 'legalize' the
operation in IR because we don't have a way of concatenating or
splitting vectors at a scalable point, so for now we just bail
out if the types won't match the actual hardware instructions.
perhaps adding a comment to this and the other function below would be a good idea, at least consistent with the other lowerinterleaved* functions