Add a new GradualVectorToSCF pass that lowers vector transfer ops to SCF by gradually unpacking one dimension at time. Unpacking stops at 1D, but can be configured to stop earlier, should the HW support (N>1)-d vectors.
The current implementation cannot handle permutation maps, masks, tensor types and unrolling yet. These will be added in subsequent commits. Once features are on par with VectorToSCF, this implementation will replace VectorToSCF.
LLVM has TargetTransformInfo, which you can query at runtime for such values.