This revision adds support for vectorizing more general linalg operations with projected permutation maps.

This is achieved by eagerly broadcasting the intermediate vector to the common size

of the iteration domain of the linalg op. This allows a much more natural expression of

generalized vectorization but may introduce additional computations until all the

proper canonicalizations are implemented.

This generalization modifies the vector.transfer_read/write permutation logic and

exposes the fact that the logic employed in vector.contract was too ad-hoc.

As a consequence, changes occur in the permutation / transposition logic for contraction. In turn this prompts supporting more cases in the lowering of contract

to matrix intrinsics, which is required to make the corresponding tests pass.