Three patterns are added to convert into vector.multi_reduction into a sequence of vector.reduction as the following:

- Transpose the inputs so inner most dimensions are always reduction.
- Reduce rank of vector.multi_reduction into 2d with inner most reduction dim (get the 2d canical form)
- 2D canonical form is converted into a sequence of vector.reduction.

There are two things we might worth in a follow up diff:

- An scf.for (maybe optionally) around vector.reduction instead of unrolling it.
- Breakdown the vector.reduction into a sequence of vector.reduction

(e.g tree-based reduction) instead of relying on how downstream dialects

handle it. Note: this will requires passing target-vector-length