Running: mlir-opt -test-vector-warp-distribute=rewrite-warp-ops-to-scf-if -canonicalize -verify-each=0.
Prior to this revision, IR resembling the following would be produced:
%4 = "vector.load"(%3, %arg0) : (memref<1x32xf32, 3>, index) -> vector<1x1xf32>
This fails verification since it needs 2 indices to load but only 1 is provided.
the distributed dimension should be inferred from the distributed vector type.