When vectorizing tensor.extract embedded within linalg.generic, the
default option is to rewrite it as vector.gather. When doing so, we need
to make sure that the corresponding indices are vectorized accordingly.
However, the Linalg vectorizer will not vectorize constants like in the
following example. This is fixed by simply broadcasting %c0 and %c1.
func.func @example(%arg0: tensor<3x3xf32>, %arg2: tensor<1x1x3xf32>) -> tensor<1x1x3xf32> { %c0 = arith.constant 1 : index %c1 = arith.constant 2 : index %1 = linalg.generic { (...) } outs(...) { ^bb0(...): %2 = tensor.extract %arg0[%c0, %c1] : tensor<3x3xf32> linalg.yield %2 : f32 } -> tensor<1x1x3xf32> return %1 : tensor<1x1x3xf32> }
This patch makes sure that in this case the vectorizer broadcasts %c0 and %c1.
Additional tests are added to check other scenarios as well.
Could we use one of the existing utilities that already generate a broadcast op? For example, broadcastIfNeeded. I also think we have to make sure that this constant goes through vectorizeOne code as a copy is generated for those cases where the constant also has a user that still need a scalar version of it after vectorization.