When we vectorize a scalar constant, the vector constant is inserted before its first user if the scalar constant is defined outside the loops to be vectorized. It is possible that the vector constant does not dominate all its users. To fix the problem, we find the innermost vectorized loop that encloses that first user and insert the vector constant at the top of the loop body.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Thank you, changes look good. Can you please add a brief summary for the fix (unless you think it is generally obvious) ?
Without this fix, the output for the example would be (violating SSA dominance for %cst_0) :
mlir-opt -affine-super-vectorize="virtual-vector-size=128 test-fastest-varying=0" -split-input-file ../mlir/test/Dialect/Affine/input.mlir -verify-each=0 module { func @vec_constant_with_two_users(%arg0: index, %arg1: index) -> (f32, f32) { %0 = memref.alloc(%arg0, %arg1) : memref<?x?xf32> %1 = memref.alloc(%arg0) : memref<?xf32> %cst = constant 1.000000e+00 : f32 affine.for %arg2 = 0 to %arg0 step 128 { affine.for %arg3 = 0 to %arg1 { %cst_0 = constant dense<1.000000e+00> : vector<128xf32> vector.transfer_write %cst_0, %0[%arg3, %arg2] : vector<128xf32>, memref<?x?xf32> } vector.transfer_write %cst_0, %1[%arg2] : vector<128xf32>, memref<?xf32> } %c12 = constant 12 : index %2 = affine.load %0[%c12, %c12] : memref<?x?xf32> %3 = affine.load %1[%c12] : memref<?xf32> return %2, %3 : f32, f32 } }
Comment Actions
@nicolasvasilache Thank you for the review. We'll merge it tomorrow if there are no more comments.