Similar to the parallelization strategies, the vectorization strategies
provide control on what loops should be vectorize. Unlike the parallel
strategies, only innermost loops are considered, but including reductions,
with the control of vectorizing dense loops only or dense and sparse loops.
The vectorized loops are always controlled by a vector mask to avoid
overrunning the iterations, but subsequent vector operation folding removes
redundant masks and replaces the operations with more efficient counterparts.
Similarly, we will rely on subsequent loop optimizations to further optimize
masking, e.g. using an unconditional full vector loop and scalar cleanup loop.
The current strategy already demonstrates a nice interaction between the
sparse compiler and all prior optimizations that went into the vector dialect.
Ongoing discussion at:
https://llvm.discourse.group/t/mlir-support-for-sparse-tensors/2020/10
Nit: Upper-case T?