In the following loop the dependence distance is 2 and can only be
vectorized if the vector length is no larger than this.
void foo(int *a, int *b, int N) { #pragma clang loop vectorize(enable) vectorize_width(4) for (int i=0; i<N; ++i) { a[i + 2] = a[i] + b[i]; } }
However, when specifying a VF of 4 via a loop hint this loop is
vectorized. According to [1][2], loop hints are ignored if the
optimization is not safe to apply.
This patch introduces a check to bail of vectorization if the user
specified VF is greater than the maximum feasible VF, unless explicitly
forced with '-force-vector-width=X'.
[1] https://llvm.org/docs/LangRef.html#llvm-loop-vectorize-and-llvm-loop-interleave
[2] https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations
Clamping sounds good.
But I think that there is a difference between the "maximum safe vector width" and "the maximum register bitwidth". The maximum safe distance should be a legality constraint that doesn't depend on the size of the backend registers.
It's perhaps easier to show with examples. This code should produce the same thing (the same width vectors) with and without this patch, I believe, as there is nothing "unsafe" about vectorizing at a higher bitwidth than the vector registers:
https://godbolt.org/z/ePdv3K
(I am trying to not use the term "legal", as it has too many meanings. There is a difference between "legal to vectorize" (as the safety constraints in LoopVectorizationLegality) and "legal vector widths" which just means that the llvm-ir vector can be lowered to a single vector register and I don't think should be very relevant here).