This patch fixes a crash encountered when vectorising the following loop:
void foo(float *dst, float *src, long long n) { for (long long i = 0; i < n; i++) dst[i] = -src[i]; }
using scalable vectors. I've added a test to
Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
as well as cleaned up the other tests in the same file.
From looking at isScalarAfterVectorization and collectLoopScalars, if the node is scalar after vectorization, it could be that:
For the case where the result is scalar after vectorization, I would have expected VectorTy->getVectorElementType() to be passed to TTI.getArithmeticInstrCost, not VectorTy, so I wonder if that's a bug.
Also, for case 2, we shouldn't be multiplying the cost by N. This code should instead check explicitly if the node is scalarized instead of relying on the more broader-defined isScalarAfterVectorization.
When scalarization does happen, the cost must be multiplied with VF.getFixedValue() instead (which has the implicit assert that VF is not scalable), so that you can avoid adding an unnecessary branch.