- Steps are scaled by vscale, a runtime value.
- Changes to circumvent the cost-model for now (temporary) so that the cost-model can be implemented separately.
This vectorizes:
void loop(int N, double *a, double *b) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N; i++) { a[i] = b[i] + 1.0; } }
This could do with a quick clang-format.