When computing the smallest and largest types for selecting the maximum vectorization factor, we currently ignore loads and stores of pointer types if the memory access is non-consecutive. We do this because such accesses must be scalarized regardless of vectorization factor, and thus shouldn't be considered when determining the factor. This patch makes this check less aggressive by also considering non-consecutive accesses that may be vectorized, such as interleaved accesses. Because we don't know at the time of the check if an accesses will certainly be vectorized (this is a cost model decision given a particular VF), we consider all accesses that can potentially be vectorized.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6341 ↗ | (On Diff #89536) | Why would this only apply to pointer types, though? What's special about them? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6341 ↗ | (On Diff #89536) | I'm not sure it makes sense anymore either - I'm happy to remove it. It was added when we could only choose the VF based on the size of the largest type. Maybe the pointer size was just used in place of "a large scalar type size that will cause the max VF to be too small"? I'm actually hoping we can enable -vectorizer-maximize-bandwidth at some point, though. For some context, once I commit D29466 and D29675, ARM/AArch64 should be prepared for the change. I discovered the bug here while testing those two patches (with -vectorizer-maximize-bandwidth=false). I was expecting these patches to be NFC, but for the loop in the test case, we were choosing a very large VF by mistake. |
LGTM
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6341 ↗ | (On Diff #89536) | Hm, right, if we have a loop that mostly works on i8, but gathers the pointers, we'll have a bad time with the MaxVF. And I'd really like to enable maximize-bandwidth as well. I need to run our tests again and see whether we have any regressions on x86. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6341 ↗ | (On Diff #89536) | Sounds good. Thanks, Michael! |