If the number of instructions in horizontal reduction list is not the power of 2 then only PowerOf2Floor(NumberOfInstructions) last elements are actually vectorized, other instructions remain scalar. Patch tries to vectorize the remaining elements either.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4275 ↗ | (On Diff #85156) | As long as you're changing this, maybe name this "Idx" (or leave it as "i" - this is against convention in the strict sense, but that's one of the common exceptions). But I'm also ok with "I" if you think it's better. Also, this is exactly an example of what I was talking about on the other patch - you could commit the name change separately, as a preparatory patch, and then this patch would only include functional changes, which makes it much easier to review. |
4291 ↗ | (On Diff #85156) | Can you explain why you changed ReducedVals[i] to VL[0] here and below? |
4310 ↗ | (On Diff #85156) | This whole ReduxWidth stuff is fishy. If this makes sense, then changing this can be another preparatory NFC patch before this one. If not, then why not? |
test/Transforms/SLPVectorizer/X86/horizontal-list.ll | ||
572 ↗ | (On Diff #85156) | Could you add a test for something like width 31, so that we get 16 + 8 + 4 + 3? |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4275 ↗ | (On Diff #85156) | Reverted it back. clang-tidy made this change. |
4291 ↗ | (On Diff #85156) | Because ReducedVals[i] actually is the same as VL[0]. Take a look at line 4278. VL ArrayRef starts from ReducedVals[i]. I had to make this change because changed the logic of updating of i counter. But anyway, reworked this part a little bit. |
4310 ↗ | (On Diff #85156) | Ok, I'll try to prepare the NFC patch for this member |
test/Transforms/SLPVectorizer/X86/horizontal-list.ll | ||
572 ↗ | (On Diff #85156) | Ok, will do |
One more nit.
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4277 ↗ | (On Diff #85357) | Could you turn this into a while loop? A for with only the middle term looks odd. |