This patch moves the processing of pointer induction variables in collectLoopUniforms from the consecutive pointer phase of the analysis to the phi node phase. Previously, if a pointer induction variable was used by both a scalarized non-memory instruction as well as a vectorized memory instruction, we would incorrectly identify the pointer as uniform. Pointer induction variables should be treated the same as other phi nodes. That is, they are uniform if all users of the induction variable and induction variable update are uniform.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5397 ↗ | (On Diff #71178) | Do we care about bitcasts / GEP chains? (I'm ok with being conservative with this, just making sure it's by choice.) |
5398 ↗ | (On Diff #71178) | Do we need to check that if U is a store, then Ptr is actually the pointer operand of the store, not the value? I'm thinking about: |
5466 ↗ | (On Diff #71178) | The only thing that changed here, functionally, is the addition of isVectorizedMemAccessUse right? |
Thanks for the quick feedback, Michael!
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5397 ↗ | (On Diff #71178) | This is intentionally conservative, and we can probably relax it eventually. This patch was NFC when I tested it on SPEC, by the way. But note that Ptr should be the actual pointer operand of a memory access (see comment below). This covers the case where we have a GEP that is bitcast, and the bitcast is then used by the memory access. We look through bitcasts in isConsecutivePointer, which calls getGEPInstruction. For chains of instructions, we have to prove the user uniform first. This already happens to some degree in the expansion phase of the analysis. If GEP1 is only used by GEP2, and GEP2 remains uniform, GEP1 will be marked uniform as well. The same is true for the GEP in the bitcast case above. |
5398 ↗ | (On Diff #71178) | Yes, nice catch! I'll update the patch. We want all users of the pointer to be memory accesses, where the pointer is the pointer operand. This is all we consider when checking for scalarization. We already know Ptr is the pointer operand of one memory access, but it could be used as the value operand of another. Something like: %x = load i32* %p store i32* %p, i32** %q %p can't remain uniform because we need to store all it's corresponding values to memory. %q remains uniform if the other conditions are met (the store is not scalarized, etc.). |
5466 ↗ | (On Diff #71178) | That's right. I'm happy to save the cleanup for a separate patch if you prefer. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
5398 ↗ | (On Diff #71178) | Yes, this is exactly what I meant. |
5466 ↗ | (On Diff #71178) | My personal preference is to do it the other way around - NFC cleanups go in first, and then you can put up the functional patch (post-cleanup) for review. This kills two birds with one stone:
(If you think the cleanup itself is complex enough to deserve review, then I'd prefer two separate reviews, but, again, cleanup first.) |
Addressed Michael's comments.
- Updated UsersAreMemAccesses to check that uses are pointer operands.
- Added a new test case for the discussed code fragment.
- Rebased on top of the NFC clean up.