We previously only created a vector phi node for an induction variable if its step had a constant integer type. However, the step actually only needs to be loop-invariant. This patch allows a non-constant step if it is loop-invariant and has an integer type.
Details
Diff Detail
- Build Status
Buildable 3971 Build 3971: arc lint + arc unit
Event Timeline
Thanks, Matt!
A few minor comments inline.
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
2376 | Can this really be anything but a trunc? If not, why not CreateTrunc? | |
2384 | Maybe it should? :-) | |
2466 | This doesn't add any new overhead - we'd expand this SCEV into the pre-header anyway, for the scalar induction, right? | |
2489 | Same as above. | |
test/Transforms/LoopVectorize/induction-step.ll | ||
201 | Do we have a negative test-case showing we don't try anything stupid for loop-variant values? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
2376 | It's always a trunc - I thought about that after submitting - I'll update it. | |
2384 | I agree. I'll add FIXME here for now. | |
2466 | Right, no new overhead here. We're just expanding early if the step is loop-invariant. The existing expansion is performed a bit further down at line 2497 (for the scalar IVs and the splat vector IVs). The insertion point for the existing expansion has always been the current IRBuilder insertion point, which is actually within the body of the loop. But for a loop-invariant step, SCEVExpander is smart enough to place it in the pre-header. | |
test/Transforms/LoopVectorize/induction-step.ll | ||
201 | I'll add one. |
test/Transforms/LoopVectorize/induction-step.ll | ||
---|---|---|
201 | Actually, we don't handle inductions with loop-variant steps at all (InductionDescriptor::isInductionPHI). So I'll probably change the patch to assert instead of checking for loop-invariance. |
Addressed Michael's comments.
- Used CreateTrunc instead of CreateSExtOrTrunc.
- Marked IRBuilder comment FIXME.
- Changed loop-invariant check to an assertion (InductionDescriptor only allows loop-invariant steps) and simplified.
Can this really be anything but a trunc? If not, why not CreateTrunc?