Handling FP IVs without SCEV. Covered simple cases with constant and loop invariant step.
float *A;
float x = init;
for (int i=0; i < N; ++i) {
A[i] = x; x -= fp_inc; }
Differential D21330
Loop vectorization with FP induction variables delena on Jun 14 2016, 11:30 AM. Authored by
Details Handling FP IVs without SCEV. Covered simple cases with constant and loop invariant step. float *A; A[i] = x; x -= fp_inc; }
Diff Detail
Event TimelineComment Actions Some comments inline. I cannot review the LoopVectorize.cpp changes.
Comment Actions Michael, thanks a lot for your comments. I'll upload a new patch.
Comment Actions Thanks, Elena. I guess there's one thing I don't understand about the fnegs - why are you using an IR instruction as the marker of whether the original induction had an fadd or an fsub, instead of a property of the IV? It would make sense to me if you actually used the fneg to feed the vector induction, thus simplifying the code (not having to special-case the sub), but instead you special-case it anyway by looking through the fneg. Another thing I forgot earlier - this should only fire when we're in unsafe/fast math mode, right? Is there a check for that?
Comment Actions What do you mean by "property of IV" ? Do you suggest to add a special field to InductionDescriptor? Comment Actions
You know, I probably just don't understand the SCEV situation well enough. Comment Actions
Comment Actions Hi Elena, Sorry for the delay, I was out on vacation. I'm catching up on email now, and I will try to get to review this this week. Thanks! Comment Actions Some comments inline. Please let me know if you want me to look at something specific, but I'm not familiar enough with the code this patch touches to lgtm it.
Comment Actions Some changes according to Sanjoy's comment. Thanks Sanjoy and Michael for review. Comment Actions I want to go over the code again, after the changes, but the reason I don't feel like I can accept the patch is because I wasn't part of the original FP SCEV discussion, and I'm not sure I understand the design considerations. If someone - e.g. sanjoy - OK's the design, I can LGTM the LV code change.
Comment Actions The bottom line of the FP SCEV discussion was the point that FP SCEV is overkill for "secondary" IV (like in the example above). We'll need FP SCEV for primary FP IV like
Comment Actions I've just started looking at this too. Please give me a few mins. So far I only encountered minor things.
Comment Actions Sorry, didn't realize anyone else is still interested in looking at this, given how long this patch has been up. Comment Actions No, *I am* sorry for chiming in this late. I felt obliged because you mentioned that no one looked at this from the original llvm-dev thread and I felt bad ;). And thanks for reviewing it, this is looking pretty good. Comment Actions So with these it should LGTM too. I haven't checked everything (most notably the unsafe math part). I just wanted to see whether this was in line with the direction set in the original llvm-dev thread and it is! Thanks to all of you and sorry about the delay again.
Comment Actions You should probably also have a test for the case where there is not "fast" attribute on the step instruction but we can still vectorize with the hints.
Comment Actions LGTM with the x86 test split into its own test under Transform/LoopVectorize/X86. Thanks!
|