It appears that the way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis.
Details
Diff Detail
Event Timeline
test/Transforms/LoopInterchange/phi-ordering.ll | ||
---|---|---|
15 | Could you add a few more checks, making sure the loop structure is what we expect? |
Hello.
Good suggestion, I've updated the test.
test/Transforms/LoopInterchange/phi-ordering.ll | ||
---|---|---|
15 | This depends on what we expect the loop structure to be :) I originally had this compiling on thumbv6m, where the interchange here is almost certainly going to end up making things slower. The load/store to c would no longer be able to be pulled out of the inner loop. Hence the weak test check that would work even if the performance model here was fixes, and the interchange no longer done. I've changed this test to v7a where I think vectorisation will make this interchange profitable, and added the loop structure checks you suggested. |
Can you not pass the -loop-interchange-threshold option to force the cost model to always interchange? If this isn't possible with the current code then it seems like a major oversight.
LGTM, this is looks like a reasonable fix addressing a FIXME. Please use -loop-interchange-threshold in the test as suggested by @aemerson to ensure the test is always interchanged and hold off committing for a day or so, to give people in other timezones a chance to raise objections.
lib/Transforms/Scalar/LoopInterchange.cpp | ||
---|---|---|
1219 | It might be helpful to keep a comment documenting the need for the induction phi to be the first PHI node here (without the FIXME of course) |
I've updated the test to one that will always be profitable to interchange, as the vectoriser can then kick in.
lib/Transforms/Scalar/LoopInterchange.cpp | ||
---|---|---|
1219 | Good idea. |
It might be helpful to keep a comment documenting the need for the induction phi to be the first PHI node here (without the FIXME of course)