Currently loop-unrolling doesn't preserve loop-simplified form. This patch
fixes it by resimplifying affected loops.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Utils/LoopUnroll.cpp | ||
---|---|---|
386–387 ↗ | (On Diff #66747) | So, this is done because we leave the first copy of the loop alone when unrolling? If not, shouldn't we handle this below so that we don't add inner loops that won't actually exist after unrolling? |
692 ↗ | (On Diff #66747) | Related to the above TODO comment, the ways in which this can break loop simplified form seem fairly constrained. If this is a compile time issue, its likely a more targeted approach could be used... Happy to just leave a comment here if you don't yet know whether we need to consider the compile time here. |
lib/Transforms/Utils/LoopUnroll.cpp | ||
---|---|---|
386–387 ↗ | (On Diff #66747) |
Yes, exactly. When we unroll something like this: outer_loop { inner_loop } we could get something like this: outer_loop { inner_loop inner_loop2 } or even inner_loop inner_loop2 in the case of complete unrolling. inner_loop2 is a new loop, created by loop-unrolling (see lines 417-418). inner_loop existed in the original IR and remains (almost) untouched. It still needs to be simplified, as it'll be using the same exit block, as the cloned loops. |
692 ↗ | (On Diff #66747) |
This is true, that's why I left the TODO there. It should be definitely possible to fix only broken parts (i.e. insert preheaders and split exit-edges). On the other hand, we've been calling simplifyLoop on OuterL without any issues before, which also simplifies all nested loops - that was my motivation for keeping it simple for now. My idea was that comment from above relates to this spot as well, just didn't want to replicate the same TODO. |
Patch LGTM.
Would it make sense to only fill in LoopsToSimplify if OuterL is null? Either way, feel free to land.
Thanks, committed in r278038.
Would it make sense to only fill in LoopsToSimplify if OuterL is null?
I think the checks will clutter the code for not so big benefit, so I decided to go without them.