This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Addressed all the review comments.
I'm ready to land this, unless @Whitney has concerns about the test case.
llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
---|---|---|
42 | Yes, this is intentional to make the two loops not control-flow-equivalent, which is what this specific test is checking. | |
270 | This is a good catch. I missed this. I've fixed it to check for the memory dependencies again. |
llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
---|---|---|
42 | The two loops should already be not control-flow-equivalent, as bb7 is not guarded, while bb22 is guarded. Why is the extra guard needed? |
llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
---|---|---|
42 | If the loop is not guarded, the code will use the preheader; if the loop is guarded, it uses the guard. |
llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
---|---|---|
42 | Notice that before changing the test case to rotated form, there is only one loop guard needed. I think we should change the way fusion compute the control flow equivalent between two loops. As a guarded loop (assuming the condition not always true) and an unguarded loop should be consider not control flow equivalent. I am ok to keep that as future improvement. |
llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
---|---|---|
42 | Yes, this is an existing problem that came up recently. I don't think the current patch exacerbates the problem though. |
This guard was not in the original test case, now the second loop has two guards, is that intensional?