This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 41867 Build 42188: arc lint + arc unit
Event Timeline
Addressed all the review comments.
I'm ready to land this, unless @Whitney has concerns about the test case.
| llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
|---|---|---|
| 42 | Yes, this is intentional to make the two loops not control-flow-equivalent, which is what this specific test is checking. | |
| 270 | This is a good catch. I missed this. I've fixed it to check for the memory dependencies again. | |
| llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
|---|---|---|
| 42 | The two loops should already be not control-flow-equivalent, as bb7 is not guarded, while bb22 is guarded. Why is the extra guard needed? | |
| llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
|---|---|---|
| 42 | If the loop is not guarded, the code will use the preheader; if the loop is guarded, it uses the guard. | |
| llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
|---|---|---|
| 42 | Notice that before changing the test case to rotated form, there is only one loop guard needed. I think we should change the way fusion compute the control flow equivalent between two loops. As a guarded loop (assuming the condition not always true) and an unguarded loop should be consider not control flow equivalent. I am ok to keep that as future improvement. | |
| llvm/test/Transforms/LoopFusion/cannot_fuse.ll | ||
|---|---|---|
| 42 | Yes, this is an existing problem that came up recently. I don't think the current patch exacerbates the problem though. | |
This guard was not in the original test case, now the second loop has two guards, is that intensional?