This is an archive of the discontinued LLVM Phabricator instance.

[LoopFusion] Sorting of undominated FusionCandidates crashes
ClosedPublic

Authored by ram-NK on Dec 13 2022, 8:41 PM.

Details

Summary

This patch tries to fix issue.

If two FusionCandidates are in same level of dominator tree then, they will not be dominates each other. But they are control flow equivalent. To sort those FusionCandidates nonStrictlyPostDominate check is needed.

Diff Detail

Event Timeline

ram-NK created this revision.Dec 13 2022, 8:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 8:41 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
ram-NK requested review of this revision.Dec 13 2022, 8:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 8:41 PM
Narutoworld requested changes to this revision.Dec 14 2022, 12:25 PM
Narutoworld added a subscriber: Narutoworld.
Narutoworld added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
424–447

Some suggestions about grammer.

If two FusionCandidates are in the same level of dominator tree, they will not dominate each other, but may still be control flow equivalent. To sort shose FusionCandidates, nonStrictlyPostDominate() function is needed.

433

I think you also need to add comments on this. When "nonStrictlyPostDominate()" is not useful, using the level of PostDominateTree to compare two candidates.

435–438

A more simplified version.

return (LNode->getLevel() > RNode->getLevel() );

llvm/test/Transforms/LoopFusion/undominated_loops.ll
4

I am not sure if it is good to leave remarks inside a testcase. I perfer to not include the remarks since they are not relevant.

20

Maybe try to use true or false instead. I remembered I recevied a comment before saying limit the use of undef

This revision now requires changes to proceed.Dec 14 2022, 12:25 PM
aeubanks added inline comments.Dec 14 2022, 12:45 PM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
428

please use LLVM coding style, local variables are CamelCase without a prefix

llvm/test/Transforms/LoopFusion/undominated_loops.ll
2

can you get the IR right before loop-fusion? generally we don't want to run multiple passes in a test because that makes it more confusing
(this smells like newgvn isn't properly returning when it invalidates the domtree rather than a loop-fusion issue, but I could be wrong)

StephenFan added inline comments.
llvm/test/Transforms/LoopFusion/undominated_loops.ll
17

Can bb_4 be deleted?

uabelho added inline comments.Dec 14 2022, 10:32 PM
llvm/test/Transforms/LoopFusion/undominated_loops.ll
2

Note that there are two reproducers in the ticket. The one I added in
https://github.com/llvm/llvm-project/issues/56263#issuecomment-1316748908
just requires

opt -passes="loop-fusion"
ram-NK updated this revision to Diff 483186.Dec 15 2022, 7:43 AM

@Narutoworld, @aeubanks, Comments in code are corrected.

ram-NK marked 4 inline comments as done.Dec 15 2022, 7:47 AM
ram-NK updated this revision to Diff 483197.Dec 15 2022, 8:15 AM

Replace with the second test sample from the reported issue. It is simpler and only one pass than the first one. Thanks to @uabelho to pointed out this.

ram-NK marked 5 inline comments as done.Dec 15 2022, 8:17 AM

@uabelho , @aeubanks, @Narutoworld, @StephenFan All comments are updated.

ram-NK updated this revision to Diff 483263.Dec 15 2022, 11:31 AM

format issue fixed.

fhahn added a subscriber: fhahn.Dec 15 2022, 1:51 PM
fhahn added inline comments.
llvm/test/Transforms/LoopFusion/undominated_loops.ll
3

If you are checking debug output, you need to add ; REQUIRES: asserts.

43

are those needed?

fhahn added inline comments.Dec 15 2022, 1:52 PM
llvm/test/Transforms/LoopFusion/undominated_loops.ll
8

If this now fuses loops we didn't fuse before (because it was crashing), it would probably be good to check the generated IR

aeubanks added inline comments.Dec 15 2022, 1:54 PM
llvm/test/Transforms/LoopFusion/undominated_loops.ll
8

+1, instead of checking the debug output, you can use update_test_checks.py

ram-NK updated this revision to Diff 483650.Dec 16 2022, 1:22 PM

test verification updated using update_test_checks.py

ram-NK marked 3 inline comments as done.Dec 16 2022, 1:35 PM

Comments are updated. @fhahn This test cases will not fuse the loops. But this type of FusionCandidates will not allow to proceed the loop fusion.

looks fine, collectFusionCandidates does check isControlFlowEquivalent rather than dominance

llvm/lib/Transforms/Scalar/LoopFuse.cpp
391–398

this comment needs to be updated

428

nit: these local variables should still be capitalized

435–438

no need for parens

ram-NK updated this revision to Diff 484018.Dec 19 2022, 11:05 AM
ram-NK marked 3 inline comments as done.Dec 19 2022, 11:15 AM

@aeubanks All comments are updated.

ram-NK marked an inline comment as done.Dec 19 2022, 11:38 AM

@fhahn,

If this now fuses loops we didn't fuse before (because it was crashing), it would probably be good to check the generated IR

Corrected the IR with generated IR. Loops in the test IR will not fuse. Before, loops are not fused because of crash. Now the loops are not fused based on the proper profitable check.

Narutoworld accepted this revision.EditedDec 21 2022, 10:53 AM

LGTM, please wait a little bit to have other reviewers' opinions.

This revision is now accepted and ready to land.Dec 21 2022, 10:53 AM
bjope resigned from this revision.Dec 22 2022, 12:56 AM
aeubanks added inline comments.Dec 22 2022, 10:35 AM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
432

actually, do we have tests for all the WrongOrder/RightOrder combinations?

ram-NK updated this revision to Diff 484939.Dec 22 2022, 12:59 PM
ram-NK marked an inline comment as done.Dec 22 2022, 1:07 PM
ram-NK added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
432

In the test case, added new predecessor for entry.vector.body_crit_edge from middle.block15 to satisfy this condition.

ram-NK updated this revision to Diff 484960.Dec 22 2022, 2:17 PM
ram-NK marked an inline comment as done.

corrected test case.

This revision was automatically updated to reflect the committed changes.