This is an archive of the discontinued LLVM Phabricator instance.

[LoopFusion] Fix LI after fusion of guarded loops
ClosedPublic

Authored by dcaballe on Apr 22 2020, 3:35 PM.

Details

Summary

Hello!

We run some experiments with LoopFusion pass and found an issue when fusing two loop nests with zero-trip-count check guards. We thought it would be interesting to contribute a fix, assuming it makes sense!

The problem is that after fusion, an unreachable block (FC0.ExitBlock) is not removed from LI. This makes LI verification fail when the unreachable block is inside a loop since unreachable blocks are not allowed to be part of a loop.

This issue happens depending on the number of loops per nest and the form of the input loops. This is what we have found so far:

  1. For single loop nests, the unreachable block (FC0.ExitBlock) is not nested in any other loop so LI verification doesn't complain.
  2. For double loop nests in canonical form, the unreachable block resulting from fusing the inner loops is still nested in the outer loop so LI verification fails.
  3. For non-canonical double loop nests, no unreachable block seems to be generated, probably because loops don't have dedicated exists.
  4. For loop nests with more than two loops, unreachable blocks are generated regardless of whether they are in canonical or non-canonical form so we have the same problem as in #2.

The fix basically removes FC0.ExitBlock from LI after fusion. We also decided to remove FC1GuardBlock from LI and DT to keep them consistent until FC1 loop is removed later on in the code, in case those analyses are used in the future before that point. Hopefully we are not missing anything else.

The patch also includes a couple of tests with double and a triple loop nest cases. They were passed through regular optimizations, including jump threading and CFG simplifications and loops are in canonical form. They don't include CHECK rules for now. I plan to add them once I have a confirmation that this patch makes sense :). For experimentation, the double loop nest can be turned into non-canonical form by running SimplifyCFG on it.

On a side note, I wonder if we should only fuse loops in canonical form and bail out in other cases. That would reduce the number of loop "flavors" to be supported. A check for that could be added to the legality phase.

Thanks!
Diego

Diff Detail

Event Timeline

dcaballe created this revision.Apr 22 2020, 3:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2020, 3:35 PM
kbarton requested changes to this revision.May 1 2020, 8:46 AM

Hi Diego,
Thank you for submitting this patch!
I think this fix makes sense. If you could go ahead and add the check rules for the test cases that would be great!

To answer your question about only fusing loops in canonical form, I had started a patch in early March that would do exactly that (as well as a bunch of refactoring). Unfortunately I haven't been able to get back to it recently. This patch should work with those changes though, and the test cases will be a welcome addition to the testing to make sure nothing regresses. Thanks again for finding this and posting a fix!

This revision now requires changes to proceed.May 1 2020, 8:46 AM
dcaballe updated this revision to Diff 261995.May 4 2020, 8:08 PM

Hi Kit. This should be it. Thanks for taking a look!

kbarton accepted this revision.May 6 2020, 10:18 AM

One minor comment that I just noticed - can you please change the names of the test cases to use _ instead of - (to be consistent with the other tests in the directory)?
Aside from that, LGTM.

This revision is now accepted and ready to land.May 6 2020, 10:18 AM

One minor comment that I just noticed - can you please change the names of the test cases to use _ instead of - (to be consistent with the other tests in the directory)?
Aside from that, LGTM.

Sure! I'll do that before landing it. Thank you so much!

This revision was automatically updated to reflect the committed changes.