This is an archive of the discontinued LLVM Phabricator instance.

[runtimeunroll] Support multiple exits to latch exit w/epilogue loop
ClosedPublic

Authored by reames on Aug 3 2021, 11:28 AM.

Details

Summary

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used.

I decided to restrict this to the epilogue case. Given the changes ended up being pretty generic, we may be able to unblock the prolog case too, but I want to do that in a separate change to reduce the amount of code we all have to understand at one time.

Diff Detail

Event Timeline

reames created this revision.Aug 3 2021, 11:28 AM
reames requested review of this revision.Aug 3 2021, 11:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2021, 11:28 AM
nikic accepted this revision.Aug 17 2021, 11:01 AM

LGTM

llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
749–750

No need to create a SmallVector anymore here, you can directly pass {Latch}.

llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
1–2

In https://github.com/llvm/llvm-project/commit/70ffd65ca97bd7010108ad8c1369c105fb78714a you dropped the -instcombine from the other RUN lines, so this EPILOG-NO-IC line is now redundant and can be dropped.

This revision is now accepted and ready to land.Aug 17 2021, 11:01 AM
This revision was landed with ongoing or failed builds.Aug 17 2021, 5:52 PM
This revision was automatically updated to reflect the committed changes.

Hello! I'm seeing a failure from this commit in this test with our downstream Arm32 compiler. Does it ring any bells? Thanks.

21		DominatorTree is different than a freshly computed one!
22			Current:
23		=============================--------------------------------
24		Inorder Dominator Tree: DFSNumbers invalid: 17 slow queries.
25		  [1] %entry {4294967295,4294967295} [0]
26		    [2] %entry.new {4294967295,4294967295} [1]
27		      [3] %header {4294967295,4294967295} [2]
28		        [4] %for.exiting_block {4294967295,4294967295} [3]
29		          [5] %latch {4294967295,4294967295} [4]
30		            [6] %for.exiting_block.1 {4294967295,4294967295} [5]
31		              [7] %latch.1 {4294967295,4294967295} [6]
32		                [8] %latchExit.unr-lcssa.loopexit {4294967295,4294967295} [7]
33		          [5] %for.exit2.loopexit {4294967295,4294967295} [4]
34		        [4] %latchExit.epilog-lcssa.loopexit {4294967295,4294967295} [3]
35		    [2] %latchExit.unr-lcssa {4294967295,4294967295} [1]
36		      [3] %header.epil.preheader {4294967295,4294967295} [2]
37		        [4] %header.epil {4294967295,4294967295} [3]
38		          [5] %for.exiting_block.epil {4294967295,4294967295} [4]
39		            [6] %latch.epil {4294967295,4294967295} [5]
40		      [3] %latchExit {4294967295,4294967295} [2]
41		    [2] %for.exit2 {4294967295,4294967295} [1]
42		    [2] %latchExit.epilog-lcssa {4294967295,4294967295} [1]
43		Roots: %entry 
44		 
45			Freshly computed tree:
46		=============================--------------------------------
47		Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries.
48		  [1] %entry {4294967295,4294967295} [0]
49		    [2] %latchExit.unr-lcssa {4294967295,4294967295} [1]
50		      [3] %header.epil.preheader {4294967295,4294967295} [2]
51		        [4] %header.epil {4294967295,4294967295} [3]
52		          [5] %for.exiting_block.epil {4294967295,4294967295} [4]
53		            [6] %latch.epil {4294967295,4294967295} [5]
54		    [2] %latchExit.epilog-lcssa {4294967295,4294967295} [1]
55		    [2] %latchExit {4294967295,4294967295} [1]
56		    [2] %for.exit2 {4294967295,4294967295} [1]
57		    [2] %entry.new {4294967295,4294967295} [1]
58		      [3] %header {4294967295,4294967295} [2]
59		        [4] %latchExit.epilog-lcssa.loopexit {4294967295,4294967295} [3]
60		        [4] %for.exiting_block {4294967295,4294967295} [3]
61		          [5] %for.exit2.loopexit {4294967295,4294967295} [4]
62		          [5] %latch {4294967295,4294967295} [4]
63		            [6] %for.exiting_block.1 {4294967295,4294967295} [5]
64		              [7] %latch.1 {4294967295,4294967295} [6]
65		                [8] %latchExit.unr-lcssa.loopexit {4294967295,4294967295} [7]
66		Roots: %entry 
67		opt: /scratch/aphipps/triage_repo/tools/llvm_cgt/llvm-project/llvm/lib/IR/Dominators.cpp:412: virtual void llvm::DominatorTreeWrapperPass::verifyAnalysis() const: Assertion `DT.verify(DominatorTree::VerificationLevel::Full)' failed.
bjope added a subscriber: bjope.Aug 19 2021, 6:36 AM

I'm seeing the same fault as @alanphipps when having build with LLVM_ENABLE_EXPENSIVE_CHECKS=ON and running the test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll test case.

Should we revert this patch until this has been investigated further?

I'm seeing the same fault as @alanphipps when having build with LLVM_ENABLE_EXPENSIVE_CHECKS=ON and running the test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll test case.

Should we revert this patch until this has been investigated further?

Should be fixed in 447256f. If not, please revert both changes and I'll take another look.

I do want to briefly address the revert question though. I would not revert based on the first report. There's no reproducer, and (I'm guessing) they're force enabling an internal flag. That would definitely be an unsupported configuration. Your second report is a supported configuration failing with an intree reproducer. Reverting to green in that case would have been completely reasonable.

p.s. Did I miss a buildbot failure on this? I'm surprised I didn't notice the expensive check failure. I would have expected a buildbot to report that.

I'm seeing the same fault as @alanphipps when having build with LLVM_ENABLE_EXPENSIVE_CHECKS=ON and running the test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll test case.

Should we revert this patch until this has been investigated further?

Should be fixed in 447256f. If not, please revert both changes and I'll take another look.

I do want to briefly address the revert question though. I would not revert based on the first report. There's no reproducer, and (I'm guessing) they're force enabling an internal flag. That would definitely be an unsupported configuration. Your second report is a supported configuration failing with an intree reproducer. Reverting to green in that case would have been completely reasonable.

p.s. Did I miss a buildbot failure on this? I'm surprised I didn't notice the expensive check failure. I would have expected a buildbot to report that.

It showed up there https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/20179/

But I think for some reason that one does not send emails. I'll check with the maintainers.

I'm seeing the same fault as @alanphipps when having build with LLVM_ENABLE_EXPENSIVE_CHECKS=ON and running the test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll test case.

Should we revert this patch until this has been investigated further?

Should be fixed in 447256f. If not, please revert both changes and I'll take another look.

I do want to briefly address the revert question though. I would not revert based on the first report. There's no reproducer, and (I'm guessing) they're force enabling an internal flag. That would definitely be an unsupported configuration. Your second report is a supported configuration failing with an intree reproducer. Reverting to green in that case would have been completely reasonable.

Thanks @reames, that sounds reasonable to me. There was no rush to revert from my point of view here (specially considering the quick fix, so it was more of a suggestion in case it would take days to fix the problem).

Why did this only fail under expensive checks? The test has a -verify-dom-info, so I'd have expected that to fail with a normal build as well.

FYI, pushed the prolog equivalent (D108262) with a fix for the DT issue rolled in. It existed there too. If further problems are found, please feel free to revert that as well.

anna added a comment.Aug 23 2021, 9:47 AM

Why did this only fail under expensive checks? The test has a -verify-dom-info, so I'd have expected that to fail with a normal build as well.

IIRC, verify-dom-info is slightly different from performing a full DT verification. We have the full verification of DT in runtime unrolling which is under the expensive checks flag:

  // Verify that the Dom Tree is correct.
  #if defined(EXPENSIVE_CHECKS) && !defined(NDEBUG)
assert(DT->verify(DominatorTree::VerificationLevel::Full));