This is an archive of the discontinued LLVM Phabricator instance.

[LoopPeel] Peel loops with exits followed by an unreachable or deopt block
ClosedPublic

Authored by dmakogon on Oct 1 2021, 4:59 AM.

Details

Summary

Added support for peeling loops with exits that are followed either by an unreachable-terminated block or block that has a terminatnig deoptimize call. All blocks in the sequence must have an unique successor, maybe except for the last one.

Diff Detail

Event Timeline

dmakogon created this revision.Oct 1 2021, 4:59 AM
dmakogon requested review of this revision.Oct 1 2021, 4:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2021, 4:59 AM
lebedev.ri added a subscriber: lebedev.ri.
lebedev.ri added inline comments.
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
510

Nice, i was about to ask for that.
But doesn't this lead to the issues with branch weights discussed in D108108?

mkazantsev requested changes to this revision.Oct 1 2021, 5:12 AM
mkazantsev added inline comments.
llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h
132

This comment doesn't correspond to what you really want. Imagine that a block has 2 successors, one being deoptimizing and another is just a regular block. It falls into category "any of its children is deoptimizing", but it's definitely not what you are looking for.

What I suggest is to rewrite it like "Check if we can prove that all paths starting from this block will converge to a block that is terminated by unreachable".

llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
496

What DeoptimizingBlocks is used for? To me it looks like a complete doubler of VisitedBlocks. Also no good reason for it to be a parameter.

499

Use SmallPtrSet. contains checks in vector are expensive.

500

Looks like it may be done much easier.

while(BB && VisitedBlocks.insert(BB).second) {
  if ([terminates with unreachable](BB))
    return true;
   BB = BB->getSingleSuccessor();
 }
return false;

Please check if it's what you meant here, and simplify the code.

This revision now requires changes to proceed.Oct 1 2021, 5:12 AM
mkazantsev added inline comments.Oct 1 2021, 5:25 AM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
510

Well, it doesn't introduce a new bug, it just expands the scope of the existing problem. I'm pretty sure it's relatively safe to always set "many:1" metadata for unreached blocks, but need more digging to understand the impact of this.

518

Compile time-wise, I'd prefer to have traversal limit set by an option. In worst case, you'll need to walk dozens blocks for each exit, and here is where things may go astray.

dmakogon updated this revision to Diff 376858.Oct 4 2021, 4:19 AM

Rewritten IsBlockDeoptimizing in a simpler way with the use of SmallPtrSet.
Added an option to limit the maximum traversed block chain depth.

dmakogon marked 5 inline comments as done.Oct 4 2021, 4:20 AM
mkazantsev added inline comments.Oct 5 2021, 1:28 AM
llvm/lib/Transforms/Utils/LoopPeel.cpp
760

Instead of this, please fill a SmallVector with DomTreeUpdates and then DT->applyUpdates once. It's a more canonical way of doing this.

mkazantsev added inline comments.Oct 5 2021, 1:37 AM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
494

This name is misleading. It's not always deoptimizing (if it ends with unreachable), and neither *this* block is deoptimizing. Maybe smth like IsBlockFollowedByDeoptOrUnreached?

502

There is one thing that bugs me. Imagine a sutiation:

loop_exit:
  call foo() // will not return, but is not a deopt
  unreachable

I'm not sure if it's bad actually. But maybe we should consider checking all other (non deopt) instruction with isGuaranteedToTransferExecutionToSuccessor. Opinions?

lebedev.ri added inline comments.Oct 5 2021, 2:12 AM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
502

This is actually exactly the situation would have wanted to ask to be supported.

dmakogon updated this revision to Diff 377270.Oct 5 2021, 8:57 AM

Renamed IsBlockDeoptimizing to IsBlockFollowedByDeoptOrUnreachable
DT is now updated using applyUpdates

dmakogon marked 2 inline comments as done.Oct 5 2021, 8:57 AM
mkazantsev added inline comments.Oct 5 2021, 8:29 PM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
502

Do you mean "supported" as "let's peel it" or "let's NOT peel it unless we know unreachable WILL execute"? :)

lebedev.ri added inline comments.Oct 6 2021, 12:26 AM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
502

"let's peel it"

mkazantsev accepted this revision.Oct 6 2021, 3:14 AM

In that case everything since done; @dmakogon pls add one more test with single-exit non-throwing unreachable exit and I think we can go with it, unless someone has cons.

llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
502

Ok, we peel it now. :)

This revision is now accepted and ready to land.Oct 6 2021, 3:14 AM
lebedev.ri added inline comments.Oct 6 2021, 3:19 AM
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
502

:)

This revision was automatically updated to reflect the committed changes.
foad added a subscriber: foad.Oct 8 2021, 1:59 AM

This causes lots of lit test failures in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Can you please fix or revert?

Failed Tests (26):
  LLVM :: ThinLTO/X86/function_entry_count.ll
  LLVM :: Transforms/LoopFusion/guarded_peel.ll
  LLVM :: Transforms/LoopFusion/peel.ll
  LLVM :: Transforms/LoopUnroll/Hexagon/peel-small-loop.ll
  LLVM :: Transforms/LoopUnroll/dce.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-conditions-pgo-1.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-conditions-pgo-2.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-conditions.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-inner.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-nests.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-noalias-scope-decl.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt-idom-2.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt-idom.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll
  LLVM :: Transforms/LoopUnroll/peel-loop-scev-invalidate.ll
  LLVM :: Transforms/LoopUnroll/peel-loop.ll
  LLVM :: Transforms/LoopUnroll/peel-loop2.ll
  LLVM :: Transforms/LoopUnroll/peel-multiple-unreachable-exits.ll
  LLVM :: Transforms/LoopUnroll/pr33437.ll
  LLVM :: Transforms/LoopUnroll/pr45939-peel-count-and-complete-unroll.ll
  LLVM :: Transforms/LoopUnroll/unroll-after-peel.ll
  LLVM :: Transforms/LoopUnroll/unroll-heuristics-pgo.ll
  LLVM :: Transforms/LoopUnroll/wrong_assert_in_peeling.ll
  LLVM :: Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll

Reverted, Dima pls investigate.

dmakogon updated this revision to Diff 378159.Oct 8 2021, 3:50 AM

Fixed failing tests with expensive checks enabled. Forgot to remove the DT->verify check that happened every iteration after all needed DT updates which were applied immediately, but with the patch we store some updates to apply them all later at once.

This revision was landed with ongoing or failed builds.Oct 8 2021, 4:10 AM

This causes crashes, reverting

$ ./build/rel/bin/opt -passes=loop-unroll-full /tmp/b.ll -disable-output
opt: ../../llvm/include/llvm/Support/CFGUpdate.h:87: void llvm::cfg::LegalizeUpdates(ArrayRef<Update<NodePtr>>, SmallVectorImpl<Update<NodePtr>> &, bool, bool) [NodePtr = llvm::BasicBlock *]: Assertion `std::abs(NumInsertions) <= 1 && "Unbalanced operations!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ./build/rel/bin/opt -passes=loop-unroll-full /tmp/b.ll -disable-output
 #0 0x0000000001f01953 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:565:13
 #1 0x0000000001eff7ce llvm::sys::RunSignalHandlers() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Signals.cpp:98:18
 #2 0x0000000001f01cbf SignalHandler(int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:407:1
 #3 0x00007f998f9e7140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140)
 #4 0x00007f998f4c6ce1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #5 0x00007f998f4b0537 abort ./stdlib/abort.c:81:7
 #6 0x00007f998f4b040f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8
 #7 0x00007f998f4b040f _nl_load_domain ./intl/loadmsgcat.c:970:34
 #8 0x00007f998f4bf662 (/lib/x86_64-linux-gnu/libc.so.6+0x34662)
 #9 0x0000000001539fd3 AdvancePastEmptyBuckets /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/ADT/DenseMap.h:1281:5
#10 0x0000000001539fd3 operator++ /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/ADT/DenseMap.h:1271:5
#11 0x0000000001539fd3 void llvm::cfg::LegalizeUpdates<llvm::BasicBlock*>(llvm::ArrayRef<llvm::cfg::Update<llvm::BasicBlock*> >, llvm::SmallVectorImpl<llvm::cfg::Update<llvm::BasicBlock*> >&, bool, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/CFGUpdate.h:85:17
#12 0x0000000001537265 llvm::GraphDiff<llvm::BasicBlock*, false>::GraphDiff(llvm::ArrayRef<llvm::cfg::Update<llvm::BasicBlock*> >, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/CFGDiff.h:0:5
#13 0x00000000025a0809 applyUpdates /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/GenericDomTree.h:547:5
#14 0x00000000025a0809 llvm::peelLoop(llvm::Loop*, unsigned int, llvm::LoopInfo*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Transforms/Utils/LoopPeel.cpp:803:7

aeubanks reopened this revision.Oct 8 2021, 10:53 AM
This revision is now accepted and ready to land.Oct 8 2021, 10:53 AM
mkazantsev requested changes to this revision.Oct 9 2021, 12:35 AM

Thanks @aeubanks! I've checked in your test as test/Transforms/LoopUnroll/revert-D110922.ll to not skip this again. @dmakogon pls investigate.

This revision now requires changes to proceed.Oct 9 2021, 12:35 AM
dmakogon updated this revision to Diff 378675.Oct 11 2021, 8:19 AM

Fixed crash on test/Transforms/LoopUnroll/revert-D110922.ll. It happened due to adding the same edges to DomTree due to the fact that one of the loop exiting blocks had a switch terminator and there were multiple edges from that exiting block to an exit. Now we construct a set from the loop exiting edges and then add edge insertion updates only for edges from the set.

Two drive by comments.

One the surface, this seems like it should be two changes. One to do an NFC restructuring of the domtree updates. One to actually enable the broader scope.

The description of the change also needs updated. An exit reaching an unreachable is *not* a "deoptimizing" exit.

+1 to spliting off dom tree update, especially since it was causing troubles.

aeubanks removed a subscriber: aeubanks.Oct 11 2021, 9:33 AM
dmakogon updated this revision to Diff 378905.Oct 12 2021, 12:37 AM
dmakogon retitled this revision from [LoopPeel] Peel loops with deoptimizing exits to [LoopPeel] Peel loops with exits followed by an unreachable or deopt block.
dmakogon edited the summary of this revision. (Show Details)

Split the patch into 2:

  1. D111611 changes the way DT is updated,
  2. This patch adds support for peeling loops with exits followed by unreachable or deopt blocks.

Also changed this patch's title and description to address Philip's comment.

Do I understand correctly that the reason of crash was duplicating exiting edges, which caused duplicating DT, and this is now fixed in D111611?

mkazantsev requested changes to this revision.Oct 17 2021, 9:01 PM
mkazantsev added inline comments.
llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
57

Please rename the option and change description, taking Philip's comment into account. Unreachable-terminated block is not a deopt block.

This revision now requires changes to proceed.Oct 17 2021, 9:01 PM
dmakogon updated this revision to Diff 382211.Oct 26 2021, 12:26 AM

Rebase.

Updated name of the option which limits the maximum path length when checking whether a BB is followed by an unreachable or deoptimizing block.

This revision is now accepted and ready to land.Nov 1 2021, 12:03 AM