Added support for peeling loops with exits that are followed either by an unreachable-terminated block or block that has a terminatnig deoptimize call. All blocks in the sequence must have an unique successor, maybe except for the last one.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
| llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h | ||
|---|---|---|
| 132 | This comment doesn't correspond to what you really want. Imagine that a block has 2 successors, one being deoptimizing and another is just a regular block. It falls into category "any of its children is deoptimizing", but it's definitely not what you are looking for. What I suggest is to rewrite it like "Check if we can prove that all paths starting from this block will converge to a block that is terminated by unreachable". | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
| 496 | What DeoptimizingBlocks is used for? To me it looks like a complete doubler of VisitedBlocks. Also no good reason for it to be a parameter. | |
| 499 | Use SmallPtrSet. contains checks in vector are expensive. | |
| 500 | Looks like it may be done much easier. while(BB && VisitedBlocks.insert(BB).second) {
  if ([terminates with unreachable](BB))
    return true;
   BB = BB->getSingleSuccessor();
 }
return false;Please check if it's what you meant here, and simplify the code. | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 510 | Well, it doesn't introduce a new bug, it just expands the scope of the existing problem. I'm pretty sure it's relatively safe to always set "many:1" metadata for unreached blocks, but need more digging to understand the impact of this. | |
| 518 | Compile time-wise, I'd prefer to have traversal limit set by an option. In worst case, you'll need to walk dozens blocks for each exit, and here is where things may go astray. | |
Rewritten IsBlockDeoptimizing in a simpler way with the use of SmallPtrSet.  
Added an option to limit the maximum traversed block chain depth.
| llvm/lib/Transforms/Utils/LoopPeel.cpp | ||
|---|---|---|
| 760 | Instead of this, please fill a SmallVector with DomTreeUpdates and then DT->applyUpdates once. It's a more canonical way of doing this. | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 494 | This name is misleading. It's not always deoptimizing (if it ends with unreachable), and neither *this* block is deoptimizing. Maybe smth like IsBlockFollowedByDeoptOrUnreached? | |
| 502 | There is one thing that bugs me. Imagine a sutiation: loop_exit: call foo() // will not return, but is not a deopt unreachable I'm not sure if it's bad actually. But maybe we should consider checking all other (non deopt) instruction with isGuaranteedToTransferExecutionToSuccessor. Opinions? | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 502 | This is actually exactly the situation would have wanted to ask to be supported. | |
Renamed IsBlockDeoptimizing to IsBlockFollowedByDeoptOrUnreachable
DT is now updated using applyUpdates
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 502 | Do you mean "supported" as "let's peel it" or "let's NOT peel it unless we know unreachable WILL execute"? :) | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 502 | "let's peel it" | |
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 502 | :) | |
This causes lots of lit test failures in an LLVM_ENABLE_EXPENSIVE_CHECKS build. Can you please fix or revert?
Failed Tests (26): LLVM :: ThinLTO/X86/function_entry_count.ll LLVM :: Transforms/LoopFusion/guarded_peel.ll LLVM :: Transforms/LoopFusion/peel.ll LLVM :: Transforms/LoopUnroll/Hexagon/peel-small-loop.ll LLVM :: Transforms/LoopUnroll/dce.ll LLVM :: Transforms/LoopUnroll/peel-loop-conditions-pgo-1.ll LLVM :: Transforms/LoopUnroll/peel-loop-conditions-pgo-2.ll LLVM :: Transforms/LoopUnroll/peel-loop-conditions.ll LLVM :: Transforms/LoopUnroll/peel-loop-inner.ll LLVM :: Transforms/LoopUnroll/peel-loop-nests.ll LLVM :: Transforms/LoopUnroll/peel-loop-noalias-scope-decl.ll LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt-idom-2.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt-idom.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo-deopt.ll LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll LLVM :: Transforms/LoopUnroll/peel-loop-scev-invalidate.ll LLVM :: Transforms/LoopUnroll/peel-loop.ll LLVM :: Transforms/LoopUnroll/peel-loop2.ll LLVM :: Transforms/LoopUnroll/peel-multiple-unreachable-exits.ll LLVM :: Transforms/LoopUnroll/pr33437.ll LLVM :: Transforms/LoopUnroll/pr45939-peel-count-and-complete-unroll.ll LLVM :: Transforms/LoopUnroll/unroll-after-peel.ll LLVM :: Transforms/LoopUnroll/unroll-heuristics-pgo.ll LLVM :: Transforms/LoopUnroll/wrong_assert_in_peeling.ll LLVM :: Transforms/PhaseOrdering/X86/peel-before-lv-to-enable-vectorization.ll
Fixed failing tests with expensive checks enabled. Forgot to remove the DT->verify check that happened every iteration after all needed DT updates which were applied immediately, but with the patch we store some updates to apply them all later at once.
This causes crashes, reverting
$ ./build/rel/bin/opt -passes=loop-unroll-full /tmp/b.ll -disable-output opt: ../../llvm/include/llvm/Support/CFGUpdate.h:87: void llvm::cfg::LegalizeUpdates(ArrayRef<Update<NodePtr>>, SmallVectorImpl<Update<NodePtr>> &, bool, bool) [NodePtr = llvm::BasicBlock *]: Assertion `std::abs(NumInsertions) <= 1 && "Unbalanced operations!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: ./build/rel/bin/opt -passes=loop-unroll-full /tmp/b.ll -disable-output #0 0x0000000001f01953 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:565:13 #1 0x0000000001eff7ce llvm::sys::RunSignalHandlers() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Signals.cpp:98:18 #2 0x0000000001f01cbf SignalHandler(int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:407:1 #3 0x00007f998f9e7140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140) #4 0x00007f998f4c6ce1 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #5 0x00007f998f4b0537 abort ./stdlib/abort.c:81:7 #6 0x00007f998f4b040f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8 #7 0x00007f998f4b040f _nl_load_domain ./intl/loadmsgcat.c:970:34 #8 0x00007f998f4bf662 (/lib/x86_64-linux-gnu/libc.so.6+0x34662) #9 0x0000000001539fd3 AdvancePastEmptyBuckets /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/ADT/DenseMap.h:1281:5 #10 0x0000000001539fd3 operator++ /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/ADT/DenseMap.h:1271:5 #11 0x0000000001539fd3 void llvm::cfg::LegalizeUpdates<llvm::BasicBlock*>(llvm::ArrayRef<llvm::cfg::Update<llvm::BasicBlock*> >, llvm::SmallVectorImpl<llvm::cfg::Update<llvm::BasicBlock*> >&, bool, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/CFGUpdate.h:85:17 #12 0x0000000001537265 llvm::GraphDiff<llvm::BasicBlock*, false>::GraphDiff(llvm::ArrayRef<llvm::cfg::Update<llvm::BasicBlock*> >, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/CFGDiff.h:0:5 #13 0x00000000025a0809 applyUpdates /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/include/llvm/Support/GenericDomTree.h:547:5 #14 0x00000000025a0809 llvm::peelLoop(llvm::Loop*, unsigned int, llvm::LoopInfo*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, bool) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Transforms/Utils/LoopPeel.cpp:803:7
Fixed crash on test/Transforms/LoopUnroll/revert-D110922.ll. It happened due to adding the same edges to DomTree due to the fact that one of the loop exiting blocks had a switch terminator and there were multiple edges from that exiting block to an exit. Now we construct a set from the loop exiting edges and then add edge insertion updates only for edges from the set.
Two drive by comments.
One the surface, this seems like it should be two changes. One to do an NFC restructuring of the domtree updates. One to actually enable the broader scope.
The description of the change also needs updated. An exit reaching an unreachable is *not* a "deoptimizing" exit.
Split the patch into 2:
- D111611 changes the way DT is updated,
- This patch adds support for peeling loops with exits followed by unreachable or deopt blocks.
Also changed this patch's title and description to address Philip's comment.
Do I understand correctly that the reason of crash was duplicating exiting edges, which caused duplicating DT, and this is now fixed in D111611?
| llvm/lib/Transforms/Utils/BasicBlockUtils.cpp | ||
|---|---|---|
| 57 | Please rename the option and change description, taking Philip's comment into account. Unreachable-terminated block is not a deopt block. | |
Rebase.
Updated name of the option which limits the maximum path length when checking whether a BB is followed by an unreachable or deoptimizing block.
This comment doesn't correspond to what you really want. Imagine that a block has 2 successors, one being deoptimizing and another is just a regular block. It falls into category "any of its children is deoptimizing", but it's definitely not what you are looking for.
What I suggest is to rewrite it like "Check if we can prove that all paths starting from this block will converge to a block that is terminated by unreachable".