Page MenuHomePhabricator

[SCCP] Directly remove non-feasible edges
ClosedPublic

Authored by nikic on Jul 21 2020, 1:10 PM.

Details

Summary

Non-feasible control-flow edges are currently removed by replacing the branch condition with a constant and then calling ConstantFoldTerminator. This happens in a rather roundabout manner, by inspecting the users (effectively: predecessors) of unreachable blocks, and further complicated by the need to explicitly materialize the condition for "forced" edges. I would like to extend SCCP to discard switch conditions that are non-feasible based on range information, but this is incompatible with the current approach (as there is no single constant we could use.)

Instead, this patch explicitly removes non-feasible edges. It currently only handles the case where there is a single feasible edge. The llvm_unreachable() branch will need to be implemented for the aforementioned switch improvement.

Diff Detail

Event Timeline

nikic created this revision.Jul 21 2020, 1:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2020, 1:10 PM
efriedma added inline comments.Jul 21 2020, 2:34 PM
llvm/lib/Transforms/Scalar/SCCP.cpp
1825

Iterating over a smallptrset is non-deterministic.

1827

I'm not sure calling removePredecessor once is enough if there are multiple edges to the same successor, for example in a switch.

1837

Even if only one basic block is a possible successor, there could still be multiple edges, for example in a switch. (This matters because you might need to update PHI nodes.)

ConstantFoldTerminator does some stuff with profile metadata; do we need something similar here?

1840

Is it possible that no successor is feasible?

fhahn added inline comments.Jul 21 2020, 2:49 PM
llvm/test/Transforms/SCCP/switch-constantfold-crash.ll
8

These checks seem odd. We care basically just creating a FileCheck variable, but never match the variable elsewhere, right? So we aren't actually checking if we branch to bb9 here, unless I am missing something.

nikic marked 4 inline comments as done.Jul 21 2020, 2:57 PM
nikic added inline comments.
llvm/lib/Transforms/Scalar/SCCP.cpp
1827

I think you are right. removePredecessor() is basically removeIncomingValue() on all phi nodes, and that function only removes a single incoming value. I'll try to come up with a failing test case.

1837

The profile metadata adjustment is only done when merging cases into the default destination. We don't need to adjust metadata if we convert into an unconditional branch.

1840

This shouldn't be possible right now, as we always force one successor to be feasible, even if it was undef/unknown. It might be possible in the future if we want to make use of branch-on-undef being UB and all successors being non-feasible.

llvm/test/Transforms/SCCP/switch-constantfold-crash.ll
8

Yes, this is a weakness of update_test_checks generated check lines. Do you want me to manually undo these?

nikic updated this revision to Diff 279899.Jul 22 2020, 11:44 AM

Fix handling of multi-edges. The corresponding test case has been added in https://github.com/llvm/llvm-project/commit/eae6bb3807977a2998ac9114a1d6ecb6bdafc3cd.

nikic marked 5 inline comments as done.Jul 22 2020, 11:47 AM
nikic added inline comments.
llvm/lib/Transforms/Scalar/SCCP.cpp
1825

The updated version no longer iterates over SmallPtrSet :)

1827

This is now fixed. Previously the test case from https://github.com/llvm/llvm-project/commit/eae6bb3807977a2998ac9114a1d6ecb6bdafc3cd crashed.

llvm/test/Transforms/SCCP/switch.ll
21

This test change shows another weakness of the previous approach: Even though we already determined that here only the default edge is feasible, we did not make use of this fact, because the target block is still reachable through a different edge. Because ConstantFoldTerminator was called for the predecessors of unreachable blocks, this case was missed.

efriedma added inline comments.Jul 22 2020, 12:31 PM
llvm/lib/Transforms/Scalar/SCCP.cpp
1831

Can we skip creating the unconditional branch if the terminator is already an unconditional branch?

Is it possible for this code to see other branches which inherently only have one successor, e.g. catchret?

1837

Even if only one basic block is a possible successor, there could still be multiple edges, for example in a switch. (This matters because you might need to update PHI nodes.)

Any response to this?

nikic updated this revision to Diff 279935.Jul 22 2020, 1:38 PM
nikic marked 2 inline comments as done.

Fix multi-edge handling more thoroughly.

nikic marked 2 inline comments as done.Jul 22 2020, 1:41 PM
nikic added inline comments.
llvm/lib/Transforms/Scalar/SCCP.cpp
1831

Any of this code is only executed if there is at least one non-feasible successor, i.e. only cases for which SCCP supports feasible edge analysis (that is more precise than "all successors"). Those are conditional branches, switches and indirectbr, thus the types in the assert.

1837

Sorry, I misunderstood what you meant here and thought this was resolved by the previous change. I've now added two additional test cases in https://github.com/llvm/llvm-project/commit/e20b3079c14d8e68ebc90125eb95156f2e1ee9e4 and fixed this to remove all but one of the edges.

This revision is now accepted and ready to land.Jul 22 2020, 2:31 PM
This revision was automatically updated to reflect the committed changes.
nikic marked an inline comment as done.
MaskRay reopened this revision.Jul 23 2020, 5:53 PM
MaskRay added a subscriber: MaskRay.

Reverted this revision and its follow-up 5db5b4bc4394ca247c9eb665e03b851848aa2fbf in 4637daa9905cbe12adba85fc88ea95a3be7db85d

This broke a stage-2 build

This revision is now accepted and ready to land.Jul 23 2020, 5:53 PM
nikic added a comment.Jul 24 2020, 9:55 AM

@MaskRay How would I go about reproducing this issue? This is the first time I'm trying a stage2 build. I've used this cmake configuration

cmake -G Ninja ../llvm \
  -DCLANG_ENABLE_BOOTSTRAP=On \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_TARGETS_TO_BUILD="X86;Hexagon" \
  -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
  -DLLVM_APPEND_VC_REV=false \
  -DCLANG_ENABLE_ARCMT=false \
  -DCLANG_ENABLE_STATIC_ANALYZER=false

and then ran ninja stage2. However, I did not get an error this way.

@MaskRay How would I go about reproducing this issue? This is the first time I'm trying a stage2 build. I've used this cmake configuration

cmake -G Ninja ../llvm \
  -DCLANG_ENABLE_BOOTSTRAP=On \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_TARGETS_TO_BUILD="X86;Hexagon" \
  -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
  -DLLVM_APPEND_VC_REV=false \
  -DCLANG_ENABLE_ARCMT=false \
  -DCLANG_ENABLE_STATIC_ANALYZER=false

and then ran ninja stage2. However, I did not get an error this way.

I usually set up things myself, by treating it as a regular build, just with additionally setting -DCMAKE_C_COMPILER= and -DCMAKE_CXX_COMPILER= to the local build I want to use to build stage2. Before building stage2, you just have to make sure the stage1 compiler build is up-to-date with the changes you want.

@MaskRay I guess some additional options need to be set for the build? It would be great if you could provide additional details, ideally with the full traceback as well.

nikic added a comment.Jul 25 2020, 1:28 AM

I usually set up things myself, by treating it as a regular build, just with additionally setting -DCMAKE_C_COMPILER= and -DCMAKE_CXX_COMPILER= to the local build I want to use to build stage2. Before building stage2, you just have to make sure the stage1 compiler build is up-to-date with the changes you want.

Thanks, that is indeed much simpler to work with.

After some tinkering with different configs, I now managed to reproduce this by compiling with -fexperimental-new-pass-manager.

clang++: /home/nikic/llvm-project/llvm/include/llvm/Support/GenericDomTree.h:641: void llvm::DominatorTreeBase<NodeT, IsPostDom>::eraseNode(NodeT*) [with NodeT = llvm::BasicBlock; bool IsPostDom = false]: Assertion `Node->isLeaf() && "Node is not a leaf node."' failed.
...
 #9 0x000055e32f884bab llvm::DomTreeUpdater::eraseDelBBNode(llvm::BasicBlock*) (/home/nikic/llvm-project/build/bin/clang+++0x528bbab)
#10 0x000055e32f884d15 llvm::DomTreeUpdater::forceFlushDeletedBB() (/home/nikic/llvm-project/build/bin/clang+++0x528bd15)
#11 0x000055e32f885ebc llvm::DomTreeUpdater::dropOutOfDateUpdates() (/home/nikic/llvm-project/build/bin/clang+++0x528cebc)
#12 0x000055e32d566aa1 llvm::runIPSCCP(llvm::Module&, llvm::DataLayout const&, std::function<llvm::TargetLibraryInfo const& (llvm::Function&)>, llvm::function_ref<llvm::AnalysisResultsForFn (llvm::Function&)>) (/home/nikic/llvm-project/build/bin/clang+++0x2f6daa1)
#13 0x000055e32d0ad35c llvm::IPSCCPPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/nikic/llvm-project/build/bin/clang+++0x2ab435c)
nikic added a comment.Jul 25 2020, 2:21 AM
  • How do I get the pre-ipsccp IR under NewPM -- without getting a full print-after-all dump? There is no opt-bisect and I didn't find anything else. Is inserting manual M.dump() calls really the state of the art here?
  • How do I reduce test cases under NewPM? After wrangling with it for a while, I finally understood that bugpoint just doesn't understand NewPM at all. So I tried llvm-reduce. After learning how to make that work -- which seems way more complicated than bugpoint -- it just managed to break the module after a couple iterations :(

Per a recent mailing list thread, people have been using NewPM in production for years -- how do you debug issues under it?

For the record, this is the IR dump I managed to get: https://gist.github.com/nikic/138cbd24f0c87b847d52fa9d941a4ae4 And the reason why this only affects NewPM is that we don't preserve DT/PDT under the LegacyPM in this pass at all.

nikic added a comment.Jul 25 2020, 3:00 AM

Okay, it looks like you can use bugpoint with NewPM after all, you just need to use it the same way as llvm-reduce, via -compile-custom -compile-command. With that and some manual cleanup, here is a reduced test case:

; opt -passes=ipsccp
define i32 @test() {
entry:
  br label %for.body

for.body:                                         ; preds = %entry
  br i1 true, label %if.then2, label %if.else

if.then2:                                         ; preds = %for.body
  br label %for.inc

if.else:                                          ; preds = %for.body
  br i1 undef, label %lor.rhs, label %if.then19.critedge

lor.rhs:                                          ; preds = %if.else
  br i1 undef, label %if.then19, label %for.inc

if.then19.critedge:                               ; preds = %if.else
  br label %if.then19

if.then19:                                        ; preds = %if.then19.critedge, %lor.rhs
  unreachable

for.inc:                                          ; preds = %lor.rhs, %if.then2
  unreachable
}
nikic updated this revision to Diff 280668.Jul 25 2020, 4:36 AM

Apply DTU updates only after changing the CFG and add previously crashing test. This is the first point in the CAUTION list at https://github.com/llvm/llvm-project/blob/3c1476d26c769cd97a631a129b30c62232ac96b6/llvm/include/llvm/Analysis/DomTreeUpdater.h#L135-L139.

fhahn accepted this revision.Jul 25 2020, 4:50 AM

Apply DTU updates only after changing the CFG and add previously crashing test. This is the first point in the CAUTION list at https://github.com/llvm/llvm-project/blob/3c1476d26c769cd97a631a129b30c62232ac96b6/llvm/include/llvm/Analysis/DomTreeUpdater.h#L135-L139.

That sounds right, thanks! LGTM

llvm/test/Transforms/SCCP/domtree-update.ll
2 ↗(On Diff #280668)

perhaps also add verify<domtree>

Also, even though we don't actively preserve the DT with the legacy pass manager, we might as well run the test with the legacy pass manager as well. Or add a note why we only check the NewPM

This revision was automatically updated to reflect the committed changes.

Glad that you figured out the problem! We are indeed using -fexperimental-new-pass-manager. Sorry that I was not very responsive yesterday because I was/am investigating a mysterious PGO regression.