This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)
AcceptedPublic

Authored by lebedev.ri on Jul 19 2021, 2:54 PM.

Details

Summary

LLVM IR SSA form is "implicit" in @pr51125. While is a valid LLVM IR,
and does not require any PHI nodes, that completely breaks the further logic
in CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses()
that updates the live-out uses of the bonus instructions.

What i believe we need to do, is to first make the SSA form explicit,
by inserting tautological PHI nodes, and rewriting the offending uses.

$ /builddirs/llvm-project/build-Clang12/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -tv -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=10 -tv -o /dev/null /tmp/test.ll 

----------------------------------------
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  br label %L

%L:
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  br i1 %iszero, label %exit, label %L2

%L2:
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp eq i32 %ld, 4294967295
  br i1 %cmp, label %L, label %exit

%exit:
  %r = phi i32 [ %ld, %L2 ], [ %ld, %L ]
  ret i32 %r
}
=>
@global_pr51125 = global 4 bytes, align 4

define i32 @pr51125() {
%entry:
  %ld.old = load i32, * @global_pr51125, align 4
  %iszero.old = icmp eq i32 %ld.old, 0
  br i1 %iszero.old, label %exit, label %L2

%L2:
  %ld2 = phi i32 [ %ld.old, %entry ], [ %ld, %L2 ]
  store i32 4294967295, * @global_pr51125, align 4
  %cmp = icmp ne i32 %ld2, 4294967295
  %ld = load i32, * @global_pr51125, align 4
  %iszero = icmp eq i32 %ld, 0
  %or.cond = select i1 %cmp, i1 1, i1 %iszero
  br i1 %or.cond, label %exit, label %L2

%exit:
  %ld1 = phi i32 [ poison, %L2 ], [ %ld.old, %entry ]
  %r = phi i32 [ %ld2, %L2 ], [ %ld.old, %entry ]
  ret i32 %r
}
Transformation seems to be correct!

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125

Diff Detail

Event Timeline

lebedev.ri created this revision.Jul 19 2021, 2:54 PM
lebedev.ri requested review of this revision.Jul 19 2021, 2:54 PM

It seems odd that we need to "prepare the IR" so late. At first glance, only looking at pr51125, it seems we treat uses in PHIs as uses of the phi block and not the predecessor (edge). Is that the problem?

It seems odd that we need to "prepare the IR" so late.

Well, i view this part of CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(), just done before the CFG changes.

At first glance, only looking at pr51125, it seems we treat uses in PHIs as uses of the phi block and not the predecessor (edge). Is that the problem?

"as uses *in* the phi block" you mean? I'm afraid i'm not really seeing it?

At first glance, only looking at pr51125, it seems we treat uses in PHIs as uses of the phi block and not the predecessor (edge). Is that the problem?

"as uses *in* the phi block" you mean? I'm afraid i'm not really seeing it?

yes, uses *in* the phi block. In pr51125 we have 4 uses of %ld and right now we treat one different than the remaining 3. The reason, as I imagine, is
that we do not consider the use of %ld in the phi %r to be inside of %L` but inside of %exit instead. Conceptually it's on the edge but it's certainly
closer to the end of %L than the beginning of %exit.

I was playing around with this a bit and while the follow code seems to work on that example it is by no means thought through very well:

     SSAUpdate.AddAvailableValue(PredBlock, NewBonusInst);
     for (Use &U : make_early_inc_range(BonusInst.uses())) {
       auto *UI = cast<Instruction>(U.getUser());
+      if (auto *PHI = dyn_cast<PHINode>(UI))
+        if (PHI->getIncomingBlock(U) == PredBlock) {
+          Value *V = SSAUpdate.GetValueInMiddleOfBlock(PredBlock);
+          if (!isa<UndefValue>(V)) {
+            U.set(V);
+            continue;
+          }
+        }
       if (UI->getParent() != PredBlock)
         SSAUpdate.RewriteUseAfterInsertions(U);
       else // Use is in the same block as, and comes before, NewBonusInst.

Streamline the code, NFCI.

With an upcoming branch (tomorrow!), does anyone believe that this is an incorrect fix?

nikic added inline comments.Aug 8 2021, 9:41 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
1101–1115

Do we still need SSAUpdater here? Can this be just if (PN->getIncomingBlock(U) != BB) U.set(NewBonusInst); or so?

3044

This comment would really benefit from a description of what problem this is solving and how it does so. "makes miscompiles easy" doesn't really tell us anything and "sudo-LCSSA" form is somewhat confusing when not talking about loops.

3052

What is an "offending" use? Based on the implementation, it's a use that is potentially not dominated by BonusInst.

I'd personally invert the predicate and call it IsTriviallyDominatedUse or something.

3062

receives

lebedev.ri marked 4 inline comments as done.
lebedev.ri retitled this revision from [SimplifyCFG] performBranchToCommonDestFolding(): form sudo-LCSSA before cloning instructions (PR51125) to [SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125).

Thank you for taking a look!
Addressing review notes.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
1101–1115

Hmm, yes, this is better.

3044

Reworded, but not sure how better this is.

Better naming, NFC.

nikic accepted this revision.Aug 15 2021, 8:14 AM

This is rather weird, but LGTM.

llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll
169 ↗(On Diff #365755)

Unrelated regeneration, precommit?

This revision is now accepted and ready to land.Aug 15 2021, 8:14 AM

This is rather weird, but LGTM.

Thank you for the review!

This had a suspiciously large impact on code size (-4% on mafft with LTO): https://llvm-compile-time-tracker.com/compare.php?from=60dd0121c92e93a33cf39b5f7006923ac9e4f127&to=3d9beefc7d713ad8462d92427ccd17b9532ce904&stat=size-text Not necessarily a problem, but I didn't expect to see any significant impact from this patch.

lebedev.ri reopened this revision.Aug 16 2021, 1:08 AM

Perhaps we simply can't do this here without proper domtree.

This revision is now accepted and ready to land.Aug 16 2021, 1:08 AM