This is an archive of the discontinued LLVM Phabricator instance.

[llvm][CallBrPrepare] split critical edges
ClosedPublic

Authored by nickdesaulniers on Dec 12 2022, 12:01 PM.

Details

Summary

If we have a CallBrInst with output that's used, we need to split
critical edges so that we have some place to insert COPYs for physregs
to virtregs.

Part 2a of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8.

Test cases and logic re-purposed from D138078.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 12:01 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
nickdesaulniers requested review of this revision.Dec 12 2022, 12:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 12:01 PM
  • remove setPreservesAll
llvm/lib/CodeGen/CallBrPrepare.cpp
64

@aeubanks does this delta look correct?

128–129

@aeubanks does this look correct?

nickdesaulniers edited subscribers, added: arsenm, nhaehnle; removed: jyknight, void.
nickdesaulniers planned changes to this revision.Dec 12 2022, 2:20 PM
nickdesaulniers added inline comments.
llvm/test/CodeGen/AArch64/callbr-prepare.ll
140–142

looks like this should have been split...

llvm/test/CodeGen/AArch64/callbr-prepare.ll
140–142

oh, right, no uses of the output in the indirect edge.

aeubanks added inline comments.Dec 12 2022, 2:32 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
59

this needs to be changed to be like this to initialize dom tree (annoying legacy-PM boilerplate)

67–72

both ShouldRun can be static

71

I'd just return the list of CallBrInsts and check if it's empty in the caller, that's a lot more explicit

90–91

moving these variables into the loop makes this less errorprone

114

unnecessary if

128–129

yup, lg

  • add test case about missing use in indirect branch
nickdesaulniers marked 5 inline comments as done.
  • INITIALIZE_PASS_DEPENDENCY
  • return SmallVector (NVRO)
  • mark ShouldRun static
  • create new SmallPtrSet, SmallVector per loop iter
llvm/lib/CodeGen/CallBrPrepare.cpp
114

SplitKnownCriticalEdge is fallible. If we don't end up splitting the critical edge, then we shouldn't mark denote that the pass has made changes, right?

aeubanks added inline comments.Dec 12 2022, 3:02 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
114

there's an assert right above that Synth != nullptr, it doesn't make sense to assert on something then check if it's true

looking at SplitKnownCriticalEdge, it only returns nullptr for the options we're passing it when DestBB->isEHPad(), can that happen?

llvm/lib/CodeGen/CallBrPrepare.cpp
114

there's an assert right above that Synth != nullptr, it doesn't make sense to assert on something then check if it's true

If assertions are disabled, we should still denote correctly whether the pass made modifications.

looking at SplitKnownCriticalEdge, it only returns nullptr for the options we're passing it when DestBB->isEHPad(), can that happen?

If someone was deranged enough to mix C++ structured exception handling with asm goto, perhaps.

int y;
asm goto ("":"=r"(y):::out);
try {} catch (int x) {
out:
}

Though looks like the frontend rejects it: https://godbolt.org/z/dKGnjKK49.

I've also tried:

define i32 @x() {
  %out = callbr i32 asm "", "=r,!i"() to label %direct [label %lp]
direct:
  br label %lp
lp:
  %foo = landingpad { ptr, i32 } cleanup
  ret i32 42
}

which fails the verifier check

Block containing LandingPadInst must be jumped to only by the unwind edge of an invoke.
  %foo = landingpad { ptr, i32 }
          cleanup
LandingPadInst needs to be in a function with a personality.
  %foo = landingpad { ptr, i32 }
          cleanup

I don't know enough about the rest of the EHPad instructions to know if someone could conjure up such an abomination, hence extra checks. "Let's support those if we actually ever do see them in the wild."

llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

Oh, I guess later on (in a later commit) I will need to query DT.dominates. If DOM tree info isn't available, wat do?

llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

Perhaps I would manually need to construct a DomTreeUpdater or something?

I guess there's prior art in SafeStackLegacyPass::runOnFunction

efriedma added inline comments.Dec 12 2022, 4:23 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

If you need a domtree, just explicitly request one. With LegacyPM, something like AU.addRequired<DominatorTreeWrapperPass>(); in CallBrPrepare::getAnalysisUsage

nickdesaulniers marked an inline comment as done.
  • require domtree
nickdesaulniers added inline comments.
llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

So I think the issue with AU.addRequired<DominatorTreeWrapperPass>(); is that it might now introduce a Dominator Tree Construction into -O0 pipelines pessimistically.

The idea being that if we scanned the IR for callbrs (which are highly unlikely to exist in most programs outside of the Linux kernel and tcmalloc), we could lazily compute the DOMTree only if we needed (I think that's why SafeStackLegacyPass::runOnFunction has that pattern? cc @ab @pcc @davide @lebedev.ri )

efriedma added inline comments.Dec 15 2022, 2:47 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

Oh, hmm, I see what you mean. Yes, that's what the code in SafeStackLegacyPass involving the std::optional<DominatorTree> is doing: if an existing domtree is available, it uses it, otherwise it only computes the domtree if it's needed.

nickdesaulniers planned changes to this revision.Dec 15 2022, 5:17 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
128–129

So it looks like there's a tradeoff (FWICT) with that pattern:

+  // It's highly likely that most programs do not contain CallBrInsts. Follow a
+  // similar pattern from SafeStackLegacyPass::runOnFunction to reuse previous
+  // domtree analysis if available, otherwise compute it lazily. This avoids
+  // forcing Dominator Tree Construction at -O0 for programs that likely do not
+  // contain CallBrInsts. It does pessimize programs with callbr at higher
+  // optimization levels, as the DominatorTree created here is not reused by
+  // subsequent passes.
  • lazily compute domtree
void added inline comments.Dec 21 2022, 3:05 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
114

I agree with @aeubanks. This sequence looks bad. I know that asserts aren't enabled for a release, but it's either bad or it isn't. If it's bad, we should at the very least report it.

llvm/lib/CodeGen/CallBrPrepare.cpp
114

Is this really better looking?

diff --git a/llvm/lib/CodeGen/CallBrPrepare.cpp b/llvm/lib/CodeGen/CallBrPrepare.cpp
index 9835b5b41e2f..752e2d6f442f 100644
--- a/llvm/lib/CodeGen/CallBrPrepare.cpp
+++ b/llvm/lib/CodeGen/CallBrPrepare.cpp
@@ -117,8 +117,8 @@ bool CallBrPrepare::SplitCriticalEdges(ArrayRef<CallBrInst *> CBRs,
     for (unsigned i : UniqueCriticalSuccessorIndices) {
       BasicBlock *Synth = SplitKnownCriticalEdge(CBR, i, Options);
       assert(Synth && "Failed to split a known critical edge from callbr");
-      if (Synth)
-        Changed = true;
+      (void)Synth;
+      Changed = true;
     }
   }
   return Changed;
void added inline comments.Dec 22 2022, 12:45 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
114

That's how it's done in other places. E.g. in llvm/include/llvm/CodeGen/LiveVariables.h:

/// removeVirtualRegisterKilled - Remove the specified kill of the virtual
/// register from the live variable information. Returns true if the
/// variable was marked as killed by the specified instruction,
/// false otherwise.
bool removeVirtualRegisterKilled(Register Reg, MachineInstr &MI) {
  if (!getVarInfo(Reg).removeKill(MI))
    return false;

  bool Removed = false;
  for (MachineOperand &MO : MI.operands()) {
    if (MO.isReg() && MO.isKill() && MO.getReg() == Reg) {
      MO.setIsKill(false);
      Removed = true;
      break;
    }
  }

  assert(Removed && "Register is not used by this instruction!");
  (void)Removed;
  return true;
}

It's not particularly satisfying, but at least it appears to be consistent with the rest of the code base.

nickdesaulniers marked 4 inline comments as done.
void accepted this revision.Jan 3 2023, 3:16 PM

I've been thinking about this and I'm not really sure why LLVM hasn't created their own version of assert yet. It could be something as simple as this:

#define LLVM_ASSERT(cond, msg)  if (!cond) assert(false && msg)

But that's a separate issue.

This pass looks okay to me. I'll accept, but I think others who commented should also accept.

This revision is now accepted and ready to land.Jan 3 2023, 3:16 PM
  • rebase, format
efriedma accepted this revision.Jan 18 2023, 11:37 AM
void accepted this revision.Jan 18 2023, 3:39 PM

LGTM. One suggestion that you can ignore if you wish.

llvm/lib/CodeGen/CallBrPrepare.cpp
136

You can iterate over the indirect destinations. Something like:

for (BasicBlock *BB : CBR->getIndirectDests()) {
}
llvm/lib/CodeGen/CallBrPrepare.cpp
136

I would have preferred to use range-for iteration here. IIRC, the issue with that approach was that isCriticalEdge takes an unsigned as the successor number (rather than the BasicBlock successor itself. If it did, that would probably have improved the ergonomics here. Let me know if you think I should add an overload of isCriticalEdge perhaps, or if I missed something else?

void added inline comments.Jan 18 2023, 4:08 PM
llvm/lib/CodeGen/CallBrPrepare.cpp
136

isCriticalEdge can also take a BB:

llvm/include/llvm/Analysis/CFG.h:
bool isCriticalEdge(const Instruction *TI, const BasicBlock *Succ,
                    bool AllowIdenticalEdges = false);
llvm/lib/CodeGen/CallBrPrepare.cpp
136

Ah, it was SplitKnownCriticalEdge used later outside of this loop that expected the index of the successor number. That's why this loop builds up a vector of indices; the next loop will then split them. So I don't think I can change this first loop, but let me know if I'm still missing something.

jyknight added inline comments.Jan 19 2023, 11:01 AM
llvm/lib/CodeGen/CallBrPrepare.cpp
75–116

Doesn't seem to add anything to have "ShouldRun" as a separate function. I'd just inline the check here.

If not that, rename it; the name "ShouldRun" is not adding any understandability here.

132–133

I don't understand this comment. With a single loop, if we iterate over the successors in order, and successors 1 and 3 are both to the same block, the MergeIdenticalEdges option should ensure that when we call SplitKnownCriticalEdge on 1, it'll also rewrite 3.

Oh, I see -- I think you needed isCriticalEdge(CBR, i, /*AllowIdenticalEdges=*/true), so that we don't report the edge as critical when it's only used from a single block (and won't split the already-split edge!)

llvm/lib/CodeGen/CallBrPrepare.cpp
132–133

if we iterate over the successors in order, and successors 1 and 3 are both to the same block, the MergeIdenticalEdges option should ensure that when we call SplitKnownCriticalEdge on 1, it'll also rewrite 3.

Correct, see @split_me1 in llvm/test/CodeGen/AArch64/callbr-prepare.ll which covers this part of the code.

I think you needed isCriticalEdge(CBR, i, /*AllowIdenticalEdges=*/true), so that we don't report the edge as critical when it's only used from a single block (and won't split the already-split edge!)

I don't think setting AllowIdenticalEdges to true works. I have vague memories of trying to use it, but I can't recall specifically why I wasn't able to use it and haven't been able to rework this block via setting AllowIdenticalEdges to true.

The comment also mentions SplitCriticalEdge rather than SplitKnownCriticalEdge. That may be an artifact from an earlier revision. At the least, I should fix that part of the comment.

Do you have thoughts on how I could better rephrase the comment block you didn't understand (assuming now that you do)?

Are you asking that I try to use AllowIdenticalEdges set to true?

llvm/lib/CodeGen/CallBrPrepare.cpp
136

It looks like llvm::GetSuccessorNumber could be used. So even if SplitKnownCriticalEdge needs an index, I could use range-for then llvm::GetSuccessorNumber. That may be more readable, but it looks like it would be O(N^2) rather than two sequential O(N) loops? I guess N is likely too small to matter though.

llvm/lib/CodeGen/CallBrPrepare.cpp
132–133

For instance, I would have like for this code to simply have been something like:

for (BasicBlock *BB : CBR->getIndirectDests())
  if (BasicBlock *Synth = SplitCriticalEdge(CBR, GetSuccessorNumber(CBR->getParent(), BB), Options))
    Changed = true;

but that doesn't work, hence the two loops and (obtuse) commentary.

nickdesaulniers marked an inline comment as done.
  • inline ShouldRun, as per @jyknight
  • update comment s/SplitCriticalEdge/SplitKnownCriticalEdge/
llvm/lib/CodeGen/CallBrPrepare.cpp
132–133

So I think the issue here is what happens if the direct and indirect destination are the same. Maybe @split_me5 test case added later, where we don't want to split both edges, vs two indirect branches that are the same, in case we'd like to split both of them @split_me1 but only once.

llvm/lib/CodeGen/CallBrPrepare.cpp
132–133

@jyknight what are your thoughts on rewording the comment block as such (sorry, the block has shifted around from edits since your comment; we're referring to the comment blocks in CallBrPrepare::SplitCriticalEdges)

Conceptually, these two loops are doing:

for (BasicBlock *Dest : CBR->getIndirectDests())
  if (BasicBlock *Synth = SplitCriticalEdge(CBR, GetSuccessorNumber(CBR->getParent(), Dest)))
    Changed = true;

But there is a problem resulting in two special cases; a single callbr instruction may have multiple edges to the
same destination.  Consider the first case:

  %0 = callbr ... to label %x [label %y, label %y]

We want to split the edge to %y once, not twice so that both instance of %y go to the new destination.
Now consider the second case:

  %0 = callbr ... to label %x [label %x]

We want to split the indirect edge, but not the direct edge (since the direct edge doesn't
have the problem of where to store copies).  This pair of sequential loops is necessary to
handle what is logically a pass over indirect edges to split them.

(with necessary word wrapping, since I haven't typed it out in code yet) (I would then delete all existing comments in that method).

Thoughts?

nickdesaulniers marked 3 inline comments as done.
  • refactor and significantly simply CallBrPrepare::SplitCriticalEdges.

@jyknight figured out how to properly use AllowIdenticalEdges together with
MergeIdenticalEdges. Use code he sent me and add a comment.

jyknight accepted this revision.Jan 25 2023, 8:11 AM

Thanks, looks good now!

  • final rebase
This revision was automatically updated to reflect the committed changes.