This is an archive of the discontinued LLVM Phabricator instance.

BreakCriticalEdges for callbr indirect dests
ClosedPublic

Authored by nickdesaulniers on Jun 10 2020, 11:37 AM.

Details

Summary

llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors). It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return). Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).

Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2020, 11:37 AM
llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
14

noob question: this totally fails for me, even though the test was autogenerated. FileCheck is complaining that:

/android0/llvm-project/llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll:13:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: call void @callee(i1 false)
              ^
...
           9: Top.split: ; preds = %Top
next:13'0                X~~~~~~~~~~~~~ error: no match found
          10:  call void @callee(i1 false)
next:13'0     ~~~~~~~~~~~~~~~~~~~
next:13'1      ?                           possible intended match
...

which makes no sense to me. Am I holding it wrong?

llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
24

Note that update_test_checks.py rewrites the label references, but seemingly not for the indirect label list...

void added inline comments.Jun 10 2020, 12:04 PM
llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
14

FileCheck should ignore the comment...Try changing

CHECK: Top.split:

to

CHECK-LABEL: Top.split:

or add the comment?

  • fix test via CHECK-LABEL
nickdesaulniers marked 3 inline comments as done.Jun 10 2020, 1:33 PM
nickdesaulniers added inline comments.
llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
14

That works, thanks @void!

update_test_checks.py tries to change them back though...

fhahn added a subscriber: fhahn.Jun 11 2020, 3:48 AM
fhahn added inline comments.
llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
40

Does callbr always take the destinations as blockaddress or could an arbitrary i8* value be used? Then doing the splitting here would only be safe if we can update the blockaddress constant and not in other cases I think.

nickdesaulniers marked an inline comment as done.Jun 11 2020, 9:53 AM
nickdesaulniers added a subscriber: efriedma.

@nathanchance ran significant testing on this to ensure this patch doesn't regress linux ToT "mainline" or 5.4 stable kernels.

https://del.dog/raw/ocyphupiph

https://del.dog/raw/mollilekad

llvm/test/Transforms/CallSiteSplitting/callsite-split-callbr.ll
40

Yes; always blockaddresses. @efriedma and I have discussed removing the duplication in callbr; the indirect destination list ([label %End2]) duplicates the arguments to the inline asm (blockaddress(@caller, %End2).

This is an invariant that is checked by IR verification.

(Using grammar from https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html)

But the duplication is brittle because while setSuccessor does work correctly, the lower level setOperand does not update callbr operands correctly. It's also hard to distinguish between the labels passed in for asm goto vs extended inline asm inputs that are labels, ex foo: asm goto (""::"r"(foo)::bar); bar:; (we know bar is a potential destination, but provide no guarantees that foo can be safely jumped to from the inline asm. (InputOperands vs GotoLabels)

Removing the duplication should fix that brittleness, but it's also orthogonal to this patch. Only blockaddresses can be indirect destinations that we preserve. Arbitrary addresses can be arguments to inline asm (InputOperands), but we provide no guarantees unless the label in in the final inline asm parameter list (GotoLabels), as opposed to the list of inputs (InputOperands).

nickdesaulniers added a comment.EditedJun 11 2020, 1:19 PM

@nathanchance did further testing of this patch on Linux kernel's staging tree, linux-next: https://del.dog/raw/aisucysteb (no issues reported).

(Note to reviewers: this test case fails an assertion before this patch is applied. Rewritten from an actual assertion failure seen in the wild).

efriedma accepted this revision.Jun 12 2020, 12:03 PM

If the plan is to drop blockaddresses from callbr, then this LGTM.

This revision is now accepted and ready to land.Jun 12 2020, 12:03 PM

If the plan is to drop blockaddresses from callbr, then this LGTM.

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

I would like to resolve @fhahn 's question. TBH, I understand what a critical edge is, but I don't understand why splitting them is necessary? I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

void added a comment.Jun 12 2020, 1:32 PM

If the plan is to drop blockaddresses from callbr, then this LGTM.

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

I would like to resolve @fhahn 's question. TBH, I understand what a critical edge is, but I don't understand why splitting them is necessary? I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

In general, splitting a critical edge is necessary because of how PHI nodes are resolved. A PHI node isn't a "real" instruction, but needs to be converted to a "copy to register" when translating to machine code. Because the edge is "critical" you can't just place the copy to register at the end of the parent block. So you need a place to put the copy, which is where the block created by splitting the edge comes into play.

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

Wasn't intended to be conditional on any code changes. Just that we don't plan to ever allow jumping to blockaddresses passed in as register operands.

I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

Wasn't intended to be conditional on any code changes. Just that we don't plan to ever allow jumping to blockaddresses passed in as register operands.

I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

While it's not the case right now, I think it would be very desirable to allow this in the future for callbr. Some of the other envisioned use-cases for callbr would require the ability to pass the target block in as a value (in which case there may be no blockaddress arguments directly on the callbr.)

fhahn added a comment.Jun 12 2020, 3:03 PM

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

Wasn't intended to be conditional on any code changes. Just that we don't plan to ever allow jumping to blockaddresses passed in as register operands.

I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

While it's not the case right now, I think it would be very desirable to allow this in the future for callbr. Some of the other envisioned use-cases for callbr would require the ability to pass the target block in as a value (in which case there may be no blockaddress arguments directly on the callbr.)

I think as long as we have block address constants we can adjust, doing the splitting is fine. But if this restriction gets relaxed in the future, it would probably cause problems here. CallSiteSplitting already bails out for indirectbr terminators, so alternatively we could also disable it for callbr (https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/CallSiteSplitting.cpp#L198)

Ok, that's probably useful for @rnk to consider, should callbr be reused for other constructs other than asm goto. I plan on landing this Monday, please explicitly request changes if there's anything more for me to do here.

Thanks all for the additional comments and consideration.

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

Wasn't intended to be conditional on any code changes. Just that we don't plan to ever allow jumping to blockaddresses passed in as register operands.

I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

Wait, that sounds like a problem. Imagine we have 3 callbrs, two that share the same indirect destination, and we decide to split one their critical edges. Then we update the blockaddress of that callbr, but not the others where the others' indirect branch now doesn't preserve the previous operations. As if we added another callbr in the entry of this test case with an indirect destination of %CallSiteBB.

While it's not the case right now, I think it would be very desirable to allow this in the future for callbr. Some of the other envisioned use-cases for callbr would require the ability to pass the target block in as a value (in which case there may be no blockaddress arguments directly on the callbr.)

Being indecisive about this is creating a mess in both the LLVM IR and MachineInstrs. I've repeatedly asked about this, and gotten contradictory answers every time. We need to make a decision here and stick to it.

void added a comment.Jun 12 2020, 6:37 PM

Disclaimer: I don't plan to rework callbr's operands before landing this. Was that a conditional LGTM?

Wasn't intended to be conditional on any code changes. Just that we don't plan to ever allow jumping to blockaddresses passed in as register operands.

I assume there's something important @fhahn was alluding to, but I'm not sure precisely what, and I would like to make sure everyone's happy with this.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

Wait, that sounds like a problem. Imagine we have 3 callbrs, two that share the same indirect destination, and we decide to split one their critical edges. Then we update the blockaddress of that callbr, but not the others where the others' indirect branch now doesn't preserve the previous operations. As if we added another callbr in the entry of this test case with an indirect destination of %CallSiteBB.

We can't really support PHI nodes in the indirect destinations of a callbr. It reverts to the original issue we had with the first proposal allowing callbr outputs to be live on the indirect branches. There's no good way to do this without messing the CFG up tremendously, and I suspect that it would be buggy as hell. So any use of a value in the path starting from an indirect block must dominate that indirect block or be defined within that path.

void added a comment.Jun 12 2020, 6:43 PM

While it's not the case right now, I think it would be very desirable to allow this in the future for callbr. Some of the other envisioned use-cases for callbr would require the ability to pass the target block in as a value (in which case there may be no blockaddress arguments directly on the callbr.)

Being indecisive about this is creating a mess in both the LLVM IR and MachineInstrs. I've repeatedly asked about this, and gotten contradictory answers every time. We need to make a decision here and stick to it.

My vote is for allowing non-blockaddresses as acceptable input to a callbr. (I would suggest using the indirect destination list, but that list is unordered.) I don't think there's a good enough reason to require only blockaddresses.

The reason we can't split edges coming out of indirectbr instructions is that in general, there is no way to fix the related blockaddresses in functions that contain multiple indirectbrs. We can't rename the successor in one indirectbr without renaming it in all of them. And if we rename it in all of them, the edge is still critical after the transform.

We want to be sure the same issue doesn't apply to callbr instructions. If it does, then we would need to fix the callers of SplitCriticalEdge, not SplitCriticalEdge itself.

Wait, that sounds like a problem. Imagine we have 3 callbrs, two that share the same indirect destination, and we decide to split one their critical edges. Then we update the blockaddress of that callbr, but not the others where the others' indirect branch now doesn't preserve the previous operations. As if we added another callbr in the entry of this test case with an indirect destination of %CallSiteBB.

Ok, so I modified the added test case to do this, which also involves adding a case to the phi for the new incoming edge. In that case, call site splitting does not kick in or perform any transformations. Since call site splitting can't transform that, rewriting the blockaddresses is not an issue in this case. Generally, that may be a problem, but I'd argue that's an issue with blockaddresses regardless of this patch. It's currently on callers of SplitCriticalEdge not to miss updating blockaddresses, if there's more than one to worry about. For instance, adding to the test case:

@foo = global i8* blockaddress(@caller, %CallSiteBB)

call site splitting will modify %CallSiteBB such that the call to @callee is hoisted out into the newly split blocks. If someone tried to use @foo for control flow, then llvm::SplitCriticalEdge would have made a transform that doesn't preserve previous control flow (is there a better term for that?).

(Going further, I'd be curious if having a blockaddress not scoped to a function is even generally useful?)

If we plan to radically change callbr, I think @fhahn 's suggestion to simply disable call site splitting for callbr is likely sufficient (though that makes me think that we can still reach the failing assertion in llvm::SplitEdge, just no longer from call site splitting). Or we could likely add a similar check I'm removing here to llvm::SplitEdge for indirect branches from callbr, then work through any issues that might cause in call site splitting. I don't *need* call site splitting to work for callbr, I'm just trying to fix the currently failing assertion in llvm::SplitEdge that this test case reproduces, so I don't particularly care whether we take this form of the patch, or just disable this.

I plan on committing this tomorrow morning, as I'm trying to pin down regressions that D79794 in current form will cause. Please explicitly Accept or Reject the patch by then, and please feel empowered to revert otherwise. Rereading the comments both @eli.friedman and @fhahn sound cautious, but no one has explicitly Rejected the patch, which leads me to hesitate on committing.

LGTM, please merge.

I guess we can always revert this later in favor of another approach once we start seriously looking at complete blockaddress support. This isn't making a huge commitment on its own

(Going further, I'd be curious if having a blockaddress not scoped to a function is even generally useful?)

blockaddresses in global variables are required for compatibility with the gcc indirect goto extension. (Really, we only need them in C static local variables, but that's the same thing in LLVM IR.)

(Going further, I'd be curious if having a blockaddress not scoped to a function is even generally useful?)

blockaddresses in global variables are required for compatibility with the gcc indirect goto extension. (Really, we only need them in C static local variables, but that's the same thing in LLVM IR.)

If you're referring to https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html, then the labels still are required to be scoped. ie.

void* a [] = {&&bar};
void foo(void) {
    bar:;
}

is invalid; hence I don't like how blockaddress in LLVM don't follow the same scoping restrictions. (The above idea *can* be expressed in LLVM IR).

The following is allowed:

void foo(void) {
    static void* a [] = {&&bar};
    bar:;
}

The following is allowed:

void foo(void) {
    static void* a [] = {&&bar};
    bar:;
}

*shakes fist*

This revision was automatically updated to reflect the committed changes.

Hi, you can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people. Please keep the tag. (I have updated my script to use --date=now (setting author date to committer date))

https://reviews.llvm.org/D80978 contains a git pre-push hook to automate this.