This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
3/7
LangRef.rst
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
2/3
MachineBasicBlock.h
-
lib/
-
AsmParser/
-
LLParser.cpp
-
CodeGen/
9/14
MachineBasicBlock.cpp
4/6
MachineVerifier.cpp
-
SelectionDAG/
3/4
ScheduleDAGSDNodes.cpp
-
SelectionDAGBuilder.cpp
-
IR/
-
Verifier.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
1/1
callbr-asm-outputs.ll
1/1
callbr-asm.ll

Differential D69868

Allow "callbr" to return non-void values
ClosedPublic

Authored by void on Nov 5 2019, 2:34 PM.

Download Raw Diff

Details

Reviewers

jyknight
nickdesaulniers
hfinkel
MaskRay
lattner

Commits

rG23c2a5ce33f0: Allow "callbr" to return non-void values

Summary

Terminators in LLVM aren't prohibited from returning values. This means that
the "callbr" instruction, which is used for "asm goto", can support "asm goto
with outputs."

This patch removes all restrictions against "callbr" returning values. The
heavy lifting is done by the code generator. The "INLINEASM_BR" instruction's
a terminator, and the code generator doesn't allow non-terminator instructions
after a terminator. In order to correctly model the feature, we need to copy
outputs from "INLINEASM_BR" into virtual registers. Of course, those copies
aren't terminators.

To get around this issue, we split the block containing the "INLINEASM_BR"
right before the "COPY" instructions. This results in two cheats:

Any physical registers defined by "INLINEASM_BR" need to be marked as live-in into the block with the "COPY" instructions. This violates an assumption that physical registers aren't marked as "live-in" until after register allocation. But it seems as if the live-in information only needs to be correct after register allocation. So we're able to get away with this.
The indirect branches from the "INLINEASM_BR" are moved to the "COPY" block. This is to satisfy PHI nodes.

I've been told that MLIR can support this handily, but until we're able to
use it, we'll have to stick with the above.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

void created this revision.Nov 5 2019, 2:34 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 5 2019, 2:34 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B40548: Diff 227969.Nov 5 2019, 2:38 PM

JonChesterfield added a subscriber: JonChesterfield.Nov 5 2019, 2:43 PM

Update language reference.

Harbormaster completed remote builds in B40647: Diff 228306.Nov 7 2019, 1:47 PM

nickdesaulniers added inline comments.Nov 7 2019, 2:42 PM

llvm/docs/LangRef.rst
7331	Rather than the distinction between "normal" and "other," can we use the terms "fallthrough" and "indirect?"
7334	I assume that's the WIP part? Otherwise there's nothing else here that explicitly adds poison from what I can tell.

void marked an inline comment as done.Nov 7 2019, 4:31 PM

void added inline comments.

llvm/docs/LangRef.rst
7334	You're able to have the "fallthrough" listed in the "indirect" list as well. I'm just pointing out that if the value hasn't been calculated and shoved into the output object (register, etc.) then using that value in the fallthrough branch causes a poisoned value.

Update description to use "fallthrough" and "indirect"

Harbormaster completed remote builds in B40663: Diff 228336.Nov 7 2019, 4:37 PM

Add testcase for multiple output constraints.

Harbormaster completed remote builds in B40723: Diff 228596.Nov 10 2019, 1:15 AM

This change is now ready for review. PTAL.

Don't split critical edges into a callbr indirect destination.

Harbormaster completed remote builds in B40792: Diff 228813.Nov 11 2019, 10:37 PM

Temporary change to determine if a block is used in a callbr.

Harbormaster completed remote builds in B40794: Diff 228820.Nov 11 2019, 11:58 PM

Simplify check that a basic block is the target of an indirect jump from callbr
and add testcase for this.

Harbormaster completed remote builds in B40832: Diff 228926.Nov 12 2019, 11:33 AM

Friendly ping. :-)

The value returned by "callbr" is only valid on the "normal" path. Return

values used on "abnormal" paths are poisoned.

Please edit the revision comment to use the terms "fallthrough" and "indirect" targets or paths, rather than "normal" and "abnormal."

llvm/docs/LangRef.rst
7332	I still don't understand how the `poison` value gets added. Can you please clarify, or maybe add a test case that does?
7332	If the outputs are only valid on the fallthrough, will that allow us to build the extension to the existing `asm goto` that has been asked for? I worry that if we implement this restriction, it may not be fully useful unless the outputs on all paths are valid. Do you have a link to the previous discussion about this?
llvm/lib/IR/BasicBlock.cpp
481 ↗	(On Diff #228926)	In the past, I've been shot down for adding to `BasicBlock`. Can this be implemented as a method on a `CallBrInst` that accepts a `BasicBlock`?
487 ↗	(On Diff #228926)	`PI` is only iterating `BasicBlocks` in this `Function`, right? Can't `callbr` refer to labels in other functions (unlike `asm goto` in C, due to scoping)?

This revision now requires changes to proceed.Nov 18 2019, 10:51 AM

void edited the summary of this revision. (Show Details)Nov 18 2019, 1:00 PM

Don't add to BasicBlock, but just loop directly.

llvm/docs/LangRef.rst
7332	I still don't understand how the poison value gets added. Can you please clarify, or maybe add a test case that does? Poison values are a result of execution, not specified directly. So during execution, the value becomes poison if used in the indirect block, but not if used in the fallthrough. It's just fancy language for saying "don't do that!" :-) From the docs: Poison value behavior is defined in terms of value dependence: ... * An instruction control-depends on a terminator instruction if the terminator instruction has multiple successors and the instruction is always executed when control transfers to one of the successors, and may not be executed when control is transferred to another.
7332	After going back and forth with James and Hal, I now believe that having the value valid only on the fallthrough path will be sufficient for most purposes (if not all). However, this implementation doesn't restrict us from relaxing it in the future, but I would like to see a use case for that before we sink a lot of time into it. Below is a link to the start of the thread. http://lists.llvm.org/pipermail/llvm-dev/2019-June/133428.html
llvm/lib/IR/BasicBlock.cpp
481 ↗	(On Diff #228926)	Better, I'll just inline it. :-)
487 ↗	(On Diff #228926)	I don't believe it can refer to labels in other functions. Anyway, I changed this so that this function is gone.

Harbormaster completed remote builds in B41137: Diff 229918.Nov 18 2019, 1:52 PM

Ping? :-)

In D69868#1758211, @void wrote:

Ping? :-)

Sorry, busy week going into Thanksgiving holiday. I really think this needs more time to bake in code review and input from @jyknight (and maybe @rnk ). Particularly, re-reading through the thread (https://lists.llvm.org/pipermail/llvm-dev/2019-June/133428.html), I'm curious to see more tests for cases that we don't expect to work, particularly for the clang patch I'm curious if we can somehow warn the user?

Should you add a test case here for the case @jyknight describes in: https://lists.llvm.org/pipermail/llvm-dev/2019-June/133431.html (ie. differing output constraints)?

Add testcase for callbrs with different output constraints jumping to the same
default location.

Harbormaster completed remote builds in B41478: Diff 231008.Nov 25 2019, 11:20 PM

rnk added inline comments.Dec 3 2019, 3:48 PM

llvm/lib/CodeGen/MachineBasicBlock.cpp
1118	As a minor efficiency golf thing, I would start off this chain of checks with `if (Succ->hasAddressTaken())`. In most cases, which will handle splitting the critical edge to the fall through destination, which will not be marked as address taken.
1119	This code really should be looking at the machine IR to know if there is a callbr. I think INLINEASM_BR is target independent, so you really can just loop through the MachineInstrs looking for it, and then look for the MBB in the operand list. I wonder if we should have a flag on the MBB to indicate that it ends in an INLINEASM_BR, since those break the MIR invariant that terminators are grouped together at the end of the block.
1120	Is it possible for a BB to be both an indirect successor and a fallthrough successor? I suppose that could be the case with the Linux macro that gets the current PC. In any case, it's probably safe to remove this condition, and then we don't have to worry.
llvm/test/CodeGen/X86/callbr-asm-outputs.ll
1	Please add -verify-machineinstrs to the tests, since that isn't usually on by default, and it may highlight some latent verifier issues I don't know about.
llvm/test/CodeGen/X86/callbr-asm.ll
2	Please add -verify-machineinstrs.

Add "verify-machineinstrs" flag to callbr tests. Relax some of the verification
tests to account for callbrs.

llvm/lib/CodeGen/MachineBasicBlock.cpp
1119	The problem is that I need to between the indirect branches and the default one from the `callbr` instruction. It's way more cumbersome to do that with the `INLINEASM_BR` instruction. If you feel strongly about it I'll make the change, but the `INLINEASM_BR` instruction is already strongly tied to `callbr`. Do you expect it to change anytime in the future?
1120	It is possible. I have a testcase for it in this patch. :-)

Harbormaster completed remote builds in B41836: Diff 232049.Dec 4 2019, 12:54 AM

nickdesaulniers added inline comments.Dec 17 2019, 2:53 PM

llvm/lib/CodeGen/MachineBasicBlock.cpp
1119–1120	I think it may be worthwhile to swap these conditions with each other, too. It's highly unlikely the BB is terminated with a `CallBrInst`, yet highly likely the MBB refers to a BB.
1120	But you don't have a test case for what @rnk is asking about. (Unsure if it would be necessary). I think @rnk is getting at this case from C: void foo(void) { asm goto ("#NICK":: "r"(&&hello) :: hello); hello: return; } Clang will emit this as: define dso_local void @foo() #0 { entry: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %hello)) #1 to label %asm.fallthrough [label %hello], !srcloc !2 asm.fallthrough: ; preds = %entry br label %hello hello: ; preds = %asm.fallthrough, %entry ret void } ie. the `blockaddress` is passed twice, once as the address of a label (GNU C extension), once as the indirect destination of the `asm goto`. So to @rnk 's question: Is it possible for a BB to be both an indirect successor and a fallthrough successor? It is valid LLVM IR to have a BB be both; Clang today (or with https://reviews.llvm.org/D69876) will not emit such formation (but could). ie. imagine the above: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %hello)) #1 to label %asm.fallthrough [label %hello], !srcloc !2 to instead be: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %asm.fallthrough)) #1 to label %asm.fallthrough [label %asm.fallthrough], !srcloc !2 I still don't understand @rnk 's point about In any case, it's probably safe to remove this condition, and then we don't have to worry. though.
1122–1123	please use a range-based for: for (const BasicBlock* ID : cbr->getIndirectDests()) if (ID == bb) or `llvm::any_of`.

Update

Herald added a project: Restricted Project. · View Herald TranscriptDec 17 2019, 3:23 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

void added inline comments.Dec 17 2019, 3:23 PM

llvm/lib/CodeGen/MachineBasicBlock.cpp
1120	I meant that I had a testcase that required the conditional he suggested I remove. Sorry for the confusion. You're right that we can generate the clang code where the fallthrough is the same as the indirect. I didn't mean to imply that it wasn't the case. I can add a testcase. I asked a question on the "cfe-dev" mailing list to determine how I could identify the "fallthrough" block from the CFG. So far no one has responded. Until I can determine that, I won't be able to handle that properly with this conditional.

Harbormaster completed remote builds in B42690: Diff 234402.Dec 17 2019, 3:24 PM

Fix bad update

Harbormaster completed remote builds in B42691: Diff 234403.Dec 17 2019, 3:29 PM

xbolva00 added subscribers: aaron.ballman, xbolva00.Dec 25 2019, 3:37 PM

xbolva00 added inline comments.

llvm/lib/CodeGen/MachineBasicBlock.cpp
1120	Maybe @aaron.ballman can answer your question related to “fallthrough” block..

More reviewers: +@echristo @MaskRay

llvm/include/llvm/CodeGen/MachineBasicBlock.h
134	I think "IsInlineAsmBrIndirectTarget" would be better. I think of "Pad" as being something EH related, it's usually a "landing pad" or "eh pad".
llvm/lib/CodeGen/MachineBasicBlock.cpp
1118	Somehow I missed that you added isInlineAsmBrIndirectPad, you could use that here instead of hasAddressTaken, and it would be more precise.
1119	I don't think it will actually be that hard to implement. The MIR already knows a few things: After block layout, MBB->getFallthrough() will return the default destination. If non-null, you can use it. If getFallthrough() is null and this block terminated in callbr in IR, there must be an analyzable unconditional branch terminator (JMP or B). All other address taken successors must come from abnormal control flow, i.e. exception handling or callbr. Actually, I'm surprised this utility doesn't return false after analyzeBranch below if Succ is not one of TBB or FBB. In any case, if these ideas don't work out, I don't feel that strongly, but I feel obligated to ask you to try a variety of ways to make this work with just MIR info.
1120	I had thought that it would be safe to return false (don't allow this edge to be split) if BB appears in the indirect destination list, even if it's also the default destination, so this condition could be reduced to `contains(cbr->getIndirectDests(), bb)`. I looked for a test where a BB is both a default and indirect destination, so I don't see why we can't remove the default dest check.

void marked 5 inline comments as done.Jan 7 2020, 3:50 PM

void added inline comments.

llvm/lib/CodeGen/MachineBasicBlock.cpp
1119–1120	Okay. No one likes this addition. I'm going to remove it and just check `IsInlineAsmBrIndirectTarget`.

Remove check that the successor may be in the indirect list of a callbr instruction.

Harbormaster completed remote builds in B43472: Diff 236710.Jan 7 2020, 3:51 PM

Missed a change.

Harbormaster completed remote builds in B43474: Diff 236716.Jan 7 2020, 4:12 PM

The plus constraints are now at the end of the input/output/label list.

Harbormaster completed remote builds in B43651: Diff 237224.Jan 9 2020, 4:35 PM

jyknight added inline comments.Jan 10 2020, 12:09 PM

llvm/lib/CodeGen/MachineVerifier.cpp
699	This isn't correct. This line here, is looking at a block which doesn't end in a jump to a successor. So, it's trying to verify that the successor list makes sense in that context. The unstated assumption in the code is that the only successors will be landing pads. Instead of actually checking each one, instead it just checks that the count is the number of landing pads, with the assumption that all the successors should be landing pads, and that all the landing pads should be successors. The next clause is then checking for the case where there's a fallthrough to the next block. In that case, the successors should've been all the landing pads, and the single fallthrough block. Adding similar code to check for the number of callbr targets doesn't really make sense. It's certainly not the case that all callbr targets are targets of all callbr instructions. And even if it was, this still wouldn't be counting things correctly. However -- I think i'd expect analyzeBranch to error out (returning true) when confronted by a callbr instruction, because it cannot actually tell what's going on there. If that were the case, nothing in this block should even be invoked. But I guess that's probably not happening, due to the terminator being followed by non-terminators. That seems likely to be a problem that needs to be fixed. (And if that is fixed, I think the changes here aren't needed anymore)

void marked an inline comment as done.Jan 10 2020, 12:23 PM

void added inline comments.

llvm/lib/CodeGen/MachineVerifier.cpp
699	Your comment is very confusing. Could you please give an example of where this fails?

jyknight added inline comments.Jan 10 2020, 3:34 PM

llvm/lib/CodeGen/MachineVerifier.cpp
699	Sorry about that, I should've delimited the parts of that message better... Basically: Paragraphs 2-4 are describing why the code before this patch appears to be correct for landing pad, even though it's taking some shortcuts and making some non-obvious assumptions. Paragraph 5 ("Adding similar code"...) is why it's not correct for callbr. Paragraph 6-7 are how I'd suggest to resolve it. I believe the code as of your patch will fail validation if you have a callbr instruction which has a normal-successor block which is an indirect target of a different callbr in the function. I believe it'll also fail if you have any landing-pad successors, since those aren't being added to the count of expected successors, but rather checked separately. But more seriously than these potential verifier failures, I expect that analyzeBranch returning wrong answers (in that it may report that a block unconditionally-jumps to a successor, while it really has both a callbr and jump, separated by the non-terminator copies) will cause miscompilation. I'm not sure exactly how that will exhibit, but I'm pretty sure it's not going to be good. And, if analyzeBranch properly said "no idea" when confronted by callbr control flow, then this code in the verifier wouldn't be reached.

void marked 2 inline comments as done.Jan 13 2020, 2:36 AM

void added inline comments.

llvm/lib/CodeGen/MachineVerifier.cpp
699	I didn't need a delineation of the parts of the comment. I needed a clearer description of what your concern is, and to give an example of code that fails here. This bit of code is simply saying that if the block containing the `INLINEASM_BR` doesn't end with a `BR` instruction, then the number of its successors should be equal to the number of indirect successors. This is correct, as it's not valid to have a duplicate label used in a `callbr` instruction: $ llc -o /dev/null x.ll Duplicate callbr destination! %3 = callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${2:l}", "={si},r,X,0,~{dirflag},~{fpsr},~{flags}"(i32 %2, i8* blockaddress(@test1, %asm.fallthrough), i32 %1) #2 to label %asm.fallthrough [label %asm.fallthrough], !srcloc !6 ./bin/llc: x.ll: error: input module is broken! A `callbr` with a normal successor block that is the indirect target of a different `callbr` isn't really relevant here, unless I'm misunderstanding what `analyzeBranch` returns. There would be two situations: The MBB ends in a fallthrough, which is the case I mentioned above, or The MBB ends in a `BR` instruction, in which case it won't be in this block of code, but the block below. If `analyzeBranch` is not taking into account potential `COPY` instructions between `INLINEASM_BR` and `BR`, then it needs to be addressed there (I'll verify that it is). I do know that this code is reached by the verifier, so it handles it to some degree. :-)

void marked an inline comment as done.Jan 15 2020, 4:38 PM

void marked an inline comment as done.Jan 15 2020, 5:30 PM

void added inline comments.

llvm/lib/CodeGen/MachineVerifier.cpp
699	But more seriously than these potential verifier failures, I expect that analyzeBranch returning wrong answers (in that it may report that a block unconditionally-jumps to a successor, while it really has both a callbr and jump, separated by the non-terminator copies) will cause miscompilation. I'm not sure exactly how that will exhibit, but I'm pretty sure it's not going to be good. Here are two proposals that may help alleviate these concerns: Have analyzeBranch skip over the COPYs between the INLINEASM_BR and the JMP. It's relatively straight-forward to do, but it would have to be done for all analyzeBranch calls. Create a new pseudo-instruction called `INLINEASM_BR_COPY` (or some better name) that's a terminator which behaves like a normal `COPY`, but the analyzeBranch and other methods that look at terminators will be able to handle it without modifications, since it'll look similarly to an `INLINEASM_BR` instruction. It doesn't require changing all of analyzeBranch implementations, but it's a much larger change. Thoughts?

Update so that each MBB has a list of indirect dests of INLINEASM_BR instructions.

Split the machine basic block after an INLINEASM_BR instruction that has
outputs. The copies then end up in a separate block and the back end doesn't
have to deal with COPY instructions between two terminators.

Herald added a subscriber: MatzeB. · View Herald TranscriptJan 20 2020, 11:15 PM

Harbormaster completed remote builds in B44454: Diff 239231.Jan 20 2020, 11:19 PM

@jyknight Do you think you'll have time to review this patch this week? I'd like to get it into the 10.0 release if possible. :-)

In D69868#1832559, @void wrote:

@jyknight Do you think you'll have time to review this patch this week? I'd like to get it into the 10.0 release if possible. :-)

I volunteer as a reviewer:)

In D69868#1832725, @MaskRay wrote:

In D69868#1832559, @void wrote:

@jyknight Do you think you'll have time to review this patch this week? I'd like to get it into the 10.0 release if possible. :-)

I volunteer as a reviewer:)

W00t! :-)

The idea of moving the copies to a new MachineBasicBlock seems a reasonable solution. That said, it does mean there will be allocatable physical registers which are live-in to the following block, which is generally not allowed. As far as I can tell, I _think_ that should be fine in this particular circumstance, but I'm a little uneasy that I might be missing some reason why it'll be incorrect.

I'm more uncomfortable with the scanning/splitting after-the-fact in ScheduleDAGSDNodes.cpp, and then the successor lists subsequently being incorrect. However, I wasn't immediately able to say what I'd suggest doing instead, so I spent some time yesterday poking around with this patch to see if I could find something that seemed better. I now believe it will be cleaner to have the inline-asm tell SelectionDAGISel::FinishBasicBlock to just emit the copies in another block, which we do in some other situations, already. But, I'm still poking at that to see how it'll end up -- I can't say I'm sure that's the right answer at the moment.

In D69868#1836777, @jyknight wrote:

The idea of moving the copies to a new MachineBasicBlock seems a reasonable solution. That said, it does mean there will be allocatable physical registers which are live-in to the following block, which is generally not allowed. As far as I can tell, I _think_ that should be fine in this particular circumstance, but I'm a little uneasy that I might be missing some reason why it'll be incorrect.

There are some passes (e.g. machine CSE; MachineCSE.cpp:692) that handle physical live-in registers before register allocation. Do those passes run after a pass which handles the physical live-in registers? (By "handle" I mean analyzes and/or modifies the machine IR so that physical live-ins are "okay".)

I'm more uncomfortable with the scanning/splitting after-the-fact in ScheduleDAGSDNodes.cpp, and then the successor lists subsequently being incorrect. However, I wasn't immediately able to say what I'd suggest doing instead, so I spent some time yesterday poking around with this patch to see if I could find something that seemed better. I now believe it will be cleaner to have the inline-asm tell SelectionDAGISel::FinishBasicBlock to just emit the copies in another block, which we do in some other situations, already. But, I'm still poking at that to see how it'll end up -- I can't say I'm sure that's the right answer at the moment.

I looked at FinishBasicBlock at one point. I put it in EmitSchedule because I wanted to allow the basic block to go through any post processing that may occur. I can place it in FinishBasicBlock if you think it fits in there better.

As for the successor list, I justify it this way: The default behavior is exactly the same as executing the inline asm and continuing directly out as if it's straight line code. If INLINEASM_BR wasn't a terminator, we would add the COPYs directly after it and before the JMP. The only issue is whether the successors being on the copy block instead of the block containing the INLINEASM_BR would cause something to go wrong. Like you I was concerned about this. But I think that since the behavior is the same as if the two blocks were a single block (the assembly code isn't changed, etc.), and the fact that the successors being on the INLINEASM_BR block shouldn't affect any machine passes (i.e. no analysis or transformation should care, because the IR can't model potential branches by the asm), it should be okay. If you wish I could add a part to the machine instruction verifier to ensure that assumptions we're making are enforced.

It seems that callbr (with output) will now be similar to a catchpad. It can set live-in physical register information. (See test/CodeGen/X86/{seh-catch-all.ll,seh-exception-code.ll,wineh-coreclr.ll,wineh-exceptionpointer.ll}) Is there any caveat doing this? Add @rnk to the attention list as the author of rL249492 and rL249786...

MaskRay mentioned this in D69876: Support output constraints on "asm goto".Jan 27 2020, 2:49 PM

void added a child revision: D69876: Support output constraints on "asm goto".Jan 27 2020, 3:04 PM

@rnk Thoughts on passing live-in formation (a bit similar to WinEH catchpad) to callbr at the SelectionDAG stage?

In D69868#1848327, @MaskRay wrote:

@rnk Thoughts on passing live-in formation (a bit similar to WinEH catchpad) to callbr at the SelectionDAG stage?

To help with the review, here are some of the assumptions I'm making about the INLINEASM_BR and the copy block this patch creates:

The two blocks (CallBrBB and CopyBB) are "tightly coupled". I.e., you can't split the edge between them (not that you should want to).
The CopyBB block will *always* be the fall-through block. This implies that INLINEASM_BR is the only terminator in CallBrBB.
If one block moves, then both must move.

I would be happy to add code in the machine instruction verifier to ensure that these assumptions are met if you think it's necessary.

Friendly ping. :-)

My apologies for being a pest, but I wanted to know the status of reviews for this bug. @jyknight & @rnk, do you have further comments or need more time?

I think this is fine, but want to hear from @jyknight and @rnk.

llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
1043	`for (const MachineOperand &MO : Last.operands()) {`
1066	`for (const MachineOperand &MO : MI.operands()) {`

Use iterator for machine operands.

void marked 2 inline comments as done.Feb 5 2020, 2:47 PM

Harbormaster completed remote builds in B45813: Diff 242752.Feb 5 2020, 2:53 PM

Another friendly ping. :-)

Use the BB when creating the MBB.

Harbormaster completed remote builds in B46582: Diff 244840.Feb 15 2020, 2:10 PM

void added a reviewer: lattner.Feb 18 2020, 4:55 PM

It's been almost a month since the last comments on this review. If you need more time, please comment here. Otherwise, I will submit this with the current approvals by the end of the week.

In D69868#1881985, @void wrote:

It's been almost a month since the last comments on this review. If you need more time, please comment here. Otherwise, I will submit this with the current approvals by the end of the week.

Explicit ping @nickdesaulniers, who blocked this, and @jyknight and @rnk, both explicitly requested by @MaskRay.

This code has now been tested on a running Linux kernel making use of the feature.

I still would like @jyknight to clarify his comments, consider explicitly requesting changes to this CL, or consider resigning as reviewer.

llvm/include/llvm/CodeGen/MachineBasicBlock.h
137	It's likely the count here is 0, or maybe 1. We don't see too often a large list of labels here.
llvm/lib/CodeGen/MachineVerifier.cpp
699	Instead of actually checking each one, instead it just checks that the count is the number of landing pads, with the assumption that all the successors should be landing pads, and that all the landing pads should be successors. What do you mean "instead of actually checking each one?" What check should be done? It's certainly not the case that all callbr targets are targets of all callbr instructions. And even if it was, this still wouldn't be counting things correctly. Right, so you could have a `MachineBasicBlock` that's the target of a `INLINEASM_BR`, and a different `MachineBasicBlock` that also branches to the `INLINEASM_BR` target (but itself wasn't an `INLINEASM_BR`). But `IndirectTargetSuccs` is only built up from successors of the current block that are `isInlineAsmBrIndirectTarget`'s. So I don't understand how the count is "wrong" just because you could have other MBB's also target the current MBB's indirect successor.
llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
1076	Isn't `Fallthrough` from above one of the potential successors? Do we have to skip it in the below conditional? What happens if we call `addSuccessor` with the same `MBB` twice?

This revision is now accepted and ready to land.Feb 19 2020, 10:50 AM

Ugh, it's actually been that long, hasn't it...I'm really sorry about that. :(

I've been actively spending time to look at this over the last couple weeks. I haven't been able to convince myself that the weird-successors and having allocatable registers across BBs here is not going to cause codegen issues after optimization passes run on it. Unfortunately, despite spending time looking into it, I also wasn't able to convince myself that this *IS* broken. Maybe someone else can chime in here and either assuage or confirm my worries?

I also see some other minor issues, which I need to write up, but I'd been blocking writing up a comment for that behind the larger question.

Anyways, while looking at this I started thinking it might actually be better to actually have INLINEASM_BR (both with or without outputs) _not_ be a terminator MachineInstr. (remaining a terminator in IR form, however). This would then be similar to how "invoke" works -- at the MachineInstr level, the call is not a terminator, even though it can jump out to the EH successors. And, the return-value handling remains in the same basic block as the call. I tried making that change, but doing it correctly has other impacts -- much of the code which currently special-cases isEHPad() needs to be updated (Which does mean I now feel like I have a good handle on the problems the _previous_ version of this code, which didn't split the block, had.). MachineBasicBlock::updateTerminator is a good example of the problematic cases -- it trawls the successor list to look for the "correct" successor, filtering out isEHPad successors. But unlike EH Pads, indirect targets of a INLINEASM_BR are not used only for that purpose, so can't be so quickly distinguished in the successors list.

So, I started updating some of that code to not have that assumption. While doing so, I noticed that fixing this would also allow analyzeBranch to ignore the inlineasm branch instructions -- and remain correct -- which actually would allow llvm to do better block placement of the inlineasm_br blocks, because it enables it to move the normal successor jump/fallthrough. So, I think that would actually be a good idea to make that change.

But whether it'd be _necessary_ to make such a change (or other changes) for this patch, or just a nice thing to fix, I'm still just unsure about.

In D69868#1883687, @jyknight wrote:

Ugh, it's actually been that long, hasn't it...I'm really sorry about that. :(

No worries. Thanks for getting back to us!

I've been actively spending time to look at this over the last couple weeks. I haven't been able to convince myself that the weird-successors and having allocatable registers across BBs here is not going to cause codegen issues after optimization passes run on it. Unfortunately, despite spending time looking into it, I also wasn't able to convince myself that this *IS* broken. Maybe someone else can chime in here and either assuage or confirm my worries?

I've been concerned about the register live-ins too (I'm less concerned about the successors issue). Is there documentation on the original decision to disallow physical register live-ins for MBBs before register allocation? We could then check to see if we're violating the original reasoning.

I also see some other minor issues, which I need to write up, but I'd been blocking writing up a comment for that behind the larger question.

Anyways, while looking at this I started thinking it might actually be better to actually have INLINEASM_BR (both with or without outputs) _not_ be a terminator MachineInstr. (remaining a terminator in IR form, however). This would then be similar to how "invoke" works -- at the MachineInstr level, the call is not a terminator, even though it can jump out to the EH successors. And, the return-value handling remains in the same basic block as the call. I tried making that change, but doing it correctly has other impacts -- much of the code which currently special-cases isEHPad() needs to be updated (Which does mean I now feel like I have a good handle on the problems the _previous_ version of this code, which didn't split the block, had.). MachineBasicBlock::updateTerminator is a good example of the problematic cases -- it trawls the successor list to look for the "correct" successor, filtering out isEHPad successors. But unlike EH Pads, indirect targets of a INLINEASM_BR are not used only for that purpose, so can't be so quickly distinguished in the successors list.

I had a very similar idea. There are some people *cough*Chris*cough* who insist that INLINEASM_BR makes sense as a terminator. I don't deny that it's very tempting to make it one, but because of this instruction's behavior it doesn't *act* like a normal terminator (explicitly branching to some destination).

As for isEHPad, do you think it would make sense to have a generic isIndirectTarget predicate, which we could then use everywhere and it would automagically filter out blocks that aren't interesting?

So, I started updating some of that code to not have that assumption. While doing so, I noticed that fixing this would also allow analyzeBranch to ignore the inlineasm branch instructions -- and remain correct -- which actually would allow llvm to do better block placement of the inlineasm_br blocks, because it enables it to move the normal successor jump/fallthrough. So, I think that would actually be a good idea to make that change.

Now that I know I'm not crazy for wanting to make INLINEASM_BR a non-terminator (I'm assuming you're not crazy at least :-), I'll give it a go. Feel free to send me patches if you have them.

But whether it'd be _necessary_ to make such a change (or other changes) for this patch, or just a nice thing to fix, I'm still just unsure about.

Given your concerns over the patch in its current form, let's at least try the non-terminator path. If we can make it work, then your concerns will be assuaged (as would mine).

Would you be okay with me submitting this and working on making INLINEASM_BR a non-terminator? I'd like to give this feature some bake time.

I'm super excited to see this progress towards supporting 'asm goto' with results! Great work!

I've been concerned about the register live-ins too (I'm less concerned about the successors issue). Is there documentation on the original decision to disallow physical register live-ins for MBBs before register allocation? We could then check to see if we're violating the original reasoning.

IIRC the regallocators don't rely on this. Actually, I think the live-ins sets are supposed to be correct only after regalloc (expect for the entry block, I don’t know for the landing pads.)

void edited the summary of this revision. (Show Details)Feb 21 2020, 3:12 PM

Herald added a subscriber: rriddle. · View Herald TranscriptFeb 21 2020, 3:12 PM

void marked 2 inline comments as done.Feb 24 2020, 4:36 PM

void added inline comments.

llvm/include/llvm/CodeGen/MachineBasicBlock.h
137	I'll make it 2 instead. :-)
llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
1076	Yes. It's added to CopyBB above and the `!CopyBB->isSuccessor(Succ)` makes sure it's not re-added.

Closed by commit rG23c2a5ce33f0: Allow "callbr" to return non-void values (authored by void). · Explain WhyFeb 24 2020, 6:34 PM

This revision was automatically updated to reflect the committed changes.

nickdesaulniers mentioned this in D114895: [SelectionDagBuilder] improve CallBrInst BlockAddress constraint handling.Dec 1 2021, 11:19 AM

nickdesaulniers mentioned this in D115688: [SelectionDAG] treat X constrained labels as i for asm.Dec 14 2021, 4:17 PM

nickdesaulniers mentioned this in rG4edb9983cb8c: [SelectionDAG] treat X constrained labels as i for asm.Jan 11 2022, 10:30 AM

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

29 lines

include/

llvm/

CodeGen/

MachineBasicBlock.h

33 lines

lib/

AsmParser/

LLParser.cpp

3 lines

CodeGen/

MachineBasicBlock.cpp

6 lines

MachineVerifier.cpp

43 lines

SelectionDAG/

ScheduleDAGSDNodes.cpp

63 lines

SelectionDAGBuilder.cpp

15 lines

IR/

Verifier.cpp

2 lines

test/

CodeGen/

X86/

callbr-asm-outputs.ll

162 lines

callbr-asm.ll

2 lines

Diff 246358

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 7,265 Lines • ▼ Show 20 Lines
	^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	<result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>\|<fnty> <fnptrval>(<function args>) [fn attrs]			<result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>\|<fnty> <fnptrval>(<function args>) [fn attrs]
	[operand bundles] to label <normal label> [other labels]			[operand bundles] to label <fallthrough label> [indirect labels]

	Overview:			Overview:
	"""""""""			"""""""""

	The '``callbr``' instruction causes control to transfer to a specified			The '``callbr``' instruction causes control to transfer to a specified
	function, with the possibility of control flow transfer to either the			function, with the possibility of control flow transfer to either the
	'``normal``' label or one of the '``other``' labels.			'``fallthrough``' label or one of the '``indirect``' labels.

	This instruction should only be used to implement the "goto" feature of gcc			This instruction should only be used to implement the "goto" feature of gcc
	style inline assembly. Any other usage is an error in the IR verifier.			style inline assembly. Any other usage is an error in the IR verifier.

	Arguments:			Arguments:
	""""""""""			""""""""""

	This instruction requires several arguments:			This instruction requires several arguments:
	Show All 10 Lines
	#. '``ty``': the type of the call instruction itself which is also the			#. '``ty``': the type of the call instruction itself which is also the
	type of the return value. Functions that return no value are marked			type of the return value. Functions that return no value are marked
	``void``.			``void``.
	#. '``fnty``': shall be the signature of the function being called. The			#. '``fnty``': shall be the signature of the function being called. The
	argument types must match the types implied by this signature. This			argument types must match the types implied by this signature. This
	type can be omitted if the function is not varargs.			type can be omitted if the function is not varargs.
	#. '``fnptrval``': An LLVM value containing a pointer to a function to			#. '``fnptrval``': An LLVM value containing a pointer to a function to
	be called. In most cases, this is a direct function call, but			be called. In most cases, this is a direct function call, but
	indirect ``callbr``'s are just as possible, calling an arbitrary pointer			other ``callbr``'s are just as possible, calling an arbitrary pointer
	to function value.			to function value.
	#. '``function args``': argument list whose types match the function			#. '``function args``': argument list whose types match the function
	signature argument types and parameter attributes. All arguments must			signature argument types and parameter attributes. All arguments must
	be of :ref:`first class <t_firstclass>` type. If the function signature			be of :ref:`first class <t_firstclass>` type. If the function signature
	indicates the function accepts a variable number of arguments, the			indicates the function accepts a variable number of arguments, the
	extra arguments can be specified.			extra arguments can be specified.
	#. '``normal label``': the label reached when the called function			#. '``fallthrough label``': the label reached when the inline assembly's
	executes a '``ret``' instruction.			execution exits the bottom.
	#. '``other labels``': the labels reached when a callee transfers control			#. '``indirect labels``': the labels reached when a callee transfers control
	to a location other than the normal '``normal label``'. The blockaddress			to a location other than the '``fallthrough label``'. The blockaddress
	constant for these should also be in the list of '``function args``'.			constant for these should also be in the list of '``function args``'.
	#. The optional :ref:`function attributes <fnattrs>` list.			#. The optional :ref:`function attributes <fnattrs>` list.
	#. The optional :ref:`operand bundles <opbundles>` list.			#. The optional :ref:`operand bundles <opbundles>` list.

	Semantics:			Semantics:
	""""""""""			""""""""""

	This instruction is designed to operate as a standard '``call``'			This instruction is designed to operate as a standard '``call``'
	instruction in most regards. The primary difference is that it			instruction in most regards. The primary difference is that it
	establishes an association with additional labels to define where control			establishes an association with additional labels to define where control
	flow goes after the call.			flow goes after the call.

				Outputs of a '``callbr``' instruction are valid only on the '``fallthrough``'
				nickdesaulniersUnsubmitted Not Done Reply Inline Actions Rather than the distinction between "normal" and "other," can we use the terms "fallthrough" and "indirect?" nickdesaulniers: Rather than the distinction between "normal" and "other," can we use the terms "fallthrough"…
				path. Use of outputs on the '``indirect``' path(s) results in :ref:`poison
				nickdesaulniersUnsubmitted Not Done Reply Inline Actions I still don't understand how the `poison` value gets added. Can you please clarify, or maybe add a test case that does? nickdesaulniers: I still don't understand how the `poison` value gets added. Can you please clarify, or maybe…
				voidAuthorUnsubmitted Done Reply Inline Actions I still don't understand how the poison value gets added. Can you please clarify, or maybe add a test case that does? Poison values are a result of execution, not specified directly. So during execution, the value becomes poison if used in the indirect block, but not if used in the fallthrough. It's just fancy language for saying "don't do that!" :-) From the docs: Poison value behavior is defined in terms of value dependence: ... * An instruction control-depends on a terminator instruction if the terminator instruction has multiple successors and the instruction is always executed when control transfers to one of the successors, and may not be executed when control is transferred to another. void: > I still don't understand how the poison value gets added. Can you please > clarify, or maybe…
				nickdesaulniersUnsubmitted Not Done Reply Inline Actions If the outputs are only valid on the fallthrough, will that allow us to build the extension to the existing `asm goto` that has been asked for? I worry that if we implement this restriction, it may not be fully useful unless the outputs on all paths are valid. Do you have a link to the previous discussion about this? nickdesaulniers: If the outputs are only valid on the fallthrough, will that allow us to build the extension to…
				voidAuthorUnsubmitted Done Reply Inline Actions After going back and forth with James and Hal, I now believe that having the value valid only on the fallthrough path will be sufficient for most purposes (if not all). However, this implementation doesn't restrict us from relaxing it in the future, but I would like to see a use case for that before we sink a lot of time into it. Below is a link to the start of the thread. http://lists.llvm.org/pipermail/llvm-dev/2019-June/133428.html void: After going back and forth with James and Hal, I now believe that having the value valid only…
				values <poisonvalues>`.

				nickdesaulniersUnsubmitted Not Done Reply Inline Actions I assume that's the WIP part? Otherwise there's nothing else here that explicitly adds poison from what I can tell. nickdesaulniers: I assume that's the WIP part? Otherwise there's nothing else here that explicitly adds poison…
				voidAuthorUnsubmitted Done Reply Inline Actions You're able to have the "fallthrough" listed in the "indirect" list as well. I'm just pointing out that if the value hasn't been calculated and shoved into the output object (register, etc.) then using that value in the fallthrough branch causes a poisoned value. void: You're able to have the "fallthrough" listed in the "indirect" list as well. I'm just pointing…
	The only use of this today is to implement the "goto" feature of gcc inline			The only use of this today is to implement the "goto" feature of gcc inline
	assembly where additional labels can be provided as locations for the inline			assembly where additional labels can be provided as locations for the inline
	assembly to jump to.			assembly to jump to.

	Example:			Example:
	""""""""			""""""""

	.. code-block:: text			.. code-block:: llvm

	callbr void asm "", "r,x"(i32 %x, i8 *blockaddress(@foo, %fail))			; "asm goto" without output constraints.
	to label %normal [label %fail]			callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
				to label %fallthrough [label %indirect]

				; "asm goto" with output constraints.
				<result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
				to label %fallthrough [label %indirect]

	.. _i_resume:			.. _i_resume:

	'``resume``' Instruction			'``resume``' Instruction
	^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""
	▲ Show 20 Lines • Show All 11,580 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/MachineBasicBlock.h

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	private:
bool IsEHScopeEntry = false;		bool IsEHScopeEntry = false;

/// Indicate that this basic block is the entry block of an EH funclet.		/// Indicate that this basic block is the entry block of an EH funclet.
bool IsEHFuncletEntry = false;		bool IsEHFuncletEntry = false;

/// Indicate that this basic block is the entry block of a cleanup funclet.		/// Indicate that this basic block is the entry block of a cleanup funclet.
bool IsCleanupFuncletEntry = false;		bool IsCleanupFuncletEntry = false;

		/// Default target of the callbr of a basic block.
		bool InlineAsmBrDefaultTarget = false;
		rnkUnsubmitted Done Reply Inline Actions I think "IsInlineAsmBrIndirectTarget" would be better. I think of "Pad" as being something EH related, it's usually a "landing pad" or "eh pad". rnk: I think "IsInlineAsmBrIndirectTarget" would be better. I think of "Pad" as being something EH…

		/// List of indirect targets of the callbr of a basic block.
		SmallPtrSet<const MachineBasicBlock*, 4> InlineAsmBrIndirectTargets;
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions It's likely the count here is 0, or maybe 1. We don't see too often a large list of labels here. nickdesaulniers: It's likely the count here is 0, or maybe 1. We don't see too often a large list of labels…
		voidAuthorUnsubmitted Done Reply Inline Actions I'll make it 2 instead. :-) void: I'll make it 2 instead. :-)

/// since getSymbol is a relatively heavy-weight operation, the symbol		/// since getSymbol is a relatively heavy-weight operation, the symbol
/// is only computed once and is cached.		/// is only computed once and is cached.
mutable MCSymbol *CachedMCSymbol = nullptr;		mutable MCSymbol *CachedMCSymbol = nullptr;

// Intrusive list support		// Intrusive list support
MachineBasicBlock() = default;		MachineBasicBlock() = default;

explicit MachineBasicBlock(MachineFunction &MF, const BasicBlock *BB);		explicit MachineBasicBlock(MachineFunction &MF, const BasicBlock *BB);
▲ Show 20 Lines • Show All 263 Lines • ▼ Show 20 Lines	#endif
void setIsEHFuncletEntry(bool V = true) { IsEHFuncletEntry = V; }		void setIsEHFuncletEntry(bool V = true) { IsEHFuncletEntry = V; }

/// Returns true if this is the entry block of a cleanup funclet.		/// Returns true if this is the entry block of a cleanup funclet.
bool isCleanupFuncletEntry() const { return IsCleanupFuncletEntry; }		bool isCleanupFuncletEntry() const { return IsCleanupFuncletEntry; }

/// Indicates if this is the entry block of a cleanup funclet.		/// Indicates if this is the entry block of a cleanup funclet.
void setIsCleanupFuncletEntry(bool V = true) { IsCleanupFuncletEntry = V; }		void setIsCleanupFuncletEntry(bool V = true) { IsCleanupFuncletEntry = V; }

		/// Returns true if this is the indirect dest of an INLINEASM_BR.
		bool isInlineAsmBrIndirectTarget(const MachineBasicBlock *Tgt) const {
		return InlineAsmBrIndirectTargets.count(Tgt);
		}

		/// Indicates if this is the indirect dest of an INLINEASM_BR.
		void addInlineAsmBrIndirectTarget(const MachineBasicBlock *Tgt) {
		InlineAsmBrIndirectTargets.insert(Tgt);
		}

		/// Transfers indirect targets to INLINEASM_BR's copy block.
		void transferInlineAsmBrIndirectTargets(MachineBasicBlock *CopyBB) {
		for (auto *Target : InlineAsmBrIndirectTargets)
		CopyBB->addInlineAsmBrIndirectTarget(Target);
		return InlineAsmBrIndirectTargets.clear();
		}

		/// Returns true if this is the default dest of an INLINEASM_BR.
		bool isInlineAsmBrDefaultTarget() const {
		return InlineAsmBrDefaultTarget;
		}

		/// Indicates if this is the default deft of an INLINEASM_BR.
		void setInlineAsmBrDefaultTarget() {
		InlineAsmBrDefaultTarget = true;
		}

/// Returns true if it is legal to hoist instructions into this block.		/// Returns true if it is legal to hoist instructions into this block.
bool isLegalToHoistInto() const;		bool isLegalToHoistInto() const;

// Code Layout methods.		// Code Layout methods.

/// Move 'this' block before or after the specified block. This only moves		/// Move 'this' block before or after the specified block. This only moves
/// the block, it does not modify the CFG or adjust potential fall-throughs at		/// the block, it does not modify the CFG or adjust potential fall-throughs at
/// the end of the block.		/// the end of the block.
▲ Show 20 Lines • Show All 550 Lines • Show Last 20 Lines

llvm/lib/AsmParser/LLParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,417 Lines • ▼ Show 20 Lines	bool LLParser::ParseCallBr(Instruction *&Inst, PerFunctionState &PFS) {
CalleeID.FTy = Ty;		CalleeID.FTy = Ty;

// Look up the callee.		// Look up the callee.
Value *Callee;		Value *Callee;
if (ConvertValIDToValue(PointerType::getUnqual(Ty), CalleeID, Callee, &PFS,		if (ConvertValIDToValue(PointerType::getUnqual(Ty), CalleeID, Callee, &PFS,
/IsCall=/true))		/IsCall=/true))
return true;		return true;

if (isa<InlineAsm>(Callee) && !Ty->getReturnType()->isVoidTy())
return Error(RetTypeLoc, "asm-goto outputs not supported");

// Set up the Attribute for the function.		// Set up the Attribute for the function.
SmallVector<Value *, 8> Args;		SmallVector<Value *, 8> Args;
SmallVector<AttributeSet, 8> ArgAttrs;		SmallVector<AttributeSet, 8> ArgAttrs;

// Loop through FunctionType's arguments and ensure they are specified		// Loop through FunctionType's arguments and ensure they are specified
// correctly. Also, gather any parameter attributes.		// correctly. Also, gather any parameter attributes.
FunctionType::param_iterator I = Ty->param_begin();		FunctionType::param_iterator I = Ty->param_begin();
FunctionType::param_iterator E = Ty->param_end();		FunctionType::param_iterator E = Ty->param_end();
▲ Show 20 Lines • Show All 2,502 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineBasicBlock.cpp

	Show First 20 Lines • Show All 1,107 Lines • ▼ Show 20 Lines

	bool MachineBasicBlock::canSplitCriticalEdge(			bool MachineBasicBlock::canSplitCriticalEdge(
	const MachineBasicBlock *Succ) const {			const MachineBasicBlock *Succ) const {
	// Splitting the critical edge to a landing pad block is non-trivial. Don't do			// Splitting the critical edge to a landing pad block is non-trivial. Don't do
	// it in this generic function.			// it in this generic function.
	if (Succ->isEHPad())			if (Succ->isEHPad())
	return false;			return false;

	const MachineFunction *MF = getParent();			// Splitting the critical edge to a callbr's indirect block isn't advised.
				// Don't do it in this generic function.
				if (isInlineAsmBrIndirectTarget(Succ))
				rnkUnsubmitted Done Reply Inline Actions As a minor efficiency golf thing, I would start off this chain of checks with `if (Succ->hasAddressTaken())`. In most cases, which will handle splitting the critical edge to the fall through destination, which will not be marked as address taken. rnk: As a minor efficiency golf thing, I would start off this chain of checks with `if (Succ…
				rnkUnsubmitted Done Reply Inline Actions Somehow I missed that you added isInlineAsmBrIndirectPad, you could use that here instead of hasAddressTaken, and it would be more precise. rnk: Somehow I missed that you added isInlineAsmBrIndirectPad, you could use that here instead of…
				return false;
				rnkUnsubmitted Not Done Reply Inline Actions This code really should be looking at the machine IR to know if there is a callbr. I think INLINEASM_BR is target independent, so you really can just loop through the MachineInstrs looking for it, and then look for the MBB in the operand list. I wonder if we should have a flag on the MBB to indicate that it ends in an INLINEASM_BR, since those break the MIR invariant that terminators are grouped together at the end of the block. rnk: This code really should be looking at the machine IR to know if there is a callbr. I think…
				voidAuthorUnsubmitted Done Reply Inline Actions The problem is that I need to between the indirect branches and the default one from the `callbr` instruction. It's way more cumbersome to do that with the `INLINEASM_BR` instruction. If you feel strongly about it I'll make the change, but the `INLINEASM_BR` instruction is already strongly tied to `callbr`. Do you expect it to change anytime in the future? void: The problem is that I need to between the indirect branches and the default one from the…
				rnkUnsubmitted Done Reply Inline Actions I don't think it will actually be that hard to implement. The MIR already knows a few things: After block layout, MBB->getFallthrough() will return the default destination. If non-null, you can use it. If getFallthrough() is null and this block terminated in callbr in IR, there must be an analyzable unconditional branch terminator (JMP or B). All other address taken successors must come from abnormal control flow, i.e. exception handling or callbr. Actually, I'm surprised this utility doesn't return false after analyzeBranch below if Succ is not one of TBB or FBB. In any case, if these ideas don't work out, I don't feel that strongly, but I feel obligated to ask you to try a variety of ways to make this work with just MIR info. rnk: I don't think it will actually be that hard to implement. The MIR already knows a few things…

				rnkUnsubmitted Done Reply Inline Actions Is it possible for a BB to be both an indirect successor and a fallthrough successor? I suppose that could be the case with the Linux macro that gets the current PC. In any case, it's probably safe to remove this condition, and then we don't have to worry. rnk: Is it possible for a BB to be both an indirect successor and a fallthrough successor? I suppose…
				voidAuthorUnsubmitted Done Reply Inline Actions It is possible. I have a testcase for it in this patch. :-) void: It is possible. I have a testcase for it in this patch. :-)
				nickdesaulniersUnsubmitted Not Done Reply Inline Actions But you don't have a test case for what @rnk is asking about. (Unsure if it would be necessary). I think @rnk is getting at this case from C: void foo(void) { asm goto ("#NICK":: "r"(&&hello) :: hello); hello: return; } Clang will emit this as: define dso_local void @foo() #0 { entry: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %hello)) #1 to label %asm.fallthrough [label %hello], !srcloc !2 asm.fallthrough: ; preds = %entry br label %hello hello: ; preds = %asm.fallthrough, %entry ret void } ie. the `blockaddress` is passed twice, once as the address of a label (GNU C extension), once as the indirect destination of the `asm goto`. So to @rnk 's question: Is it possible for a BB to be both an indirect successor and a fallthrough successor? It is valid LLVM IR to have a BB be both; Clang today (or with https://reviews.llvm.org/D69876) will not emit such formation (but could). ie. imagine the above: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %hello)) #1 to label %asm.fallthrough [label %hello], !srcloc !2 to instead be: callbr void asm sideeffect "#NICK", "r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@foo, %hello), i8* blockaddress(@foo, %asm.fallthrough)) #1 to label %asm.fallthrough [label %asm.fallthrough], !srcloc !2 I still don't understand @rnk 's point about In any case, it's probably safe to remove this condition, and then we don't have to worry. though. nickdesaulniers: But you don't have a test case for what @rnk is asking about. (Unsure if it would be necessary).
				voidAuthorUnsubmitted Done Reply Inline Actions I meant that I had a testcase that required the conditional he suggested I remove. Sorry for the confusion. You're right that we can generate the clang code where the fallthrough is the same as the indirect. I didn't mean to imply that it wasn't the case. I can add a testcase. I asked a question on the "cfe-dev" mailing list to determine how I could identify the "fallthrough" block from the CFG. So far no one has responded. Until I can determine that, I won't be able to handle that properly with this conditional. void: I meant that I had a testcase that required the conditional he suggested I remove. Sorry for…
				xbolva00Unsubmitted Not Done Reply Inline Actions Maybe @aaron.ballman can answer your question related to “fallthrough” block.. xbolva00: Maybe @aaron.ballman can answer your question related to “fallthrough” block..
				rnkUnsubmitted Not Done Reply Inline Actions I had thought that it would be safe to return false (don't allow this edge to be split) if BB appears in the indirect destination list, even if it's also the default destination, so this condition could be reduced to `contains(cbr->getIndirectDests(), bb)`. I looked for a test where a BB is both a default and indirect destination, so I don't see why we can't remove the default dest check. rnk: I had thought that it would be safe to return false (don't allow this edge to be split) if BB…
				nickdesaulniersUnsubmitted Not Done Reply Inline Actions I think it may be worthwhile to swap these conditions with each other, too. It's highly unlikely the BB is terminated with a `CallBrInst`, yet highly likely the MBB refers to a BB. nickdesaulniers: I think it may be worthwhile to swap these conditions with each other, too. It's highly…
				voidAuthorUnsubmitted Done Reply Inline Actions Okay. No one likes this addition. I'm going to remove it and just check `IsInlineAsmBrIndirectTarget`. void: Okay. No one likes this addition. I'm going to remove it and just check…
				const MachineFunction *MF = getParent();
	// Performance might be harmed on HW that implements branching using exec mask			// Performance might be harmed on HW that implements branching using exec mask
	// where both sides of the branches are always executed.			// where both sides of the branches are always executed.
				nickdesaulniersUnsubmitted Done Reply Inline Actions please use a range-based for: for (const BasicBlock* ID : cbr->getIndirectDests()) if (ID == bb) or `llvm::any_of`. nickdesaulniers: please use a range-based for: ``` for (const BasicBlock* ID : cbr->getIndirectDests()) if (ID…
	if (MF->getTarget().requiresStructuredCFG())			if (MF->getTarget().requiresStructuredCFG())
	return false;			return false;

	// We may need to update this's terminator, but we can't do that if			// We may need to update this's terminator, but we can't do that if
	// analyzeBranch fails. If this uses a jump table, we won't touch it.			// analyzeBranch fails. If this uses a jump table, we won't touch it.
	const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();			const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
	MachineBasicBlock TBB = nullptr, FBB = nullptr;			MachineBasicBlock TBB = nullptr, FBB = nullptr;
	SmallVector<MachineOperand, 4> Cond;			SmallVector<MachineOperand, 4> Cond;
	▲ Show 20 Lines • Show All 382 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineVerifier.cpp

Show First 20 Lines • Show All 616 Lines • ▼ Show 20 Lines	MachineVerifier::visitMachineBasicBlockBefore(const MachineBasicBlock *MBB) {
FirstNonPHI = nullptr;		FirstNonPHI = nullptr;

if (!MF->getProperties().hasProperty(		if (!MF->getProperties().hasProperty(
MachineFunctionProperties::Property::NoPHIs) && MRI->tracksLiveness()) {		MachineFunctionProperties::Property::NoPHIs) && MRI->tracksLiveness()) {
// If this block has allocatable physical registers live-in, check that		// If this block has allocatable physical registers live-in, check that
// it is an entry block or landing pad.		// it is an entry block or landing pad.
for (const auto &LI : MBB->liveins()) {		for (const auto &LI : MBB->liveins()) {
if (isAllocatable(LI.PhysReg) && !MBB->isEHPad() &&		if (isAllocatable(LI.PhysReg) && !MBB->isEHPad() &&
		!MBB->isInlineAsmBrDefaultTarget() &&
MBB->getIterator() != MBB->getParent()->begin()) {		MBB->getIterator() != MBB->getParent()->begin()) {
report("MBB has allocatable live-in, but isn't entry or landing-pad.", MBB);		report("MBB has allocatable live-in, but isn't entry or landing-pad.", MBB);
report_context(LI.PhysReg);		report_context(LI.PhysReg);
}		}
}		}
}		}

// Count the number of landing pad successors.		// Count the number of landing pad successors.
SmallPtrSet<MachineBasicBlock*, 4> LandingPadSuccs;		SmallPtrSet<const MachineBasicBlock*, 4> LandingPadSuccs;
for (MachineBasicBlock::const_succ_iterator I = MBB->succ_begin(),		for (const auto *succ : MBB->successors()) {
E = MBB->succ_end(); I != E; ++I) {		if (succ->isEHPad())
if ((*I)->isEHPad())		LandingPadSuccs.insert(succ);
LandingPadSuccs.insert(*I);		if (!FunctionBlocks.count(succ))
if (!FunctionBlocks.count(*I))
report("MBB has successor that isn't part of the function.", MBB);		report("MBB has successor that isn't part of the function.", MBB);
if (!MBBInfoMap[*I].Preds.count(MBB)) {		if (!MBBInfoMap[succ].Preds.count(MBB)) {
report("Inconsistent CFG", MBB);		report("Inconsistent CFG", MBB);
errs() << "MBB is not in the predecessor list of the successor "		errs() << "MBB is not in the predecessor list of the successor "
<< printMBBReference((I)) << ".\n";		<< printMBBReference(*succ) << ".\n";
		}
		}

		// Count the number of INLINEASM_BR indirect target successors.
		SmallPtrSet<const MachineBasicBlock*, 4> IndirectTargetSuccs;
		for (const auto *succ : MBB->successors()) {
		if (MBB->isInlineAsmBrIndirectTarget(succ))
		IndirectTargetSuccs.insert(succ);
		if (!FunctionBlocks.count(succ))
		report("MBB has successor that isn't part of the function.", MBB);
		if (!MBBInfoMap[succ].Preds.count(MBB)) {
		report("Inconsistent CFG", MBB);
		errs() << "MBB is not in the predecessor list of the successor "
		<< printMBBReference(*succ) << ".\n";
}		}
}		}

// Check the predecessor list.		// Check the predecessor list.
for (MachineBasicBlock::const_pred_iterator I = MBB->pred_begin(),		for (MachineBasicBlock::const_pred_iterator I = MBB->pred_begin(),
E = MBB->pred_end(); I != E; ++I) {		E = MBB->pred_end(); I != E; ++I) {
if (!FunctionBlocks.count(*I))		if (!FunctionBlocks.count(*I))
report("MBB has predecessor that isn't part of the function.", MBB);		report("MBB has predecessor that isn't part of the function.", MBB);
Show All 24 Lines	if (!TII->analyzeBranch(const_cast<MachineBasicBlock >(MBB), TBB, FBB,
if (!TBB && !FBB) {		if (!TBB && !FBB) {
// Block falls through to its successor.		// Block falls through to its successor.
MachineFunction::const_iterator MBBI = MBB->getIterator();		MachineFunction::const_iterator MBBI = MBB->getIterator();
++MBBI;		++MBBI;
if (MBBI == MF->end()) {		if (MBBI == MF->end()) {
// It's possible that the block legitimately ends with a noreturn		// It's possible that the block legitimately ends with a noreturn
// call or an unreachable, in which case it won't actually fall		// call or an unreachable, in which case it won't actually fall
// out the bottom of the function.		// out the bottom of the function.
} else if (MBB->succ_size() == LandingPadSuccs.size()) {		} else if (MBB->succ_size() == LandingPadSuccs.size() \|\|
		MBB->succ_size() == IndirectTargetSuccs.size()) {
		jyknightUnsubmitted Not Done Reply Inline Actions This isn't correct. This line here, is looking at a block which doesn't end in a jump to a successor. So, it's trying to verify that the successor list makes sense in that context. The unstated assumption in the code is that the only successors will be landing pads. Instead of actually checking each one, instead it just checks that the count is the number of landing pads, with the assumption that all the successors should be landing pads, and that all the landing pads should be successors. The next clause is then checking for the case where there's a fallthrough to the next block. In that case, the successors should've been all the landing pads, and the single fallthrough block. Adding similar code to check for the number of callbr targets doesn't really make sense. It's certainly not the case that all callbr targets are targets of all callbr instructions. And even if it was, this still wouldn't be counting things correctly. However -- I think i'd expect analyzeBranch to error out (returning true) when confronted by a callbr instruction, because it cannot actually tell what's going on there. If that were the case, nothing in this block should even be invoked. But I guess that's probably not happening, due to the terminator being followed by non-terminators. That seems likely to be a problem that needs to be fixed. (And if that is fixed, I think the changes here aren't needed anymore) jyknight: This isn't correct. This line here, is looking at a block which doesn't end in a jump to a…
		voidAuthorUnsubmitted Done Reply Inline Actions Your comment is very confusing. Could you please give an example of where this fails? void: Your comment is very confusing. Could you please give an example of where this fails?
		jyknightUnsubmitted Done Reply Inline Actions Sorry about that, I should've delimited the parts of that message better... Basically: Paragraphs 2-4 are describing why the code before this patch appears to be correct for landing pad, even though it's taking some shortcuts and making some non-obvious assumptions. Paragraph 5 ("Adding similar code"...) is why it's not correct for callbr. Paragraph 6-7 are how I'd suggest to resolve it. I believe the code as of your patch will fail validation if you have a callbr instruction which has a normal-successor block which is an indirect target of a different callbr in the function. I believe it'll also fail if you have any landing-pad successors, since those aren't being added to the count of expected successors, but rather checked separately. But more seriously than these potential verifier failures, I expect that analyzeBranch returning wrong answers (in that it may report that a block unconditionally-jumps to a successor, while it really has both a callbr and jump, separated by the non-terminator copies) will cause miscompilation. I'm not sure exactly how that will exhibit, but I'm pretty sure it's not going to be good. And, if analyzeBranch properly said "no idea" when confronted by callbr control flow, then this code in the verifier wouldn't be reached. jyknight: Sorry about that, I should've delimited the parts of that message better... Basically…
		voidAuthorUnsubmitted Done Reply Inline Actions I didn't need a delineation of the parts of the comment. I needed a clearer description of what your concern is, and to give an example of code that fails here. This bit of code is simply saying that if the block containing the `INLINEASM_BR` doesn't end with a `BR` instruction, then the number of its successors should be equal to the number of indirect successors. This is correct, as it's not valid to have a duplicate label used in a `callbr` instruction: $ llc -o /dev/null x.ll Duplicate callbr destination! %3 = callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${2:l}", "={si},r,X,0,~{dirflag},~{fpsr},~{flags}"(i32 %2, i8* blockaddress(@test1, %asm.fallthrough), i32 %1) #2 to label %asm.fallthrough [label %asm.fallthrough], !srcloc !6 ./bin/llc: x.ll: error: input module is broken! A `callbr` with a normal successor block that is the indirect target of a different `callbr` isn't really relevant here, unless I'm misunderstanding what `analyzeBranch` returns. There would be two situations: The MBB ends in a fallthrough, which is the case I mentioned above, or The MBB ends in a `BR` instruction, in which case it won't be in this block of code, but the block below. If `analyzeBranch` is not taking into account potential `COPY` instructions between `INLINEASM_BR` and `BR`, then it needs to be addressed there (I'll verify that it is). I do know that this code is reached by the verifier, so it handles it to some degree. :-) void: I didn't need a delineation of the parts of the comment. I needed a clearer description of what…
		voidAuthorUnsubmitted Done Reply Inline Actions But more seriously than these potential verifier failures, I expect that analyzeBranch returning wrong answers (in that it may report that a block unconditionally-jumps to a successor, while it really has both a callbr and jump, separated by the non-terminator copies) will cause miscompilation. I'm not sure exactly how that will exhibit, but I'm pretty sure it's not going to be good. Here are two proposals that may help alleviate these concerns: Have analyzeBranch skip over the COPYs between the INLINEASM_BR and the JMP. It's relatively straight-forward to do, but it would have to be done for all analyzeBranch calls. Create a new pseudo-instruction called `INLINEASM_BR_COPY` (or some better name) that's a terminator which behaves like a normal `COPY`, but the analyzeBranch and other methods that look at terminators will be able to handle it without modifications, since it'll look similarly to an `INLINEASM_BR` instruction. It doesn't require changing all of analyzeBranch implementations, but it's a much larger change. Thoughts? void: > But more seriously than these potential verifier failures, I expect that > analyzeBranch…
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Instead of actually checking each one, instead it just checks that the count is the number of landing pads, with the assumption that all the successors should be landing pads, and that all the landing pads should be successors. What do you mean "instead of actually checking each one?" What check should be done? It's certainly not the case that all callbr targets are targets of all callbr instructions. And even if it was, this still wouldn't be counting things correctly. Right, so you could have a `MachineBasicBlock` that's the target of a `INLINEASM_BR`, and a different `MachineBasicBlock` that also branches to the `INLINEASM_BR` target (but itself wasn't an `INLINEASM_BR`). But `IndirectTargetSuccs` is only built up from successors of the current block that are `isInlineAsmBrIndirectTarget`'s. So I don't understand how the count is "wrong" just because you could have other MBB's also target the current MBB's indirect successor. nickdesaulniers: > Instead of actually checking each one, instead it just checks that the count is the number of…
// It's possible that the block legitimately ends with a noreturn		// It's possible that the block legitimately ends with a noreturn
// call or an unreachable, in which case it won't actually fall		// call or an unreachable, in which case it won't actually fall
// out of the block.		// out of the block.
} else if (MBB->succ_size() != 1+LandingPadSuccs.size()) {		} else if ((LandingPadSuccs.size() &&
		MBB->succ_size() != 1 + LandingPadSuccs.size()) \|\|
		(IndirectTargetSuccs.size() &&
		MBB->succ_size() != 1 + IndirectTargetSuccs.size())) {
report("MBB exits via unconditional fall-through but doesn't have "		report("MBB exits via unconditional fall-through but doesn't have "
"exactly one CFG successor!", MBB);		"exactly one CFG successor!", MBB);
} else if (!MBB->isSuccessor(&*MBBI)) {		} else if (!MBB->isSuccessor(&*MBBI)) {
report("MBB exits via unconditional fall-through but its successor "		report("MBB exits via unconditional fall-through but its successor "
"differs from its CFG successor!", MBB);		"differs from its CFG successor!", MBB);
}		}
if (!MBB->empty() && MBB->back().isBarrier() &&		if (!MBB->empty() && MBB->back().isBarrier() &&
!TII->isPredicated(MBB->back())) {		!TII->isPredicated(MBB->back())) {
report("MBB exits via unconditional fall-through but ends with a "		report("MBB exits via unconditional fall-through but ends with a "
"barrier instruction!", MBB);		"barrier instruction!", MBB);
}		}
if (!Cond.empty()) {		if (!Cond.empty()) {
report("MBB exits via unconditional fall-through but has a condition!",		report("MBB exits via unconditional fall-through but has a condition!",
MBB);		MBB);
}		}
} else if (TBB && !FBB && Cond.empty()) {		} else if (TBB && !FBB && Cond.empty()) {
// Block unconditionally branches somewhere.		// Block unconditionally branches somewhere.
// If the block has exactly one successor, that happens to be a		// If the block has exactly one successor, that happens to be a
// landingpad, accept it as valid control flow.		// landingpad, accept it as valid control flow.
if (MBB->succ_size() != 1+LandingPadSuccs.size() &&		if (MBB->succ_size() != 1+LandingPadSuccs.size() &&
(MBB->succ_size() != 1 \|\| LandingPadSuccs.size() != 1 \|\|		(MBB->succ_size() != 1 \|\| LandingPadSuccs.size() != 1 \|\|
MBB->succ_begin() != LandingPadSuccs.begin())) {		MBB->succ_begin() != LandingPadSuccs.begin()) &&
		MBB->succ_size() != 1 + IndirectTargetSuccs.size() &&
		(MBB->succ_size() != 1 \|\| IndirectTargetSuccs.size() != 1 \|\|
		MBB->succ_begin() != IndirectTargetSuccs.begin())) {
report("MBB exits via unconditional branch but doesn't have "		report("MBB exits via unconditional branch but doesn't have "
"exactly one CFG successor!", MBB);		"exactly one CFG successor!", MBB);
} else if (!MBB->isSuccessor(TBB)) {		} else if (!MBB->isSuccessor(TBB)) {
report("MBB exits via unconditional branch but the CFG "		report("MBB exits via unconditional branch but the CFG "
"successor doesn't match the actual successor!", MBB);		"successor doesn't match the actual successor!", MBB);
}		}
if (MBB->empty()) {		if (MBB->empty()) {
report("MBB exits via unconditional branch but doesn't contain "		report("MBB exits via unconditional branch but doesn't contain "
▲ Show 20 Lines • Show All 2,156 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp

Show First 20 Lines • Show All 1,021 Lines • ▼ Show 20 Lines	for (const auto &InstrOrder : Orders) {
}		}
if (DLI == DLE)		if (DLI == DLE)
break;		break;

LastOrder = Order;		LastOrder = Order;
}		}
}		}

		// Split after an INLINEASM_BR block with outputs. This allows us to keep the
		// copy to/from register instructions from being between two terminator
		// instructions, which causes the machine instruction verifier agita.
		auto TI = llvm::find_if(*BB, [](const MachineInstr &MI){
		return MI.getOpcode() == TargetOpcode::INLINEASM_BR;
		});
		auto SplicePt = TI != BB->end() ? std::next(TI) : BB->end();
		if (TI != BB->end() && SplicePt != BB->end() &&
		TI->getOpcode() == TargetOpcode::INLINEASM_BR &&
		SplicePt->getOpcode() == TargetOpcode::COPY) {
		MachineBasicBlock *FallThrough = BB->getFallThrough();
		if (!FallThrough)
		for (const MachineOperand &MO : BB->back().operands())
		if (MO.isMBB()) {
		MaskRayUnsubmitted Done Reply Inline Actions `for (const MachineOperand &MO : Last.operands()) {` MaskRay: `for (const MachineOperand &MO : Last.operands()) {`
		FallThrough = MO.getMBB();
		break;
		}
		assert(FallThrough && "Cannot find default dest block for callbr!");

		MachineBasicBlock *CopyBB = MF.CreateMachineBasicBlock(BB->getBasicBlock());
		MachineFunction::iterator BBI(*BB);
		MF.insert(++BBI, CopyBB);

		CopyBB->splice(CopyBB->begin(), BB, SplicePt, BB->end());
		CopyBB->setInlineAsmBrDefaultTarget();

		CopyBB->addSuccessor(FallThrough, BranchProbability::getOne());
		BB->addSuccessor(CopyBB, BranchProbability::getOne());

		// Mark all physical registers defined in the original block as being live
		// on entry to the copy block.
		for (const auto &MI : *CopyBB)
		for (const MachineOperand &MO : MI.operands())
		if (MO.isReg()) {
		Register reg = MO.getReg();
		if (Register::isPhysicalRegister(reg)) {
		CopyBB->addLiveIn(reg);
		MaskRayUnsubmitted Done Reply Inline Actions `for (const MachineOperand &MO : MI.operands()) {` MaskRay: `for (const MachineOperand &MO : MI.operands()) {`
		break;
		}
		}

		// Bit of a hack: The copy block we created here exists only because we want
		// the CFG to work with the current system. However, the successors to the
		// block with the INLINEASM_BR instruction expect values to come from that
		// block, not this usurper block. Thus we steal its successors and add them
		// to the copy so that everyone is happy.
		for (auto *Succ : BB->successors())
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Isn't `Fallthrough` from above one of the potential successors? Do we have to skip it in the below conditional? What happens if we call `addSuccessor` with the same `MBB` twice? nickdesaulniers: Isn't `Fallthrough` from above one of the potential successors? Do we have to skip it in the…
		voidAuthorUnsubmitted Done Reply Inline Actions Yes. It's added to CopyBB above and the `!CopyBB->isSuccessor(Succ)` makes sure it's not re-added. void: Yes. It's added to CopyBB above and the `!CopyBB->isSuccessor(Succ)` makes sure it's not re…
		if (Succ != CopyBB && !CopyBB->isSuccessor(Succ))
		CopyBB->addSuccessor(Succ, BranchProbability::getZero());

		for (auto *Succ : CopyBB->successors())
		if (BB->isSuccessor(Succ))
		BB->removeSuccessor(Succ);

		CopyBB->normalizeSuccProbs();
		BB->normalizeSuccProbs();

		BB->transferInlineAsmBrIndirectTargets(CopyBB);

		InsertPos = CopyBB->end();
		return CopyBB;
		}

InsertPos = Emitter.getInsertPos();		InsertPos = Emitter.getInsertPos();
return Emitter.getBlock();		return Emitter.getBlock();
}		}

/// Return the basic block label.		/// Return the basic block label.
std::string ScheduleDAGSDNodes::getDAGName() const {		std::string ScheduleDAGSDNodes::getDAGName() const {
return "sunit-dag." + BB->getFullName();		return "sunit-dag." + BB->getFullName();
}		}

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,854 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitCallBr(const CallBrInst &I) {
// have to do anything here to lower funclet bundles.		// have to do anything here to lower funclet bundles.
assert(!I.hasOperandBundlesOtherThan(		assert(!I.hasOperandBundlesOtherThan(
{LLVMContext::OB_deopt, LLVMContext::OB_funclet}) &&		{LLVMContext::OB_deopt, LLVMContext::OB_funclet}) &&
"Cannot lower callbrs with arbitrary operand bundles yet!");		"Cannot lower callbrs with arbitrary operand bundles yet!");

assert(isa<InlineAsm>(I.getCalledValue()) &&		assert(isa<InlineAsm>(I.getCalledValue()) &&
"Only know how to handle inlineasm callbr");		"Only know how to handle inlineasm callbr");
visitInlineAsm(&I);		visitInlineAsm(&I);
		CopyToExportRegsIfNeeded(&I);

// Retrieve successors.		// Retrieve successors.
MachineBasicBlock *Return = FuncInfo.MBBMap[I.getDefaultDest()];		MachineBasicBlock *Return = FuncInfo.MBBMap[I.getDefaultDest()];
		Return->setInlineAsmBrDefaultTarget();

// Update successor info.		// Update successor info.
addSuccessorWithProb(CallBrMBB, Return);		addSuccessorWithProb(CallBrMBB, Return);
for (unsigned i = 0, e = I.getNumIndirectDests(); i < e; ++i) {		for (unsigned i = 0, e = I.getNumIndirectDests(); i < e; ++i) {
MachineBasicBlock *Target = FuncInfo.MBBMap[I.getIndirectDest(i)];		MachineBasicBlock *Target = FuncInfo.MBBMap[I.getIndirectDest(i)];
addSuccessorWithProb(CallBrMBB, Target);		addSuccessorWithProb(CallBrMBB, Target);
		CallBrMBB->addInlineAsmBrIndirectTarget(Target);
}		}
CallBrMBB->normalizeSuccProbs();		CallBrMBB->normalizeSuccProbs();

// Drop into default successor.		// Drop into default successor.
DAG.setRoot(DAG.getNode(ISD::BR, getCurSDLoc(),		DAG.setRoot(DAG.getNode(ISD::BR, getCurSDLoc(),
MVT::Other, getControlRoot(),		MVT::Other, getControlRoot(),
DAG.getBasicBlock(Return)));		DAG.getBasicBlock(Return)));
}		}
▲ Show 20 Lines • Show All 5,290 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitInlineAsm(ImmutableCallSite CS) {

// First Pass: Calculate HasSideEffects and ExtraFlags (AlignStack,		// First Pass: Calculate HasSideEffects and ExtraFlags (AlignStack,
// AsmDialect, MayLoad, MayStore).		// AsmDialect, MayLoad, MayStore).
bool HasSideEffect = IA->hasSideEffects();		bool HasSideEffect = IA->hasSideEffects();
ExtraFlags ExtraInfo(CS);		ExtraFlags ExtraInfo(CS);

unsigned ArgNo = 0; // ArgNo - The argument of the CallInst.		unsigned ArgNo = 0; // ArgNo - The argument of the CallInst.
unsigned ResNo = 0; // ResNo - The result number of the next output.		unsigned ResNo = 0; // ResNo - The result number of the next output.
		unsigned NumMatchingOps = 0;
for (auto &T : TargetConstraints) {		for (auto &T : TargetConstraints) {
ConstraintOperands.push_back(SDISelAsmOperandInfo(T));		ConstraintOperands.push_back(SDISelAsmOperandInfo(T));
SDISelAsmOperandInfo &OpInfo = ConstraintOperands.back();		SDISelAsmOperandInfo &OpInfo = ConstraintOperands.back();

// Compute the value type for each operand.		// Compute the value type for each operand.
if (OpInfo.Type == InlineAsm::isInput \|\|		if (OpInfo.Type == InlineAsm::isInput \|\|
(OpInfo.Type == InlineAsm::isOutput && OpInfo.isIndirect)) {		(OpInfo.Type == InlineAsm::isOutput && OpInfo.isIndirect)) {
OpInfo.CallOperandVal = const_cast<Value *>(CS.getArgument(ArgNo++));		OpInfo.CallOperandVal = const_cast<Value *>(CS.getArgument(ArgNo++));

// Process the call argument. BasicBlocks are labels, currently appearing		// Process the call argument. BasicBlocks are labels, currently appearing
// only in asm's.		// only in asm's.
const Instruction *I = CS.getInstruction();		const Instruction *I = CS.getInstruction();
if (isa<CallBrInst>(I) &&		if (isa<CallBrInst>(I) &&
(ArgNo - 1) >= (cast<CallBrInst>(I)->getNumArgOperands() -		ArgNo - 1 >= (cast<CallBrInst>(I)->getNumArgOperands() -
cast<CallBrInst>(I)->getNumIndirectDests())) {		cast<CallBrInst>(I)->getNumIndirectDests() -
		NumMatchingOps) &&
		(NumMatchingOps == 0 \|\|
		ArgNo - 1 < (cast<CallBrInst>(I)->getNumArgOperands() -
		NumMatchingOps))) {
const auto *BA = cast<BlockAddress>(OpInfo.CallOperandVal);		const auto *BA = cast<BlockAddress>(OpInfo.CallOperandVal);
EVT VT = TLI.getValueType(DAG.getDataLayout(), BA->getType(), true);		EVT VT = TLI.getValueType(DAG.getDataLayout(), BA->getType(), true);
OpInfo.CallOperand = DAG.getTargetBlockAddress(BA, VT);		OpInfo.CallOperand = DAG.getTargetBlockAddress(BA, VT);
} else if (const auto *BB = dyn_cast<BasicBlock>(OpInfo.CallOperandVal)) {		} else if (const auto *BB = dyn_cast<BasicBlock>(OpInfo.CallOperandVal)) {
OpInfo.CallOperand = DAG.getBasicBlock(FuncInfo.MBBMap[BB]);		OpInfo.CallOperand = DAG.getBasicBlock(FuncInfo.MBBMap[BB]);
} else {		} else {
OpInfo.CallOperand = getValue(OpInfo.CallOperandVal);		OpInfo.CallOperand = getValue(OpInfo.CallOperandVal);
}		}
Show All 14 Lines	if (OpInfo.Type == InlineAsm::isInput \|\|
OpInfo.ConstraintVT =		OpInfo.ConstraintVT =
TLI.getSimpleValueType(DAG.getDataLayout(), CS.getType());		TLI.getSimpleValueType(DAG.getDataLayout(), CS.getType());
}		}
++ResNo;		++ResNo;
} else {		} else {
OpInfo.ConstraintVT = MVT::Other;		OpInfo.ConstraintVT = MVT::Other;
}		}

		if (OpInfo.hasMatchingInput())
		++NumMatchingOps;

if (!HasSideEffect)		if (!HasSideEffect)
HasSideEffect = OpInfo.hasMemory(TLI);		HasSideEffect = OpInfo.hasMemory(TLI);

// Determine if this InlineAsm MayLoad or MayStore based on the constraints.		// Determine if this InlineAsm MayLoad or MayStore based on the constraints.
// FIXME: Could we compute this on OpInfo rather than T?		// FIXME: Could we compute this on OpInfo rather than T?

// Compute the constraint code and ConstraintType to use.		// Compute the constraint code and ConstraintType to use.
TLI.ComputeConstraintToUse(T, SDValue());		TLI.ComputeConstraintToUse(T, SDValue());
▲ Show 20 Lines • Show All 2,471 Lines • Show Last 20 Lines

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 2,527 Lines • ▼ Show 20 Lines	Assert(BI.getDestination(i)->getType()->isLabelTy(),
"Indirectbr destinations must all have pointer type!", &BI);		"Indirectbr destinations must all have pointer type!", &BI);

visitTerminator(BI);		visitTerminator(BI);
}		}

void Verifier::visitCallBrInst(CallBrInst &CBI) {		void Verifier::visitCallBrInst(CallBrInst &CBI) {
Assert(CBI.isInlineAsm(), "Callbr is currently only used for asm-goto!",		Assert(CBI.isInlineAsm(), "Callbr is currently only used for asm-goto!",
&CBI);		&CBI);
Assert(CBI.getType()->isVoidTy(), "Callbr return value is not supported!",
&CBI);
for (unsigned i = 0, e = CBI.getNumSuccessors(); i != e; ++i)		for (unsigned i = 0, e = CBI.getNumSuccessors(); i != e; ++i)
Assert(CBI.getSuccessor(i)->getType()->isLabelTy(),		Assert(CBI.getSuccessor(i)->getType()->isLabelTy(),
"Callbr successors must all have pointer type!", &CBI);		"Callbr successors must all have pointer type!", &CBI);
for (unsigned i = 0, e = CBI.getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = CBI.getNumOperands(); i != e; ++i) {
Assert(i >= CBI.getNumArgOperands() \|\| !isa<BasicBlock>(CBI.getOperand(i)),		Assert(i >= CBI.getNumArgOperands() \|\| !isa<BasicBlock>(CBI.getOperand(i)),
"Using an unescaped label as a callbr argument!", &CBI);		"Using an unescaped label as a callbr argument!", &CBI);
if (isa<BasicBlock>(CBI.getOperand(i)))		if (isa<BasicBlock>(CBI.getOperand(i)))
for (unsigned j = i + 1; j != e; ++j)		for (unsigned j = i + 1; j != e; ++j)
▲ Show 20 Lines • Show All 3,037 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/callbr-asm-outputs.ll

	; RUN: not llc -mtriple=i686-- < %s 2> %t			; RUN: llc -mtriple=i686-- -verify-machineinstrs < %s \| FileCheck %s
				rnkUnsubmitted Done Reply Inline Actions Please add -verify-machineinstrs to the tests, since that isn't usually on by default, and it may highlight some latent verifier issues I don't know about. rnk: Please add -verify-machineinstrs to the tests, since that isn't usually on by default, and it…
	; RUN: FileCheck %s < %t

	; CHECK: error: asm-goto outputs not supported			; A test for asm-goto output

	; A test for asm-goto output prohibition			; CHECK-LABEL: test1:
				; CHECK: movl 4(%esp), %eax
	define i32 @test(i32 %a) {			; CHECK-NEXT: addl $4, %eax
				; CHECK-NEXT: #APP
				; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: jmp .Ltmp0
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB0_1:
				; CHECK-NEXT: retl
				; CHECK-LABEL: .Ltmp0: # Address of block that was removed by CodeGen
				define i32 @test1(i32 %x) {
	entry:			entry:
	%0 = add i32 %a, 4			%add = add nsw i32 %x, 4
	%1 = callbr i32 asm "xorl $1, $1; jmp ${1:l}", "=&r,r,X,~{dirflag},~{fpsr},~{flags}"(i32 %0, i8* blockaddress(@test, %fail)) to label %normal [label %fail]			%ret = callbr i32 asm "xorl $1, $0; jmp ${2:l}", "=r,r,X,~{dirflag},~{fpsr},~{flags}"(i32 %add, i8* blockaddress(@test1, %abnormal))
				to label %normal [label %abnormal]

	normal:			normal:
	ret i32 %1			ret i32 %ret

	fail:			abnormal:
	ret i32 1			ret i32 1
	}			}

				; CHECK-LABEL: test2:
				; CHECK: # %bb.1: # %if.then
				; CHECK-NEXT: #APP
				; CHECK-NEXT: testl %esi, %esi
				; CHECK-NEXT: testl %edi, %esi
				; CHECK-NEXT: jne .Ltmp1
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB1_2:
				; CHECK-NEXT: jmp .LBB1_4
				; CHECK-NEXT: .LBB1_3: # %if.else
				; CHECK-NEXT: #APP
				; CHECK-NEXT: testl %esi, %edi
				; CHECK-NEXT: testl %esi, %edi
				; CHECK-NEXT: jne .Ltmp2
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB1_4:
				; CHECK-NEXT: movl %esi, %eax
				; CHECK-NEXT: addl %edi, %eax
				; CHECK-NEXT: .Ltmp2:
				; CHECK-NEXT: # %bb.5: # %return
				; CHECK-LABEL: .Ltmp1: # Address of block that was removed by CodeGen
				define i32 @test2(i32 %out1, i32 %out2) {
				entry:
				%cmp = icmp slt i32 %out1, %out2
				br i1 %cmp, label %if.then, label %if.else

				if.then: ; preds = %entry
				%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "={si},={di},r,X,X,0,1,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test2, %label_true), i8* blockaddress(@test2, %return), i32 %out1, i32 %out2)
				to label %if.end [label %label_true, label %return]

				if.else: ; preds = %entry
				%1 = callbr { i32, i32 } asm sideeffect "testl $0, $1; testl $2, $3; jne ${5:l}", "={si},={di},r,r,X,X,0,1,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i32 %out2, i8* blockaddress(@test2, %label_true), i8* blockaddress(@test2, %return), i32 %out1, i32 %out2)
				to label %if.end [label %label_true, label %return]

				if.end: ; preds = %if.else, %if.then
				%.sink11 = phi { i32, i32 } [ %0, %if.then ], [ %1, %if.else ]
				%asmresult3 = extractvalue { i32, i32 } %.sink11, 0
				%asmresult4 = extractvalue { i32, i32 } %.sink11, 1
				%add = add nsw i32 %asmresult4, %asmresult3
				br label %return

				label_true: ; preds = %if.else, %if.then
				br label %return

				return: ; preds = %if.then, %if.else, %label_true, %if.end
				%retval.0 = phi i32 [ %add, %if.end ], [ -2, %label_true ], [ -1, %if.else ], [ -1, %if.then ]
				ret i32 %retval.0
				}

				; CHECK-LABEL: test3:
				; CHECK: # %bb.1: # %true
				; CHECK-NEXT: #APP
				; CHECK-NEXT: .short %esi
				; CHECK-NEXT: .short %edi
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB2_2:
				; CHECK-NEXT: movl %edi, %eax
				; CHECK-NEXT: jmp .LBB2_5
				; CHECK-NEXT: .LBB2_3: # %false
				; CHECK-NEXT: #APP
				; CHECK-NEXT: .short %eax
				; CHECK-NEXT: .short %edx
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB2_4:
				; CHECK-NEXT: movl %edx, %eax
				; CHECK-NEXT: .LBB2_5: # %asm.fallthrough
				; CHECK-LABEL: .Ltmp3: # Address of block that was removed by CodeGen
				define i32 @test3(i1 %cmp) {
				entry:
				br i1 %cmp, label %true, label %false

				true:
				%0 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={si},={di},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]

				false:
				%1 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={ax},={dx},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]

				asm.fallthrough:
				%vals = phi { i32, i32 } [ %0, %true ], [ %1, %false ]
				%v = extractvalue { i32, i32 } %vals, 1
				ret i32 %v

				indirect:
				ret i32 42
				}

				; Test 4 - asm-goto with output constraints.
				; CHECK-LABEL: test4:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: movl $-1, %eax
				; CHECK-NEXT: movl 4(%esp), %ecx
				; CHECK-NEXT: #APP
				; CHECK-NEXT: testl %ecx, %ecx
				; CHECK-NEXT: testl %edx, %ecx
				; CHECK-NEXT: jne .Ltmp4
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB3_1:
				; CHECK-NEXT: #APP
				; CHECK-NEXT: testl %ecx, %edx
				; CHECK-NEXT: testl %ecx, %edx
				; CHECK-NEXT: jne .Ltmp5
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB3_2:
				; CHECK-NEXT: addl %edx, %ecx
				; CHECK-NEXT: movl %ecx, %eax
				; CHECK-NEXT: .Ltmp5:
				; CHECK-NEXT: # %bb.3: # %return
				; CHECK-NEXT: retl
				; CHECK-LABEL: .Ltmp4: # Address of block that was removed by CodeGen
				define i32 @test4(i32 %out1, i32 %out2) {
				entry:
				%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "=r,=r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))
				to label %asm.fallthrough [label %label_true, label %return]

				asm.fallthrough: ; preds = %entry
				%asmresult = extractvalue { i32, i32 } %0, 0
				%asmresult1 = extractvalue { i32, i32 } %0, 1
				%1 = callbr { i32, i32 } asm sideeffect "testl $0, $1; testl $2, $3; jne ${5:l}", "=r,=r,r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %asmresult, i32 %asmresult1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))
				to label %asm.fallthrough2 [label %label_true, label %return]

				asm.fallthrough2: ; preds = %asm.fallthrough
				%asmresult3 = extractvalue { i32, i32 } %1, 0
				%asmresult4 = extractvalue { i32, i32 } %1, 1
				%add = add nsw i32 %asmresult3, %asmresult4
				br label %return

				label_true: ; preds = %asm.fallthrough, %entry
				br label %return

				return: ; preds = %entry, %asm.fallthrough, %label_true, %asm.fallthrough2
				%retval.0 = phi i32 [ %add, %asm.fallthrough2 ], [ -2, %label_true ], [ -1, %asm.fallthrough ], [ -1, %entry ]
				ret i32 %retval.0
				}

llvm/test/CodeGen/X86/callbr-asm.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-- -O3 \| FileCheck %s			; RUN: llc < %s -mtriple=i686-- -O3 -verify-machineinstrs \| FileCheck %s
				rnkUnsubmitted Done Reply Inline Actions Please add -verify-machineinstrs. rnk: Please add -verify-machineinstrs.

	; Tests for using callbr as an asm-goto wrapper			; Tests for using callbr as an asm-goto wrapper

	; Test 1 - fallthrough label gets removed, but the fallthrough code that is			; Test 1 - fallthrough label gets removed, but the fallthrough code that is
	; unreachable due to asm ending on a jmp is still left in.			; unreachable due to asm ending on a jmp is still left in.
	define i32 @test1(i32 %a) {			define i32 @test1(i32 %a) {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Allow "callbr" to return non-void valuesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 246358

llvm/docs/LangRef.rst

llvm/include/llvm/CodeGen/MachineBasicBlock.h

llvm/lib/AsmParser/LLParser.cpp

llvm/lib/CodeGen/MachineBasicBlock.cpp

llvm/lib/CodeGen/MachineVerifier.cpp

llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/IR/Verifier.cpp

llvm/test/CodeGen/X86/callbr-asm-outputs.ll

llvm/test/CodeGen/X86/callbr-asm.ll

Allow "callbr" to return non-void values
ClosedPublic