This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
-
ScheduleDAGSDNodes.cpp
2/2
SelectionDAGBuilder.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
2/3
callbr-asm-outputs-pred-succ.ll
-
callbr-asm-outputs.ll

Differential D76961

[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent
ClosedPublic

Authored by nickdesaulniers on Mar 27 2020, 5:25 PM.

Download Raw Diff

Details

Reviewers

void
craig.topper
efriedma

Commits

rG5bc291be7154: [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent

Summary

A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two. This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated. The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
  MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
  if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
    continue;
  PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was continueing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block CopyBB which updates FuncInfo->MBB (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nickdesaulniers created this revision.Mar 27 2020, 5:25 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2020, 5:25 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

An INLINEASM_BR edge should count as a predecessor, yes; if that edge is missing, something went wrong.

Good catch.

This revision is now accepted and ready to land.Mar 27 2020, 6:09 PM

Harbormaster failed remote builds in B50753: Diff 253259!Mar 27 2020, 6:45 PM

An INLINEASM_BR edge should count as a predecessor, yes; if that edge is missing, something went wrong.

Ok, that's what I suspected (should be no difference for direct vs indirect references), thanks for confirming.

I'm going to sit on this change; the test cases are helpful and expose a current issue, but the fix feels like more duct tape and bailing wire to me, and I'm not certain it's the right fix.

From here, my plan is:

implement separate machine verifier checks for:
1. all MachineBasicBlocks with INLINEASM_BR terminators should have the targets of the INLINEASM_BR in their list of successors.
2. all MachineBasicBlocks that are targets of INLINEASM_BR should have the INLINEASM_BR's parent MachineBasicBlock in their list of predacessors. This is the invariant that's violated in the above test case.
fix whatever currently unidentified pass is breaking those invariants and fix them.
revisit landing this test case, maybe without the changes I've made here to BranchFolding.

(Then landing in the order 2, 1, 3).

I had to beef up the verification of callbr at the Instruction level to help find previous breakages, so this feels like a similar plan, albeit at a lower IR. I'll cc folks when I have those ready, will be my focus this week.

In D76961#1950390, @nickdesaulniers wrote:

implement separate machine verifier checks for:

all MachineBasicBlocks with INLINEASM_BR terminators should have the targets of the INLINEASM_BR in their list of successors.

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor; when inline asm passes the address of a label ("computed goto") as input to an asm statement, but not in the label list, that doesn't make a it valid branch target (though theoretically, the inline asm could still jump to it, which is super undefined).

But with a basic check:

diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 72da33c92346..94b1bd31a4e9 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -887,6 +887,17 @@ void MachineVerifier::verifyInlineAsm(const MachineInstr *MI) {
     if (!MO.isReg() || !MO.isImplicit())
       report("Expected implicit register after groups", &MO, OpNo);
   }
+
+  if (MI->getOpcode() == TargetOpcode::INLINEASM_BR) {
+    // Check that the targets are listed as successors to this MI's parent MBB.
+    SmallPtrSet<const BasicBlock*, 2> successors;
+    for (const MachineBasicBlock* MBB : MI->getParent()->successors())
+      successors.insert(MBB->getBasicBlock());
+    for (const MachineOperand& MO : MI->operands())
+      if (MO.isBlockAddress())
+        if (!successors.count(MO.getBlockAddress()->getBasicBlock()))
+          report("Expected INLINEASM_BR blockaddress operand not in successor list of parent MBB", MI);
+  }
 }

This fails on our test suite in llvm/test/CodeGen/X86/callbr-asm-outputs.ll's first test case. Right off the bat, I can see we messed this up in ISEL:

# *** IR Dump After Finalize ISel and expand pseudo-instructions ***:
...
bb.0.entry:
  successors: %bb.3(0x80000000); %bb.3(100.00%)
...
  INLINEASM_BR &"xorl $1, $0; jmp ${2:l}" [attdialect], $0:[regdef:GR32], def %2:gr32, $1:[reguse:GR32], %3:gr32, $2:[imm], blockaddress(@test1, %ir-block.abnormal), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags
bb.3.entry:
; predecessors: %bb.0
...
bb.2.abnormal (address-taken):
; predecessors: %bb.3
...

We can see we have an INLINEASM_BR MachineInstr with a blockaddress MachineOperand, but the parent MachineBasicBlock (bb.0) of the INLINEASM_BR MachineInstr's successor list does not include the MachineBasicBlock for which the blockaddress MachineOperand refers (bb.2).

(Also, it's un-ergonomic that blockaddresses at the MachineInstr level seem to refer to BasicBlocks, and not MachineBasicBlocks. You have to call MachineBasicBlock#getBasicBlock(). I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?)

I'm going to check the other invariant I specified above now, and will report back on that.

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor

Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that.

I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?

In general, blockaddress can refer to a block in a different function. So for various reasons reasons, it makes sense to represent that as a "blockaddress". It can't be an MBB because in general the MBB doesn't exist in memory. That really only applies for indirectbr, though.

For INLINEASM_BR, we could use MachineBasicBlock operands to represent the succesors.

In D76961#1950390, @nickdesaulniers wrote:

all MachineBasicBlocks that are targets of INLINEASM_BR should have the INLINEASM_BR's parent MachineBasicBlock in their list of predacessors. This is the invariant that's violated in the above test case.

In D76961#1950917, @nickdesaulniers wrote:

This fails on our test suite in llvm/test/CodeGen/X86/callbr-asm-outputs.ll's first test case. Right off the bat, I can see we messed this up in ISEL:

# *** IR Dump After Finalize ISel and expand pseudo-instructions ***:
...
bb.0.entry:
  successors: %bb.3(0x80000000); %bb.3(100.00%)
...
  INLINEASM_BR &"xorl $1, $0; jmp ${2:l}" [attdialect], $0:[regdef:GR32], def %2:gr32, $1:[reguse:GR32], %3:gr32, $2:[imm], blockaddress(@test1, %ir-block.abnormal), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags
bb.3.entry:
; predecessors: %bb.0
...
bb.2.abnormal (address-taken):
; predecessors: %bb.3
...

We can see the second invariant already violated post ISEL here as well. FWIW, a check might look like:

diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp                                             
index 72da33c92346..871c61e76326 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -887,6 +887,28 @@ void MachineVerifier::verifyInlineAsm(const MachineInstr *MI) {
     if (!MO.isReg() || !MO.isImplicit())
       report("Expected implicit register after groups", &MO, OpNo);
   }
+
+  if (MI->getOpcode() == TargetOpcode::INLINEASM_BR) {
+    for (const MachineOperand& MO : MI->operands())
+      if (MO.isBlockAddress()) {
+        const BasicBlock* BB = MO.getBlockAddress()->getBasicBlock();
+        const MachineBasicBlock* Target = nullptr;
+        for (const MachineBasicBlock& MBB : *MI->getParent()->getParent()) {
+           if (MBB.getBasicBlock() == BB)
+             Target = &MBB;
+        }
+        if (Target) {
+          SmallPtrSet<const MachineBasicBlock*, 2> Preds;
+          for (const MachineBasicBlock* Pred : Target->predecessors()) {
+            Preds.insert(Pred);
+          }
+          if (!Preds.count(MI->getParent()))
+            report("MMB target of INLINEASM_BR, but missing INLINEASM_BR's parent MBB from predecessor list", Target);
+        } else {
+          report("could not find target MBB", MI);
+        }
+      }
+  }

So I think we need to fix up ISEL to properly set successors and predecessors for MachineBasicBlocks that are terminated by INLINEASM_BR MachineInstrs. Does that sound right, @void @efriedma ?

In D76961#1951036, @nickdesaulniers wrote:

So I think we need to fix up ISEL to properly set successors and predecessors for MachineBasicBlocks that are terminated by INLINEASM_BR MachineInstrs. Does that sound right, @void @efriedma ?

Probably in InstrEmitter::EmitSpecialNode.

In D76961#1950958, @efriedma wrote:

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor

Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that.

Yeah, we could technically make asm goto work even if you didn't use the goto part...

I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?

In general, blockaddress can refer to a block in a different function. So for various reasons reasons, it makes sense to represent that as a "blockaddress". It can't be an MBB because in general the MBB doesn't exist in memory. That really only applies for indirectbr, though.

For INLINEASM_BR, we could use MachineBasicBlock operands to represent the succesors.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

I think the interaction between blockaddresses and indirectbr is fine. The indirection of blockaddress constants is actually helpful when you're dealing with random instructions that aren't indirectbr/callbr.

For callbr, the indirection isn't so helpful, so yes, it would make sense to just use BasicBlock/MachineBasicBlock directly.

In D76961#1951241, @efriedma wrote:

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

I think this is what's going on in ScheduleDAGSDNodes::EmitSchedule; the results of the before/after split don't look quite right in terms of pred/succ lists, but I've got to dig more into this tomorrow.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

I think the interaction between blockaddresses and indirectbr is fine. The indirection of blockaddress constants is actually helpful when you're dealing with random instructions that aren't indirectbr/callbr.

For callbr, the indirection isn't so helpful, so yes, it would make sense to just use BasicBlock/MachineBasicBlock directly.

That's a yak shave for another day, but one I really want to do since I think it makes callbr and transforms on it less brittle. (Particularly ones that use get/setOperand to modify one operand, when technically they should be modifying two).

In D76961#1951329, @nickdesaulniers wrote:

In D76961#1951241, @efriedma wrote:

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

I think this is what's going on in ScheduleDAGSDNodes::EmitSchedule; the results of the before/after split don't look quite right in terms of pred/succ lists, but I've got to dig more into this tomorrow.

Yeah, this is what's going on. Printing the MachineBasicBlocks before and after the relevant INLINEASM_BR part of ScheduleDAGSDNodes::EmitSchedule shows:

Before:

bb.0 (%ir-block.2):
  successors: %bb.1(0x80000000), %bb.4(0x00000000); %bb.1(100.00%), %bb.4(0.00%)
  INLINEASM_BR &"jmp ${1:l}" [attdialect], $0:[regdef:GR32], def %5:gr32, $1:[imm], blockaddress(@main, %ir-block.11), $2:[clobber], implicit-def early-clobber $df, $3:[clobber], implicit-def early-clobber $fpsw, $4:[clobber], implicit-def early-clobber $eflags, !2
  %0:gr32 = COPY %5:gr32
  JMP_1 %bb.1

bb.1 (%ir-block.4):
; predecessors: %bb.0
...
bb.4 (%ir-block.11, address-taken):
; predecessors: %bb.0

Everything looks good. I don't understand why INLINEASM_BR isn't the terminal instruction of bb.0. Isn't it marked a terminator in llvm/include/llvm/Target/Target.td line 1021? I also also don't understand why the MachineBasicBlock has a terminators method plural? I thought all blocks have 1 and only 1 terminal instruction?

Afterwards, things look bad:

bb.0 (%ir-block.2):
  successors: %bb.6(0x80000000); %bb.6(100.00%)

  INLINEASM_BR &"jmp ${1:l}" [attdialect], $0:[regdef:GR32], def %5:gr32, $1:[imm], blockaddress(@main, %ir-block.11), $2:[clobber], implicit-def early-clobber $df, $3:[clobber], implicit-def early-clobber $fpsw, $4:[clobber], implicit-def early-clobber $eflags, !2

bb.6 (%ir-block.2):
; predecessors: %bb.0
  successors: %bb.1(0x80000000), %bb.4(0x00000000); %bb.1(100.00%), %bb.4(0.00%)

  %0:gr32 = COPY %5:gr32
  JMP_1 %bb.1

bb.1 (%ir-block.4):
; predecessors: %bb.6
...
bb.4 (%ir-block.11, address-taken):
; predecessors: %bb.6

Specifically:

bb.0 has one successor, bb.6, which isn't right as the INLINEASM_BR could jump to bb.4. bb.0 should have two successors (fallthrough: bb.6, indirect target: bb.4)
bb.6 (the CopyBB in ScheduleDAGSDNodes::EmitSchedule successor list is wrong; bb.1 is right, but bb.6 cannot get to bb.4 as it unconditionally jumps to bb.1 only.

When we split the initial MachineBasicBlock, we need to "take" only the fallthrough successor, I suspect.

In D76961#1951350, @nickdesaulniers wrote:

Everything looks good. I don't understand why INLINEASM_BR isn't the terminal instruction of bb.0. Isn't it marked a terminator in llvm/include/llvm/Target/Target.td line 1021? I also also don't understand why the MachineBasicBlock has a terminators method plural? I thought all blocks have 1 and only 1 terminal instruction?

A colleague today explained to me that this happens with conditional jumps, and later I found such a case with multiple conditional jumps with an unconditional jump being the final instruction.

I have a fix for my test case, but it regresses one of the previous existing test cases. I need to spend more time on it tomorrow. Essentially, ISel is now emitting a messed up PHI node that's missing cases for some of the predecessors, which the machine verifier complains about. I suppose this may be the agita (sic) referred to in the comment above ScheduleDAGSDNodes::EmitSchedule's handling of `INLINEASM_BR. :P

Just a status update; yesterday I found/observed how the phi nodes get emitted from selectiondag. This morning I've found a fix for the mangled phi node, but that regresses other phi nodes in the test case. I think I understand why, and will continue to investigate further, but I'm going for a smoke to contemplate. I didn't smoke before this bug.

Ok, I think I have a fix in hand; it's still regressing the existing test, but I think it's more along the lines of the above changes regarding Address of block that was removed by CodeGen vs Block address taken and I don't have the below diff in this new branch I'm working out of yet, but need to double check. It is now producing valid predecessors and successors, and verifies that.

I need to:

clean up my spaghetti
fix up the existing test case
retest kernel builds with this
write up an extra detailed commit message, while I still briefly understand a glimpse/sliver of selection dag, instruction selection, instruction scheduling, phis, and virtual registers.

Hopefully I can have that out by EOD, but no promises.

EDIT:
The fix wasn't too bad, more of a PITA to debug everything and a firehose to drink from.

Ok, everything is green now, kernel boots. (There's probably more verification I can do, too).

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

In D76961#1958358, @nickdesaulniers wrote:

Ok, everything is green now, kernel boots. (There's probably more verification I can do, too).

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

In D76961#1958516, @void wrote:

In D76961#1958358, @nickdesaulniers wrote:

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

huh, so it seems that ./llvm/utils/update_llc_test_checks.py llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll doesn't actually do anything :( so I don't think I can preprocess it. I think the output needs to change; I don't understand how you write a test that update_llc_test_checks.py is happy with off of the bat.

Worse is that arc seems messed up, so I'm not able to post the fix to phabricator.

$ arc diff
...
 Exception 
Error while loading file "/android1/arcanist/arcanist/src/lint/linter/xhpast/rules/ArcanistPHPCompatibilityXHPASTLinterRule.php": "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"?
(Run with `--trace` for a full exception trace.)

not sure if it auto updated on my host or what. Is there a way to post to phabricator without arc, maybe just git?

FWIW: this is the current state of the patch: https://github.com/ClangBuiltLinux/llvm-project/commit/21a14be632072013a2800f8747154ca1a271d3b6

I'll properly clean up the tests tomorrow. (Man can we plz move to github pull requests).

nickdesaulniers mentioned this in D77356: [test] preformat test with update_llc_test_checks.py NFC.Apr 2 2020, 7:00 PM

redo everything

This revision is now accepted and ready to land.Apr 2 2020, 7:12 PM

Herald added a subscriber: MatzeB. · View Herald TranscriptApr 2 2020, 7:12 PM

nickdesaulniers requested review of this revision.Apr 2 2020, 7:16 PM

nickdesaulniers retitled this revision from [BranchFolder] don't remove MBB's that have their address taken to [SelectionDAG] fix predeccessor list for INLINEASM_BRs' parent.

nickdesaulniers edited the summary of this revision. (Show Details)

nickdesaulniers added a reviewer: efriedma.

nickdesaulniers added a child revision: D77356: [test] preformat test with update_llc_test_checks.py NFC.

nickdesaulniers removed a child revision: D77356: [test] preformat test with update_llc_test_checks.py NFC.Apr 2 2020, 7:18 PM

nickdesaulniers added a parent revision: D77356: [test] preformat test with update_llc_test_checks.py NFC.

nickdesaulniers edited the summary of this revision. (Show Details)Apr 2 2020, 7:22 PM

In D76961#1958582, @nickdesaulniers wrote:

In D76961#1958516, @void wrote:

In D76961#1958358, @nickdesaulniers wrote:

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

huh, so it seems that ./llvm/utils/update_llc_test_checks.py llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll doesn't actually do anything :( so I don't think I can preprocess it. I think the output needs to change; I don't understand how you write a test that update_llc_test_checks.py is happy with off of the bat.

Special talent. :-)

not sure if it auto updated on my host or what. Is there a way to post to phabricator without arc, maybe just git?

Yes. Just do something like git diff -U99999 and take the diff and add it via the "Update diff" link at the top right.

Harbormaster failed remote builds in B51577: Diff 254676!Apr 2 2020, 8:04 PM

nickdesaulniers added subscribers: fhahn, hfinkel.Apr 3 2020, 11:46 AM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/MachineVerifier.cpp
893 ↗	(On Diff #254676)	@efriedma , so I'm hesitant about this added verification check. As written, it's checking each operand of `INLINEASM_BR` that's a `blockaddress`, which again may or may not be the indirect target of `asm goto` (it could just be a vanilla input). This goes back to our discussion in https://reviews.llvm.org/D64101#1569218 with @fhahn and @hfinkel . At this point, I think I agree with your previous comment on this review (https://reviews.llvm.org/D76961#1950958), ie. Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that. I think we should just make `callbr` assume that any `blockaddress` operand is a possible successor. The implication then is that it might then even be safe to use a vanilla asm statement (rather than `asm goto`), with labels as inputs, and get the same functionality as `asm goto`. `asm goto` is kind of just a distinct syntax for what feels like could have been fixes to codegen for the original syntax. I don't really want to discuss such a change here, but it's something for us to think more about and maybe discuss at the next LLVM developer meeting, or on the mailing list. I'm more curious about @efriedma 's thoughts on how to proceed: drop this verification pass, defer decisions to the future, or keep this pass, and add the additional check I commented earlier in this thread. I like the idea of 2, but I'm concerned that if we add a verification pass, then someone tries to use blockaddress operands in callbr that are not indirect successors, we probably don't set the successor list correctly, and this added verification check will be too aggressive and error. So maybe it's better to drop the the pass, and revisit this another day, when were confident in the direct and changes we'd like to make. Thoughts?
llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
39	@void I recognize the irony of preprocessing the other test I modified, in child revision https://reviews.llvm.org/D77356, then adding a new test that doesn't do the same formatting. I tend to write comments in my test of what's being tested that way in the future when someone else needs to change my test, they have some sense of what's important to the test and what's not. That way they feel more empowered to change them. Should I: preprocess the test with update_llc_test_checks.py and then remove these comments I don't mind deleting anything, and trust your judgement, but I wanted to highlight this irony during code review.

nickdesaulniers retitled this revision from [SelectionDAG] fix predeccessor list for INLINEASM_BRs' parent to [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent.Apr 3 2020, 12:18 PM

nickdesaulniers added inline comments.

llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
26	s/predecessed/proceeded by/
36	ditto

efriedma added inline comments.Apr 3 2020, 1:04 PM

llvm/lib/CodeGen/MachineVerifier.cpp
893 ↗	(On Diff #254676)	I'm fine with just dropping the verifier check. Jump table jumps have implicit targets in a similar way, so unverifiable CFGs aren't really anything novel. Way off-topic response to the general design of the feature: At the IR level, at this point, I think that using blockaddress for callbr was a mistake. The list of valid destinations is already listed in the "indirect destinations" list of the callbr; we could just allow asm strings to refer to that label list directly. This would be a little less flexible from a source language perspective, since you couldn't pass the address of a label as a register operand, but at the C level I don't think anyone is actually passing &&LABEL to asm goto in practice. asm goto is kind of just a distinct syntax for what feels like could have been fixes to codegen for the original syntax. If the syntax isn't distinct, we would be forced to impose indirect goto's "jump-into-scope" restrictions on all inline asm statements. Probably this would break someone's code. I guess if we were designing the feature from scratch, we could allow asm goto blocks to jump to any block in the function whose address is taken. But that isn't any more powerful than the feature we currently implement.
897 ↗	(On Diff #254676)	is_contained?

nickdesaulniers mentioned this in rG9d9b8a20a8b0: [test] preformat test with update_llc_test_checks.py NFC.Apr 3 2020, 2:08 PM

drop verifier checks, fix typos

nickdesaulniers edited the summary of this revision. (Show Details)Apr 3 2020, 3:04 PM

nickdesaulniers edited the summary of this revision. (Show Details)

efriedma added inline comments.Apr 3 2020, 3:46 PM

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2950	Maybe it would be better to move this into SelectionDAGISel::FinishBasicBlock? All the other similar code seems to be there. If we do keep it here, better to fix the comment to something like "SelectionDAGISel::FinishBasicBlock will add PHI operands for the successors of the fallthrough block. Here, we add PHI operands for the successors of the INLINEASM_BR block itself". (It's not obvious at first glance what "first" and "last" refer to.)

update comment- update comment- update comment- update comment- update comment- update comment- update comment- update comment- update comment

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2950	The relevant call chain look like: SelectionDAGISel::SelectAllBasicBlocks SelectBasicBlock CodeGenAndEmitDAG ScheduleDAGSDNodes::EmitSchedule SelectionDAGBuilder::UpdateSplitBlock FinishBasicBlock The issue is that `ScheduleDAGSDNodes::EmitSchedule` splits `FuncInfo->MBB` (the current `MachineBasicBlock` we're emitting a schedule for), then the return value is used to update `FuncInfo->MBB` in `CodeGenAndEmitDAG`, such that later in `FinishBasicBlock` we no longer have a reference to the block referred to as `First` in `SelectionDAGBuilder::UpdateSplitBlock`. `SelectionDAGBuilder::UpdateSplitBlock` is only called when `FuncInfo->MBB` is split via call to `ScheduleDAGSDNodes::EmitSchedule`, so I'm certain it make sense to perform special clean up there. In fact, the comment in `SelectionDAGISel::CodeGenAndEmitDAG` says: // If the block was split, make sure we update any references that are used to // update PHI nodes later on. if (FirstMBB != LastMBB) SDB->UpdateSplitBlock(FirstMBB, LastMBB); so again it makes sense for us to update the PHI nodes given a block split in `UpdateSplitBlock`. I've updated the comment.

add period to end of comment.

LGTM

This revision is now accepted and ready to land.Apr 3 2020, 5:25 PM

Closed by commit rG5bc291be7154: [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent (authored by nickdesaulniers). · Explain WhyApr 6 2020, 2:11 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

ScheduleDAGSDNodes.cpp

14 lines

SelectionDAGBuilder.cpp

10 lines

test/

CodeGen/

X86/

callbr-asm-outputs-pred-succ.ll

73 lines

callbr-asm-outputs.ll

54 lines

Diff 255486

llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp

Show First 20 Lines • Show All 1,048 Lines • ▼ Show 20 Lines	if (TI != BB->end() && SplicePt != BB->end() &&
MachineBasicBlock *CopyBB = MF.CreateMachineBasicBlock(BB->getBasicBlock());		MachineBasicBlock *CopyBB = MF.CreateMachineBasicBlock(BB->getBasicBlock());
MachineFunction::iterator BBI(*BB);		MachineFunction::iterator BBI(*BB);
MF.insert(++BBI, CopyBB);		MF.insert(++BBI, CopyBB);

CopyBB->splice(CopyBB->begin(), BB, SplicePt, BB->end());		CopyBB->splice(CopyBB->begin(), BB, SplicePt, BB->end());
CopyBB->setInlineAsmBrDefaultTarget();		CopyBB->setInlineAsmBrDefaultTarget();

CopyBB->addSuccessor(FallThrough, BranchProbability::getOne());		CopyBB->addSuccessor(FallThrough, BranchProbability::getOne());
		BB->removeSuccessor(FallThrough);
BB->addSuccessor(CopyBB, BranchProbability::getOne());		BB->addSuccessor(CopyBB, BranchProbability::getOne());

// Mark all physical registers defined in the original block as being live		// Mark all physical registers defined in the original block as being live
// on entry to the copy block.		// on entry to the copy block.
for (const auto &MI : *CopyBB)		for (const auto &MI : *CopyBB)
for (const MachineOperand &MO : MI.operands())		for (const MachineOperand &MO : MI.operands())
if (MO.isReg()) {		if (MO.isReg()) {
Register reg = MO.getReg();		Register reg = MO.getReg();
if (Register::isPhysicalRegister(reg)) {		if (Register::isPhysicalRegister(reg)) {
CopyBB->addLiveIn(reg);		CopyBB->addLiveIn(reg);
break;		break;
}		}
}		}

// Bit of a hack: The copy block we created here exists only because we want
// the CFG to work with the current system. However, the successors to the
// block with the INLINEASM_BR instruction expect values to come from that
// block, not this usurper block. Thus we steal its successors and add them
// to the copy so that everyone is happy.
for (auto *Succ : BB->successors())
if (Succ != CopyBB && !CopyBB->isSuccessor(Succ))
CopyBB->addSuccessor(Succ, BranchProbability::getZero());

for (auto *Succ : CopyBB->successors())
if (BB->isSuccessor(Succ))
BB->removeSuccessor(Succ);

CopyBB->normalizeSuccProbs();		CopyBB->normalizeSuccProbs();
BB->normalizeSuccProbs();		BB->normalizeSuccProbs();

BB->transferInlineAsmBrIndirectTargets(CopyBB);		BB->transferInlineAsmBrIndirectTargets(CopyBB);

InsertPos = CopyBB->end();		InsertPos = CopyBB->end();
return CopyBB;		return CopyBB;
}		}
Show All 9 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,938 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::UpdateSplitBlock(MachineBasicBlock *First,
for (unsigned i = 0, e = SL->JTCases.size(); i != e; ++i)		for (unsigned i = 0, e = SL->JTCases.size(); i != e; ++i)
if (SL->JTCases[i].first.HeaderBB == First)		if (SL->JTCases[i].first.HeaderBB == First)
SL->JTCases[i].first.HeaderBB = Last;		SL->JTCases[i].first.HeaderBB = Last;

// Update BitTestCases.		// Update BitTestCases.
for (unsigned i = 0, e = SL->BitTestCases.size(); i != e; ++i)		for (unsigned i = 0, e = SL->BitTestCases.size(); i != e; ++i)
if (SL->BitTestCases[i].Parent == First)		if (SL->BitTestCases[i].Parent == First)
SL->BitTestCases[i].Parent = Last;		SL->BitTestCases[i].Parent = Last;

		// SelectionDAGISel::FinishBasicBlock will add PHI operands for the
		// successors of the fallthrough block. Here, we add PHI operands for the
		// successors of the INLINEASM_BR block itself.
		efriedmaUnsubmitted Done Reply Inline Actions Maybe it would be better to move this into SelectionDAGISel::FinishBasicBlock? All the other similar code seems to be there. If we do keep it here, better to fix the comment to something like "SelectionDAGISel::FinishBasicBlock will add PHI operands for the successors of the fallthrough block. Here, we add PHI operands for the successors of the INLINEASM_BR block itself". (It's not obvious at first glance what "first" and "last" refer to.) efriedma: Maybe it would be better to move this into SelectionDAGISel::FinishBasicBlock? All the other…
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions The relevant call chain look like: SelectionDAGISel::SelectAllBasicBlocks SelectBasicBlock CodeGenAndEmitDAG ScheduleDAGSDNodes::EmitSchedule SelectionDAGBuilder::UpdateSplitBlock FinishBasicBlock The issue is that `ScheduleDAGSDNodes::EmitSchedule` splits `FuncInfo->MBB` (the current `MachineBasicBlock` we're emitting a schedule for), then the return value is used to update `FuncInfo->MBB` in `CodeGenAndEmitDAG`, such that later in `FinishBasicBlock` we no longer have a reference to the block referred to as `First` in `SelectionDAGBuilder::UpdateSplitBlock`. `SelectionDAGBuilder::UpdateSplitBlock` is only called when `FuncInfo->MBB` is split via call to `ScheduleDAGSDNodes::EmitSchedule`, so I'm certain it make sense to perform special clean up there. In fact, the comment in `SelectionDAGISel::CodeGenAndEmitDAG` says: // If the block was split, make sure we update any references that are used to // update PHI nodes later on. if (FirstMBB != LastMBB) SDB->UpdateSplitBlock(FirstMBB, LastMBB); so again it makes sense for us to update the PHI nodes given a block split in `UpdateSplitBlock`. I've updated the comment. nickdesaulniers: The relevant call chain look like: ``` SelectionDAGISel::SelectAllBasicBlocks…
		if (First->getFirstTerminator()->getOpcode() == TargetOpcode::INLINEASM_BR)
		for (std::pair<MachineInstr *, unsigned> &pair : FuncInfo.PHINodesToUpdate)
		if (First->isSuccessor(pair.first->getParent()))
		MachineInstrBuilder(*First->getParent(), pair.first)
		.addReg(pair.second)
		.addMBB(First);
}		}

void SelectionDAGBuilder::visitIndirectBr(const IndirectBrInst &I) {		void SelectionDAGBuilder::visitIndirectBr(const IndirectBrInst &I) {
MachineBasicBlock *IndirectBrMBB = FuncInfo.MBB;		MachineBasicBlock *IndirectBrMBB = FuncInfo.MBB;

// Update machine-CFG edges with unique successors.		// Update machine-CFG edges with unique successors.
SmallSet<BasicBlock*, 32> Done;		SmallSet<BasicBlock*, 32> Done;
for (unsigned i = 0, e = I.getNumSuccessors(); i != e; ++i) {		for (unsigned i = 0, e = I.getNumSuccessors(); i != e; ++i) {
▲ Show 20 Lines • Show All 7,601 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll

This file was added.

				; Tests that InstrEmitter::EmitMachineNode correctly sets predecessors and
				; successors.

				; RUN: llc -stop-after=finalize-isel -print-after=finalize-isel -mtriple=i686-- < %s 2>&1 \| FileCheck %s

				; The block containting the INLINEASM_BR should have a fallthrough and its
				; indirect targets as its successors. The fallthrough is a block we synthesized
				; in InstrEmitter::EmitMachineNode. Fallthrough should have 100% branch weight,
				; while the indirect targets have 0%.
				; CHECK: bb.0 (%ir-block.2):
				; CHECK-NEXT: successors: %bb.4(0x00000000), %bb.6(0x80000000); %bb.4(0.00%), %bb.6(100.00%)

				; The fallthrough block is predaccessed by the block containing INLINEASM_BR,
				; and succeeded by the INLINEASM_BR's original fallthrough block pre-splitting.
				; CHECK: bb.6 (%ir-block.2):
				; CHECK-NEXT: predecessors: %bb.0
				; CHECK-NEXT: successors: %bb.1(0x80000000); %bb.1(100.00%)

				; Another block containing a second INLINEASM_BR. Check it has two successors,
				; and the the probability for fallthrough is 100%. Predecessor check irrelevant.
				; CHECK: bb.1 (%ir-block.4):
				; CHECK: successors: %bb.2(0x00000000), %bb.7(0x80000000); %bb.2(0.00%), %bb.7(100.00%)

				; Check the synthesized fallthrough block for the second INLINEASM_BR is
				; preceded correctly, and has the original successor pre-splitting.
				; CHECK: bb.7 (%ir-block.4):
				nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions s/predecessed/proceeded by/ nickdesaulniers: s/predecessed/proceeded by/
				; CHECK-NEXT: predecessors: %bb.1
				; CHECK-NEXT: successors: %bb.3(0x80000000); %bb.3(100.00%)

				; Check the second INLINEASM_BR target block is preceded by the block with the
				; second INLINEASM_BR.
				; CHECK: bb.2 (%ir-block.7, address-taken):
				; CHECK-NEXT: predecessors: %bb.1

				; Check the first INLINEASM_BR target block is predecessed by the block with
				; the first INLINEASM_BR.
				nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions ditto nickdesaulniers: ditto
				; CHECK: bb.4 (%ir-block.11, address-taken):
				; CHECK-NEXT: predecessors: %bb.0

				nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions @void I recognize the irony of preprocessing the other test I modified, in child revision https://reviews.llvm.org/D77356, then adding a new test that doesn't do the same formatting. I tend to write comments in my test of what's being tested that way in the future when someone else needs to change my test, they have some sense of what's important to the test and what's not. That way they feel more empowered to change them. Should I: preprocess the test with update_llc_test_checks.py and then remove these comments I don't mind deleting anything, and trust your judgement, but I wanted to highlight this irony during code review. nickdesaulniers: @void I recognize the irony of preprocessing the other test I modified, in child revision https…
				@.str = private unnamed_addr constant [26 x i8] c"inline asm#1 returned %d\0A\00", align 1
				@.str.2 = private unnamed_addr constant [26 x i8] c"inline asm#2 returned %d\0A\00", align 1
				@str = private unnamed_addr constant [30 x i8] c"inline asm#1 caused exception\00", align 1
				@str.4 = private unnamed_addr constant [30 x i8] c"inline asm#2 caused exception\00", align 1

				; Function Attrs: nounwind uwtable
				define dso_local i32 @main(i32 %0, i8** nocapture readnone %1) #0 {
				%3 = callbr i32 asm "jmp ${1:l}", "=r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@main, %11)) #3
				to label %4 [label %11]

				4: ; preds = %2
				%5 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([26 x i8], [26 x i8]* @.str, i64 0, i64 0), i32 %3)
				%6 = callbr i32 asm "jmp ${1:l}", "=r,X,~{dirflag},~{fpsr},~{flags}"(i8* blockaddress(@main, %7)) #3
				to label %9 [label %7]

				7: ; preds = %4
				%8 = tail call i32 @puts(i8* nonnull dereferenceable(1) getelementptr inbounds ([30 x i8], [30 x i8]* @str.4, i64 0, i64 0))
				br label %13

				9: ; preds = %4
				%10 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([26 x i8], [26 x i8]* @.str.2, i64 0, i64 0), i32 %6)
				br label %13

				11: ; preds = %2
				%12 = tail call i32 @puts(i8* nonnull dereferenceable(1) getelementptr inbounds ([30 x i8], [30 x i8]* @str, i64 0, i64 0))
				br label %13

				13: ; preds = %11, %9, %7
				%14 = phi i32 [ 1, %7 ], [ 0, %9 ], [ 1, %11 ]
				ret i32 %14
				}

				declare dso_local i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #1
				declare i32 @puts(i8* nocapture readonly) local_unnamed_addr #2

llvm/test/CodeGen/X86/callbr-asm-outputs.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=i686-- -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple=i686-- -verify-machineinstrs < %s \| FileCheck %s

	; A test for asm-goto output			; A test for asm-goto output

	define i32 @test1(i32 %x) {			define i32 @test1(i32 %x) {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: addl $4, %eax			; CHECK-NEXT: addl $4, %eax
	; CHECK-NEXT: #APP			; CHECK-NEXT: #APP
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: jmp .Ltmp0			; CHECK-NEXT: jmp .Ltmp0
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB0_1: # %normal			; CHECK-NEXT: .LBB0_1: # %normal
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	; CHECK-NEXT: .Ltmp0: # Address of block that was removed by CodeGen			; CHECK-NEXT: .Ltmp0: # Block address taken
				; CHECK-NEXT: .LBB0_2: # %abnormal
				; CHECK-NEXT: movl $1, %eax
				; CHECK-NEXT: retl
	entry:			entry:
	%add = add nsw i32 %x, 4			%add = add nsw i32 %x, 4
	%ret = callbr i32 asm "xorl $1, $0; jmp ${2:l}", "=r,r,X,~{dirflag},~{fpsr},~{flags}"(i32 %add, i8* blockaddress(@test1, %abnormal))			%ret = callbr i32 asm "xorl $1, $0; jmp ${2:l}", "=r,r,X,~{dirflag},~{fpsr},~{flags}"(i32 %add, i8* blockaddress(@test1, %abnormal))
	to label %normal [label %abnormal]			to label %normal [label %abnormal]

	normal:			normal:
	ret i32 %ret			ret i32 %ret

	Show All 16 Lines
	; CHECK-NEXT: cmpl %edi, %esi			; CHECK-NEXT: cmpl %edi, %esi
	; CHECK-NEXT: jge .LBB1_3			; CHECK-NEXT: jge .LBB1_3
	; CHECK-NEXT: # %bb.1: # %if.then			; CHECK-NEXT: # %bb.1: # %if.then
	; CHECK-NEXT: #APP			; CHECK-NEXT: #APP
	; CHECK-NEXT: testl %esi, %esi			; CHECK-NEXT: testl %esi, %esi
	; CHECK-NEXT: testl %edi, %esi			; CHECK-NEXT: testl %edi, %esi
	; CHECK-NEXT: jne .Ltmp1			; CHECK-NEXT: jne .Ltmp1
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB1_2:			; CHECK-NEXT: .LBB1_2: # %if.then
	; CHECK-NEXT: jmp .LBB1_4			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: .LBB1_3: # %if.else			; CHECK-NEXT: addl %esi, %eax
	; CHECK-NEXT: #APP
	; CHECK-NEXT: testl %esi, %edi
	; CHECK-NEXT: testl %esi, %edi
	; CHECK-NEXT: jne .Ltmp2
	; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB1_4:
	; CHECK-NEXT: movl %esi, %eax
	; CHECK-NEXT: addl %edi, %eax
	; CHECK-NEXT: .Ltmp2: # Block address taken			; CHECK-NEXT: .Ltmp2: # Block address taken
	; CHECK-NEXT: # %bb.5: # %return			; CHECK-NEXT: .LBB1_6: # %return
	; CHECK-NEXT: popl %esi			; CHECK-NEXT: popl %esi
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: popl %edi			; CHECK-NEXT: popl %edi
	; CHECK-NEXT: .cfi_def_cfa_offset 4			; CHECK-NEXT: .cfi_def_cfa_offset 4
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	; CHECK-NEXT: .Ltmp1: # Address of block that was removed by CodeGen			; CHECK-NEXT: .LBB1_3: # %if.else
				; CHECK-NEXT: .cfi_def_cfa_offset 12
				; CHECK-NEXT: #APP
				; CHECK-NEXT: testl %esi, %edi
				; CHECK-NEXT: testl %esi, %edi
				; CHECK-NEXT: jne .Ltmp2
				; CHECK-NEXT: #NO_APP
				; CHECK-NEXT: .LBB1_4: # %if.else
				; CHECK-NEXT: jmp .LBB1_2
				; CHECK-NEXT: .Ltmp1: # Block address taken
				; CHECK-NEXT: .LBB1_5: # %label_true
				; CHECK-NEXT: movl $-2, %eax
				; CHECK-NEXT: jmp .LBB1_6
	entry:			entry:
	%cmp = icmp slt i32 %out1, %out2			%cmp = icmp slt i32 %out1, %out2
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "={si},={di},r,X,X,0,1,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test2, %label_true), i8* blockaddress(@test2, %return), i32 %out1, i32 %out2)			%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "={si},={di},r,X,X,0,1,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test2, %label_true), i8* blockaddress(@test2, %return), i32 %out1, i32 %out2)
	to label %if.end [label %label_true, label %return]			to label %if.end [label %label_true, label %return]

	Show All 27 Lines
	; CHECK-NEXT: .cfi_offset %edi, -8			; CHECK-NEXT: .cfi_offset %edi, -8
	; CHECK-NEXT: testb $1, {{[0-9]+}}(%esp)			; CHECK-NEXT: testb $1, {{[0-9]+}}(%esp)
	; CHECK-NEXT: je .LBB2_3			; CHECK-NEXT: je .LBB2_3
	; CHECK-NEXT: # %bb.1: # %true			; CHECK-NEXT: # %bb.1: # %true
	; CHECK-NEXT: #APP			; CHECK-NEXT: #APP
	; CHECK-NEXT: .short %esi			; CHECK-NEXT: .short %esi
	; CHECK-NEXT: .short %edi			; CHECK-NEXT: .short %edi
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB2_2:			; CHECK-NEXT: .LBB2_2: # %true
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: jmp .LBB2_5			; CHECK-NEXT: jmp .LBB2_5
	; CHECK-NEXT: .LBB2_3: # %false			; CHECK-NEXT: .LBB2_3: # %false
	; CHECK-NEXT: #APP			; CHECK-NEXT: #APP
	; CHECK-NEXT: .short %eax			; CHECK-NEXT: .short %eax
	; CHECK-NEXT: .short %edx			; CHECK-NEXT: .short %edx
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB2_4:			; CHECK-NEXT: .LBB2_4: # %false
	; CHECK-NEXT: movl %edx, %eax			; CHECK-NEXT: movl %edx, %eax
	; CHECK-NEXT: .LBB2_5: # %asm.fallthrough			; CHECK-NEXT: .LBB2_5: # %asm.fallthrough
	; CHECK-NEXT: popl %esi			; CHECK-NEXT: popl %esi
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: popl %edi			; CHECK-NEXT: popl %edi
	; CHECK-NEXT: .cfi_def_cfa_offset 4			; CHECK-NEXT: .cfi_def_cfa_offset 4
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	; CHECK-NEXT: .Ltmp3: # Address of block that was removed by CodeGen			; CHECK-NEXT: .Ltmp3: # Block address taken
				; CHECK-NEXT: .LBB2_6: # %indirect
				; CHECK-NEXT: .cfi_def_cfa_offset 12
				; CHECK-NEXT: movl $42, %eax
				; CHECK-NEXT: jmp .LBB2_5
	entry:			entry:
	br i1 %cmp, label %true, label %false			br i1 %cmp, label %true, label %false

	true:			true:
	%0 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={si},={di},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]			%0 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={si},={di},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]

	false:			false:
	%1 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={ax},={dx},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]			%1 = callbr { i32, i32 } asm sideeffect ".word $0, $1", "={ax},={dx},X" (i8* blockaddress(@test3, %indirect)) to label %asm.fallthrough [label %indirect]
	Show All 19 Lines
	; CHECK-NEXT: jne .Ltmp4			; CHECK-NEXT: jne .Ltmp4
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB3_1: # %asm.fallthrough			; CHECK-NEXT: .LBB3_1: # %asm.fallthrough
	; CHECK-NEXT: #APP			; CHECK-NEXT: #APP
	; CHECK-NEXT: testl %ecx, %edx			; CHECK-NEXT: testl %ecx, %edx
	; CHECK-NEXT: testl %ecx, %edx			; CHECK-NEXT: testl %ecx, %edx
	; CHECK-NEXT: jne .Ltmp5			; CHECK-NEXT: jne .Ltmp5
	; CHECK-NEXT: #NO_APP			; CHECK-NEXT: #NO_APP
	; CHECK-NEXT: .LBB3_2: # %asm.fallthrough2			; CHECK-NEXT: .LBB3_2: # %asm.fallthrough
	; CHECK-NEXT: addl %edx, %ecx			; CHECK-NEXT: addl %edx, %ecx
	; CHECK-NEXT: movl %ecx, %eax			; CHECK-NEXT: movl %ecx, %eax
				; CHECK-NEXT: retl
				; CHECK-NEXT: .Ltmp4: # Block address taken
				; CHECK-NEXT: .LBB3_3: # %label_true
				; CHECK-NEXT: movl $-2, %eax
	; CHECK-NEXT: .Ltmp5: # Block address taken			; CHECK-NEXT: .Ltmp5: # Block address taken
	; CHECK-NEXT: # %bb.3: # %return			; CHECK-NEXT: .LBB3_4: # %return
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	; CHECK-NEXT: .Ltmp4: # Address of block that was removed by CodeGen
	entry:			entry:
	%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "=r,=r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))			%0 = callbr { i32, i32 } asm sideeffect "testl $0, $0; testl $1, $2; jne ${3:l}", "=r,=r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %out1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))
	to label %asm.fallthrough [label %label_true, label %return]			to label %asm.fallthrough [label %label_true, label %return]

	asm.fallthrough: ; preds = %entry			asm.fallthrough: ; preds = %entry
	%asmresult = extractvalue { i32, i32 } %0, 0			%asmresult = extractvalue { i32, i32 } %0, 0
	%asmresult1 = extractvalue { i32, i32 } %0, 1			%asmresult1 = extractvalue { i32, i32 } %0, 1
	%1 = callbr { i32, i32 } asm sideeffect "testl $0, $1; testl $2, $3; jne ${5:l}", "=r,=r,r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %asmresult, i32 %asmresult1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))			%1 = callbr { i32, i32 } asm sideeffect "testl $0, $1; testl $2, $3; jne ${5:l}", "=r,=r,r,r,X,X,~{dirflag},~{fpsr},~{flags}"(i32 %asmresult, i32 %asmresult1, i8* blockaddress(@test4, %label_true), i8* blockaddress(@test4, %return))
	Show All 15 Lines