Page MenuHomePhabricator

[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent
ClosedPublic

Authored by nickdesaulniers on Mar 27 2020, 5:25 PM.

Details

Summary

A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two. This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated. The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
  MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
  if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
    continue;
  PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was continueing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block CopyBB which updates FuncInfo->MBB (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Diff Detail

Unit TestsFailed

TimeTest
100 msClang.Driver::Unknown Unit Message ("")
Script: -- : 'RUN: at line 7'; printf '/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/clang/test/Driver/cl-response-file.c\n' '/I/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/clang/test/Driver\Inputs\cl-response-file\ /DFOO=2' > /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/tools/clang/test/Driver/Output/cl-response-file.c.tmp.rsp
40 msLLVM.Linker::Unknown Unit Message ("")
Script: -- : 'RUN: at line 4'; echo "/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Outputy = type opaque @GV = external global /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Outputy*" | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/llvm-as > /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output/2003-01-30-LinkerTypeRename.ll.tmp.1.bc
20 msLLVM.Linker::Unknown Unit Message ("")
Script: -- : 'RUN: at line 4'; echo "/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output = type i32" | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/llvm-as > /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output/2003-04-26-NullPtrLinkProblem.ll.tmp.2.bc
20 msLLVM.Linker::Unknown Unit Message ("")
Script: -- : 'RUN: at line 1'; echo "/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output = type opaque" | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/llvm-as > /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output/2003-06-02-TypeResolveProblem.ll.tmp.2.bc
20 msLLVM.Linker::Unknown Unit Message ("")
Script: -- : 'RUN: at line 1'; echo "/mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output = type i32" | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/llvm-as > /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/test/Linker/Output/2003-06-02-TypeResolveProblem2.ll.tmp.1.bc
View Full Test Results (8 Failed)

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMar 27 2020, 5:25 PM

An INLINEASM_BR edge should count as a predecessor, yes; if that edge is missing, something went wrong.

void accepted this revision.Mar 27 2020, 6:09 PM

Good catch.

This revision is now accepted and ready to land.Mar 27 2020, 6:09 PM
nickdesaulniers planned changes to this revision.Mar 30 2020, 10:49 AM

An INLINEASM_BR edge should count as a predecessor, yes; if that edge is missing, something went wrong.

Ok, that's what I suspected (should be no difference for direct vs indirect references), thanks for confirming.

I'm going to sit on this change; the test cases are helpful and expose a current issue, but the fix feels like more duct tape and bailing wire to me, and I'm not certain it's the right fix.

From here, my plan is:

  1. implement separate machine verifier checks for:
    1. all MachineBasicBlocks with INLINEASM_BR terminators should have the targets of the INLINEASM_BR in their list of successors.
    2. all MachineBasicBlocks that are targets of INLINEASM_BR should have the INLINEASM_BR's parent MachineBasicBlock in their list of predacessors. This is the invariant that's violated in the above test case.
  2. fix whatever currently unidentified pass is breaking those invariants and fix them.
  3. revisit landing this test case, maybe without the changes I've made here to BranchFolding.

(Then landing in the order 2, 1, 3).

I had to beef up the verification of callbr at the Instruction level to help find previous breakages, so this feels like a similar plan, albeit at a lower IR. I'll cc folks when I have those ready, will be my focus this week.

  1. implement separate machine verifier checks for:
    1. all MachineBasicBlocks with INLINEASM_BR terminators should have the targets of the INLINEASM_BR in their list of successors.

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor; when inline asm passes the address of a label ("computed goto") as input to an asm statement, but not in the label list, that doesn't make a it valid branch target (though theoretically, the inline asm could still jump to it, which is super undefined).

But with a basic check:

diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 72da33c92346..94b1bd31a4e9 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -887,6 +887,17 @@ void MachineVerifier::verifyInlineAsm(const MachineInstr *MI) {
     if (!MO.isReg() || !MO.isImplicit())
       report("Expected implicit register after groups", &MO, OpNo);
   }
+
+  if (MI->getOpcode() == TargetOpcode::INLINEASM_BR) {
+    // Check that the targets are listed as successors to this MI's parent MBB.
+    SmallPtrSet<const BasicBlock*, 2> successors;
+    for (const MachineBasicBlock* MBB : MI->getParent()->successors())
+      successors.insert(MBB->getBasicBlock());
+    for (const MachineOperand& MO : MI->operands())
+      if (MO.isBlockAddress())
+        if (!successors.count(MO.getBlockAddress()->getBasicBlock()))
+          report("Expected INLINEASM_BR blockaddress operand not in successor list of parent MBB", MI);
+  }
 }

This fails on our test suite in llvm/test/CodeGen/X86/callbr-asm-outputs.ll's first test case. Right off the bat, I can see we messed this up in ISEL:

# *** IR Dump After Finalize ISel and expand pseudo-instructions ***:
...
bb.0.entry:
  successors: %bb.3(0x80000000); %bb.3(100.00%)
...
  INLINEASM_BR &"xorl $1, $0; jmp ${2:l}" [attdialect], $0:[regdef:GR32], def %2:gr32, $1:[reguse:GR32], %3:gr32, $2:[imm], blockaddress(@test1, %ir-block.abnormal), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags
bb.3.entry:
; predecessors: %bb.0
...
bb.2.abnormal (address-taken):
; predecessors: %bb.3
...

We can see we have an INLINEASM_BR MachineInstr with a blockaddress MachineOperand, but the parent MachineBasicBlock (bb.0) of the INLINEASM_BR MachineInstr's successor list does not include the MachineBasicBlock for which the blockaddress MachineOperand refers (bb.2).

(Also, it's un-ergonomic that blockaddresses at the MachineInstr level seem to refer to BasicBlocks, and not MachineBasicBlocks. You have to call MachineBasicBlock#getBasicBlock(). I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?)

I'm going to check the other invariant I specified above now, and will report back on that.

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor

Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that.

I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?

In general, blockaddress can refer to a block in a different function. So for various reasons reasons, it makes sense to represent that as a "blockaddress". It can't be an MBB because in general the MBB doesn't exist in memory. That really only applies for indirectbr, though.

For INLINEASM_BR, we could use MachineBasicBlock operands to represent the succesors.

  1. all MachineBasicBlocks that are targets of INLINEASM_BR should have the INLINEASM_BR's parent MachineBasicBlock in their list of predacessors. This is the invariant that's violated in the above test case.

This fails on our test suite in llvm/test/CodeGen/X86/callbr-asm-outputs.ll's first test case. Right off the bat, I can see we messed this up in ISEL:

# *** IR Dump After Finalize ISel and expand pseudo-instructions ***:
...
bb.0.entry:
  successors: %bb.3(0x80000000); %bb.3(100.00%)
...
  INLINEASM_BR &"xorl $1, $0; jmp ${2:l}" [attdialect], $0:[regdef:GR32], def %2:gr32, $1:[reguse:GR32], %3:gr32, $2:[imm], blockaddress(@test1, %ir-block.abnormal), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags
bb.3.entry:
; predecessors: %bb.0
...
bb.2.abnormal (address-taken):
; predecessors: %bb.3
...

We can see the second invariant already violated post ISEL here as well. FWIW, a check might look like:

diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp                                             
index 72da33c92346..871c61e76326 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -887,6 +887,28 @@ void MachineVerifier::verifyInlineAsm(const MachineInstr *MI) {
     if (!MO.isReg() || !MO.isImplicit())
       report("Expected implicit register after groups", &MO, OpNo);
   }
+
+  if (MI->getOpcode() == TargetOpcode::INLINEASM_BR) {
+    for (const MachineOperand& MO : MI->operands())
+      if (MO.isBlockAddress()) {
+        const BasicBlock* BB = MO.getBlockAddress()->getBasicBlock();
+        const MachineBasicBlock* Target = nullptr;
+        for (const MachineBasicBlock& MBB : *MI->getParent()->getParent()) {
+           if (MBB.getBasicBlock() == BB)
+             Target = &MBB;
+        }
+        if (Target) {
+          SmallPtrSet<const MachineBasicBlock*, 2> Preds;
+          for (const MachineBasicBlock* Pred : Target->predecessors()) {
+            Preds.insert(Pred);
+          }
+          if (!Preds.count(MI->getParent()))
+            report("MMB target of INLINEASM_BR, but missing INLINEASM_BR's parent MBB from predecessor list", Target);
+        } else {
+          report("could not find target MBB", MI);
+        }
+      }
+  }

So I think we need to fix up ISEL to properly set successors and predecessors for MachineBasicBlocks that are terminated by INLINEASM_BR MachineInstrs. Does that sound right, @void @efriedma ?

So I think we need to fix up ISEL to properly set successors and predecessors for MachineBasicBlocks that are terminated by INLINEASM_BR MachineInstrs. Does that sound right, @void @efriedma ?

Probably in InstrEmitter::EmitSpecialNode.

Ok, so I don't think this is generally the case, as we could have a MachineOperand to the INLINEASM_BR that is a blockaddress, yet is not a successor

Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that.

Yeah, we could technically make asm goto work even if you didn't use the goto part...

I honestly don't like blockaddresses and wonder if we could make these all just have BasicBlocks or MachineBasicBlocks as operands?

In general, blockaddress can refer to a block in a different function. So for various reasons reasons, it makes sense to represent that as a "blockaddress". It can't be an MBB because in general the MBB doesn't exist in memory. That really only applies for indirectbr, though.

For INLINEASM_BR, we could use MachineBasicBlock operands to represent the succesors.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

I think the interaction between blockaddresses and indirectbr is fine. The indirection of blockaddress constants is actually helpful when you're dealing with random instructions that aren't indirectbr/callbr.

For callbr, the indirection isn't so helpful, so yes, it would make sense to just use BasicBlock/MachineBasicBlock directly.

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

I think this is what's going on in ScheduleDAGSDNodes::EmitSchedule; the results of the before/after split don't look quite right in terms of pred/succ lists, but I've got to dig more into this tomorrow.

Maybe blockaddress Constants could be generated as late as possible. Instead, we'd pass BasicBlock operands around in the Instruction level IR, then lower to MachineBasicBlock operands at the MachineInstr level, then only generate blockaddress operands very late when lowering to MCInst level?

I think the interaction between blockaddresses and indirectbr is fine. The indirection of blockaddress constants is actually helpful when you're dealing with random instructions that aren't indirectbr/callbr.

For callbr, the indirection isn't so helpful, so yes, it would make sense to just use BasicBlock/MachineBasicBlock directly.

That's a yak shave for another day, but one I really want to do since I think it makes callbr and transforms on it less brittle. (Particularly ones that use get/setOperand to modify one operand, when technically they should be modifying two).

Probably in InstrEmitter::EmitSpecialNode.

We normally construct the MachineFunction CFG much earlier; see SelectionDAGBuilder::visitCallBr.

It looks like the problem is that we split the "block" into two MBBs, and the successors end up attached to the second block, instead of the first. Not sure where that happens, off the top of my head.

I think this is what's going on in ScheduleDAGSDNodes::EmitSchedule; the results of the before/after split don't look quite right in terms of pred/succ lists, but I've got to dig more into this tomorrow.

Yeah, this is what's going on. Printing the MachineBasicBlocks before and after the relevant INLINEASM_BR part of ScheduleDAGSDNodes::EmitSchedule shows:

Before:

bb.0 (%ir-block.2):
  successors: %bb.1(0x80000000), %bb.4(0x00000000); %bb.1(100.00%), %bb.4(0.00%)
  INLINEASM_BR &"jmp ${1:l}" [attdialect], $0:[regdef:GR32], def %5:gr32, $1:[imm], blockaddress(@main, %ir-block.11), $2:[clobber], implicit-def early-clobber $df, $3:[clobber], implicit-def early-clobber $fpsw, $4:[clobber], implicit-def early-clobber $eflags, !2
  %0:gr32 = COPY %5:gr32
  JMP_1 %bb.1

bb.1 (%ir-block.4):
; predecessors: %bb.0
...
bb.4 (%ir-block.11, address-taken):
; predecessors: %bb.0

Everything looks good. I don't understand why INLINEASM_BR isn't the terminal instruction of bb.0. Isn't it marked a terminator in llvm/include/llvm/Target/Target.td line 1021? I also also don't understand why the MachineBasicBlock has a terminators method plural? I thought all blocks have 1 and only 1 terminal instruction?

Afterwards, things look bad:

bb.0 (%ir-block.2):
  successors: %bb.6(0x80000000); %bb.6(100.00%)

  INLINEASM_BR &"jmp ${1:l}" [attdialect], $0:[regdef:GR32], def %5:gr32, $1:[imm], blockaddress(@main, %ir-block.11), $2:[clobber], implicit-def early-clobber $df, $3:[clobber], implicit-def early-clobber $fpsw, $4:[clobber], implicit-def early-clobber $eflags, !2

bb.6 (%ir-block.2):
; predecessors: %bb.0
  successors: %bb.1(0x80000000), %bb.4(0x00000000); %bb.1(100.00%), %bb.4(0.00%)

  %0:gr32 = COPY %5:gr32
  JMP_1 %bb.1

bb.1 (%ir-block.4):
; predecessors: %bb.6
...
bb.4 (%ir-block.11, address-taken):
; predecessors: %bb.6

Specifically:

  1. bb.0 has one successor, bb.6, which isn't right as the INLINEASM_BR could jump to bb.4. bb.0 should have two successors (fallthrough: bb.6, indirect target: bb.4)
  2. bb.6 (the CopyBB in ScheduleDAGSDNodes::EmitSchedule successor list is wrong; bb.1 is right, but bb.6 cannot get to bb.4 as it unconditionally jumps to bb.1 only.

When we split the initial MachineBasicBlock, we need to "take" only the fallthrough successor, I suspect.

Everything looks good. I don't understand why INLINEASM_BR isn't the terminal instruction of bb.0. Isn't it marked a terminator in llvm/include/llvm/Target/Target.td line 1021? I also also don't understand why the MachineBasicBlock has a terminators method plural? I thought all blocks have 1 and only 1 terminal instruction?

A colleague today explained to me that this happens with conditional jumps, and later I found such a case with multiple conditional jumps with an unconditional jump being the final instruction.

I have a fix for my test case, but it regresses one of the previous existing test cases. I need to spend more time on it tomorrow. Essentially, ISel is now emitting a messed up PHI node that's missing cases for some of the predecessors, which the machine verifier complains about. I suppose this may be the agita (sic) referred to in the comment above ScheduleDAGSDNodes::EmitSchedule's handling of `INLINEASM_BR. :P

Just a status update; yesterday I found/observed how the phi nodes get emitted from selectiondag. This morning I've found a fix for the mangled phi node, but that regresses other phi nodes in the test case. I think I understand why, and will continue to investigate further, but I'm going for a smoke to contemplate. I didn't smoke before this bug.

nickdesaulniers added a comment.EditedApr 2 2020, 3:32 PM

Ok, I think I have a fix in hand; it's still regressing the existing test, but I think it's more along the lines of the above changes regarding Address of block that was removed by CodeGen vs Block address taken and I don't have the below diff in this new branch I'm working out of yet, but need to double check. It is now producing valid predecessors and successors, and verifies that.

I need to:

  1. clean up my spaghetti
  2. fix up the existing test case
  3. retest kernel builds with this
  4. write up an extra detailed commit message, while I still briefly understand a glimpse/sliver of selection dag, instruction selection, instruction scheduling, phis, and virtual registers.

Hopefully I can have that out by EOD, but no promises.

EDIT:
The fix wasn't too bad, more of a PITA to debug everything and a firehose to drink from.

nickdesaulniers added a comment.EditedApr 2 2020, 4:22 PM

Ok, everything is green now, kernel boots. (There's probably more verification I can do, too).

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

void added a comment.Apr 2 2020, 5:30 PM

Ok, everything is green now, kernel boots. (There's probably more verification I can do, too).

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

huh, so it seems that ./llvm/utils/update_llc_test_checks.py llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll doesn't actually do anything :( so I don't think I can preprocess it. I think the output needs to change; I don't understand how you write a test that update_llc_test_checks.py is happy with off of the bat.

Worse is that arc seems messed up, so I'm not able to post the fix to phabricator.

$ arc diff
...
 Exception 
Error while loading file "/android1/arcanist/arcanist/src/lint/linter/xhpast/rules/ArcanistPHPCompatibilityXHPASTLinterRule.php": "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"?
(Run with `--trace` for a full exception trace.)

not sure if it auto updated on my host or what. Is there a way to post to phabricator without arc, maybe just git?

nickdesaulniers added a comment.EditedApr 2 2020, 6:06 PM

FWIW: this is the current state of the patch: https://github.com/ClangBuiltLinux/llvm-project/commit/21a14be632072013a2800f8747154ca1a271d3b6

I'll properly clean up the tests tomorrow. (Man can we plz move to github pull requests).

  • redo everything
This revision is now accepted and ready to land.Apr 2 2020, 7:12 PM
nickdesaulniers requested review of this revision.Apr 2 2020, 7:16 PM
nickdesaulniers retitled this revision from [BranchFolder] don't remove MBB's that have their address taken to [SelectionDAG] fix predeccessor list for INLINEASM_BRs' parent.
nickdesaulniers edited the summary of this revision. (Show Details)
nickdesaulniers added a reviewer: efriedma.
nickdesaulniers edited the summary of this revision. (Show Details)Apr 2 2020, 7:22 PM
void added a comment.Apr 2 2020, 7:35 PM

@void would you be opposed to me posting a patch first that preprocessed llvm/test/CodeGen/X86/callbr-asm-outputs.ll with llvm/utils/update_llc_test_checks.py?

I find updating these tests with lots of checks to be a PITA, and it make it easier to maintain this test in the future and see what changed easier with my change.

Yeah, go for it.

huh, so it seems that ./llvm/utils/update_llc_test_checks.py llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll doesn't actually do anything :( so I don't think I can preprocess it. I think the output needs to change; I don't understand how you write a test that update_llc_test_checks.py is happy with off of the bat.

Special talent. :-)

not sure if it auto updated on my host or what. Is there a way to post to phabricator without arc, maybe just git?

Yes. Just do something like git diff -U99999 and take the diff and add it via the "Update diff" link at the top right.

nickdesaulniers added inline comments.
llvm/lib/CodeGen/MachineVerifier.cpp
893 ↗(On Diff #254676)

@efriedma , so I'm hesitant about this added verification check. As written, it's checking each operand of INLINEASM_BR that's a blockaddress, which again may or may not be the indirect target of asm goto (it could just be a vanilla input).

This goes back to our discussion in https://reviews.llvm.org/D64101#1569218 with @fhahn and @hfinkel .

At this point, I think I agree with your previous comment on this review (https://reviews.llvm.org/D76961#1950958), ie.

Even if there isn't any way to tell the difference from the INLINEASM_BR instruction at the moment, we could change that.

I think we should just make callbr assume that any blockaddress operand is a possible successor. The implication then is that it might then even be safe to use a vanilla asm statement (rather than asm goto), with labels as inputs, and get the same functionality as asm goto. asm goto is kind of just a distinct syntax for what feels like could have been fixes to codegen for the original syntax.

I don't really want to discuss such a change here, but it's something for us to think more about and maybe discuss at the next LLVM developer meeting, or on the mailing list.

I'm more curious about @efriedma 's thoughts on how to proceed:

  1. drop this verification pass, defer decisions to the future, or
  2. keep this pass, and add the additional check I commented earlier in this thread.

I like the idea of 2, but I'm concerned that if we add a verification pass, then someone tries to use blockaddress operands in callbr that are not indirect successors, we probably don't set the successor list correctly, and this added verification check will be too aggressive and error.

So maybe it's better to drop the the pass, and revisit this another day, when were confident in the direct and changes we'd like to make.

Thoughts?

llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
38 ↗(On Diff #254676)

@void I recognize the irony of preprocessing the other test I modified, in child revision https://reviews.llvm.org/D77356, then adding a new test that doesn't do the same formatting.

I tend to write comments in my test of what's being tested that way in the future when someone else needs to change my test, they have some sense of what's important to the test and what's not. That way they feel more empowered to change them.

Should I:

  1. preprocess the test with update_llc_test_checks.py and then
  2. remove these comments

I don't mind deleting anything, and trust your judgement, but I wanted to highlight this irony during code review.

nickdesaulniers retitled this revision from [SelectionDAG] fix predeccessor list for INLINEASM_BRs' parent to [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent.Apr 3 2020, 12:18 PM
nickdesaulniers added inline comments.
llvm/test/CodeGen/X86/callbr-asm-outputs-pred-succ.ll
25 ↗(On Diff #254676)

s/predecessed/proceeded by/

35 ↗(On Diff #254676)

ditto

efriedma added inline comments.Apr 3 2020, 1:04 PM
llvm/lib/CodeGen/MachineVerifier.cpp
893 ↗(On Diff #254676)

I'm fine with just dropping the verifier check. Jump table jumps have implicit targets in a similar way, so unverifiable CFGs aren't really anything novel.


Way off-topic response to the general design of the feature:

At the IR level, at this point, I think that using blockaddress for callbr was a mistake. The list of valid destinations is already listed in the "indirect destinations" list of the callbr; we could just allow asm strings to refer to that label list directly. This would be a little less flexible from a source language perspective, since you couldn't pass the address of a label as a register operand, but at the C level I don't think anyone is actually passing &&LABEL to asm goto in practice.

asm goto is kind of just a distinct syntax for what feels like could have been fixes to codegen for the original syntax.

If the syntax isn't distinct, we would be forced to impose indirect goto's "jump-into-scope" restrictions on all inline asm statements. Probably this would break someone's code.

I guess if we were designing the feature from scratch, we could allow asm goto blocks to jump to any block in the function whose address is taken. But that isn't any more powerful than the feature we currently implement.

897 ↗(On Diff #254676)

is_contained?

nickdesaulniers marked 5 inline comments as done.
  • drop verifier checks, fix typos
nickdesaulniers edited the summary of this revision. (Show Details)Apr 3 2020, 3:04 PM
nickdesaulniers edited the summary of this revision. (Show Details)
efriedma added inline comments.Apr 3 2020, 3:46 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2950 ↗(On Diff #254923)

Maybe it would be better to move this into SelectionDAGISel::FinishBasicBlock? All the other similar code seems to be there.

If we do keep it here, better to fix the comment to something like "SelectionDAGISel::FinishBasicBlock will add PHI operands for the successors of the fallthrough block. Here, we add PHI operands for the successors of the INLINEASM_BR block itself". (It's not obvious at first glance what "first" and "last" refer to.)

nickdesaulniers marked 2 inline comments as done.
  • update comment- update comment- update comment- update comment- update comment- update comment- update comment- update comment- update comment
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2950 ↗(On Diff #254923)

The relevant call chain look like:

SelectionDAGISel::SelectAllBasicBlocks
  SelectBasicBlock
    CodeGenAndEmitDAG
      ScheduleDAGSDNodes::EmitSchedule
      SelectionDAGBuilder::UpdateSplitBlock
  FinishBasicBlock

The issue is that ScheduleDAGSDNodes::EmitSchedule splits FuncInfo->MBB (the current MachineBasicBlock we're emitting a schedule for), then the return value is used to update FuncInfo->MBB in CodeGenAndEmitDAG, such that later in FinishBasicBlock we no longer have a reference to the block referred to as First in SelectionDAGBuilder::UpdateSplitBlock. SelectionDAGBuilder::UpdateSplitBlock is only called when FuncInfo->MBB is split via call to ScheduleDAGSDNodes::EmitSchedule, so I'm certain it make sense to perform special clean up there. In fact, the comment in SelectionDAGISel::CodeGenAndEmitDAG says:

// If the block was split, make sure we update any references that are used to
// update PHI nodes later on.
if (FirstMBB != LastMBB)
  SDB->UpdateSplitBlock(FirstMBB, LastMBB);

so again it makes sense for us to update the PHI nodes given a block split in UpdateSplitBlock.

I've updated the comment.

  • add period to end of comment.
efriedma accepted this revision.Apr 3 2020, 5:25 PM

LGTM

This revision is now accepted and ready to land.Apr 3 2020, 5:25 PM
This revision was automatically updated to reflect the committed changes.