Changeset View
Standalone View
llvm/lib/CodeGen/TailDuplicator.cpp
Show First 20 Lines • Show All 792 Lines • ▼ Show 20 Lines | if (PredBB->succ_size() > 1) | ||||
return false; | return false; | ||||
MachineBasicBlock *PredTBB = nullptr, *PredFBB = nullptr; | MachineBasicBlock *PredTBB = nullptr, *PredFBB = nullptr; | ||||
SmallVector<MachineOperand, 4> PredCond; | SmallVector<MachineOperand, 4> PredCond; | ||||
if (TII->analyzeBranch(*PredBB, PredTBB, PredFBB, PredCond)) | if (TII->analyzeBranch(*PredBB, PredTBB, PredFBB, PredCond)) | ||||
return false; | return false; | ||||
if (!PredCond.empty()) | if (!PredCond.empty()) | ||||
return false; | return false; | ||||
if (TailBB->isInlineAsmBrIndirectTarget()) | |||||
efriedma: If I'm understanding correctly, the issue here is that PredBB might contain an INLINEASM_BR. | |||||
return false; | |||||
return true; | return true; | ||||
} | } | ||||
Not Done ReplyInline ActionsIf I'm reading correctly, the 2nd condition is "the edge is from the indirect target list && TailBB is a fall-through block of PredBB". Correct me if I'm reading this wrong. How is a fall-through TailBB different from a non-fall-through TailBB when it comes to corrupted successor/predecessor list? (This is purely for my education and thanks for any enlightening point!) mingmingl: If I'm reading correctly, the 2nd condition is "the edge is from the indirect target list &&… | |||||
Not Done ReplyInline Actions
That's correct. If TailBB is a fallthrough block of PredBB, there is an edge from PredBB to TailBB. If PredBB contains an INLINEASM_BR, and TailBB is in the indirect target list, there is also an edge from PredBB to TailBB. TailDuplicator::tailDuplicate seems to make the assumption that MachineBasicBlocks can only have one edge from a predecessor to a successor, but that's not necessarily true, as my added test case demonstrates. But maybe there is some invariant that MachineBasicBlocks cannot have more than one edge between the same blocks? Maybe @MatzeB would know if such an invariant exists? While INLINEASM_BR is not a terminating instruction (today, it was, and might become so again), I can imagine a jcc .Ltmp1; jmp .Ltmp1; also getting messed up by TailDuplicator::tailDuplicate here (since there would also be 2 edges from pred to succ in that case as well).
INLINEASM_BR has an explicit fallthough MBB. This may differ from what follows immediately after that MCinstr; there may be an explicit branch to another MBB, or no instruction (which I'd also call fallthrough, though that's implicit fallthrough as compared to INLINEASM_BR's explicit fallthrough). This patch addresses a case of successor/predecessor list corruption stemming from TailDuplicator::tailDuplicate when TailBB is BOTH the explicit fallthrough of an INLINEASM_BR AND indirect target simultaneously.
Always! Never hesitate to ask; if there's a way I could modify the wording of the comment or commit message to make anything clearer; I'm open to suggestions. Otherwise I'll land this when I get back from lunch in ~1hr. Thanks for taking the time to look over my patch. nickdesaulniers: > If I'm reading correctly, the 2nd condition is "the edge is from the indirect target list &&… | |||||
Ah I was also assuming there is only one edge from Pred to each one of its Succ. This is helpful!
To confirm my understanding, when TailBB is both the explicit fall through and implicit fall-through, tail-duplicate eliminate both edges (from PredBB to TailBB, and from INLINEASM_BR to TailBB) while INLINEASM_BR edge should be kept, is that correct? (if yes, I think I understand the issue better!) I'm not a fan of terms, but just to clarify a little bit, does fall-through really mean the edge (that connects basic blocks) here? Ask since I originally thought fall-through indicates layout (e.g., TailBB is placed right after PredBB in the generated machine function) -> in that sense, whether TailBB is placed right after PredBB or not (i.e., whether TailBB is a layout fall-through or not), TailBB shouldn't be tail duplicated (otherwise, both edges are deleted and issue happen).
Your explanation is much appreciated! Thanks for taking the time to reply! It took me sometime to wrap my head and figure out what i missed, so don't bother with wording it again. mingmingl: > TailDuplicator::tailDuplicate seems to make the assumption that MachineBasicBlocks can only… | |||||
Not Done ReplyInline ActionsOh, I'm mixing up the MCInstr (lower level IR) with the Instruction (higher level IR). INLINEASM_BR (MCInstr) does not have an explicit fallthrough. callbr (Instruction) does.
That wasn't the specific case I was referring to, but I think the answer for that case would be still be "yes." i.e. %bb.4: INLINEASM_BR &"", 1 /* sideeffect attdialect */, 13 /* imm */, %bb.5 JMP_1 %bb.5 so there's 2 edges from Pred (%bb.4) to Succ (%bb.5). It's probably ok for taildup to subsume the contents of %bb.5 into %bb.4, removing the JMP, but it should not remove %bb.5 from the successor list, because INLINEASM_BR still exists and hasn't been updated. I'm not sure yet if that transform is even profitable or not.
Yes, I believe MachineBasicBlocks don't need explicit terminators and may implicitly fallthrough. I might be mis-remembering though. nickdesaulniers: Oh, I'm mixing up the MCInstr (lower level IR) with the Instruction (higher level IR). | |||||
Not Done ReplyInline Actions
Got it. Thanks again for the following up!! My previous confusion is from tie-ing 'fall-through' to block layout. Re if duplicating is profitable or not, TailDuplicator::canTailDuplicate is used by both MachineBlockPlacement.cpp and TailDuplication.cpp. MBP pass queries this method to see if 'TailBB' can be placed into all its (unplaced) predecessor blocks when placing all blocks in a chain, so an answer 'no' (from PredBB1) (when it's performance neutral in PredBB1 -> TailBB edge, and correct to do so) may prevent 'TailBB' being duplicated into 'PredBB2' (if that reduces dynamic branches taken).
This is true (e.g., anallyzeBranch depends on the implicit fall-through) mingmingl: > so there's 2 edges from Pred (%bb.4) to Succ (%bb.5). It's probably ok for taildup to subsume… | |||||
/// If it is profitable, duplicate TailBB's contents in each | /// If it is profitable, duplicate TailBB's contents in each | ||||
/// of its predecessors. | /// of its predecessors. | ||||
/// \p IsSimple result of isSimpleBB | /// \p IsSimple result of isSimpleBB | ||||
/// \p TailBB Block to be duplicated. | /// \p TailBB Block to be duplicated. | ||||
/// \p ForcedLayoutPred When non-null, use this block as the layout predecessor | /// \p ForcedLayoutPred When non-null, use this block as the layout predecessor | ||||
/// instead of the previous block in MF's order. | /// instead of the previous block in MF's order. | ||||
/// \p TDBBs A vector to keep track of all blocks tail-duplicated | /// \p TDBBs A vector to keep track of all blocks tail-duplicated | ||||
/// into. | /// into. | ||||
▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines |
If I'm understanding correctly, the issue here is that PredBB might contain an INLINEASM_BR. We're checking TailBB->isInlineAsmBrIndirectTarget because we don't have any convenient way to check the relevant property of PredBB. This makes the code in callbr-asm-outputs.ll, but we can't easily avoid it. Is that correct?
Please add a comment to explain that. And maybe we should consider adding some more direct way to check if a block contains an INLINEASM_BR.