This is an archive of the discontinued LLVM Phabricator instance.

[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets
ClosedPublic

Authored by nickdesaulniers on Jul 19 2022, 2:55 PM.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 2:55 PM
nickdesaulniers requested review of this revision.Jul 19 2022, 2:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2022, 2:55 PM

Going to throw a -1 on this; I think we should instead revert https://reviews.llvm.org/D129997 for now. This patch needs the test case from https://reviews.llvm.org/D129997#3663878 added anyways.

I also think it might be something else going wrong in TailDuplicator::tailDuplicate that's messing up the predaccessor list.

nickdesaulniers planned changes to this revision.Jul 19 2022, 2:57 PM

TODO: I should review the intent of llvm/test/CodeGen/X86/tail-dup-asm-goto.ll.

  • rebase, restore -filetype=obj tests, add previously crashing test case
nickdesaulniers edited the summary of this revision. (Show Details)Aug 23 2022, 4:54 PM
nickdesaulniers planned changes to this revision.Aug 23 2022, 4:56 PM
nickdesaulniers added inline comments.
llvm/test/CodeGen/X86/tail-dup-asm-goto.ll
74

I should test that this crashes prior to my change, since the reproducer was found on ppc, though I don't think the issue is target specific. If so, I will pre-commit the additional test, and rebase this so we can better see the resulting changes to the test.

llvm/test/CodeGen/X86/tail-dup-asm-goto.ll
74

Cool; yeah, after I rebase I should add -verify-machineinstrs to this test; it now asserts for this addition with the assertion I added recently in D130290.

  • remove fixme

Sorry for the churn/delays here. This is now ready for review (as is the parent patch D132609 and child patch D129997).

efriedma added inline comments.Aug 25 2022, 2:12 PM
llvm/lib/CodeGen/TailDuplicator.cpp
801

If I'm understanding correctly, the issue here is that PredBB might contain an INLINEASM_BR. We're checking TailBB->isInlineAsmBrIndirectTarget because we don't have any convenient way to check the relevant property of PredBB. This makes the code in callbr-asm-outputs.ll, but we can't easily avoid it. Is that correct?

Please add a comment to explain that. And maybe we should consider adding some more direct way to check if a block contains an INLINEASM_BR.

nickdesaulniers marked an inline comment as done.
nickdesaulniers marked an inline comment as not done.
  • fix typo in comment. :set spell
This revision is now accepted and ready to land.Aug 31 2022, 10:13 AM
mingmingl added inline comments.Aug 31 2022, 11:26 AM
llvm/lib/CodeGen/TailDuplicator.cpp
804–805

If I'm reading correctly, the 2nd condition is "the edge is from the indirect target list && TailBB is a fall-through block of PredBB". Correct me if I'm reading this wrong.

How is a fall-through TailBB different from a non-fall-through TailBB when it comes to corrupted successor/predecessor list? (This is purely for my education and thanks for any enlightening point!)

nickdesaulniers added inline comments.
llvm/lib/CodeGen/TailDuplicator.cpp
804–805

If I'm reading correctly, the 2nd condition is "the edge is from the indirect target list && TailBB is a fall-through block of PredBB". Correct me if I'm reading this wrong.

That's correct. If TailBB is a fallthrough block of PredBB, there is an edge from PredBB to TailBB. If PredBB contains an INLINEASM_BR, and TailBB is in the indirect target list, there is also an edge from PredBB to TailBB. TailDuplicator::tailDuplicate seems to make the assumption that MachineBasicBlocks can only have one edge from a predecessor to a successor, but that's not necessarily true, as my added test case demonstrates. But maybe there is some invariant that MachineBasicBlocks cannot have more than one edge between the same blocks? Maybe @MatzeB would know if such an invariant exists? While INLINEASM_BR is not a terminating instruction (today, it was, and might become so again), I can imagine a jcc .Ltmp1; jmp .Ltmp1; also getting messed up by TailDuplicator::tailDuplicate here (since there would also be 2 edges from pred to succ in that case as well).

How is a fall-through TailBB different from a non-fall-through TailBB when it comes to corrupted successor/predecessor list?

INLINEASM_BR has an explicit fallthough MBB. This may differ from what follows immediately after that MCinstr; there may be an explicit branch to another MBB, or no instruction (which I'd also call fallthrough, though that's implicit fallthrough as compared to INLINEASM_BR's explicit fallthrough).

This patch addresses a case of successor/predecessor list corruption stemming from TailDuplicator::tailDuplicate when TailBB is BOTH the explicit fallthrough of an INLINEASM_BR AND indirect target simultaneously.

This is purely for my education and thanks for any enlightening point!

Always! Never hesitate to ask; if there's a way I could modify the wording of the comment or commit message to make anything clearer; I'm open to suggestions. Otherwise I'll land this when I get back from lunch in ~1hr. Thanks for taking the time to look over my patch.

This revision was landed with ongoing or failed builds.Aug 31 2022, 1:07 PM
This revision was automatically updated to reflect the committed changes.
mingmingl added inline comments.Aug 31 2022, 1:36 PM
llvm/lib/CodeGen/TailDuplicator.cpp
804–805

TailDuplicator::tailDuplicate seems to make the assumption that MachineBasicBlocks can only have one edge from a predecessor to a successor, but that's not necessarily true, as my added test case demonstrates.

Ah I was also assuming there is only one edge from Pred to each one of its Succ. This is helpful!

This patch addresses a case of successor/predecessor list corruption stemming from TailDuplicator::tailDuplicate when TailBB is BOTH the explicit fallthrough of an INLINEASM_BR AND indirect target simultaneously.

To confirm my understanding, when TailBB is both the explicit fall through and implicit fall-through, tail-duplicate eliminate both edges (from PredBB to TailBB, and from INLINEASM_BR to TailBB) while INLINEASM_BR edge should be kept, is that correct? (if yes, I think I understand the issue better!)

I'm not a fan of terms, but just to clarify a little bit, does fall-through really mean the edge (that connects basic blocks) here? Ask since I originally thought fall-through indicates layout (e.g., TailBB is placed right after PredBB in the generated machine function) -> in that sense, whether TailBB is placed right after PredBB or not (i.e., whether TailBB is a layout fall-through or not), TailBB shouldn't be tail duplicated (otherwise, both edges are deleted and issue happen).

if there's a way I could modify the wording of the comment or commit message to make anything clearer; I'm open to suggestions.

Your explanation is much appreciated! Thanks for taking the time to reply!

It took me sometime to wrap my head and figure out what i missed, so don't bother with wording it again.

llvm/lib/CodeGen/TailDuplicator.cpp
804–805

Oh, I'm mixing up the MCInstr (lower level IR) with the Instruction (higher level IR). INLINEASM_BR (MCInstr) does not have an explicit fallthrough. callbr (Instruction) does.

when TailBB is both the explicit fall through and implicit fall-through, tail-duplicate eliminate both edges (from PredBB to TailBB, and from INLINEASM_BR to TailBB) while INLINEASM_BR edge should be kept, is that correct?

That wasn't the specific case I was referring to, but I think the answer for that case would be still be "yes."

i.e.

%bb.4:
INLINEASM_BR &"", 1 /* sideeffect attdialect */, 13 /* imm */, %bb.5
JMP_1 %bb.5

so there's 2 edges from Pred (%bb.4) to Succ (%bb.5). It's probably ok for taildup to subsume the contents of %bb.5 into %bb.4, removing the JMP, but it should not remove %bb.5 from the successor list, because INLINEASM_BR still exists and hasn't been updated. I'm not sure yet if that transform is even profitable or not.

does fall-through really mean the edge (that connects basic blocks) here?

Yes, I believe MachineBasicBlocks don't need explicit terminators and may implicitly fallthrough. I might be mis-remembering though.

mingmingl added inline comments.Aug 31 2022, 3:04 PM
llvm/lib/CodeGen/TailDuplicator.cpp
804–805

so there's 2 edges from Pred (%bb.4) to Succ (%bb.5). It's probably ok for taildup to subsume the contents of %bb.5 into %bb.4, removing the JMP, but it should not remove %bb.5 from the successor list

Got it. Thanks again for the following up!! My previous confusion is from tie-ing 'fall-through' to block layout.

Re if duplicating is profitable or not, TailDuplicator::canTailDuplicate is used by both MachineBlockPlacement.cpp and TailDuplication.cpp. MBP pass queries this method to see if 'TailBB' can be placed into all its (unplaced) predecessor blocks when placing all blocks in a chain, so an answer 'no' (from PredBB1) (when it's performance neutral in PredBB1 -> TailBB edge, and correct to do so) may prevent 'TailBB' being duplicated into 'PredBB2' (if that reduces dynamic branches taken).

Yes, I believe MachineBasicBlocks don't need explicit terminators and may implicitly fallthrough.

This is true (e.g., anallyzeBranch depends on the implicit fall-through)