This is an archive of the discontinued LLVM Phabricator instance.

MachineBasicBlock::updateTerminator now requires an explicit layout successor.
ClosedPublic

Authored by jyknight on May 7 2020, 3:24 PM.

Details

Summary

Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block (such as invoke, and
potentially in the future, callbr).

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Diff Detail

Event Timeline

jyknight created this revision.May 7 2020, 3:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2020, 3:24 PM

(such as invoke, and potentially in the future, callbr).

Is the endgame here that we relax the restriction that updateTerminator() requires an analyzable branch?

llvm/include/llvm/CodeGen/MachineBasicBlock.h
519

fallthrough.

llvm/lib/CodeGen/TailDuplicator.cpp
955

Is this related to the rest of the patch? I don't see any obvious interaction.

jyknight marked 2 inline comments as done.May 12 2020, 1:27 PM

Is the endgame here that we relax the restriction that updateTerminator() requires an analyzable branch?

That wasn't my goal. My goal is to remove the assumption that it's possible to reconstruct which branches should be present at the end of the block, based purely on CFG successors. Once upon a time, that may have been a simple prospect -- e.g. if there's a conditional branch, there should be 2 CFG successors, and at least one must have an actual branch instruction, so you just add/remove branches to the other one, depending on whether it's the fallthrough block. But then we added exception handling, and the EHPad successors mess up that simplicity. But, we could at least filter out EHPad blocks, because those can't also be the target of a normal jump.

However, I'm now proposing to make inlineasm_br not be a terminator instruction, which means the block can have extra successors in a new way. And unlike the EHPad blocks, the targets of an INLINEASM_BR may _also_ be used as normal jump/fallthroughs of another block.

I split this out as a separate review, because I think it's actually better that this is less magical and more explicit, regardless of whether we decide to make that INLINEASM_BR change.

llvm/lib/CodeGen/TailDuplicator.cpp
955

This code previously left the block in a bogus state, where the new CFG was not properly reflected in the instructions in the block (e.g. the machine instructions could fallthrough to the next block, while the CFG said it should actually jump somewhere else).

In the past, other future calls to updateTerminator managed to reconstruct the desired state of things, and insert the correct jump, based on the CFG successor. That no longer happens magically, so this code now must ensure that it has inserted a jump at the end of the block if it intends there to be one.

Okay, that explanation makes sense.

llvm/lib/CodeGen/MachineBasicBlock.cpp
585

"or the end of the block is unreachable" makes me a little worried. If I'm understanding correctly, if do somehow end up in that situation, the layout successor won't actually be a successor of the BasicBlock, and assert(isSuccessor(PreviousLayoutSuccessor)) will fail.

llvm/lib/CodeGen/MachineBlockPlacement.cpp
2729

Commented out code?

Are you planning to land this or D79793 is the alternative approach?

jyknight marked 3 inline comments as done.May 13 2020, 9:42 AM

Are you planning to land this or D79793 is the alternative approach?

They're not alternatives, I plan to land both.

llvm/lib/CodeGen/MachineBasicBlock.cpp
585

Um, yup. I think you've identified a serious issue.

Identifying whether the instruction stream represents a fallthrough is another place where we look at successor list to help figure out what the instructions mean. That's something else I should fix -- we currently don't actually represent the tail of the block being unreachable in the instruction stream. IMO, this is a mistake, and we ought to have an UNREACHABLE_MARKER pseudo-inst to represent this situation, and stop depending on querying the successors list to figure this out.

For the purposes of this patch and making incremental progress, I believe it's sufficient to just delete the assert and go with if(!PreviousLayoutSuccessor || !isSuccessor(PreviousLayoutSuccessor)) return;. That won't fully achieve my goal, but I'll work on the above idea coupled with a cleanup of all the fallthrough handling, here and everywhere else, as a separate change.

efriedma added inline comments.May 13 2020, 12:19 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
585

Yes, I think that works.

aheejin added inline comments.May 13 2020, 5:10 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
626

Does this new patch handle when a given layout successor is an EH pad? I might be mistaken, but I think that case should be excluded. But while this code is deleted, I can't find a similar counterpart in the new CL.

llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp
168

Here for example.. this seems to takes the next layout block as a fallthrough, regardless of whether the layout successor is an EH pad or not... No?

efriedma added inline comments.May 13 2020, 5:47 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
626

There are three possibilities here:

  1. The end of the block is unreachable. This is case we discussed in the other comment.
  2. We have fallthrough. The original layout successor can't be an EHPad because it's illegal to fall through into an EHPad.
  3. We don't have fallthrough. In this case, the original layout successor is irrelevant.
efriedma added inline comments.May 13 2020, 5:50 PM
llvm/lib/CodeGen/MachineBasicBlock.cpp
585

Hmm. Actually, isSuccessor() isn't sufficient, I think. Consider the case where the end of the block is unreachable, and the layout successor is an EHPad.

There are many places in this patch that just gives the layout successor as an argument to updateTerminator, which does not consider EH pad possibility. I think the main reason we enumerated the successor iterator is to filter out EH pads, and I'm not sure if deleting that and delegating the task of giving the right layout successor to the caller side is a good idea, because each caller has to evaluate that separately.
Also I'm not very sure PreviousLayoutSuccessor stands for; updateTerminator does not change layout (= order of BBs), so the layout successor should stay the same before and after calling updateTerminator. Depending on branches, after calling updateTerminator, we may not end up falling through to the PreviousLayoutSuccessor variable.

llvm/include/llvm/CodeGen/MachineBasicBlock.h
523

I'm not very sure PreviousLayoutSuccessor stands for. updateTerminator does not change layout (= order of BBs), so the layout successor should stay the same before and after calling updateTerminator. Depending on branches created, after calling updateTerminator, we may not end up falling through to the PreviousLayoutSuccessor variable, but in that case it still succeeds MBB in layout. PreviousLayoutSuccessor, to me, sounds like the layout successor will change after calling this function. How about just LayoutSuccessor or something..?

llvm/lib/CodeGen/MachineBasicBlock.cpp
626

I'm not sure about case 2; the reason is here: https://reviews.llvm.org/D79605#2035699

jyknight marked 3 inline comments as done.May 15 2020, 5:21 PM
jyknight added inline comments.
llvm/include/llvm/CodeGen/MachineBasicBlock.h
523

In some (but not all) of the callers, the current actual block ordering of the function has been modified _before_ calling this function. Therefore, the argument passed as "PreviousLayoutSuccessor" is no longer the actual, current, layout successor, at the time of this call.

If there is currently a fallthrough at the end of the block, a branch may need to be inserted, and the target of that branch is PreviousLayoutSuccessor.

llvm/lib/CodeGen/MachineBasicBlock.cpp
585

Agreed.

626

The problem right now (discussed in the other comment) is that we cannot easily distinguish between actual fallthrough and unreachable block-end.

So, yes, right now we may end up in a situation that appears as though a fallthrough occurs into an EHPad, but, that's not actually possible. It's only because in that situation, the block-end must in fact have been unreachable.

jyknight updated this revision to Diff 264404.May 15 2020, 7:30 PM

Handle the case where the apparent fallthrough is not actually a
fallthrough, because the end of the block is unreachable.

efriedma accepted this revision.May 16 2020, 1:37 PM

LGTM with a couple minor comments.

llvm/lib/CodeGen/MachineBasicBlock.cpp
593

nit: please clang-format

622

Might as well assert(!PreviousLayoutSuccessor->isEHPad()).

This revision is now accepted and ready to land.May 16 2020, 1:37 PM
aheejin added inline comments.May 18 2020, 1:26 PM
llvm/include/llvm/CodeGen/MachineBasicBlock.h
523

Then the change in WebAssemblyCFGSort.cpp can be incorrect. What you gave as an argument there is the 'current' layout successor. The order change happens in the caller, SortBlocks(), and you have to track all changes there to figure out what the previous layout successor had been.

I'm still not sure if this approach, that pushing all the responsibilities of figuring out "previous layout successor" to the callers, is feasible or scalable. The order change can happen anywhere, sometimes not right before you call MachineBasicBlock::updateTerminator().

void added a comment.Jun 1 2020, 1:04 AM

Is this and D79793 ready for submission? :-)

void added a comment.Jun 3 2020, 1:36 PM

Is this and D79793 ready for submission? :-)

[Sorry to be a pest.] If there's anything I can do to help this and the related patch along, please let me know. It's fixes a major bug in an important feature. :-)

jyknight marked 3 inline comments as done.Jun 6 2020, 6:33 PM
jyknight added inline comments.
llvm/include/llvm/CodeGen/MachineBasicBlock.h
523

The change in WebAssemblyCFGSort.cpp is passing the original/correct successor, because it's looking it up based on the block id created in RenumberBlocks (at the top of SortBlocks), which is before the blocks are reordered.

This revision was automatically updated to reflect the committed changes.