This is an archive of the discontinued LLVM Phabricator instance.

[MachineOutliner] Don't outline sequences where x16/x17/nzcv are live across
ClosedPublic

Authored by paquette on Jun 1 2018, 2:31 PM.

Details

Summary

This is for bug 37573. It isn't safe to outline sequences of instructions where x16/x17/nzcv live across the sequence.

This teaches the outliner to check whether or not a specific canidate has x16/x17/nzcv live across it and discard the candidate in the case that that is true.

https://bugs.llvm.org/show_bug.cgi?id=37573

Diff Detail

Event Timeline

paquette created this revision.Jun 1 2018, 2:31 PM

It was mentioned on https://reviews.llvm.org/D47553 that outlining also needs special handling for atomic sequences (on AArch64, LDXR/STXR etc. can fail if there's an indirect branch between the two). I guess we don't model the exclusive lock as a register at the moment?

Looks fine otherwise.

lib/Target/AArch64/AArch64InstrInfo.cpp
4949

Please put the title of the document in question, instead of a link; links tend to go stale.

Oh, oops, there's still an issue: this checks whether x16 and x17 which are live after the outlined sequence, but not whether x16 and x17 are live into the outlined sequence.

paquette updated this revision to Diff 150781.Jun 11 2018, 10:00 AM

Updated patch to handle the live-in case. Updated the test to be more explicit as well.

There are now two LRUs maintained rather than one.

  • LRUIn: live ins to a candidate
  • LRUOut: live outs from a candidate

LRUIn handles the beginning of the block up to the beginning of the candidate. LRUOut handles the beginning of the candidate up to the end of the block.

We should probably use this functionality to allow outlining instructions which access/modify SP, in cases where we don't need to spill LR. But that's probably better to do as a followup; it could have some unexpected effects on the heuristics.

include/llvm/CodeGen/MachineOutliner.h
163

This could get expensive with a long basic block; I guess we can try to add some sort of cache later if it becomes an issue.

lib/Target/AArch64/AArch64InstrInfo.cpp
4977

We don't care if it's live out: we can ignore the ABI rules because we know the exact destination of the "bx lr".

We could save/restore x16/x17 in theory, instead of giving up, but probably not worth implementing.

5012

I guess there isn't any practical difference between LRUOut and LRUIn here because we don't outline instructions other than calls which read/write LR?

paquette added inline comments.Jun 12 2018, 9:21 AM
include/llvm/CodeGen/MachineOutliner.h
163

If it'd be possible to infer candidate liveness from the entire block's liveness, we could wrap the candidate's MBB in a struct like

struct CandidateMBB
{
    MachineBasicBlock *MBB;
    LiveRegUnits LRU;
    ...
};

Then we wouldn't risk recalculating liveness on one basic block over and over again in the case of, say, overlapping candidates.

lib/Target/AArch64/AArch64InstrInfo.cpp
4977

We don't care if it's live out: we can ignore the ABI rules because we know the exact destination of the "bx lr".

Oh, good, much simpler then. :)

We could save/restore x16/x17 in theory, instead of giving up, but probably not worth implementing.

Yeah, I think most outlined sequences are too short to make saving/restoring x16/x17 worth it. However, if it turned out that just *one* candidate required the save/restore, then we could possibly salvage the candidate and still outline. I haven't checked to see if that's a common case at all though.

5012

Yep. It's worth a comment explaining that though. LRUOut was effectively what was being used before for LR as well.

(It might be worth it to try allowing instructions that read/write LR in general though, and then discarding candidates that use them if they aren't part of a prospective tail call.)

paquette updated this revision to Diff 150976.Jun 12 2018, 10:00 AM
paquette marked 2 inline comments as done.
  • Added a comment for saving/restoring LR
  • Removed LRUOut, since it's not necessary here, and incurs unnecessary calculation
  • LRUIn -> LRU
paquette marked an inline comment as done.Jun 12 2018, 10:00 AM
efriedma added inline comments.Jun 12 2018, 1:59 PM
include/llvm/CodeGen/MachineOutliner.h
163

I don't think you can do anything useful at a per-block level, unless you want to try to compute some sort of approximation for long blocks. (I wouldn't suggest the approximation approach, though; probably just as easy to keep around one LiveRegUnits for every 100 instructions or so.)

lib/Target/AArch64/AArch64InstrInfo.cpp
4967

Outdated comment.

5011

ADRP doesn't use LR, as far as I know.

paquette updated this revision to Diff 151044.Jun 12 2018, 3:03 PM
paquette marked 2 inline comments as done.

Fixed comments.

Gerolf added a subscriber: Gerolf.Jun 13 2018, 2:25 PM
Gerolf added inline comments.
lib/Target/AArch64/AArch64InstrInfo.cpp
4957

Could this be abstracted behind a target-specific interface? Other platforms like x86 etc will have to take care of similar issues and an interface would make this explicit. This could be done in a follow-up commit though.

Ping! :)

lib/Target/AArch64/AArch64InstrInfo.cpp
4957

Yeah, I think it would be a good idea to have targets implement their own version of MachineOutliner.h in general. Then functions like this could be holed away in there. I'll do that in a follow-up.

efriedma added inline comments.Jun 21 2018, 2:10 PM
include/llvm/CodeGen/MachineOutliner.h
152

This isn't using stepBackward correctly... you have to start from the end of the block to get correct results. I'm surprised this somehow works with your testcases; you probably need a bit better test coverage.

paquette updated this revision to Diff 152547.Jun 22 2018, 3:16 PM

Fixed my mistake using LiveRegUnits. This just walks back across the entire MBB now, so that we know if a register is live across the candidate.

Also extended the test to handle some more edge-cases and complex behaviour, along with some explanation of what's tested.

efriedma added inline comments.Jun 22 2018, 4:10 PM
include/llvm/CodeGen/MachineOutliner.h
148

This is still not right; it's computing the values which are live at the beginning of the block, not the beginning of the outlined sequence.

You need to start at MBB->rbegin() (the end of the block), and end at ++(MachineBasicBlock::reverse_iterator)front() (the beginning of the outlined sequence).

Alternatively, I guess you could conservatively use LiveRegUnits::accumulate.

paquette updated this revision to Diff 152716.Jun 25 2018, 10:08 AM

Updated diff to actually fix the LRU stuff (oops).

Also updated the test because the change makes a couple (safe) cases that worked with the (wrong) fix no longer work. Added a comment explaining what's going on to the test to keep note of the fact that there are a few more cases we can probably catch.

Your results weren't making any sense, so I started debugging a bit... it looks like somehow the mechanism for removing candidates is broken. I'm seeing candidates which should have been erased in AArch64InstrInfo::getOutlininingCandidateInfo, showing up later in MachineOutliner::outline. I haven't tracked down exactly where the bug is, though; any ideas?

include/llvm/CodeGen/MachineOutliner.h
151

The addLiveIns call shouldn't be necessary; it's causing the code to reject valid candidates.

Nevermind, the issue is my fault; should be (MachineBasicBlock::reverse_iterator)front(), not ++(MachineBasicBlock::reverse_iterator)front(). The resulting issue is tricky to debug because MachineInstrs are a circular linked list.

Nevermind, the issue is my fault; should be (MachineBasicBlock::reverse_iterator)front(), not ++(MachineBasicBlock::reverse_iterator)front(). The resulting issue is tricky to debug because MachineInstrs are a circular linked list.

Ah, okay, I'll update the patch with that in a second.

Your results weren't making any sense, so I started debugging a bit... it looks like somehow the mechanism for removing candidates is broken.

Even though this wasn't the issue, I think that I ought to go back and refactor/rewrite the candidate removal mechanism anyway. It's pretty messy, and clearly a source of confusion. I'll start poking at that.

paquette updated this revision to Diff 153118.Jun 27 2018, 10:22 AM
paquette marked an inline comment as done.

Updated diff.

  • ++(MachineBasicBlock::reverse_iterator)front()->(MachineBasicBlock::reverse_iterator)front()
  • Removed addLiveIns from initLRU in the Candidate class
  • Removed FIXME case from the test, because after the fix, this now works correctly
This revision is now accepted and ready to land.Jun 27 2018, 10:41 AM
paquette closed this revision.Jun 27 2018, 10:49 AM

Thanks!

Committed in r335758 (https://reviews.llvm.org/rL335758).