Only calling getPristineRegs() is not sufficient, considering shrink wrapping puts
registers not saved in certain blocks. Use explicit isLiveIn instead.
This fixes pr32292.
Differential D31188
[AntiDepBreaker] Use liveins as well in StartBlock timshen on Mar 21 2017, 6:52 AM. Authored by
Details
Diff Detail
Event TimelineComment Actions About the test: I tried to reduce the test, but as soon as the complexity gets smaller (not by much), the symptom disappears. I deliberately didn't shrink the test, and posted here to see if you think it's Ok to keep. Comment Actions I see only 5 uses of getPristineRegs, two you're removing here from the anti-dep breakers, one in MachineVerifier, and two in asserts in ARM frame lowering. Should we remove it completely? Update it with the logic you're adding here?
Comment Actions As discussed in the bug, I'll preserve the use of getPristineRegs() in this patch. And many of your comments don't apply anymore. :)
Comment Actions Hi Tim, LGTM. I expect a .mir to be more robust against changes. Cheers, Comment Actions Hi Quentin, Oh you mean turn this IR test into a mir test? I don't know how to use mir to reduce the test for even more (I used bugpoint on IR test). I can turn the test to a mir test. Comment Actions
Yes. Use -stop-before <ExeDepPassDEBUG_TYPE> to produce the .mir input. Comment Actions Committed thusly: echristo@athyra ~/s/llvm> git svn dcommit and... here: echristo@athyra ~/s/llvm> git svn dcommit *sigh* Comment Actions FWIW: I am pretty sure "getPristineRegs is not accurately considering shrink wrapping puts registers not saved in certain blocks." as mentioned in the commit message is not true! getPristineRegs() doesn't have to do that! I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied). Comment Actions Totally happy to pull it back if you'd like, was just going on what Quentin said to do. -eric Comment Actions Sounds fair, I fixed the description.
This is based on my inderstanding on "ideal liveins()" I mentioned in the bug: /* Returns all live-ins, including the unspilled callee-saved registers. */ iterator_range<concat_iterator<...>> liveins(); The fix turns "Pristine.test(Reg)" into "Pristine.test(Reg) || BB.IsLiveIn(Reg)" as a refinement (to match the ideal definition of liveins()). It means that not only those callee-saved registers (we are iterating over only them) who don't have spill slots are considered, but also those who “have spill slots but not spilled”. Comment Actions Also, the new behavior matches the intention in the comment: In non-return this is any callee-saved register that is not saved in the prolog. To be fair I don't yet understand why it thinks this is the right thing to do, but I trust that the intention is accurrate, only the code was not. Comment Actions
Pristine.test(Reg) || BB.IsLiveIn(Reg) would make a lot of sense when searching for live-ins. But the comments around it indicate it is actually searching for live-outs? For live-outs you need to:
(you may referer to LivePhysRegs::addLiveOuts() / LivePHysRegs::addLiveIns() to see an alternative implementation of this). From reading the code here the AntiDepBreaker is already doing the right thing. I only read the code though, and did not actually run/play with it so I may have missed something. Comment Actions Not for all live-outs, but only for "live-out callee-saved registers". The code wants to do something on all non-spilled callee-saved registers that are live-outs of the current block. Without considering shrink wrapping, I picture that the correct behavior should be:
The behavior was matching my picture, only if all epilogue blocks are returning blocks. After considering shrink wrapping, things change. Ideally:
Let's see what information we have (I haven't verified yet):
I think my fix is wrong about prologues and epilogues, but correct for other blocks. I think that your suggestion is the correct fix, that is looking at the live-ins of the successors. I'll do that. Do you think that we should keep this patch in the mean time? This patch makes some of the correct cases overly-conservative (but not wrong), and fixes some of the wrong cases. It doesn't make any correct cases wrong. Feel free to revert it if you don't think so. I'm on vacation until April 15th, so I'll work on it after that. :) Comment Actions Just found the time to look at this a bit: The thing that is wrong here is the live-in list of bb.2. bb.2 is a return block without any restore instructions in it so it must have the %x29 as a live-in. So it seems the actual problem lies before the DepBreakers here. I tried reproducing the earlier passes by copying the IR out of the mir test and re-running it on that, but I only always seem to get a version with a single return block without shrink wrapping... Comment Actions Good catch! After some investigation, I found that line: https://github.com/llvm-mirror/llvm/blob/8c6e6605aa4b10ac5c3e3afc4e87361c2ac1ff9d/lib/CodeGen/BranchFolding.cpp#L391 computeLiveIns() doesn't add r29 and r30 into NewMBB's liveins. I believe that computeLiveIns is doing what it's intended to do (add liveins based on instructions), so we should change the caller side to propagate CSR live-ins.
Comment Actions I'm not sure how to do this - when splitting a block B to C and D, where C unconditionally jumps (or falls through) to D, how to compute D's liveins, including CSRs? B could be any block -
If the only information are the liveins(), it seems not easy to tell liveness at the split position. Do we want to use LiveIntervals in BranchFolding.cpp? Comment Actions Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this? Comment Actions You can hold off with this for now, somehow I broke some clang stage2 bots with that commit and I first have to investigate why. Comment Actions I re-landed the patch with a fix and it seems to be fine now. I also filed http://llvm.org/PR33182 to document a MachineVerifier shortcoming of catching a case like this. Comment Actions D32464 doesn't seem to fix this test case - I ran pass post-RA-sched on this test case, and got bb.2.if.then not having X29 as live-in. Can you take a look? Comment Actions The original testcase? The mir test in the repository already has the invalid livein lists from the earlier branch folding. |
Please do not use auto in context where the type is not immediately obvious. (A reader has to lookup that getFrameInfo() does return a MachineFrameInfo here, it is not immediately obvious). A few more of these below.