This is an archive of the discontinued LLVM Phabricator instance.

[AntiDepBreaker] Use liveins as well in StartBlock
AbandonedPublic

Authored by timshen on Mar 21 2017, 6:52 AM.

Details

Summary

Only calling getPristineRegs() is not sufficient, considering shrink wrapping puts
registers not saved in certain blocks. Use explicit isLiveIn instead.

This fixes pr32292.

Event Timeline

timshen created this revision.Mar 21 2017, 6:52 AM

About the test:

I tried to reduce the test, but as soon as the complexity gets smaller (not by much), the symptom disappears. I deliberately didn't shrink the test, and posted here to see if you think it's Ok to keep.

I see only 5 uses of getPristineRegs, two you're removing here from the anti-dep breakers, one in MachineVerifier, and two in asserts in ARM frame lowering. Should we remove it completely? Update it with the logic you're adding here?

llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
166

Should this return an Optional (or similar)? It seems unfortunate to check MF.getFrameInfo().isCalleeSavedInfoValid() here and also check MFI.isCalleeSavedInfoValid() in the if below.

timshen updated this revision to Diff 92488.Mar 21 2017, 7:31 AM

Don't do redundant check.

timshen added inline comments.Mar 21 2017, 7:32 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
166

Good catch! I turned the first isCalleeSavedInfoValid into an assert, since it's guarded by the second call.

MatzeB requested changes to this revision.Mar 21 2017, 8:46 AM
MatzeB added inline comments.
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
165

Please do not use auto in context where the type is not immediately obvious. (A reader has to lookup that getFrameInfo() does return a MachineFrameInfo here, it is not immediately obvious). A few more of these below.

175

I don't think this pass will (or should) be used before the prolog epilog inserter so isCalleeSavedInfoValid() should always be true and you could simply assert on it.

176

Adding isLiveIn() here looks wrong given that the code is already adding the live ins of the successors above (which is the way to compute the live-outs of this block). This should already be fine if the code is walking the block backwards afterwards.

Seeing this I would assume we walk the blocks backwards and starting with the live-outs so adding the live-ins would not make sense.

llvm/lib/CodeGen/CriticalAntiDepBreaker.cpp
83–84

see above.

This revision now requires changes to proceed.Mar 21 2017, 8:46 AM
MatzeB added inline comments.Mar 21 2017, 8:48 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
167–172

This looks like you are replacing the check for pristine regs with manually written code that does the same. Am I missing something?

timshen added inline comments.Mar 21 2017, 9:00 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
175

I didn't assert on this, only because of https://github.com/llvm-mirror/llvm/blob/6095a7948d32d712811e8c2ab2190acf8e66a8bc/lib/CodeGen/PrologEpilogInserter.cpp#L66

It appears that CSInfo is valid only if doSpillCalleeSavedRegs is executed.

Is it always the case where usesPhysRegsForPEI is necessary for post-RA scheduling?

176

Looking at https://github.com/llvm-project/llvm-project/blob/master/llvm/lib/CodeGen/PostRASchedulerList.cpp#L331

The blocks are just traversed without an explicit order.

I'll take a closer look at StartBlock. I was more or less confused by its intention.

timshen added inline comments.Mar 21 2017, 9:06 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
167–172

This is because of Quentin's comment in the bug:

I would just resort on live ins information, if at all possible.
If not possible I would work toward making that possible :).

I did it without fully understanding the big picture. :P

It'd be good to have you or someone explain the big picture. :)

MatzeB added inline comments.Mar 21 2017, 9:07 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
175

Ah, that is special behavior for webassembly which keeps using vregs long after register allocation is done. I don't think we should concern ourself with that here as this pass doesn't seem to be compatible with that anyway (a bunch of passes are not and webassembly just isn't using them).

timshen updated this revision to Diff 92510.Mar 21 2017, 10:20 AM
timshen edited edge metadata.

Preserve the use of getPristineRegs() for now.

As discussed in the bug, I'll preserve the use of getPristineRegs() in this patch. And many of your comments don't apply anymore. :)

qcolombet added inline comments.Mar 21 2017, 11:24 AM
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
167–172

Yeah what I was saying is that is we can use isLiveIn alone we should do that.
If we cannot, the long run should be to get rid of the pristine stuff (i.e., just resort on isLiveIn to convey the right information).
I don't expect this patch to fix that :).

llvm/test/CodeGen/PowerPC/pr32292.ll
2

Using a .mir test case you should be able to reduce the test even more.

2

Please use a more descriptive name for the filename. List the PR in the file if you want.

timshen marked an inline comment as done.Mar 22 2017, 12:19 AM
timshen added inline comments.
llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp
167–172

I see.

I was clear whether I need to do anything (e.g. remove this use site) or not. :)

timshen updated this revision to Diff 92612.Mar 22 2017, 3:17 AM
timshen marked an inline comment as done.

Reduce to a smaller test case.

timshen updated this revision to Diff 92615.Mar 22 2017, 3:35 AM

Rename the test.

timshen added inline comments.Mar 22 2017, 3:36 AM
llvm/test/CodeGen/PowerPC/pr32292.ll
2

I didn't use .mir, but got a much smaller test case.

2

Renamed.

timshen retitled this revision from [AntiDepBreaker] Do not use getPristineRegs for marking live registers. to [AntiDepBreaker] Use liveins as well in StartBlock.Mar 22 2017, 3:51 AM
qcolombet accepted this revision.Mar 27 2017, 10:01 AM

Hi Tim,

LGTM.
Out-of-curiosity, what is missing to use a .mir test instead of the .ll test?

I expect a .mir to be more robust against changes.

Cheers,
-Quentin

Hi Quentin,

Oh you mean turn this IR test into a mir test? I don't know how to use mir to reduce the test for even more (I used bugpoint on IR test).

I can turn the test to a mir test.

Oh you mean turn this IR test into a mir test?

Yes.

Use -stop-before <ExeDepPassDEBUG_TYPE> to produce the .mir input.
Then, you should be able to reproduce the problem with that input and -run-pass <ExeDepPassDEBUG_TYPE>

timshen updated this revision to Diff 93233.Mar 28 2017, 6:33 AM

Update the test to .mir.

Also fix a mir parser bug, parsing nulls as constants.

Committed thusly:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
M lib/AsmParser/LLParser.cpp
M lib/CodeGen/AggressiveAntiDepBreaker.cpp
M lib/CodeGen/CriticalAntiDepBreaker.cpp
Committed r299124

and... here:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
A test/CodeGen/PowerPC/pristine-and-livein.mir
Committed r299125

*sigh*

echristo accepted this revision.Mar 30 2017, 3:48 PM

Committed thusly:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
M lib/AsmParser/LLParser.cpp
M lib/CodeGen/AggressiveAntiDepBreaker.cpp
M lib/CodeGen/CriticalAntiDepBreaker.cpp
Committed r299124

and... here:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
A test/CodeGen/PowerPC/pristine-and-livein.mir
Committed r299125

*sigh*

FWIW: I am pretty sure "getPristineRegs is not accurately considering shrink wrapping puts registers not saved in certain blocks." as mentioned in the commit message is not true! getPristineRegs() doesn't have to do that!

I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied).

Totally happy to pull it back if you'd like, was just going on what Quentin said to do.

-eric

timshen added a comment.EditedApr 4 2017, 4:47 PM

Committed thusly:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
M lib/AsmParser/LLParser.cpp
M lib/CodeGen/AggressiveAntiDepBreaker.cpp
M lib/CodeGen/CriticalAntiDepBreaker.cpp
Committed r299124

and... here:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
A test/CodeGen/PowerPC/pristine-and-livein.mir
Committed r299125

*sigh*

FWIW: I am pretty sure "getPristineRegs is not accurately considering shrink wrapping puts registers not saved in certain blocks." as mentioned in the commit message is not true! getPristineRegs() doesn't have to do that!

Sounds fair, I fixed the description.

I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied).

This is based on my inderstanding on "ideal liveins()" I mentioned in the bug:

/* Returns all live-ins, including the unspilled callee-saved registers. */
iterator_range<concat_iterator<...>> liveins();

The fix turns "Pristine.test(Reg)" into "Pristine.test(Reg) || BB.IsLiveIn(Reg)" as a refinement (to match the ideal definition of liveins()).

It means that not only those callee-saved registers (we are iterating over only them) who don't have spill slots are considered, but also those who “have spill slots but not spilled”.

timshen edited the summary of this revision. (Show Details)Apr 4 2017, 4:56 PM
timshen added a comment.EditedApr 4 2017, 5:01 PM

Committed thusly:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
M lib/AsmParser/LLParser.cpp
M lib/CodeGen/AggressiveAntiDepBreaker.cpp
M lib/CodeGen/CriticalAntiDepBreaker.cpp
Committed r299124

and... here:

echristo@athyra ~/s/llvm> git svn dcommit
Committing to https://llvm.org/svn/llvm-project/llvm/trunk ...
A test/CodeGen/PowerPC/pristine-and-livein.mir
Committed r299125

*sigh*

FWIW: I am pretty sure "getPristineRegs is not accurately considering shrink wrapping puts registers not saved in certain blocks." as mentioned in the commit message is not true! getPristineRegs() doesn't have to do that!

I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied).

Also, the new behavior matches the intention in the comment:

In non-return this is any callee-saved register that is not saved in the prolog.

To be fair I don't yet understand why it thinks this is the right thing to do, but I trust that the intention is accurrate, only the code was not.

MatzeB added a comment.Apr 4 2017, 5:04 PM

I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied).

This is based on my inderstanding on "ideal liveins()" I mentioned in the bug:

/* Returns all live-ins, including the unspilled callee-saved registers. */
iterator_range<concat_iterator<...>> liveins();

The fix turns "Pristine.test(Reg)" into "Pristine.test(Reg) || BB.IsLiveIn(Reg)" as a refinement (to match the ideal definition of liveins()).

It means that not only those callee-saved registers (we are iterating over only them) who don't have spill slots are considered, but also those who “have spill slots but not spilled”.

Pristine.test(Reg) || BB.IsLiveIn(Reg) would make a lot of sense when searching for live-ins. But the comments around it indicate it is actually searching for live-outs?

For live-outs you need to:

  • Take the union of the successor live-ins + pristines
  • In case of the return block without successors take all callee saved register instead

(you may referer to LivePhysRegs::addLiveOuts() / LivePHysRegs::addLiveIns() to see an alternative implementation of this).

From reading the code here the AntiDepBreaker is already doing the right thing. I only read the code though, and did not actually run/play with it so I may have missed something.

I also don't understand the fix here: Adding live-in registers when the code here appears to be searching for live-out registers seem like an odd fix to me (I can see how it helps though as it will just mark more registers as occupied).

This is based on my inderstanding on "ideal liveins()" I mentioned in the bug:

/* Returns all live-ins, including the unspilled callee-saved registers. */
iterator_range<concat_iterator<...>> liveins();

The fix turns "Pristine.test(Reg)" into "Pristine.test(Reg) || BB.IsLiveIn(Reg)" as a refinement (to match the ideal definition of liveins()).

It means that not only those callee-saved registers (we are iterating over only them) who don't have spill slots are considered, but also those who “have spill slots but not spilled”.

Pristine.test(Reg) || BB.IsLiveIn(Reg) would make a lot of sense when searching for live-ins. But the comments around it indicate it is actually searching for live-outs?

Not for all live-outs, but only for "live-out callee-saved registers".

The code wants to do something on all non-spilled callee-saved registers that are live-outs of the current block. Without considering shrink wrapping, I picture that the correct behavior should be:

  1. Epilogue blocks and returning blocks have all callee-saved registers live-out. Handle all of them.
  2. Prologue blocks have all callee-saved registers that are spilled. Handle callee-saved registers that are pristine.
  3. For the rest of the blocks, a callee-saved register is either not live, or live-through. Handle live-in callee-saved registers.

The behavior was matching my picture, only if all epilogue blocks are returning blocks.

After considering shrink wrapping, things change. Ideally:

  1. All blocks before prologue, excluding prologue, have all callee-saved registers live-out.
  2. All blocks after epilogue, including epilogue, have all callee-saved registers live-out.
  3. For the rest of the blocks, a callee-saved register is either not live, or live-through; pristine registers are precisely live-out callee-saved registers.

Let's see what information we have (I haven't verified yet):

  1. For all blocks before prologue, including prologue, shrink wrapping marks all callee-saved registers live-in.
  2. For all blocks after epilogue, excluding epilogue, shrink wrapping marks all-callee-saved registers live-in.

I think my fix is wrong about prologues and epilogues, but correct for other blocks. I think that your suggestion is the correct fix, that is looking at the live-ins of the successors. I'll do that.

Do you think that we should keep this patch in the mean time? This patch makes some of the correct cases overly-conservative (but not wrong), and fixes some of the wrong cases. It doesn't make any correct cases wrong. Feel free to revert it if you don't think so.

I'm on vacation until April 15th, so I'll work on it after that. :)

Just found the time to look at this a bit:

The thing that is wrong here is the live-in list of bb.2. bb.2 is a return block without any restore instructions in it so it must have the %x29 as a live-in. So it seems the actual problem lies before the DepBreakers here.

I tried reproducing the earlier passes by copying the IR out of the mir test and re-running it on that, but I only always seem to get a version with a single return block without shrink wrapping...

Just found the time to look at this a bit:

The thing that is wrong here is the live-in list of bb.2. bb.2 is a return block without any restore instructions in it so it must have the %x29 as a live-in. So it seems the actual problem lies before the DepBreakers here.

Good catch!

After some investigation, I found that line:

https://github.com/llvm-mirror/llvm/blob/8c6e6605aa4b10ac5c3e3afc4e87361c2ac1ff9d/lib/CodeGen/BranchFolding.cpp#L391

computeLiveIns() doesn't add r29 and r30 into NewMBB's liveins. I believe that computeLiveIns is doing what it's intended to do (add liveins based on instructions), so we should change the caller side to propagate CSR live-ins.

I tried reproducing the earlier passes by copying the IR out of the mir test and re-running it on that, but I only always seem to get a version with a single return block without shrink wrapping...

so we should change the caller side to propagate CSR live-ins.

I'm not sure how to do this -

when splitting a block B to C and D, where C unconditionally jumps (or falls through) to D, how to compute D's liveins, including CSRs?

B could be any block -

  1. before save point, CSRs live-through.
  2. save point, CSRs live-in, dead-out.
  3. between save point and restore point, CSRs are considered normal registers.
  4. restore point, CSRs dead-in, live-out.
  5. after restore point, CSRs live-through.

If the only information are the liveins(), it seems not easy to tell liveness at the split position. Do we want to use LiveIntervals in BranchFolding.cpp?

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

This revision now requires changes to proceed.May 25 2017, 4:41 PM

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

You can hold off with this for now, somehow I broke some clang stage2 bots with that commit and I first have to investigate why.

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

You can hold off with this for now, somehow I broke some clang stage2 bots with that commit and I first have to investigate why.

I re-landed the patch with a fix and it seems to be fine now. I also filed http://llvm.org/PR33182 to document a MachineVerifier shortcoming of catching a case like this.

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

D32464 doesn't seem to fix this test case - I ran pass post-RA-sched on this test case, and got bb.2.if.then not having X29 as live-in. Can you take a look?

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

D32464 doesn't seem to fix this test case - I ran pass post-RA-sched on this test case, and got bb.2.if.then not having X29 as live-in. Can you take a look?

The original testcase? The mir test in the repository already has the invalid livein lists from the earlier branch folding.

Ok, now that r303938 / https://reviews.llvm.org/D32464 has landed could you please double check and then revert this?

D32464 doesn't seem to fix this test case - I ran pass post-RA-sched on this test case, and got bb.2.if.then not having X29 as live-in. Can you take a look?

The original testcase? The mir test in the repository already has the invalid livein lists from the earlier branch folding.

Yes you are right. Posted D33697 to revert.

timshen abandoned this revision.May 30 2017, 4:26 PM