This is an archive of the discontinued LLVM Phabricator instance.

[MachineSink] Fix CFG walk in clobber check (PR53990)
AbandonedPublic

Authored by nikic on Feb 22 2022, 8:09 AM.

Details

Reviewers
shchenz
qcolombet
Summary

MachineSink::hasStoreBetween() determines that the code is "straight-line" by checking that From dominates To and To post-dominates From. Then it does a DFS walk starting at To, but only considers MBBs post-dominated by From.

The critical part this misses, is while the code is, in a certain sense of the word, "straight-line", it may still be part of a loop. To give a simple example:

From ->  BB2
   \    /  A
    v  v   |
     To -> BB3

Here From dominates To, and To post-dominates From, and the current code would only check for a clobber in BB2, but not in BB3. (Unfortunately, I was not able to get MachineSink to trigger with a CFG that simple.)

What we actually want to determine is that no clobber is reverse-reachable from To (without going through From), so implement that, without the post-dominance check.

Fixes https://github.com/llvm/llvm-project/issues/53990.

Diff Detail

Event Timeline

nikic created this revision.Feb 22 2022, 8:09 AM
nikic requested review of this revision.Feb 22 2022, 8:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2022, 8:09 AM
jmorse added a subscriber: jmorse.Feb 22 2022, 8:17 AM

Can we check why the instruction is sunk from a shallower block From to a deeper block To? MachineSinking::isProfitableToSinkTo() should not allow this?

Can we check why the instruction is sunk from a shallower block From to a deeper block To? MachineSinking::isProfitableToSinkTo() should not allow this?

This happens because the loop is irreducible. The loop depth check is based on MachineLoopInfo, which only handles natural loops. So sinking into irreducible cycles is not prevented by the profitability check.

Can we check why the instruction is sunk from a shallower block From to a deeper block To? MachineSinking::isProfitableToSinkTo() should not allow this?

This happens because the loop is irreducible. The loop depth check is based on MachineLoopInfo, which only handles natural loops. So sinking into irreducible cycles is not prevented by the profitability check.

Is it possible to exclude this irreducible loop case in isProfitableToSinkTo and add some assertions/checks in hasStoreBetween? The posted patch may largely increase the compile time as now more blocks are checked. Compile-time issue happened before in https://reviews.llvm.org/D86864#2469289

nikic added a comment.Feb 23 2022, 2:17 AM

Can we check why the instruction is sunk from a shallower block From to a deeper block To? MachineSinking::isProfitableToSinkTo() should not allow this?

This happens because the loop is irreducible. The loop depth check is based on MachineLoopInfo, which only handles natural loops. So sinking into irreducible cycles is not prevented by the profitability check.

Is it possible to exclude this irreducible loop case in isProfitableToSinkTo and add some assertions/checks in hasStoreBetween?

It should be possible to do this by switching MachineSink from using MachineLoopInfo to MachineCycleInfo, which supports irreducible cycles. I think this allows a better profitability decision, but I'm not entirely sure that it would be a sufficient correctness condition, as irreducible cycles are DFS-order dependent. In any case, MachineCycleInfo is a new addition that is not actually used anywhere yet, so I don't think this would be appropriate for an LLVM 14 backport.

The posted patch may largely increase the compile time as now more blocks are checked. Compile-time issue happened before in https://reviews.llvm.org/D86864#2469289

This is still subject to the existing limits on the walks. If there are compile-time issues, then those limits are too large (which, looking at them now, may well be the case: the limits allow walking up to 20 * 2000 instructions, which does not seem like a sensible limit.)

shchenz added a comment.EditedFeb 24 2022, 12:56 AM

It should be possible to do this by switching MachineSink from using MachineLoopInfo to MachineCycleInfo, which supports irreducible cycles. I think this allows a better profitability decision, but I'm not entirely sure that it would be a sufficient correctness condition, as irreducible cycles are DFS-order dependent. In any case, MachineCycleInfo is a new addition that is not actually used anywhere yet, so I don't think this would be appropriate for an LLVM 14 backport.

Yes, I agree changing MachineLoopInfo to MachineCycleInfo would be a little risky for backport. Could we just simply mark it as not profitable in isProfitableToSinkTo if MBB or SuccToSinkTo is in a loop that contains IrreducibleCFG.

ShrinkWrap.cpp has an example check:

ReversePostOrderTraversal<MachineBasicBlock *> RPOT(&*MF.begin());
if (containsIrreducibleCFG<MachineBasicBlock *>(RPOT, *MLI)) {

For this case, IMO, not allowing the sink should be the right choice as block To is in a deeper loop? Apart from the load/store sink, we may sink some other instructions to a hot loop. And not sure other queries from MachineLoopInfo in this pass have the right behavior for an irreducible loop.

nikic added a comment.Feb 24 2022, 1:15 AM

It should be possible to do this by switching MachineSink from using MachineLoopInfo to MachineCycleInfo, which supports irreducible cycles. I think this allows a better profitability decision, but I'm not entirely sure that it would be a sufficient correctness condition, as irreducible cycles are DFS-order dependent. In any case, MachineCycleInfo is a new addition that is not actually used anywhere yet, so I don't think this would be appropriate for an LLVM 14 backport.

Yes, I agree changing MachineLoopInfo to MachineCycleInfo would be a little risky for backport. Could we just simply mark it as not profitable in isProfitableToSinkTo if MBB or SuccToSinkTo is in a loop that contains IrreducibleCFG.

ShrinkWrap.cpp has an example check:

ReversePostOrderTraversal<MachineBasicBlock *> RPOT(&*MF.begin());
if (containsIrreducibleCFG<MachineBasicBlock *>(RPOT, *MLI)) {

For this case, IMO, not allowing the sink should be the right choice as block To is in a deeper loop? Apart from the load/store sink, we may sink some other instructions to a hot loop. And not sure other queries from MachineLoopInfo in this pass have the right behavior for an irreducible loop.

This is possible, but without MachineCycleInfo, this would be a very crude check: We would completely disable sinking for any functions that contain irreducible control flow (I don't think containsIrreducibleCFG is legal to use on a sub-graph). This seems potentially problematic to me, because irreducible control-flow is more common in the machine back-end than the middle-end. Here's a quick test: https://gist.github.com/nikic/e45e179e9a84c18de86c041158d3067e

I'm generally okay with doing that though, excluding irreducible control-flow entirely is certainly the most conservative fix.

I've opened D120800 for the variant that disables sinking in the presence of irreducible control flow entirely.

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 1:27 AM
nikic abandoned this revision.Mar 2 2022, 9:35 AM

Abandoning this in favor of D120800.