This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Transitively propagate `unreachable` into predecessors
AbandonedPublic

Authored by lebedev.ri on Jul 2 2021, 11:39 AM.

Details

Summary

Currently we try to erase instructions only in the very block
that literally has unreachable terminator.
While this is conservatively correct, we can do better.

If the block consists only of said unreachable,
then it stops being a successor of it's predecessors,
and if for some predecessor that means it no longer has
any successors, then said predecessor earns an unreachable
terminator itself. And so on.

SimplifyCFG would do at least some of that,
but relying on that requires for the InstCombine pass
to be rerun after SimplifyCFG, which isn't optimal..

Diff Detail

Event Timeline

lebedev.ri created this revision.Jul 2 2021, 11:39 AM
lebedev.ri requested review of this revision.Jul 2 2021, 11:39 AM
nikic added a comment.Jul 3 2021, 1:18 AM

This is walking the edge of what is appropriate for InstCombine, so I'd like to understand why we want to try so hard to remove unreachable code here. As InstCombine does not introduce its own unreachable terminators, after recent changes to SimplifyCFG, running SimplifyCFG+InstCombine should produce an optimal result (as far as unreachable code is concerned). There is no need to iterate between them to reach a fixed point (again, as far as this transform is concerned).

InstCombine can introduce pseudo-unreachable code in the form of either the store null pattern, or branches on constants that render certain edges effectively dead. Handling those cases could cut down fixpoint iteration. Is extending towards such cases the plan here?

This is walking the edge of what is appropriate for InstCombine, so I'd like to understand why we want to try so hard to remove unreachable code here. As InstCombine does not introduce its own unreachable terminators, after recent changes to SimplifyCFG, running SimplifyCFG+InstCombine should produce an optimal result (as far as unreachable code is concerned). There is no need to iterate between them to reach a fixed point (again, as far as this transform is concerned).

I'm mainly trying to harden the handling of unreachable as much as possible before D104870,
in hope that some of the regressions caused by that patch are avoidable.

InstCombine can introduce pseudo-unreachable code in the form of either the store null pattern, or branches on constants that render certain edges effectively dead. Handling those cases could cut down fixpoint iteration. Is extending towards such cases the plan here?

Yes, dealing with that as much as reasonably possible in instcombine itself is something i'd like to see happen.

@spatel thoughts?

I haven't closely followed the related patches, so I'm probably not adding much here...
But if I run -simplifycfg on any of the tests with diffs here, they are completely zapped down to a single unreachable.
Can you add a test to demonstrate where we might fall through the cracks of both passes? (possibly a test for PhaseOrdering)

@spatel thoughts?

I haven't closely followed the related patches, so I'm probably not adding much here...
But if I run -simplifycfg on any of the tests with diffs here, they are completely zapped down to a single unreachable.
Can you add a test to demonstrate where we might fall through the cracks of both passes? (possibly a test for PhaseOrdering)

I'm not really sure if there is any pattern that fits your description.
The goal/question here is mostly: should we, the instcombine, be able to get rid of more known-dead code,
potentially allowing further folds, without deferring until the next time instcombine runs after simplifycfg runs after us?

@spatel thoughts?

I haven't closely followed the related patches, so I'm probably not adding much here...
But if I run -simplifycfg on any of the tests with diffs here, they are completely zapped down to a single unreachable.
Can you add a test to demonstrate where we might fall through the cracks of both passes? (possibly a test for PhaseOrdering)

I'm not really sure if there is any pattern that fits your description.
The goal/question here is mostly: should we, the instcombine, be able to get rid of more known-dead code,
potentially allowing further folds, without deferring until the next time instcombine runs after simplifycfg runs after us?

I'm leaning towards "no" then. It seems similar to me to arguments between CSE and InstCombine. For example, we could try harder to make InstCombine see through identical values via pattern matching, but we intentionally do not try that because we have passes dedicated to that functionality. Here, we have a pass (SimplifyCFG) that can and does have the logic to propagate unreachable code, so I don't think we should duplicate that effort for a potential small improvement in efficiency.

lebedev.ri abandoned this revision.Jul 22 2021, 5:47 AM