This is an archive of the discontinued LLVM Phabricator instance.

Expand llvm.is.constant earlier
AbandonedPublic

Authored by joerg on May 16 2019, 12:21 PM.

Details

Reviewers
chandlerc
void
Summary

At the moment, llvm.is.constant intrinsics with non-constant arguments get expanded
during SelectionDAG preparation in the default pass order. This means that they are
expanded are the last instance of SimplifyCFG and don't result in pruning of dead
branches. This breaks when the dead branches contain assembler statements that
depends on immediate arguments. This is the middle end side of the fixes for PR41027.

This patch extends the InstSimplify pass to do this expansion as it is the last generic
simplification pass before the final SimplifyCFG and already iterating all instructions
anyway.

Diff Detail

Event Timeline

joerg created this revision.May 16 2019, 12:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2019, 12:21 PM

2c:
Is it guaranteed that this transform (instsimplify) will *ONLY* *EVER* run once at the end of the pipeline?
*Every* pipeline ever created? That is the 'requirement' on this @llvm.is.constant expansion, i believe.
Would it not be better to not rely on this implicit, undocumented, untested behavior?

Regardless of that, this is missing tests in instsimplify directory.

I think this isn't quite right (if I've understood what it is doing correctly).

We want to fold llvm.is.constant to true as early as possible, but to false as late as possible. This pass is the right place to fold early (and often). But it is a very bad place to fold late. The only thing that should fold llvm.is.constant to false should be FastISel or SelectionDAG.

So you should change the code to *conditionally* fold here instead of unconditionally based on the above.

Then the test needs to show both sides of this, preferably with the specific desired use case of control flow that can be simplified when the intrinsic folds to true, and showing it is not simplified prior to inlining, but is simplified after inlining, where the inlining allows the argument to become a constant.

(For posterity, this also should be in the implementation of SimplifyInstruction itself, not in the pass. The pass only ever calls that.)

Joerg points out in IRC that we need control flow pruning to be symmetric w/ true and false.

In that case, I think we need SelectionDAG and FastISel to *reject* lowering llvm.is.constant and add a dedicated lowering pass like we have for lowering atomics. It is needed for the exact same reason as atomics -- to do control flow cleanup after folding these to false.

That pass should be a dedicated pass IMO that *only* folds these intrinsics, and then *only* remove dead code. I don't think this should happen inside another pass or rely on SimplifyCFG (which does way too much). And it should be a codegen pass over the IR, not part of the pass pipeline.

These should, of course, be two separate patches -- one to get early folding to true, and one to hoist the remainder out of ISel and into a pass that can prune the CFG.

Joerg points out in IRC that we need control flow pruning to be symmetric w/ true and false.

In that case, I think we need SelectionDAG and FastISel to *reject* lowering llvm.is.constant and add a dedicated lowering pass like we have for lowering atomics. It is needed for the exact same reason as atomics -- to do control flow cleanup after folding these to false.

That pass should be a dedicated pass IMO that *only* folds these intrinsics, and then *only* remove dead code. I don't think this should happen inside another pass or rely on SimplifyCFG (which does way too much). And it should be a codegen pass over the IR, not part of the pass pipeline.

On further thought, I think it's worth putting this into the main IR pass set, and just adding it in the codegen passes like the atomic lowering pass. But we might want to *additionally* add it to the pass pipeline right after inlining finishes (for the last time in LTO, a really important case). That way we can avoid confusing the vectorizer and LSR with this kind of control flow.

We already fold llvm.is.constant to true early, in ConstantFoldScalarCall.

The only thing that should fold llvm.is.constant to false should be FastISel or SelectionDAG.

The point of this patch is that we don't want to fold it quite that late. We want to try to hold off folding it until we've resolved all the information we can get through inlining etc, but we miss some very basic optimizations if we wait for SelectionDAG. PR41027 contains a case of an invalid inline asm block, which compiles with gcc because it throws away the asm before it tries to check the constraints.

That said, this patch is horribly confusing. It isn't related to the normal functionality of the InstSimplify pass. And even if InstSimplifyPass.cpp happens to run in the correct position at the moment, that's likely to change in the future.

joerg abandoned this revision.Jul 25 2019, 6:47 AM