This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] try to fold binop with phi operands
ClosedPublic

Authored by spatel on Jan 12 2022, 6:11 AM.

Details

Summary

This is an alternate version of D115914 that handles/tests all binary opcodes. I think it is close to the same logic, but I was having a hard time making sense of the diffs/comments in that patch, so I just wrote the patch from scratch.

I suspect that we don't see these patterns too often because -simplifycfg would convert the minimal cases into selects rather than leave them in phi form (note: instcombine has logic holes for combining the select patterns too though, so that's another potential patch).

We only create a new binop in a predecessor that unconditionally branches to the final block.
https://alive2.llvm.org/ce/z/C57M2F
https://alive2.llvm.org/ce/z/WHwAoU (not safe to speculate an sdiv for example)
https://alive2.llvm.org/ce/z/rdVUvW (but it is ok on this path)

Diff Detail

Event Timeline

spatel created this revision.Jan 12 2022, 6:11 AM
spatel requested review of this revision.Jan 12 2022, 6:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 6:11 AM
This comment was removed by lebedev.ri.

What about the case where one PHI is fully constant: https://godbolt.org/z/16W8oGYv3 ?

What about the case where one PHI is fully constant: https://godbolt.org/z/16W8oGYv3 ?

Hmm...so we convert that into a binop with constant operand, and I thought we'd then get that pattern within InstCombinerImpl::foldOpIntoPhi(). But there's an extra clause in that code that is likely also needed here:

// If there is exactly one non-constant value, we can insert a copy of the
// operation in that block.  However, if this is a critical edge, we would be
// inserting the computation on some other paths (e.g. inside a loop).  Only
// do this if the pred block is unconditionally branching into the phi block.
// Also, make sure that the pred block is not dead code.
spatel updated this revision to Diff 399348.Jan 12 2022, 9:00 AM
spatel edited the summary of this revision. (Show Details)

Patch updated:

  1. Added the necessary clause that prevents hoisting into a conditional predecessor (and more tests).
  2. Added a TODO for handling phi with 2 constant incoming values (not sure yet if that's reliably handled by simplifycfg and/or other instcombines).

is it possible that we're hoisting a non-speculatable operation past something like a call in the same block?

llvm/test/Transforms/InstCombine/zext-or-icmp.ll
42

these tests need to be changed to preserve testing of https://reviews.llvm.org/D30781

is it possible that we're hoisting a non-speculatable operation past something like a call in the same block?

Indeed. If the binop is not freely speculatable, then we need to check that it is guaranteed to be executed: https://alive2.llvm.org/ce/z/VqvC_r

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1299–1301

This is too restrictive (please add a fixme?)
https://alive2.llvm.org/ce/z/kKuiW9 vs https://alive2.llvm.org/ce/z/SJ9oYU

mingmingl added inline comments.Jan 12 2022, 5:38 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1329–1334

I'm wondering if it's idiomatic to validate that this instruction could also be simplified away in later iterations of instruction-combine, and only proceed with the fold if yes.

e.g., in https://godbolt.org/z/fKfondqrn, %res = or i64 %retval.sroa.3.0.extract.shift, %phi.cast instruction is inserted into if.then block, and could be simplified away.

lebedev.ri requested changes to this revision.Jan 13 2022, 11:38 AM

(as per previous comments regarding UB introduction)

This revision now requires changes to proceed.Jan 13 2022, 11:38 AM
spatel added inline comments.Jan 13 2022, 12:35 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1329–1334

In general, instcombine tries to do the smallest transform that still results in an improvement. It can then chain together a series of minimal transforms to achieve the larger goal.

So if we can make this patch safe and still profitable while creating a new binop, then that's better than limiting a larger transform to the case where we eliminate the binop completely.

llvm/test/Transforms/InstCombine/zext-or-icmp.ll
42

I think these tests were already altered enough (there's no shift remaining in the existing CHECK lines) that they don't test the original problem (undefined shift). I'll try to come up with another way to cover that path.

spatel updated this revision to Diff 400014.Jan 14 2022, 8:14 AM
spatel marked 3 inline comments as done.

Patch updated:

  1. Added check for isGuaranteedToTransferExecutionToSuccessor() to make sure the binop is executed.
  2. Added TODO comment/test for binop in different block than phi.
lebedev.ri added inline comments.Jan 14 2022, 8:36 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1293–1297

the fixme

// TODO: This could miss matching a phi with 2 constant incoming values.

should be here, because uniform constant phi will be simplified into a plain constant

1328–1333

Don't we also need isSafeToSpeculativelyExecute() check?

spatel added inline comments.Jan 14 2022, 9:02 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1328–1333

That's what the TODO comment and "mul_const_incoming0_speculatable" test are referring to (if I'm understanding the question).
Ie, for correctness, I think we only need to bail out for opcodes like div/rem. But for performance (and as a conservative first step), we probably don't want to hoist anything on this path if it isn't going to be executed unconditionally. Do we know that some other pass will sink the op later if possible?

llvm/test/Transforms/InstCombine/zext-or-icmp.ll
42

Extra test added here:
417f5f4633e

42

lost a char:
f417f5f4633e

spatel updated this revision to Diff 400042.Jan 14 2022, 9:13 AM
spatel marked an inline comment as done.

Moved TODO comment higher to account for more potential patterns that could be handled here.

lebedev.ri added inline comments.Jan 14 2022, 11:25 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1297

Technically, this is wrong.

  1. for switch predecessor, we could have more than a single incoming edges
  2. This check is backwards. It's not really that it must have exactly two predecessors, but more that there must be at most a single predecessor with non-constant input value, otherwise we'd increase instruction count. And for constant incoming values we'll constant-fold.
lebedev.ri added inline comments.Jan 21 2022, 3:42 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1297

Can you please at least add a FIXME for these issues?

spatel marked 2 inline comments as done.Jan 21 2022, 6:27 AM
spatel added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
1297

Sorry - I got distracted from this patch. I agree that this should be more like foldOpIntoPhi().
But yes, I can add a FIXME to this and follow-up. We should still get the original motivating case even with this artificial limitation.

spatel updated this revision to Diff 401984.Jan 21 2022, 7:48 AM
spatel marked an inline comment as done.

Added TODO comment for follow-up enhancement - fix the incoming values check to look for max of 1 variable operand.

lebedev.ri accepted this revision.Jan 21 2022, 8:32 AM

This seems fine as a first step.
Thanks

This revision is now accepted and ready to land.Jan 21 2022, 8:32 AM
This revision was landed with ongoing or failed builds.Jan 22 2022, 12:01 PM
This revision was automatically updated to reflect the committed changes.