# [InstSimplify] ExpandBinOp should fold an expression only when it's safeAbandonedPublicActions

Authored by aqjune on Tue, Jul 21, 10:03 AM.

# Details

Reviewers
 craig.topper lebedev.ri spatel nlopes
Summary

ExpandBinOp folds (A op' B) op C by transforming it into
(A op C) op' (B op C) first, simplifying the two subexpressions
(say the result as L and R), and folding L op' R if possible.

However, if C contains undef, this results in making undef have two different
values whereas the original expression may not;
as a simplest case, transforming undef * (1 + 1) into undef + undef isn't
safe, because the former one is an even whereas the latter one is any number.

This patch makes ExpandBinOp fold only when it's safe.

1. If either L or R is special in that L op' ? or ? op' R always folds into a constant, it is safe to fold the expression to the constant.
1. If C is never undef or poison, it is safe.

After this patch, transformation https://reviews.llvm.org/D83360#2146539 does not happen
after D83360 is enabled.

At https://reviews.llvm.org/D83360#2146539 , I proposed making InstSimplify
remember to which value undefs are folded & use the cached value if the undef
is met again.
My plan was to implement this solution by adding a static ValueOrUse class
(so we can distinguish undefs by Use) and letting InstSimplify's functions

1. If InstSimplify fails to simplify values, the record of folded undefs should be rewinded at function returns. Doing this all over InstSimplify is error-prone and easy to miss.
1. Sometimes the value that undef is folded isn't just a constant. For example, if undef + 1 <= X is folded into true, the undef is folded into either -1 or a random number between [0, X-1]. Recording this isn't trivial.
1. Constant folding can fold undefs too. This causes changes in the relevant libraries.
1. Finally, if InstSimplify becomes smart enough, this still can cause miscompilation. For example, if InstSimplify skims through instructions in the same basic block and find equivalent expressions, a miscompilation can happen without explicit folding of undefs.

Another approach that I tried was to simply
stopping undef folds when called from ExpandBinOp recursively.
However, this also required a nontrivial change because two
UndefValue instances compare equal, leading to undesirable foldings like
undef - undef --> 0.
Also, it still does not resolve the possible miscompilation problem.

# Diff Detail

### Event Timeline

aqjune created this revision.Tue, Jul 21, 10:03 AM
Herald added a project: Restricted Project. Tue, Jul 21, 10:03 AM
aqjune edited the summary of this revision. (Show Details)Tue, Jul 21, 10:06 AM
aqjune added a comment.EditedTue, Jul 21, 10:09 AM

Running SPEC2017rate using testsuite with -O3, LTO Enabled shows negligible impact on the quality of generated code (less than 0.15% slowdown)

aqjune marked an inline comment as done.Tue, Jul 21, 10:10 AM
llvm/test/Transforms/InstCombine/and-or-icmps.ll
208

According to this comment, this test is simply for checking whether there is no assertion failure.

nikic added a subscriber: nikic.Tue, Jul 21, 11:39 AM

How about reducing the code duplication with a preliminary clean-up patch?

```diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 8fbcee84a15..22b1dac8d65 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -228,64 +228,54 @@ static bool valueDominatesPHI(Value *V, PHINode *P, const DominatorTree *DT) {
return false;
}

-/// Simplify "A op (B op' C)" by distributing op over op', turning it into
-/// "(A op B) op' (A op C)".  Here "op" is given by Opcode and "op'" is
-/// given by OpcodeToExpand, while "A" corresponds to LHS and "B op' C" to RHS.
-/// Also performs the transform "(A op' B) op C" -> "(A op C) op' (B op C)".
-/// Returns the simplified value, or null if no simplification was performed.
-static Value *ExpandBinOp(Instruction::BinaryOps Opcode, Value *LHS, Value *RHS,
-                          Instruction::BinaryOps OpcodeToExpand,
+/// Try to simplify a binary operator of form "V op OtherOp" where V is
+/// "(B0 opex B1)" by distributing 'op' across 'opex' as
+/// "(B0 op OtherOp) opex (B1 op OtherOp)".
+static Value *expandBinOp(Instruction::BinaryOps Opcode, Value *V,
+                          Value *OtherOp, Instruction::BinaryOps OpcodeToExpand,
const SimplifyQuery &Q, unsigned MaxRecurse) {
-  // Recursion is always used, so bail out at once if we already hit the limit.
-  if (!MaxRecurse--)
+  auto *B = dyn_cast<BinaryOperator>(V);
+  if (!B || B->getOpcode() != OpcodeToExpand)
+    return nullptr;
+  Value *B0 = B->getOperand(0), *B1 = B->getOperand(1);
+  Value *L = SimplifyBinOp(Opcode, B0, OtherOp, Q, MaxRecurse);
+  if (!L)
+    return nullptr;
+  Value *R = SimplifyBinOp(Opcode, B1, OtherOp, Q, MaxRecurse);
+  if (!R)
return nullptr;

-  // Check whether the expression has the form "(A op' B) op C".
-  if (BinaryOperator *Op0 = dyn_cast<BinaryOperator>(LHS))
-    if (Op0->getOpcode() == OpcodeToExpand) {
-      // It does!  Try turning it into "(A op C) op' (B op C)".
-      Value *A = Op0->getOperand(0), *B = Op0->getOperand(1), *C = RHS;
-      // Do "A op C" and "B op C" both simplify?
-      if (Value *L = SimplifyBinOp(Opcode, A, C, Q, MaxRecurse))
-        if (Value *R = SimplifyBinOp(Opcode, B, C, Q, MaxRecurse)) {
-          // They do! Return "L op' R" if it simplifies or is already available.
-          // If "L op' R" equals "A op' B" then "L op' R" is just the LHS.
-          if ((L == A && R == B) || (Instruction::isCommutative(OpcodeToExpand)
-                                     && L == B && R == A)) {
-            ++NumExpand;
-            return LHS;
-          }
-          // Otherwise return "L op' R" if it simplifies.
-          if (Value *V = SimplifyBinOp(OpcodeToExpand, L, R, Q, MaxRecurse)) {
-            ++NumExpand;
-            return V;
-          }
-        }
-    }
+  // Does the expanded pair of binops simplify to the existing binop?
+  if ((L == B0 && R == B1) ||
+      (Instruction::isCommutative(OpcodeToExpand) && L == B1 && R == B0)) {
+    ++NumExpand;
+    return B;
+  }

-  // Check whether the expression has the form "A op (B op' C)".
-  if (BinaryOperator *Op1 = dyn_cast<BinaryOperator>(RHS))
-    if (Op1->getOpcode() == OpcodeToExpand) {
-      // It does!  Try turning it into "(A op B) op' (A op C)".
-      Value *A = LHS, *B = Op1->getOperand(0), *C = Op1->getOperand(1);
-      // Do "A op B" and "A op C" both simplify?
-      if (Value *L = SimplifyBinOp(Opcode, A, B, Q, MaxRecurse))
-        if (Value *R = SimplifyBinOp(Opcode, A, C, Q, MaxRecurse)) {
-          // They do! Return "L op' R" if it simplifies or is already available.
-          // If "L op' R" equals "B op' C" then "L op' R" is just the RHS.
-          if ((L == B && R == C) || (Instruction::isCommutative(OpcodeToExpand)
-                                     && L == C && R == B)) {
-            ++NumExpand;
-            return RHS;
-          }
-          // Otherwise return "L op' R" if it simplifies.
-          if (Value *V = SimplifyBinOp(OpcodeToExpand, L, R, Q, MaxRecurse)) {
-            ++NumExpand;
-            return V;
-          }
-        }
-    }
+  // Otherwise, return "L op' R" if it simplifies.
+  Value *S = SimplifyBinOp(OpcodeToExpand, L, R, Q, MaxRecurse);
+  if (!S)
+    return nullptr;

+  ++NumExpand;
+  return S;
+}
+
+/// Try to simplify binops of form "A op (B op' C)" or the commuted variant by
+/// distributing op over op'.
+static Value *expandCommutativeBinOp(Instruction::BinaryOps Opcode,
+                                     Value *L, Value *R,
+                                     Instruction::BinaryOps OpcodeToExpand,
+                                     const SimplifyQuery &Q,
+                                     unsigned MaxRecurse) {
+  // Recursion is always used, so bail out at once if we already hit the limit.
+  if (!MaxRecurse--)
+    return nullptr;
+
+  if (Value *V = expandBinOp(Opcode, L, R, OpcodeToExpand, Q, MaxRecurse))
+    return V;
+  if (Value *V = expandBinOp(Opcode, R, L, OpcodeToExpand, Q, MaxRecurse))
+    return V;
return nullptr;
}

@@ -901,8 +891,8 @@ static Value *SimplifyMulInst(Value *Op0, Value *Op1, const SimplifyQuery &Q,
return V;

// Mul distributes over Add. Try some generic simplifications based on this.
-  if (Value *V = ExpandBinOp(Instruction::Mul, Op0, Op1, Instruction::Add,
-                             Q, MaxRecurse))
+  if (Value *V = expandCommutativeBinOp(Instruction::Mul, Op0, Op1,
return V;

// If the operation is with the result of a select instruction, check whether
@@ -2095,13 +2085,13 @@ static Value *SimplifyAndInst(Value *Op0, Value *Op1, const SimplifyQuery &Q,
return V;

// And distributes over Or.  Try some generic simplifications based on this.
-  if (Value *V = ExpandBinOp(Instruction::And, Op0, Op1, Instruction::Or,
-                             Q, MaxRecurse))
+  if (Value *V = expandCommutativeBinOp(Instruction::And, Op0, Op1,
+                                        Instruction::Or, Q, MaxRecurse))
return V;

// And distributes over Xor.  Try some generic simplifications based on this.
-  if (Value *V = ExpandBinOp(Instruction::And, Op0, Op1, Instruction::Xor,
-                             Q, MaxRecurse))
+  if (Value *V = expandCommutativeBinOp(Instruction::And, Op0, Op1,
+                                        Instruction::Xor, Q, MaxRecurse))
return V;

// If the operation is with the result of a select instruction, check whether
@@ -2253,8 +2243,8 @@ static Value *SimplifyOrInst(Value *Op0, Value *Op1, const SimplifyQuery &Q,
return V;

// Or distributes over And.  Try some generic simplifications based on this.
-  if (Value *V = ExpandBinOp(Instruction::Or, Op0, Op1, Instruction::And, Q,
-                             MaxRecurse))
+  if (Value *V = expandCommutativeBinOp(Instruction::Or, Op0, Op1,
+                                        Instruction::And, Q, MaxRecurse))
return V;

// If the operation is with the result of a select instruction, check whether```

@spatel Hi, yes, it sounds great.
I'm fine with updating this patch after the patch is landed.

@spatel Hi, yes, it sounds great.
I'm fine with updating this patch after the patch is landed.

Refactoring committed here:
rG7485e924121

We need some tests pre-committed within InstSimplify to show this working (disabling simplify) for each opcode?

llvm/lib/Analysis/InstructionSimplify.cpp
231–234

Could we adapt this existing analysis function?

```/// Return the absorbing element for the given binary
/// operation, i.e. a constant C such that X op C = C and C op X = C for
/// every X.  For example, this returns zero for integer multiplication.
/// It returns null if the operator doesn't have an absorbing element.
static Constant *getBinOpAbsorber(unsigned Opcode, Type *Ty);```
aqjune updated this revision to Diff 281578.Wed, Jul 29, 7:32 AM
• Use getBinOpAbsorber

I couldn't find a test that was previously optimized but now unoptimized,
except when & and ^ are combined.

aqjune marked an inline comment as done.Wed, Jul 29, 7:34 AM

I tried to find other examples where this could happen, and I can't find anything else either.
So I think this just needs some minor changes.
However, Alive2 does not show the "f_dontfold" test example or my reduced candidate test as incorrect - is that a bug for Alive2?
https://alive2.llvm.org/ce/z/dbGLzN

llvm/lib/Analysis/InstructionSimplify.cpp
231

Can say "This function returns..." to make it independent of the actual function name.
Use "///" to create documentation comment.

249

Formatting for function name: "Fold..." -> "fold..."

255–258

Is this dead code? The current callers of expandBinOp only have "OpcodeToExpand" possiblities for:
Instruction::And
Instruction::Or
Instruction::Xor

If that is correct, maybe we want to just make it a 'TODO' comment?
If not, we could still simplify the code to something like:

`if (BinaryOperator::isShift(Op) && match(L, m_Zero()))`
llvm/test/Transforms/InstSimplify/distribute.ll
4 ↗(On Diff #281578)

I was confused by this example because the 'x' in the code comment is not the same as the 'x' in the test.
How about reducing the example to:

```define i4 @expand_and_xor(i4 %x, i4 %y) {
%notx = xor i4 %x, -1
%a = and i4 %notx, %y
%nota = xor i4 %a, -1
%result = and i4 %x, %nota
ret i4 %result
}```

I tried to find other examples where this could happen, and I can't find anything else either.
So I think this just needs some minor changes.
However, Alive2 does not show the "f_dontfold" test example or my reduced candidate test as incorrect - is that a bug for Alive2?
https://alive2.llvm.org/ce/z/dbGLzN

f_dontfold simply shows that the patch prevents the folding; it is still correct to fold the expression. I'll leave TODO there.
I'll update this patch to reflect the comments.

aqjune updated this revision to Diff 282232.Fri, Jul 31, 8:42 AM

aqjune marked 4 inline comments as done.Fri, Jul 31, 8:45 AM

I tried to find other examples where this could happen, and I can't find anything else either.
So I think this just needs some minor changes.
However, Alive2 does not show the "f_dontfold" test example or my reduced candidate test as incorrect - is that a bug for Alive2?
https://alive2.llvm.org/ce/z/dbGLzN

f_dontfold simply shows that the patch prevents the folding; it is still correct to fold the expression. I'll leave TODO there.
I'll update this patch to reflect the comments.

I understood the question; the comments in distribute.ll were wrong. The transformations are valid even if x is undef. I fixed the comments.

nikic added a comment.Fri, Jul 31, 9:11 AM

Assuming D84792 lands, would performing the two SimplifyBinOp queries for L and R with CanUseUndef=false solve this problem as well? I would prefer that if it works.

spatel accepted this revision.Fri, Jul 31, 9:14 AM

It's not clear to me how these all of the undef safety features will come together, but this looks correct and unlikely to cause regressions, so LGTM.

This revision is now accepted and ready to land.Fri, Jul 31, 9:14 AM
aqjune added a comment.Sat, Aug 1, 6:00 AM

Assuming D84792 lands, would performing the two SimplifyBinOp queries for L and R with CanUseUndef=false solve this problem as well? I would prefer that if it works.

I think there is a trade-off between full correctness and performance. If InstSimplify can be arbitrarily smart in the future, simply blocking the transformation as shown in this patch seems to be the right way IMO.
For example, InstSimplify in the future can iterate over the BB and find equivalent instructions (as GVN does): in this case, a miscompilation can still happen, because distributivity can lead to replacing 2 * x with x + x.
Also, another concern that I had with bringing the can-undef-fold flag was that the change was global: whenever we add a new optimization to InstSimplify, it should consider whether CanUseUndef was correctly considered (maybe ConstantFolding, too?).

However, InstSimplify may not need to be very smart, and the cost of maintenance may not be very big in practice as well.
I am willing to follow the result of discussion.

kazu added a subscriber: kazu.Thu, Aug 6, 3:35 PM

Friendly ping.

Is there something left to improve in this patch? I'm asking this because a fix for https://bugs.llvm.org/show_bug.cgi?id=46940 builds on this fix, so I'd like to see this patch checked in if there is no pending issues. Thanks!

I think that we should revert the original patch for now since we have a
known miscompile and a chain of fixes are required to fix it.

As far as whether or not we can cause problems downstream from an
instcombine change - it's true, and unfortunately those need to be fixed
before we can have the instcombine as well if we find them.

Thanks!

aqjune added a comment.Fri, Aug 7, 7:22 AM

Should we go to D84792's direction?
If it is, I think we should mention that it is a workaround for the miscompilations including PR33165, and it can still be problematic in theory but chose to do it for performance.

spatel added a comment.Fri, Aug 7, 8:06 AM

Should we go to D84792's direction?
If it is, I think we should mention that it is a workaround for the miscompilations including PR33165, and it can still be problematic in theory but chose to do it for performance.

I don't have a preference, but others said they would like go with D84792, so let's do that.

aqjune abandoned this revision.Fri, Aug 7, 9:16 AM

I don't have a preference, but others said they would like go with D84792, so let's do that.

Well, okay. It is moving toward being more correct anyway, so it's good.

Is there something left to improve in this patch? I'm asking this because a fix for https://bugs.llvm.org/show_bug.cgi?id=46940 builds on this fix, so I'd like to see this patch checked in if there is no pending issues. Thanks!

Hello @kazu, sorry for making a confusion.
If D84792 lands, you can instead try calling SimplifyBinOp with giving false to SQ.CanUseUndef and check whether the bug disappears.