This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold binop of select and cast of select condition
ClosedPublic

Authored by antoniofrighetto on Jun 28 2023, 4:39 AM.

Details

Summary

InstCombine now simplifies binary operations, whose operands involve a select instruction and a cast of the select condition, into a select with folded arguments.

Fixes #63472.

Proofs: https://alive2.llvm.org/ce/z/c_JwwM

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 4:39 AM
antoniofrighetto requested review of this revision.Jun 28 2023, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 4:39 AM
nikic added inline comments.Jun 28 2023, 7:19 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1496

In this case, we don't have to require that False is a constant, right?

1502

And in this case, True can be non-constant.

nikic added inline comments.Jun 28 2023, 7:19 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1492

Redundant ().

llvm/test/Transforms/InstCombine/add.ll
3066 ↗(On Diff #535329)

Can you please add these additional tests:

  • Negative test where select and zext args are not the same.
  • Multi-use test.
  • Vector test.

Can you add alive2 link for this? I think it also works a lot more binops than just add. If so, can you split to a helper so it can be used elsewhere.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1488

I had a lot of trouble following this comment can you make say:

(select cond, (add X, 1), X)
   -> (add X, (zext cond))

then some format for sext.

antoniofrighetto retitled this revision from [InstCombine] Fold add of select and zext/sext of select condition to [InstCombine] Fold add of select and cast of select condition.
antoniofrighetto edited the summary of this revision. (Show Details)

Added proofs, added further tests, reviewed doc-comment, and added support for uitofp/sitofp conversion as well.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1496

Correct. This is not true when handling floating point though, for which it is required that both are constants.

antoniofrighetto retitled this revision from [InstCombine] Fold add of select and cast of select condition to [InstCombine] Fold binop of select and cast of select condition.Jul 3 2023, 6:12 AM
goldstein.w.n added inline comments.Jul 3 2023, 11:28 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
947

This entire function is in desperate need of comments.
Likewise the summary (commit message) should actually explain the transform being added.

Please comment this at the very least to make review simpler :)

llvm/test/Transforms/InstCombine/binop-select-cast-of-select-cond.ll
197

Can you split the tests to a seperate commit so we can see the diff generated by this patch?

nikic added inline comments.Jul 3 2023, 1:23 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
917

I don't think we should support uitofp/sitofp in this fold -- this seems way too contrived. It would be okay if this fell out of some generic handling, but not something we should spend lines of code on.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
917

@nikic, I do not have any strong opinions here, no problem removing them, I just thought it could make sense to support fp ops, as follows:

float fadd_select_uitofp(unsigned int a, bool b) {
    float f1 = b ? 2.0 : 1.0;
    float f2 = (float) b;
    return f1 + f2;
}

Considering that the patch is meant to support the following use case – amongst the others:

int add_select_zext(unsigned int a, bool b) {
    int f1 = b ? 2 : 1;
    int f2 = (int) b;
    return f1 + f2;
}

They seem like less than 30 LOCs to me, they should not really harm. Might be a bit contrived, but it is still valid C code.

947

Thanks for the note, apologize for that, I'm fixing it right away to introduce more comments.

llvm/test/Transforms/InstCombine/binop-select-cast-of-select-cond.ll
197

Do you mean to open a different patch for the tests (or land them directly)?

antoniofrighetto edited the summary of this revision. (Show Details)

Introduced further comments, dropped support at the granularity of floating point casts.

Split in two separate commits (cc/ @goldstein.w.n).

goldstein.w.n added inline comments.Jul 5 2023, 10:22 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
887

I don't see why the op must the Add or Sub. AFAICT the key is that (zext C) can be treated as 1 on the true arm and 0 on the false arm so we can just constant apply (select c, (binop 1, T), (binop 0, F)) which afaict works for just about any binop. (for sext its just -1 and 0).

911

I think cleaner would be:

Constant * CastConst = IsZExt ? Constant::getOne() : Constant::getAllOnes();
return Builder.CreateBinOp(Opc, C, CastConst);
antoniofrighetto updated this revision to Diff 538369.EditedJul 8 2023, 7:09 AM
antoniofrighetto edited the summary of this revision. (Show Details)

Should be a bit more on track now, thanks for the suggestion! Also polished up a bit, and updated proofs. I wonder if this should be extended to sdiv/udiv and have undef in the false arm (although, admittedly, I don't see much the point).

goldstein.w.n added a comment.EditedJul 8 2023, 9:47 AM

So I'm going to propose something slightly different for accomplishing this patch.
Instead of making a combine for binops, I think you can just arbitrarily
constant propagate any usage of cond to the true/false arms.

I would suggest something along the following lines (pseudo code):

if (TrueVal->hasOneUse()) {
  SmallVector<std::optional<bool>> TrueReplacements;
  bool DoReplace = false;
  for (Value *Op : TrueVal->Operands()) {
    if (Op == CondVal || match(Op, m_ZExt(m_Specific(CondVal)))) {
      TrueReplacements.emplace_back(true);
      DoReplace = true;
    } else if (match(Op, m_SExt(m_Specific(CondVal)))) {
      TrueReplacements.emplace_back(false);
      DoReplace = true;

    } else
      TrueReplacements.emplace_back(std::nullopt);
  }
  if (DoReplace) {
    SmallVector<Value *> TrueOps;
    for (unsigned I = 0; I < TrueVal->numOperands(); ++I) {
      if (TrueReplacements[I]) {
        TrueOps->emplace_back(*TrueReplacement[I] ? Constant::getOneValue(Ty)
                                                  : Constant::getAllOnes(Ty));
      } else {
              TrueOps->emplace_back(TrueVal->getOperand(I);
      }
    }
    NewTrue = Builder.Create(TrueVal->getOpcode(), TrueOps);
  }
}

For false its always replace with zero.
Maybe put this in a lambda and if you get new true/false vals
then create new instructions.

Edit:
You could probably skip the oneuse check at the top, and wait to do that check until
after you see how many non-constant operands. Its possible you will be able to entirely
constant-fold some arms inwhich case any use-count is okay.

antoniofrighetto added a comment.EditedJul 10 2023, 8:28 AM

So I'm going to propose something slightly different for accomplishing this patch.
Instead of making a combine for binops, I think you can just arbitrarily
constant propagate any usage of cond to the true/false arms.

I guess with TrueVal->Operands() we actually mean the (unique) user of TrueVal? I'm not exactly following how we would benefit from this, where the operands of the select are constants in most of our use cases, here's just an example:

define i64 @src(i1 %c) {
  %sel = select i1 %c, i64 64, i64 1
  %ext = zext i1 %c to i64
  %add = add i64 %sel, %ext
  ret i64 %add
}

Conversely, wouldn't the proposed change be trying to catch the following orthogonal test case?

define i64 @src(i1 %c, i64 %a, i64 %b, i64 %d) {
  %ext = zext i1 %c to i64
  %add = add i64 %a, %ext
  %sub = sub i64 %b, %ext
  %sel = select i1 %c, i64 %add, i64 %sub
  ret i64 %sel
}

Where %add and %sub may be optimized respectively with %1 = add i64 %a, 1 and %2 = sub i64 %b, 0.

So I'm going to propose something slightly different for accomplishing this patch.
Instead of making a combine for binops, I think you can just arbitrarily
constant propagate any usage of cond to the true/false arms.

I guess with TrueVal->Operands() we actually mean the (unique) user of TrueVal? I'm not exactly following how we would benefit from this, where the operands of the select are constants in most of our use cases, here's just an example:

No I mean the actual operands. The point is the select condition can be constant propagated to instruction on the true/false arm (usually). You can't do this for instructions that are not speculatable (although same is true for binops in your current patch).

define i64 @src(i1 %c) {
  %sel = select i1 %c, i64 64, i64 1
  %ext = zext i1 %c to i64
  %add = add i64 %sel, %ext
  ret i64 %add
}

Conversely, wouldn't the proposed change be trying to catch the following orthogonal test case?

define i64 @src(i1 %c, i64 %a, i64 %b, i64 %d) {
  %ext = zext i1 %c to i64
  %add = add i64 %a, %ext
  %sub = sub i64 %b, %ext
  %sel = select i1 %c, i64 %add, i64 %sub
  ret i64 %sel
}

Where %add and %sub may be optimized respectively with %1 = add i64 %a, 1 and %2 = sub i64 %b, 0.

Yes, what I'm proposing covers this case, it just doesn't stop at binops.
Basically for any instruction that we may speculate that uses the condition (or (ext %condition))
we can just arbitrarily replace %condition with 1/0 depending if its from the true/false
arm.

So instead of matching BinOps, just take any speculatable instruction, iterate through its
operands and replace all references to %condition with the constant. Its just a more
general implementation.

nikic added a comment.Jul 10 2023, 9:28 AM

@goldstein.w.n Where would the transform be rooted in that case?

@goldstein.w.n Where would the transform be rooted in that case?

that case? Not sure what you are saying.

@goldstein.w.n The transform needs to be called from somewhere. Currently it is called for a number of binops. The same transform is indeed possible for non-binop instructions, but I'm not sure it makes sense to plaster it everywhere.

The main sensible way of doing that I can see is to extend FoldOpIntoSelect() to handle this. We could extend constantFoldOperationIntoSelectOperand() to determine the constant for zext/sext of the condition, similar to the special case for icmps it currently has. However, this would also require relaxing various checks where we currently only call FoldOpIntoSelect() if we have a constant operand.

I guess if there is no compile-time impact to calling FoldOpIntoSelect() for non-constants, then doing that would be a pretty clean and general way to go about it.

@goldstein.w.n The transform needs to be called from somewhere. Currently it is called for a number of binops. The same transform is indeed possible for non-binop instructions, but I'm not sure it makes sense to plaster it everywhere.

I see, thats a fair point.

The main sensible way of doing that I can see is to extend FoldOpIntoSelect() to handle this. We could extend constantFoldOperationIntoSelectOperand() to determine the constant for zext/sext of the condition, similar to the special case for icmps it currently has. However, this would also require relaxing various checks where we currently only call FoldOpIntoSelect() if we have a constant operand.

I guess if there is no compile-time impact to calling FoldOpIntoSelect() for non-constants, then doing that would be a pretty clean and general way to go about it.

Okay, guess we can leave that as a todo for the future. I may look into when I have the time.

@antoniofrighetto for now I think its okay to keep this only for binops, but please add a todo that
if FoldOpIntoSelect is updated, we should remove the binop specific function.

antoniofrighetto updated this revision to Diff 539038.EditedJul 11 2023, 5:35 AM

TODO added, polished up the code. @goldstein.w.n, thanks for pointing that out.

goldstein.w.n added inline comments.Jul 11 2023, 9:57 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
897

Is there a reason RHS must be the ext and LHS must be the select?

llvm/test/Transforms/InstCombine/binop-select-cast-of-select-cond.ll
157

Can you add the following tests:

  1. where select is on the RHS of the binop.
  2. Where the select is multi-use
  3. Where the select has two non-const sides.
197

Open a different patch.

antoniofrighetto updated this revision to Diff 540068.EditedJul 13 2023, 9:20 AM
antoniofrighetto edited the summary of this revision. (Show Details)

@goldstein.w.n, added multiuse select and non-const sides tests (last three tests), proofs updated (https://alive2.llvm.org/ce/z/m_vGWd). Tests are on a dedicated commit.

It seems RHS ext & LHS sel operands are canonicalized as such in SimplifyAssociativeOrCommutative for add/mul. When the binop is a subtraction, being non-commutative, the semantic of the operation will change if the operands are swapped. I tried extending the helper to support swapped operands too (LHS ext & RHS sel), yet, the transformation we obtain appears to be correct only when the select has both const sides. With non-const sides, the transformation fails (proof: https://alive2.llvm.org/ce/z/BZU9cR, 2 transformations failed out of 4). I don't believe we should introduce specialization in the helper for const-only subs.

@goldstein.w.n, added multiuse select and non-const sides tests (last three tests), proofs updated (https://alive2.llvm.org/ce/z/m_vGWd). Tests are on a dedicated commit.

It seems RHS ext & LHS sel operands are canonicalized as such in SimplifyAssociativeOrCommutative for add/mul.

Okay thats fine (also other binops).

When the binop is a subtraction, being non-commutative, the semantic of the operation will change if the operands are swapped. I tried extending the helper to support swapped operands too (LHS ext & RHS sel), yet, the transformation we obtain appears to be correct only when the select has both const sides. With non-const sides, the transformation fails (proof: https://alive2.llvm.org/ce/z/BZU9cR, 2 transformations failed out of 4). I don't believe we should introduce specialization in the helper for const-only subs.

non-const is fine AFAICT, youre proof is just a little buggy: https://alive2.llvm.org/ce/z/4yWy3R
Its okay to skip for now, but please add TODO for sub.

antoniofrighetto updated this revision to Diff 540720.EditedJul 15 2023, 11:59 AM

@goldstein.w.n, thanks for pointing that out, updated to support swapped operands too, added tests for it.

(EDIT: Updated proofs as well)

antoniofrighetto edited the summary of this revision. (Show Details)Jul 17 2023, 12:33 AM
goldstein.w.n added inline comments.Jul 17 2023, 9:20 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
902

Can you put this duplicate logic for RHS/LHS in a lambda?

907

Since you capture the entire scope I think the IsCastOpRHS and IsZExt can just be computed in the lambda which makes the callsites clearer.

921

Imo cleaner as a whole is to seperate the decision about the constant from creating the binop i.e

Constant * C;
if(IsTrueArm) {
  C = 0
}
else if(IsZext) {
  C = 1
}
else {
  C = AllOnes
}

return IsConstOpRHS ? BinOp(Opc, V, C) : BinOp(Opc, C, V);
llvm/test/Transforms/InstCombine/binop-select-cast-of-select-cond.ll
197

still outstanding.

Simplified, and opened a patch for tests at: https://reviews.llvm.org/D155510, thanks!

We should see diffs to the tests. This patch should be rebased ontop of D155510. Once you set D155510 as the child of this commit, when you checkout this one, it will grab both. Then you can rebase + resubmit both of them and we will be able to see the actual changes this commit causes.

goldstein.w.n added inline comments.Jul 17 2023, 1:22 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
936

Think the follow is cleaner:

if(CondVal == A || match(...)) {
bool IsInvertedByNot = CondVal != A;
return SelectInst::Create(CondVal, NewFoldedConst(IsInvertedByNot, TrueVal),
                              NewFoldedConst(IsInvertedByNot, FalseVal));
}

@goldstein.w.n, thank you, I think it should work out properly now.

@goldstein.w.n, thank you, I think it should work out properly now.

We don't see any test changes as a result of the impl.
You need to do the following:

  1. Update the tests in the test patch with LLVM compiled WITHOUT your change.
  2. Update the tests in this patch with LLVM compiled WITH your change.

That way we can see exact what your change actually changes..

@goldstein.w.n, totally makes sense, my bad, for some reason I thought the diff would show up automatically when chaining reviews by setting the child. We should be on track now.

@goldstein.w.n, it seems like the order of the two simplified instructions for the lasts two tests on Windows is reversed compared to the ones it should expect, although I'm not able to reproduce the issue locally. Any idea?

@goldstein.w.n, it seems like the order of the two simplified instructions for the lasts two tests on Windows is reversed compared to the ones it should expect, although I'm not able to reproduce the issue locally. Any idea?

Sorry dont really understand what you mean. But personally have no idea about the subtle differences between LLVM on windows vs linux.

@goldstein.w.n, looking at the CI failure on Windows, it seems like that for test sub_select_sext_op_swapped_non_const_args it expects sub i6 0, [[ARGF]] to be the first instruction, as opposed to the one on Linux, where that sub appears as second instruction. That may be the reason why FileCheck fails.

@goldstein.w.n, looking at the CI failure on Windows, it seems like that for test sub_select_sext_op_swapped_non_const_args it expects sub i6 0, [[ARGF]] to be the first instruction, as opposed to the one on Linux, where that sub appears as second instruction. That may be the reason why FileCheck fails.

Oh, I wouldn't worry about that. I don't think the CI for phabricator is really used. AFAIK we only do post commit testing for LLVM (atlthough some sub projects do pre commit testing).

This revision is now accepted and ready to land.Jul 19 2023, 1:39 PM
dyung added a subscriber: dyung.Jul 22 2023, 6:09 PM

Hi @antoniofrighetto, this change seems to be causing the compiler to hit an assertion failure in one of our internal tests. I have put the details in bug #64040, can you take a look?