This is an archive of the discontinued LLVM Phabricator instance.

Add select simplifications
ClosedPublic

Authored by mcberg2017 on Aug 22 2017, 12:57 PM.

Details

Summary

In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional expression.
We then create two arithmetic operations of the same type and feed a final
select operation using the result of the true arithmetic for the true operand
and the result of the false arithmetic for the false operand and reuse the original
conditionl expression. The arithmetic operations are naturally folded as a
consequence, leaving only the newly formed select to replace the old arithmetic operation.

Diff Detail

Repository
rL LLVM

Event Timeline

mcberg2017 created this revision.Aug 22 2017, 12:57 PM
majnemer added inline comments.Aug 22 2017, 1:23 PM
lib/Transforms/InstCombine/InstCombineAddSub.cpp
1408–1409 ↗(On Diff #112210)

Shouldn't these copy the fast math flags from the original add?

majnemer edited edge metadata.Aug 22 2017, 1:23 PM

Please combine your two files into one and reduce the testcases a little bit. No need for allocas, loads or stores.

Is this potentially better handled by extracting the code from the bottom of SimplifyUsingDistributiveLaws that deals with selects on both arms of a binop. The fact that simplifyUsingDistributiveLaws isn't called on fadd/fsub/fmul is causing this to be handled differently that integer add/sub/mul.

mcberg2017 marked an inline comment as done.

I will add the fastmath flag copy in as well, as tryFactorization does not do that.

Updated for fast math flags

Should i add a fsub case in the tests?

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
749 ↗(On Diff #112210)

Ditto here regarding fastmath

I don't think that's right for the fastmath flags. In your example case the returned instruction is a select. That's not the instruction that should be getting the fastmath flags. I'm suprised that doesn't assert.

Is SimplifyUsingDistributiveLaws safe for the non constant cases of fmul/fadd/fsub when not using fast math?

So after looking in depth at SimplifyUsingDistributiveLaws, its pretty clear
that it is not safe when fast math is not available for the cases of
fadd/fsub/fmul. So I have reverted the code to the first form, while placing
the fast math flag copies on instructions that might not fold, which
should only happen rarely. I also added a test case for fsub as well.

mcberg2017 marked an inline comment as done.Aug 22 2017, 5:41 PM

I was only asking about using this portion of simplifyUsingDistributiveLaws

// (op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), 0))
// (op (select (a, b, c)), (select (a, b, d))) -> (select (a, 0, (op c, d)))
if (auto *SI0 = dyn_cast<SelectInst>(LHS)) {
  if (auto *SI1 = dyn_cast<SelectInst>(RHS)) {
    if (SI0->getCondition() == SI1->getCondition()) {
      Value *SI = nullptr;
      if (Value *V =
              SimplifyBinOp(TopLevelOpcode, SI0->getFalseValue(),
                            SI1->getFalseValue(), SQ.getWithInstruction(&I)))
        SI = Builder.CreateSelect(SI0->getCondition(),
                                  Builder.CreateBinOp(TopLevelOpcode,
                                                      SI0->getTrueValue(),
                                                      SI1->getTrueValue()),
                                  V);
      if (Value *V =
              SimplifyBinOp(TopLevelOpcode, SI0->getTrueValue(),
                            SI1->getTrueValue(), SQ.getWithInstruction(&I)))
        SI = Builder.CreateSelect(
            SI0->getCondition(), V,
            Builder.CreateBinOp(TopLevelOpcode, SI0->getFalseValue(),
                                SI1->getFalseValue()));
      if (SI) {
        SI->takeName(&I);
        return SI;
      }
    }
  }
}

And since that's defering all of the intelligence to InstSimplify, I really hope its safe without fastmath. And it would never create a new instruction because InstSimplify isn't alllowed to. I believe it would also handle "select C, 0, B + select C, A, 0 -> select C, A, B"

craig.topper added inline comments.Aug 22 2017, 5:57 PM
lib/Transforms/InstCombine/InstCombineAddSub.cpp
1410 ↗(On Diff #112272)

I don't think this can ever be an Instruction. If you started with two Constants, at worst you'd get back a ConstantExpr here and you need to set the fast math flags on that.

These changes support the fast math flag and the contexts according to the last note
on using the code snippet in InstCombining under SimplifyUsingDistributiveLaws, which I made a
helper function with support for handling fast math if required. The changes
pass make check-llvm.

Craig/David: Are there any issues that need to be resolved for this code review to conclude and move towards check in?

qcolombet added inline comments.Sep 11 2017, 3:46 PM
lib/Transforms/InstCombine/InstCombineAddSub.cpp
1410 ↗(On Diff #112272)

In theory, one could decide not to use the constant folder in the builder so it could happen, couldn't it?

lib/Transforms/InstCombine/InstructionCombining.cpp
721 ↗(On Diff #112407)

The comment seems weird to me.
Shouldn't it be:
(op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), (op d, b)))?

Also I don't get why there is two of them :).

734 ↗(On Diff #112407)

When SI != nullptr, shouldn't we use SI instead of CreateBinOp for B1, B2, here?

test/Transforms/InstCombine/select_arithmetic.ll
2 ↗(On Diff #112407)

Could you add a comment on what this test case is checking?

6 ↗(On Diff #112407)

Please use opt -instnamer to get rid of the implicit variables.

qcolombet added inline comments.Sep 11 2017, 3:50 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
721 ↗(On Diff #112407)

I screwed up the variable naming but you get it! (op something something instead of 0)

qcolombet added inline comments.Sep 11 2017, 4:03 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
734 ↗(On Diff #112407)

I meant the previous V (the one we got from the call of SimplifyBinOp B1 B2).
(Gosh I need more coffee!)

mcberg2017 marked 3 inline comments as done.

Updated for latest review.

qcolombet accepted this revision.Sep 12 2017, 2:40 PM

LGTM.

Thanks Michael.

This revision is now accepted and ready to land.Sep 12 2017, 2:40 PM

As I do not yet have check-in privalges, can someone check-in this code

lib/Transforms/InstCombine/InstructionCombining.cpp
721 ↗(On Diff #112407)

I opted for this comment:

// (op (select (a, b, c)), (select (a, d, e))) -> (select (a, (op b, d), (op c, e)))

and augmented the code to fit the comment

@majnemer
Hi David,

Does the patch LG to you before I commit on Michael's behalf?

Cheers,
-Quentin

majnemer accepted this revision.Sep 18 2017, 11:04 AM

LGTM with nits addressed.

lib/Transforms/InstCombine/InstructionCombining.cpp
714 ↗(On Diff #114883)

I'd rename it something a little more unique/descriptive like SimplifySelectsFeedingBinaryOp

727 ↗(On Diff #114883)

Could you assign this on the prior line? I don't think you save anything by doing the assignment in the if statement because the variable is used later.

Post review update.

majnemer requested changes to this revision.Sep 18 2017, 10:18 PM
majnemer added inline comments.
lib/Transforms/InstCombine/InstructionCombining.cpp
735 ↗(On Diff #115740)

I think you always pass true for CarryFastMathFlags. Is the goal to avoid invalid access to getFastMathFlags ? If so, I'd instead do something like if (isa<FPMathOperator>(&I)).

736 ↗(On Diff #115740)

Forgive me if I am wrong but wouldn't this mutate the state of the builder beyond your intended use? I think you need a IRBuilder<>::FastMathFlagGuard.

740 ↗(On Diff #115740)

I think the common custom in LLVM is to simply say if (V1).

test/Transforms/InstCombine/select_arithmetic.ll
41–47 ↗(On Diff #115740)

I don't think these add to the test, I'd remove them.

This revision now requires changes to proceed.Sep 18 2017, 10:18 PM
mcberg2017 edited edge metadata.

Ok, that is a nice way to guard FP ops, subsequently I have removed the flag argument in this patch to rely on that and on whatever the BinaryOperator has set on it. The test file is also updated and a subsequent test run is clean.

Increased scope for fastmath guard on builder and separated scope for non fp ops

mcberg2017 marked 8 inline comments as done.Sep 19 2017, 10:20 AM
majnemer added inline comments.Sep 19 2017, 10:27 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
735–749 ↗(On Diff #115860)

I think you can reduce the duplication by doing something like:

BuilderTy::FastMathFlagGuard Guard(Builder);
if (isa<FPMathOperator>(&I))
  Builder.setFastMathFlags(I.getFastMathFlags());

Value *V1 = SimplifyBinOp(Opcode, C, E, SQ.getWithInstruction(&I));
SI = V1 ? Builder.CreateSelect(A1, Builder.CreateBinOp(Opcode, B, D), V1) : SI;
if (Value *V2 = SimplifyBinOp(Opcode, B, D, SQ.getWithInstruction(&I)))
  SI = V1 ? Builder.CreateSelect(A1, V2, V1) : Builder.CreateSelect(A1, V2, Builder.CreateBinOp(Opcode, C, E));

if (SI) {
  SI->takeName(&I);
  return SI;
}

The guard will harmlessly no-op when we don't have an FP operator.

Ok, and the guard is of no effect when the fastmath flags are not set. Testing is still clean.

mcberg2017 marked an inline comment as done.Sep 19 2017, 10:41 AM
majnemer added inline comments.Sep 19 2017, 10:51 AM
lib/Transforms/InstCombine/InstructionCombining.cpp
731–733 ↗(On Diff #115864)

You can simplify this a little using m_Specific:

if (match(LHS, m_Select(m_Value(A), m_Value(B), m_Value(C))) &&
    match(RHS, m_Select(m_Specific(A), m_Value(D), m_Value(E)))) {

This also lets us remove a level of indentation.

739 ↗(On Diff #115864)

Are there situations where we'd unnecessarily CreateSelect?

My reading of the code is that we will throw away this SelectInst if both V1 and V2 end up being true.

Maybe something like the following would work:

Value *SI = nullptr;
Value *V1 = SimplifyBinOp(Opcode, C, E, SQ.getWithInstruction(&I));
Value *V2 = SimplifyBinOp(Opcode, B, D, SQ.getWithInstruction(&I));
if (V1 && V2) {
  SI = Builder.CreateSelect(A1, V2, V1);
} else if (V2) {
  SI = Builder.CreateSelect(A1, V2, Builder.CreateBinOp(Opcode, C, E));
} else if (V1) {
  SI = Builder.CreateSelect(A1, Builder.CreateBinOp(Opcode, B, D), V1);
}

if (SI) {
  SI->takeName(&I);
  return SI;
}
739–741 ↗(On Diff #115864)

It is uncommon in LLVM to parenthesize the condition of a ternary expression.

Simplified to reduce lexical depth, removed a return and removed ternaries.

mcberg2017 marked an inline comment as done.Sep 19 2017, 11:22 AM
mcberg2017 marked an inline comment as done.
This revision is now accepted and ready to land.Sep 19 2017, 11:42 AM
This revision was automatically updated to reflect the committed changes.

Unfortunately, we're seeing a 4.7% regression in SPEC2006/lbm due to this change. The number of dynamic instructions is increased by ~5.5%.

We're running on AArch64/Falkor with the following flags: -O3 -fno-math-errno -ffp-contract=fast -fomit-frame-pointer

All of the code changes are in LBM_performStreamCollide(), which accounts for ~98% of execution.

I'll keep digging so I can provide more details, but I wanted to report the regression.

mcrosier added a comment.EditedSep 26 2017, 12:11 PM

Unfortunately, we're seeing a 4.7% regression in SPEC2006/lbm due to this change. The number of dynamic instructions is increased by ~5.5%.

We're running on AArch64/Falkor with the following flags: -O3 -fno-math-errno -ffp-contract=fast -fomit-frame-pointer

All of the code changes are in LBM_performStreamCollide(), which accounts for ~98% of execution.

I'll keep digging so I can provide more details, but I wanted to report the regression.

Balaram posted a candidate fix here: https://reviews.llvm.org/D38288