In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional expression.
We then create two arithmetic operations of the same type and feed a final
select operation using the result of the true arithmetic for the true operand
and the result of the false arithmetic for the false operand and reuse the original
conditionl expression. The arithmetic operations are naturally folded as a
consequence, leaving only the newly formed select to replace the old arithmetic operation.
Details
Diff Detail
Event Timeline
lib/Transforms/InstCombine/InstCombineAddSub.cpp | ||
---|---|---|
1410–1411 | Shouldn't these copy the fast math flags from the original add? |
Please combine your two files into one and reduce the testcases a little bit. No need for allocas, loads or stores.
Is this potentially better handled by extracting the code from the bottom of SimplifyUsingDistributiveLaws that deals with selects on both arms of a binop. The fact that simplifyUsingDistributiveLaws isn't called on fadd/fsub/fmul is causing this to be handled differently that integer add/sub/mul.
Should i add a fsub case in the tests?
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp | ||
---|---|---|
752 | Ditto here regarding fastmath |
I don't think that's right for the fastmath flags. In your example case the returned instruction is a select. That's not the instruction that should be getting the fastmath flags. I'm suprised that doesn't assert.
Is SimplifyUsingDistributiveLaws safe for the non constant cases of fmul/fadd/fsub when not using fast math?
So after looking in depth at SimplifyUsingDistributiveLaws, its pretty clear
that it is not safe when fast math is not available for the cases of
fadd/fsub/fmul. So I have reverted the code to the first form, while placing
the fast math flag copies on instructions that might not fold, which
should only happen rarely. I also added a test case for fsub as well.
I was only asking about using this portion of simplifyUsingDistributiveLaws
// (op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), 0)) // (op (select (a, b, c)), (select (a, b, d))) -> (select (a, 0, (op c, d))) if (auto *SI0 = dyn_cast<SelectInst>(LHS)) { if (auto *SI1 = dyn_cast<SelectInst>(RHS)) { if (SI0->getCondition() == SI1->getCondition()) { Value *SI = nullptr; if (Value *V = SimplifyBinOp(TopLevelOpcode, SI0->getFalseValue(), SI1->getFalseValue(), SQ.getWithInstruction(&I))) SI = Builder.CreateSelect(SI0->getCondition(), Builder.CreateBinOp(TopLevelOpcode, SI0->getTrueValue(), SI1->getTrueValue()), V); if (Value *V = SimplifyBinOp(TopLevelOpcode, SI0->getTrueValue(), SI1->getTrueValue(), SQ.getWithInstruction(&I))) SI = Builder.CreateSelect( SI0->getCondition(), V, Builder.CreateBinOp(TopLevelOpcode, SI0->getFalseValue(), SI1->getFalseValue())); if (SI) { SI->takeName(&I); return SI; } } } }
And since that's defering all of the intelligence to InstSimplify, I really hope its safe without fastmath. And it would never create a new instruction because InstSimplify isn't alllowed to. I believe it would also handle "select C, 0, B + select C, A, 0 -> select C, A, B"
lib/Transforms/InstCombine/InstCombineAddSub.cpp | ||
---|---|---|
1412 | I don't think this can ever be an Instruction. If you started with two Constants, at worst you'd get back a ConstantExpr here and you need to set the fast math flags on that. |
These changes support the fast math flag and the contexts according to the last note
on using the code snippet in InstCombining under SimplifyUsingDistributiveLaws, which I made a
helper function with support for handling fast math if required. The changes
pass make check-llvm.
Craig/David: Are there any issues that need to be resolved for this code review to conclude and move towards check in?
lib/Transforms/InstCombine/InstCombineAddSub.cpp | ||
---|---|---|
1412 | In theory, one could decide not to use the constant folder in the builder so it could happen, couldn't it? | |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
721 ↗ | (On Diff #112407) | The comment seems weird to me. Also I don't get why there is two of them :). |
734 ↗ | (On Diff #112407) | |
test/Transforms/InstCombine/select_arithmetic.ll | ||
3 | Could you add a comment on what this test case is checking? | |
7 | Please use opt -instnamer to get rid of the implicit variables. |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
721 ↗ | (On Diff #112407) | I screwed up the variable naming but you get it! (op something something instead of 0) |
As I do not yet have check-in privalges, can someone check-in this code
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
721 ↗ | (On Diff #112407) | I opted for this comment: // (op (select (a, b, c)), (select (a, d, e))) -> (select (a, (op b, d), (op c, e))) and augmented the code to fit the comment |
LGTM with nits addressed.
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
714 ↗ | (On Diff #114883) | I'd rename it something a little more unique/descriptive like SimplifySelectsFeedingBinaryOp |
727 ↗ | (On Diff #114883) | Could you assign this on the prior line? I don't think you save anything by doing the assignment in the if statement because the variable is used later. |
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
735 ↗ | (On Diff #115740) | I think you always pass true for CarryFastMathFlags. Is the goal to avoid invalid access to getFastMathFlags ? If so, I'd instead do something like if (isa<FPMathOperator>(&I)). |
736 ↗ | (On Diff #115740) | Forgive me if I am wrong but wouldn't this mutate the state of the builder beyond your intended use? I think you need a IRBuilder<>::FastMathFlagGuard. |
740 ↗ | (On Diff #115740) | I think the common custom in LLVM is to simply say if (V1). |
test/Transforms/InstCombine/select_arithmetic.ll | ||
42–48 | I don't think these add to the test, I'd remove them. |
Ok, that is a nice way to guard FP ops, subsequently I have removed the flag argument in this patch to rely on that and on whatever the BinaryOperator has set on it. The test file is also updated and a subsequent test run is clean.
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
735–749 ↗ | (On Diff #115860) | I think you can reduce the duplication by doing something like: BuilderTy::FastMathFlagGuard Guard(Builder); if (isa<FPMathOperator>(&I)) Builder.setFastMathFlags(I.getFastMathFlags()); Value *V1 = SimplifyBinOp(Opcode, C, E, SQ.getWithInstruction(&I)); SI = V1 ? Builder.CreateSelect(A1, Builder.CreateBinOp(Opcode, B, D), V1) : SI; if (Value *V2 = SimplifyBinOp(Opcode, B, D, SQ.getWithInstruction(&I))) SI = V1 ? Builder.CreateSelect(A1, V2, V1) : Builder.CreateSelect(A1, V2, Builder.CreateBinOp(Opcode, C, E)); if (SI) { SI->takeName(&I); return SI; } The guard will harmlessly no-op when we don't have an FP operator. |
Ok, and the guard is of no effect when the fastmath flags are not set. Testing is still clean.
lib/Transforms/InstCombine/InstructionCombining.cpp | ||
---|---|---|
731–733 ↗ | (On Diff #115864) | You can simplify this a little using m_Specific: if (match(LHS, m_Select(m_Value(A), m_Value(B), m_Value(C))) && match(RHS, m_Select(m_Specific(A), m_Value(D), m_Value(E)))) { This also lets us remove a level of indentation. |
739 ↗ | (On Diff #115864) | Are there situations where we'd unnecessarily CreateSelect? My reading of the code is that we will throw away this SelectInst if both V1 and V2 end up being true. Maybe something like the following would work: Value *SI = nullptr; Value *V1 = SimplifyBinOp(Opcode, C, E, SQ.getWithInstruction(&I)); Value *V2 = SimplifyBinOp(Opcode, B, D, SQ.getWithInstruction(&I)); if (V1 && V2) { SI = Builder.CreateSelect(A1, V2, V1); } else if (V2) { SI = Builder.CreateSelect(A1, V2, Builder.CreateBinOp(Opcode, C, E)); } else if (V1) { SI = Builder.CreateSelect(A1, Builder.CreateBinOp(Opcode, B, D), V1); } if (SI) { SI->takeName(&I); return SI; } |
739–741 ↗ | (On Diff #115864) | It is uncommon in LLVM to parenthesize the condition of a ternary expression. |
Unfortunately, we're seeing a 4.7% regression in SPEC2006/lbm due to this change. The number of dynamic instructions is increased by ~5.5%.
We're running on AArch64/Falkor with the following flags: -O3 -fno-math-errno -ffp-contract=fast -fomit-frame-pointer
All of the code changes are in LBM_performStreamCollide(), which accounts for ~98% of execution.
I'll keep digging so I can provide more details, but I wanted to report the regression.
Shouldn't these copy the fast math flags from the original add?