This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Simplify and correct folding fcmps with the same children
ClosedPublic

Authored by timshen on Jun 27 2016, 5:11 PM.

Download Raw Diff

Details

Reviewers

Commits

rGaec68b263dc1: [InstCombine] Simplify and correct folding fcmps with the same children
rL274156: [InstCombine] Simplify and correct folding fcmps with the same children

Summary

Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.

Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.

Diff Detail

Event Timeline

timshen updated this revision to Diff 62046.Jun 27 2016, 5:11 PM

timshen retitled this revision from to [InstCombine] Simplify and correct folding fcmps with the same children.

timshen updated this object.

timshen added a reviewer: spatel.

timshen added subscribers: echristo, iteratee, llvm-commits.

This looks fantastic...if it's true. :)

So that's my problem: before we even get to this patch, we only have ~13 tests covering the 256 possible (16 * 16 predicates * 2 (and+or) / 2 for commutation) combinations of predicates.

Is there some clever way to get more coverage from those tests? If there's nothing obvious, we might as well just copy/paste the entire set into the existing regression test files. It's cheap. That way, we're certain that we're choosing the right predicate in all cases. The regression tests would then also serve as an unofficial assert that the enum values never change from their perfectly chosen values. IMO, an official assert and some loud comments in the code are warranted too.

Whether or not we agree that we should have the full regression test coverage, please note that there's a script to generate exact checks for regression tests:
$ utils/update_test_checks.py --help

This removes the need for CHECK-NOTs.

I've updated the old tests using that script in rL274046 and rL274047 .

If there's nothing obvious, we might as well just copy/paste the entire set into the existing regression test files.

I guess I missed a bit of context. Copy from where? By saying regression test files do you mean test/Transforms/InstCombine/and-fcmp.ll and test/Transforms/InstCombine/or-fcmp.ll?

Besides, auto-generate the CHECKs (assertions?) won't help with the LLVM code coverage, will it? If not, should we generate all 512 (regardless of the commutations) cases, since it's not that many? But then we have to test the generator... :P.

In D21775#469307, @timshen wrote:

If there's nothing obvious, we might as well just copy/paste the entire set into the existing regression test files.

I guess I missed a bit of context. Copy from where? By saying regression test files do you mean test/Transforms/InstCombine/and-fcmp.ll and test/Transforms/InstCombine/or-fcmp.ll?

Yes, the files under the 'test' dir are the regression tests:
http://llvm.org/docs/TestingGuide.html#llvm-testing-infrastructure-organization

Sorry, 'copy/paste' wasn't the best phrase. I just meant we should stamp out all N of the logical combinations of predicates. Right now, we don't really know what the existing behavior for each combination looks like.

Besides, auto-generate the CHECKs (assertions?) won't help with the LLVM code coverage, will it? If not, should we generate all 512 (regardless of the commutations) cases, since it's not that many? But then we have to test the generator... :P.

I don't understand what you mean by 'test the generator'?

Here's how I would proceed:

Create all N tests in the existing test files. I don't think there's much value in the commuted variants, so this would actually be N = 16*17 = 272 I think.
Run the script to generate the CHECK lines.
Commit these test changes to trunk.
Now, we know what predicates the existing code produces (hopefully there are no existing bugs).
Apply your code patch.
Regenerate the CHECK lines using the script.
Update the patch here with those diffs, so we can see any functional differences from your change in logic.

Create all N tests in the existing test files. I don't think there's much value in the commuted variants, so this would actually be N = 16*17 = 272 I think.

Run the script to generate the CHECK lines.

Commit these test changes to trunk.

I prefer not to commit this to the trunk, but only locally, since @t6 in or-fcmp.ll is testing an actual existing bug (sorry for not put that explicitly). It'd be weird to check-in something that is known wrong, and make them pass later.

How about I compare the auto-generated CHECKs and post the functional differences here in the comments, we inspect it, and then check-in the patch with all N tests (with correct CHECKs) together?

timshen updated this object.Jun 28 2016, 5:40 PM

timshen edited edge metadata.

The functional change is here:
https://gist.github.com/timshen91/51b209c8d1be22ef96a1a9fd4131595d#file-fcmp-diff-diff

I verified the differences, and considered them valid. PTAL. Specifically, there are bug fixes from line 2586 to line 2643; others are just optimizations.

Updated the patch with all 272 tests.

Sounds good. Thanks for checking.

Updated comments, and move the changed (perfectly handled actually) solutions prior to other heuristics. The move catches two more optimizations, see a local diff:

diff --git a/test/Transforms/InstCombine/and-fcmp.ll b/test/Transforms/InstCombine/and-fcmp.ll
index 8ad6779..fa9e441 100644
--- a/test/Transforms/InstCombine/and-fcmp.ll
+++ b/test/Transforms/InstCombine/and-fcmp.ll
@@ -546,10 +546,8 @@ bb:
 define i1 @auto_gen_35(double %a, double %b) {
 ; CHECK-LABEL: @auto_gen_35(
 ; CHECK-NEXT:  bb:
-; CHECK-NEXT:    [[CMP:%.*]] = fcmp ord double %a, %b
-; CHECK-NEXT:    [[CMP1:%.*]] = fcmp ord double %a, %b
-; CHECK-NEXT:    [[RETVAL:%.*]] = and i1 [[CMP]], [[CMP1]]
-; CHECK-NEXT:    ret i1 [[RETVAL]]
+; CHECK-NEXT:    [[TMP0:%.*]] = fcmp ord double %a, %b
+; CHECK-NEXT:    ret i1 [[TMP0]]
 ;
 bb:
   %cmp = fcmp ord double %a, %b
diff --git a/test/Transforms/InstCombine/or-fcmp.ll b/test/Transforms/InstCombine/or-fcmp.ll
index f961cfa..9d743c7 100644
--- a/test/Transforms/InstCombine/or-fcmp.ll
+++ b/test/Transforms/InstCombine/or-fcmp.ll
@@ -1579,10 +1579,8 @@ bb:
 define i1 @auto_gen_119(double %a, double %b) {
 ; CHECK-LABEL: @auto_gen_119(
 ; CHECK-NEXT:  bb:
-; CHECK-NEXT:    [[CMP:%.*]] = fcmp uno double %a, %b
-; CHECK-NEXT:    [[CMP1:%.*]] = fcmp uno double %a, %b
-; CHECK-NEXT:    [[RETVAL:%.*]] = or i1 [[CMP]], [[CMP1]]
-; CHECK-NEXT:    ret i1 [[RETVAL]]
+; CHECK-NEXT:    [[TMP0:%.*]] = fcmp uno double %a, %b
+; CHECK-NEXT:    ret i1 [[TMP0]]
 ;
 bb:
   %cmp = fcmp uno double %a, %b

In D21775#469456, @timshen wrote:

Create all N tests in the existing test files. I don't think there's much value in the commuted variants, so this would actually be N = 16*17 = 272 I think.

Run the script to generate the CHECK lines.

Commit these test changes to trunk.

I prefer not to commit this to the trunk, but only locally, since @t6 in or-fcmp.ll is testing an actual existing bug (sorry for not put that explicitly). It'd be weird to check-in something that is known wrong, and make them pass later.

How about I compare the auto-generated CHECKs and post the functional differences here in the comments, we inspect it, and then check-in the patch with all N tests (with correct CHECKs) together?

I started looking over the diffs in the gist, but I'm going to suggest again that you check in the tests with current output as a preliminary step to this patch. IMO, the fact that we have miscompiles is more reason to do it that way. The advantages are:

There's a direct before/after comparison in the commit log of the functional changes from this patch.
Post-commit reviewers can easily parse those diffs inline in email.
It's also easier for us to review the test diffs here in Phab in one shot alongside the code changes.

There should be no concern about documenting buggy behavior in a regression test; I do this all the time. Then there's no guesswork about the old codegen. Just add "; FIXME: miscompile" or "; FIXME: missed fold" before tests that you know are misbehaving.

To substantially reduce the size of the patch: remove the "bb:" label from each test and the corresponding CHECK line. These are all single basic-block tests, so there's no value in those lines.

SGTM.

timshen added a parent revision: D21844: [InstCombine] Add full tests for FoldAndOfFCmps and FoldOrOfFCmps.Jun 29 2016, 10:26 AM

spatel added inline comments.Jun 29 2016, 11:21 AM

test/Transforms/InstCombine/and-fcmp.ll
2–3	As you noted, these renaming changes in the scripted output are caused because the script uses the variable names in the IR to create FileCheck variable names. The reason the IR names change with your patch is because the old code would reuse existing instructions, for example: if (Op0CC == FCmpInst::FCMP_TRUE) return RHS; ...but the new code always creates a new instruction. So the old checks show that the instruction has the same variable name as the original test case, but the new checks show a new temp name (%1). For the sake of cleanliness (and I think Eric was suggesting the same thing), can you remove these diffs? This could be done by either checking in the NFC changes before this patch or removing these hunks from this patch.

Updated to rebase onto D21775

timshen added a parent revision: D21855: [InstCombine, NFC] Change the generated variable names by creating new instructions.Jun 29 2016, 11:58 AM

if (Op0CC == FCmpInst::FCMP_TRUE)
return RHS;

I see. I created D21855.

Some good boolean logic puzzles in these changes. :)
See inline comments for some nits, then LGTM.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
46	Take the advantage -> Take advantage
49	I would remove the message parameter from all of these static_asserts to tighten it up. If someone tries to change these values, it should be obvious why we don't want that. On that note, can you add a comment to the definition of CmpInst::Predicate in IntsrTypes.h that says something like, "These enum values have the magical property of matching the logical and/or of the predicates. Any changes to these values will require changes to InstCombine."
62–63	...or a constant true/false value.
1091–1099	This comment and the similar one below didn't make anything clearer to me. I would leave it out, but that's just my suggestion.

This revision is now accepted and ready to land.Jun 29 2016, 12:27 PM

Updated comments as suggested.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1091–1099	The boolean logic deduction is quite different though. :) I removed the first duplicated paragraph in FoldOrOfFCmps.

Closed by commit rL274156: [InstCombine] Simplify and correct folding fcmps with the same children (authored by timshen). · Explain WhyJun 29 2016, 1:17 PM

This revision was automatically updated to reflect the committed changes.

timshen mentioned this in rL274155: [InstCombine, NFC] Change the generated variable names by creating new….

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

124 lines

test/

Transforms/

InstCombine/

and-fcmp.ll

12 lines

or-fcmp.ll

23 lines

Diff 62046

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show All 33 Lines	static inline Value dyn_castNotVal(Value V) {

// Constants can be considered to be not'ed values...		// Constants can be considered to be not'ed values...
if (ConstantInt *C = dyn_cast<ConstantInt>(V))		if (ConstantInt *C = dyn_cast<ConstantInt>(V))
return ConstantInt::get(C->getType(), ~C->getValue());		return ConstantInt::get(C->getType(), ~C->getValue());
return nullptr;		return nullptr;
}		}

/// Similar to getICmpCode but for FCmpInst. This encodes a fcmp predicate into		/// Similar to getICmpCode but for FCmpInst. This encodes a fcmp predicate into
/// a three bit mask. It also returns whether it is an ordered predicate by		/// a four bit mask.
/// reference.		static unsigned getFCmpCode(FCmpInst::Predicate CC) {
static unsigned getFCmpCode(FCmpInst::Predicate CC, bool &isOrdered) {		assert(FCmpInst::FIRST_FCMP_PREDICATE <= CC &&
isOrdered = false;		CC <= FCmpInst::LAST_FCMP_PREDICATE && "Unexpected FCmp predicate!");
switch (CC) {		return CC;
		spatelUnsubmitted Done Reply Inline Actions Take the advantage -> Take advantage spatel: Take the advantage -> Take advantage
case FCmpInst::FCMP_ORD: isOrdered = true; return 0; // 000
case FCmpInst::FCMP_UNO: return 0; // 000
case FCmpInst::FCMP_OGT: isOrdered = true; return 1; // 001
case FCmpInst::FCMP_UGT: return 1; // 001
case FCmpInst::FCMP_OEQ: isOrdered = true; return 2; // 010
case FCmpInst::FCMP_UEQ: return 2; // 010
case FCmpInst::FCMP_OGE: isOrdered = true; return 3; // 011
case FCmpInst::FCMP_UGE: return 3; // 011
case FCmpInst::FCMP_OLT: isOrdered = true; return 4; // 100
case FCmpInst::FCMP_ULT: return 4; // 100
case FCmpInst::FCMP_ONE: isOrdered = true; return 5; // 101
case FCmpInst::FCMP_UNE: return 5; // 101
case FCmpInst::FCMP_OLE: isOrdered = true; return 6; // 110
case FCmpInst::FCMP_ULE: return 6; // 110
// True -> 7
default:
// Not expecting FCMP_FALSE and FCMP_TRUE;
llvm_unreachable("Unexpected FCmp predicate!");
}
}		}

/// This is the complement of getICmpCode, which turns an opcode and two		/// This is the complement of getICmpCode, which turns an opcode and two
		spatelUnsubmitted Done Reply Inline Actions I would remove the message parameter from all of these static_asserts to tighten it up. If someone tries to change these values, it should be obvious why we don't want that. On that note, can you add a comment to the definition of CmpInst::Predicate in IntsrTypes.h that says something like, "These enum values have the magical property of matching the logical and/or of the predicates. Any changes to these values will require changes to InstCombine." spatel: I would remove the message parameter from all of these static_asserts to tighten it up. If…
/// operands into either a constant true or false, or a brand new ICmp		/// operands into either a constant true or false, or a brand new ICmp
/// instruction. The sign is passed in to determine which kind of predicate to		/// instruction. The sign is passed in to determine which kind of predicate to
/// use in the new icmp instruction.		/// use in the new icmp instruction.
static Value getNewICmpValue(bool Sign, unsigned Code, Value LHS, Value *RHS,		static Value getNewICmpValue(bool Sign, unsigned Code, Value LHS, Value *RHS,
InstCombiner::BuilderTy *Builder) {		InstCombiner::BuilderTy *Builder) {
ICmpInst::Predicate NewPred;		ICmpInst::Predicate NewPred;
if (Value *NewConstant = getICmpValue(Sign, Code, LHS, RHS, NewPred))		if (Value *NewConstant = getICmpValue(Sign, Code, LHS, RHS, NewPred))
return NewConstant;		return NewConstant;
return Builder->CreateICmp(NewPred, LHS, RHS);		return Builder->CreateICmp(NewPred, LHS, RHS);
}		}

/// This is the complement of getFCmpCode, which turns an opcode and two		/// This is the complement of getFCmpCode, which turns an opcode and two
/// operands into either a FCmp instruction. isordered is passed in to determine		/// operands into either a FCmp instruction.
/// which kind of predicate to use in the new fcmp instruction.		static Value getFCmpValue(unsigned Code, Value LHS, Value *RHS,
		spatelUnsubmitted Done Reply Inline Actions ...or a constant true/false value. spatel: ...or a constant true/false value.
static Value *getFCmpValue(bool isordered, unsigned code,
Value LHS, Value RHS,
InstCombiner::BuilderTy *Builder) {		InstCombiner::BuilderTy *Builder) {
CmpInst::Predicate Pred;		const auto Pred = static_cast<FCmpInst::Predicate>(Code);
switch (code) {		assert(FCmpInst::FIRST_FCMP_PREDICATE <= Pred &&
default: llvm_unreachable("Illegal FCmp code!");		Pred <= FCmpInst::LAST_FCMP_PREDICATE && "Unexpected FCmp predicate!");
case 0: Pred = isordered ? FCmpInst::FCMP_ORD : FCmpInst::FCMP_UNO; break;		if (Pred == FCmpInst::FCMP_FALSE)
case 1: Pred = isordered ? FCmpInst::FCMP_OGT : FCmpInst::FCMP_UGT; break;		return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 0);
case 2: Pred = isordered ? FCmpInst::FCMP_OEQ : FCmpInst::FCMP_UEQ; break;		if (Pred == FCmpInst::FCMP_TRUE)
case 3: Pred = isordered ? FCmpInst::FCMP_OGE : FCmpInst::FCMP_UGE; break;
case 4: Pred = isordered ? FCmpInst::FCMP_OLT : FCmpInst::FCMP_ULT; break;
case 5: Pred = isordered ? FCmpInst::FCMP_ONE : FCmpInst::FCMP_UNE; break;
case 6: Pred = isordered ? FCmpInst::FCMP_OLE : FCmpInst::FCMP_ULE; break;
case 7:
if (!isordered)
return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 1);		return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 1);
Pred = FCmpInst::FCMP_ORD; break;
}
return Builder->CreateFCmp(Pred, LHS, RHS);		return Builder->CreateFCmp(Pred, LHS, RHS);
}		}

/// \brief Transform BITWISE_OP(BSWAP(A),BSWAP(B)) to BSWAP(BITWISE_OP(A, B))		/// \brief Transform BITWISE_OP(BSWAP(A),BSWAP(B)) to BSWAP(BITWISE_OP(A, B))
/// \param I Binary operator to transform.		/// \param I Binary operator to transform.
/// \return Pointer to node that must replace the original binary operator, or		/// \return Pointer to node that must replace the original binary operator, or
/// null pointer if no transformation was made.		/// null pointer if no transformation was made.
Value *InstCombiner::SimplifyBSwap(BinaryOperator &I) {		Value *InstCombiner::SimplifyBSwap(BinaryOperator &I) {
▲ Show 20 Lines • Show All 1,003 Lines • ▼ Show 20 Lines	if (LHS->getPredicate() == FCmpInst::FCMP_ORD &&
if (LHS->getOperand(0)->getType() != RHS->getOperand(0)->getType())		if (LHS->getOperand(0)->getType() != RHS->getOperand(0)->getType())
return nullptr;		return nullptr;

// (fcmp ord x, c) & (fcmp ord y, c) -> (fcmp ord x, y)		// (fcmp ord x, c) & (fcmp ord y, c) -> (fcmp ord x, y)
if (ConstantFP *LHSC = dyn_cast<ConstantFP>(LHS->getOperand(1)))		if (ConstantFP *LHSC = dyn_cast<ConstantFP>(LHS->getOperand(1)))
if (ConstantFP *RHSC = dyn_cast<ConstantFP>(RHS->getOperand(1))) {		if (ConstantFP *RHSC = dyn_cast<ConstantFP>(RHS->getOperand(1))) {
// If either of the constants are nans, then the whole thing returns		// If either of the constants are nans, then the whole thing returns
// false.		// false.
if (LHSC->getValueAPF().isNaN() \|\| RHSC->getValueAPF().isNaN())		if (LHSC->getValueAPF().isNaN() \|\| RHSC->getValueAPF().isNaN())
return Builder->getFalse();		return Builder->getFalse();
return Builder->CreateFCmpORD(LHS->getOperand(0), RHS->getOperand(0));		return Builder->CreateFCmpORD(LHS->getOperand(0), RHS->getOperand(0));
}		}

// Handle vector zeros. This occurs because the canonical form of		// Handle vector zeros. This occurs because the canonical form of
// "fcmp ord x,x" is "fcmp ord x, 0".		// "fcmp ord x,x" is "fcmp ord x, 0".
if (isa<ConstantAggregateZero>(LHS->getOperand(1)) &&		if (isa<ConstantAggregateZero>(LHS->getOperand(1)) &&
isa<ConstantAggregateZero>(RHS->getOperand(1)))		isa<ConstantAggregateZero>(RHS->getOperand(1)))
		spatelUnsubmitted Done Reply Inline Actions This comment and the similar one below didn't make anything clearer to me. I would leave it out, but that's just my suggestion. spatel: This comment and the similar one below didn't make anything clearer to me. I would leave it out…
		timshenAuthorUnsubmitted Not Done Reply Inline Actions The boolean logic deduction is quite different though. :) I removed the first duplicated paragraph in FoldOrOfFCmps. timshen: The boolean logic deduction is quite different though. :) I removed the first duplicated…
return Builder->CreateFCmpORD(LHS->getOperand(0), RHS->getOperand(0));		return Builder->CreateFCmpORD(LHS->getOperand(0), RHS->getOperand(0));
return nullptr;		return nullptr;
}		}

Value Op0LHS = LHS->getOperand(0), Op0RHS = LHS->getOperand(1);		Value Op0LHS = LHS->getOperand(0), Op0RHS = LHS->getOperand(1);
Value Op1LHS = RHS->getOperand(0), Op1RHS = RHS->getOperand(1);		Value Op1LHS = RHS->getOperand(0), Op1RHS = RHS->getOperand(1);
FCmpInst::Predicate Op0CC = LHS->getPredicate(), Op1CC = RHS->getPredicate();		FCmpInst::Predicate Op0CC = LHS->getPredicate(), Op1CC = RHS->getPredicate();


if (Op0LHS == Op1RHS && Op0RHS == Op1LHS) {		if (Op0LHS == Op1RHS && Op0RHS == Op1LHS) {
// Swap RHS operands to match LHS.		// Swap RHS operands to match LHS.
Op1CC = FCmpInst::getSwappedPredicate(Op1CC);		Op1CC = FCmpInst::getSwappedPredicate(Op1CC);
std::swap(Op1LHS, Op1RHS);		std::swap(Op1LHS, Op1RHS);
}		}

if (Op0LHS == Op1LHS && Op0RHS == Op1RHS) {
// Simplify (fcmp cc0 x, y) & (fcmp cc1 x, y).		// Simplify (fcmp cc0 x, y) & (fcmp cc1 x, y).
if (Op0CC == Op1CC)		if (Op0LHS == Op1LHS && Op0RHS == Op1RHS)
return Builder->CreateFCmp((FCmpInst::Predicate)Op0CC, Op0LHS, Op0RHS);		return getFCmpValue(getFCmpCode(Op0CC) & getFCmpCode(Op1CC), Op0LHS, Op0RHS,
if (Op0CC == FCmpInst::FCMP_FALSE \|\| Op1CC == FCmpInst::FCMP_FALSE)		Builder);
return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 0);
if (Op0CC == FCmpInst::FCMP_TRUE)
return RHS;
if (Op1CC == FCmpInst::FCMP_TRUE)
return LHS;

bool Op0Ordered;
bool Op1Ordered;
unsigned Op0Pred = getFCmpCode(Op0CC, Op0Ordered);
unsigned Op1Pred = getFCmpCode(Op1CC, Op1Ordered);
// uno && ord -> false
if (Op0Pred == 0 && Op1Pred == 0 && Op0Ordered != Op1Ordered)
return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 0);
if (Op1Pred == 0) {
std::swap(LHS, RHS);
std::swap(Op0Pred, Op1Pred);
std::swap(Op0Ordered, Op1Ordered);
}
if (Op0Pred == 0) {
// uno && ueq -> uno && (uno \|\| eq) -> uno
// ord && olt -> ord && (ord && lt) -> olt
if (!Op0Ordered && (Op0Ordered == Op1Ordered))
return LHS;
if (Op0Ordered && (Op0Ordered == Op1Ordered))
return RHS;

// uno && oeq -> uno && (ord && eq) -> false
if (!Op0Ordered)
return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 0);
// ord && ueq -> ord && (uno \|\| eq) -> oeq
return getFCmpValue(true, Op1Pred, Op0LHS, Op0RHS, Builder);
}
}

return nullptr;		return nullptr;
}		}

/// Match De Morgan's Laws:		/// Match De Morgan's Laws:
/// (~A & ~B) == (~(A \| B))		/// (~A & ~B) == (~(A \| B))
/// (~A \| ~B) == (~(A & B))		/// (~A \| ~B) == (~(A & B))
static Instruction *matchDeMorgansLaws(BinaryOperator &I,		static Instruction *matchDeMorgansLaws(BinaryOperator &I,
▲ Show 20 Lines • Show All 834 Lines • ▼ Show 20 Lines	Value InstCombiner::FoldOrOfFCmps(FCmpInst LHS, FCmpInst *RHS) {
Value Op1LHS = RHS->getOperand(0), Op1RHS = RHS->getOperand(1);		Value Op1LHS = RHS->getOperand(0), Op1RHS = RHS->getOperand(1);
FCmpInst::Predicate Op0CC = LHS->getPredicate(), Op1CC = RHS->getPredicate();		FCmpInst::Predicate Op0CC = LHS->getPredicate(), Op1CC = RHS->getPredicate();

if (Op0LHS == Op1RHS && Op0RHS == Op1LHS) {		if (Op0LHS == Op1RHS && Op0RHS == Op1LHS) {
// Swap RHS operands to match LHS.		// Swap RHS operands to match LHS.
Op1CC = FCmpInst::getSwappedPredicate(Op1CC);		Op1CC = FCmpInst::getSwappedPredicate(Op1CC);
std::swap(Op1LHS, Op1RHS);		std::swap(Op1LHS, Op1RHS);
}		}
if (Op0LHS == Op1LHS && Op0RHS == Op1RHS) {
// Simplify (fcmp cc0 x, y) \| (fcmp cc1 x, y).		// Simplify (fcmp cc0 x, y) \| (fcmp cc1 x, y).
if (Op0CC == Op1CC)		if (Op0LHS == Op1LHS && Op0RHS == Op1RHS)
return Builder->CreateFCmp((FCmpInst::Predicate)Op0CC, Op0LHS, Op0RHS);		return getFCmpValue(getFCmpCode(Op0CC) \| getFCmpCode(Op1CC), Op0LHS, Op0RHS,
if (Op0CC == FCmpInst::FCMP_TRUE \|\| Op1CC == FCmpInst::FCMP_TRUE)		Builder);
return ConstantInt::get(CmpInst::makeCmpResultType(LHS->getType()), 1);
if (Op0CC == FCmpInst::FCMP_FALSE)
return RHS;
if (Op1CC == FCmpInst::FCMP_FALSE)
return LHS;
bool Op0Ordered;
bool Op1Ordered;
unsigned Op0Pred = getFCmpCode(Op0CC, Op0Ordered);
unsigned Op1Pred = getFCmpCode(Op1CC, Op1Ordered);
if (Op0Ordered == Op1Ordered) {
// If both are ordered or unordered, return a new fcmp with
// or'ed predicates.
return getFCmpValue(Op0Ordered, Op0Pred\|Op1Pred, Op0LHS, Op0RHS, Builder);
}
}
return nullptr;		return nullptr;
}		}

/// This helper function folds:		/// This helper function folds:
///		///
/// ((A \| B) & C1) \| (B & C2)		/// ((A \| B) & C1) \| (B & C2)
///		///
/// into:		/// into:
▲ Show 20 Lines • Show All 703 Lines • Show Last 20 Lines

test/Transforms/InstCombine/and-fcmp.ll

; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

define zeroext i8 @t1(float %x, float %y) nounwind {		define zeroext i8 @t1(float %x, float %y) nounwind {
		spatelUnsubmitted Not Done Reply Inline Actions As you noted, these renaming changes in the scripted output are caused because the script uses the variable names in the IR to create FileCheck variable names. The reason the IR names change with your patch is because the old code would reuse existing instructions, for example: if (Op0CC == FCmpInst::FCMP_TRUE) return RHS; ...but the new code always creates a new instruction. So the old checks show that the instruction has the same variable name as the original test case, but the new checks show a new temp name (%1). For the sake of cleanliness (and I think Eric was suggesting the same thing), can you remove these diffs? This could be done by either checking in the NFC changes before this patch or removing these hunks from this patch. spatel: As you noted, these renaming changes in the scripted output are caused because the script uses…
%a = fcmp ueq float %x, %y		%a = fcmp ueq float %x, %y
%b = fcmp ord float %x, %y		%b = fcmp ord float %x, %y
%c = and i1 %a, %b		%c = and i1 %a, %b
%retval = zext i1 %c to i8		%retval = zext i1 %c to i8
ret i8 %retval		ret i8 %retval
; CHECK: t1		; CHECK: t1
; CHECK: fcmp oeq float %x, %y		; CHECK: fcmp oeq float %x, %y
; CHECK-NOT: fcmp ueq float %x, %y		; CHECK-NOT: fcmp ueq float %x, %y
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	define <2 x i1> @t9(<2 x float> %a, <2 x double> %b) {
%cmp = fcmp ord <2 x float> %a, zeroinitializer		%cmp = fcmp ord <2 x float> %a, zeroinitializer
%cmp1 = fcmp ord <2 x double> %b, zeroinitializer		%cmp1 = fcmp ord <2 x double> %b, zeroinitializer
%and = and <2 x i1> %cmp, %cmp1		%and = and <2 x i1> %cmp, %cmp1
ret <2 x i1> %and		ret <2 x i1> %and
; CHECK: t9		; CHECK: t9
; CHECK: fcmp ord		; CHECK: fcmp ord
; CHECK: fcmp ord		; CHECK: fcmp ord
}		}

		; CHECK-LABEL: @t10(
		define i1 @t10(double %a, double %b) {
		bb:
		%cmp = fcmp oeq double %a, %b
		%cmp1 = fcmp une double %a, %b
		; CHECK-NOT: fcmp oeq
		; CHECK-NOT: fcmp une
		; CHECK: ret i1 false
		%retval = and i1 %cmp, %cmp1
		ret i1 %retval
		}

test/Transforms/InstCombine/or-fcmp.ll

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	define zeroext i8 @t5(float %x, float %y) nounwind {
%b = fcmp oge float %x, %y ; <i1> [#uses=1]		%b = fcmp oge float %x, %y ; <i1> [#uses=1]
%c = or i1 %a, %b		%c = or i1 %a, %b
; CHECK-NOT: fcmp olt		; CHECK-NOT: fcmp olt
; CHECK-NOT: fcmp oge		; CHECK-NOT: fcmp oge
; CHECK: fcmp ord		; CHECK: fcmp ord
%retval = zext i1 %c to i8		%retval = zext i1 %c to i8
ret i8 %retval		ret i8 %retval
}		}

		; CHECK-LABEL: @t6(
		define i1 @t6(double %a, double %b) {
		bb:
		%cmp = fcmp ogt double %a, %b
		%cmp1 = fcmp ord double %a, %b
		; CHECK-NOT: fcmp ogt
		; CHECK: fcmp ord
		%retval = or i1 %cmp, %cmp1
		ret i1 %retval
		}

		; CHECK-LABEL: @t7(
		define i1 @t7(double %a, double %b) {
		bb:
		%cmp = fcmp oeq double %a, %b
		%cmp1 = fcmp une double %a, %b
		; CHECK-NOT: fcmp oeq
		; CHECK-NOT: fcmp une
		; CHECK: ret i1 true
		%retval = or i1 %cmp, %cmp1
		ret i1 %retval
		}