Download Raw Diff

Details

Reviewers

chandlerc
majnemer
nicholas
• rafael
bkramer

Commits

rG562fd7534c1c: Added instruction combine to transform few more negative values addition to…
rL211266: Added instruction combine to transform few more negative values addition to…

Summary

This patch enable following transform

(x + (~(y & c) + 1)   -->   x - (y & c)
(x + (~((y >> z) & c) + 1)   -->   x - ((y>>z) & c)
 
(x + (~(y | c) + 1)   -->   x - (y | c) if c is odd
(x + (~((y >> z) | c) + 1)   -->   x - ((y>>z) | c)  if c is odd

Z3 verification code:

http://rise4fun.com/Z3/ZA06

Diff Detail

Event Timeline

dinesh.d updated this revision to Diff 9327.May 12 2014, 3:56 PM

dinesh.d retitled this revision from to Added instruction combine to transform few more negative values addition to subtraction.

dinesh.d updated this object.

dinesh.d edited the test plan for this revision. (Show Details)

dinesh.d added reviewers: • rafael, bkramer.

dinesh.d added a subscriber: Unknown Object (MLST).

dinesh.d added a reviewer: majnemer.May 12 2014, 4:00 PM

Removed hasOneUse() check and updated test cases. This patch takes care of PR14365

bkramer added inline comments.May 14 2014, 8:49 AM

lib/Transforms/InstCombine/InstCombineAddSub.cpp
929	Can you return the fully transformed value? I find it confusing to have this split in half. You can create both instructions with IRBuilder and use ReplaceInstUsesWith at the call site.

Updated patch to return fully transformed value.

Added additional transforms

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even
(x + (~((y >> z) | c) + 1)   -->   x - ((y>>z) | c)  if c is even

Is it ok to loose names of instruction during transforms. This patch returns instruction names as %1, %2

Updated patch to use APInts and simplified few checks

gentle ping.

Will it help in reviewing if I break this patch in parts.

This is to answer Philip's comments

Both your comment and function name here seem misleading. As far as I
can tell from your transforms, none are actual predicated on an argument
being negative. I also see no use of subtraction.

Then for this one, I'm probably missing something obvious, but where are
you initializing C1 and C2? In general, the code structure here is a
bit hard to read.

If name of the function is misleading, I don't mind any other name. This
function tries to find if any of the operand is negative and use them to
convert ADD to SUB and remove negation. I have not added ADD in function
name as I am looking to use this function to convert SUB to ADD too.

I have found 3 patterns mentioned in comment which llvm is not optimizing
any further. This functions tries to identify those patterns and optimise them.
I have put comments for one of the pattern in code. Other patterns are
implemented in similar fashion. Let me know if it is still not clear.

// Checks if any operand is negative and we can convert add to sub.
// This function checks for following patterns
//   ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
//   ADD(XOR(AND(Z, ~C), ~C), 1)    == NEG(OR(Z, C)) if C is even
//   XOR(AND(Z, ~C), (~C + 1))      == NEG(OR(Z, C)) if C is odd

Value *checkForNegativeOperand(...) {

...

Here I am looking for first 2 patterns i.e. ADD(X, 1) where X is XOR-AND/ XOR-OR
which can occur on either side, following 2 lines to check if pattern is on RHS,
then swapping LHS and RHS will put it in LHS. This is just pointer swap and will
not change anything else. This approach will help in avoiding code duplication as
without this, I need to do all checks for first LHS and then RHS.

  // if ONE is on other side, swap
  if (match(RHS, m_Add(m_Value(X), m_One())))
    std::swap(LHS, RHS);

So if LHS = B and RHS = (X + 1), after above lines, it will become LHS = (X + 1), RHS = B.

    if (match(LHS, m_Add(m_Value(X), m_One()))) {

After this check, we can be sure that I has (X + 1) on any side and LHS is pointing to that
operand as match will true only of pattern matches and because of m_Value(X), Value *X will
be pointing to X part of (X + 1) pattern.

    // if XOR on other side, swap
    if (match(RHS, m_Xor(m_Value(Y), m_APInt(C1))))
      std::swap(X, RHS);

Above 2 lines is to check for XOR and swap (X, RHS) if pattern is LHS = (X + 1), RHS = B and
XOR on RHS.

	  
    if (match(X, m_Xor(m_Value(Y), m_APInt(C1)))) {

Once above check is true, we will be sure that there is XOR(Y, Constant) instruction and X is
pointing to that. Again, above swap is to avoid code duplication. Further, because of m_APInt(C1),
if match returns true, C1 will point to constant part of XOR.

// ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
if (match(Y, m_Or(m_Value(Z), m_APInt(C2))) && (*C1 == ~(*C2))) {

Same way, I am checking if X is in OR(Z, C2) form where C2 == NOT(C1). if above match is true, we
can convert initial ADD(X+1, B) instruction to SUB(B, AND(Z, C1)) as 'ADD(XOR(OR(Z, NOT(C1)), C1)), 1)'
complete instruction can be converted to NEG(AND(Z, C1)) and ADD(NEG(AND(Z, C1)), B) is same as
SUB(B, AND(Z, C1))

Value *NewAnd = Builder->CreateAnd(Z, *C1);

Above line create AND(Z, C1)

		  
        return Builder->CreateSub(RHS, NewAnd, "", IHasNUW, IHasNSW);

and this line create SUB instruction and return to caller, where this SUB instruction can be used
to replace ADD instruction.

Returns nullptr, if I's operand does not match to any of the pattern.

return nullptr;

}

Hi Philip,

I have added one comment at http://reviews.llvm.org/D3733 to reply your queries. Will you please
check it and let me know if it is clear or not ?

Regards
Dinesh Dwivedi

Original Message -------

Sender : Philip Reames<listmail@philipreames.com>
Date : May 27, 2014 22:14 (GMT+05:30)
Title : Re: [PATCH] Added instruction combine to transform few more negative
values addition to subtraction

Glancing at your patch, a few quick comments.

Checks if any operand is negative and we can convert add to sub.
This function checks for following patterns
ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
ADD(XOR(AND(Z, ~C), ~C), 1) == NEG(OR(Z, C)) if C is even
// XOR(AND(Z, ~C), (~C + 1)) == NEG(OR(Z, C)) if C is odd
Value *checkForNegativeOperand(BinaryOperator &I,
InstCombiner::BuilderTy *Builder) { ...}

Both your comment and function name here seem misleading. As far as I
can tell from your transforms, none are actual predicated on an argument
being negative. I also see no use of subtraction.

Then for this one, I'm probably missing something obvious, but where are
you initializing C1 and C2? In general, the code structure here is a
bit hard to read.

Philip

This patch creates 2 instructions to replace ADD. Updated patch to check
if atleast one of LHS or RHS to have one use to ensure benefit in transform.

Breaking patch in to parts

Is this the first one? Which patch is the second one?

Cheers,
Rafael

I have broken my previous patch to reduce size of patch. This is first one. As
other parts are depended on these changes, I will put them for review once this
patch is in [actually don't know how to submit dependent patches]

I have added TODO for those patches.

Updated comments

gentle ping

Dinesh, I'm sorry, but I can not provide further review. I can not spot any obvious problems with your code, but after trying to justify the correctness of your change to myself several times, I can't quite do so. Rather than continue to waste your time - since you'd need to find another reviewer with commit rights anyways - I'm simply going to step aside.

Thanks Philip.

If there is any specific part of the patch which is bothering you and you
are not sure for its correctness. I can explain why I have written it like that

and please don't think about wasting time. I mentioned this before too. I like
to hear from you so please continue review my patches if time permits.

It will be gr8 if you can have a look.

gentle ping

Sadness. Did you know that gcc gets this even at -O0? Not this pattern mind you, but the original testcase. It's that simple, at least before we transform to xor(or) form. GCC doesn't get it in xor(or) form either.

I looked for a better way to address this optimizer deficiency and didn't find one. The one thing I didn't check is whether we could get it by reordering our optimizations inside instcombine and do something else before it gets into xor(or) form. Even if that did fix the testcase in PR14365 it would still leave the case where someone actually does write "((a|~c) ^ c) + (a+1)".

lib/Transforms/InstCombine/InstCombineAddSub.cpp
928	"atleast" --> "at least"
951	Optional: C2 == ~(C1) might be more efficient as !C2->intersects(*C1). It isn't, but in theory we could improve intersects to not compute an intermediate value (ie., it doesn't need to look at all the bits once it finds a pair are set in both).
test/Transforms/InstCombine/add2.ll
130	Please make sure there's a newline at the end of the file.

This revision is now accepted and ready to land.Jun 18 2014, 10:58 PM

Hi Nick,

m_Not checks for pattern XOR(X, -1) and will not work in current scenario.
i.e. it can check if we have XOR(X, -1) pattern but can not identify that
0x55555555 and 0xAAAAAAAA are NOT of each other.

Am I missing something?

Regards
Dinesh Dwivedi

Original Message -------

Sender : Nick Lewycky<nicholas@mxc.ca>
Date : Jun 19, 2014 11:50 (GMT+05:30)
Title : Re: [PATCH] Added instruction combine to transform few more negative
values addition to subtraction

Nick Lewycky wrote:

Sadness. Did you know that gcc gets this even at -O0? Not this pattern mind you, but the original testcase. It's that simple, at least before we transform to xor(or) form. GCC doesn't get it in xor(or) form either.

I looked for a better way to address this optimizer deficiency and didn't find one. The one thing I didn't check is whether we could get it by reordering our optimizations inside instcombine and do something else before it gets into xor(or) form. Even if that did fix the testcase in PR14365 it would still leave the case where someone actually does write "((a|~c) ^ c) + (a+1)".

Actually, I was just thinking about this some more. There's no reason
'C' needs to be a constant at all, is there? As long as it's X and ~X
(use m_Value(X) then m_Not(m_Specific(X))) the transform should be
correct, right?

Nick

Comment at: lib/Transforms/InstCombine/InstCombineAddSub.cpp:928
@@ +927,3 @@
+
+ // This function creates 2 instructions to replace ADD, we need atleast one of

+ // LHS or RHS to have one use to ensure benefit in transform.

"atleast" --> "at least"

Comment at: lib/Transforms/InstCombine/InstCombineAddSub.cpp:951
@@ +950,3 @@
+ if (match(X, m_Xor(m_Value(Y), m_APInt(C1)))) {
+ if (match(Y, m_Or(m_Value(Z), m_APInt(C2)))&& (*C2 == ~(*C1))) {

+ Value *NewAnd = Builder->CreateAnd(Z, *C1);

Optional: *C2 == ~(*C1) might be more efficient as !C2->intersects(*C1). It isn't, but in theory we could improve intersects to not compute an intermediate value (ie., it doesn't need to look at all the bits once it finds a pair are set in both).

Comment at: test/Transforms/InstCombine/add2.ll:142
@@ +141,1 @@
+}
No newline at end of file

Please make sure there's a newline at the end of the file.

http://reviews.llvm.org/D3733

Updated as per comments

dinesh.d added inline comments.Jun 19 2014, 3:03 AM

lib/Transforms/InstCombine/InstCombineAddSub.cpp
928	updated
951	problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one but is set in one, it will not set in second. We can compute xor and then check if result is all zero but I am not sure if that is more efficient than current code, it will sure reduce readability though.
test/Transforms/InstCombine/add2.ll
130	updated.

fixed typo in previous comment

lib/Transforms/InstCombine/InstCombineAddSub.cpp
951	Problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one bit is not set in one, it will be set in second.

Closed by commit rL211266 (authored by dinesh).

Diff 9333

lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 910 Lines • ▼ Show 20 Lines	bool InstCombiner::WillNotOverflowSignedAdd(Value LHS, Value RHS) {
// sign bit) the ripple may go up to and fill the zero, but won't change the		// sign bit) the ripple may go up to and fill the zero, but won't change the
// sign. For example, (X & ~4) + 1.		// sign. For example, (X & ~4) + 1.

// TODO: Implement.		// TODO: Implement.

return false;		return false;
}		}

		// Checks for NOT and NEG patterns
		Value checkIfValIsNegative(Value Val, InstCombiner::BuilderTy *Builder) {
		Value X = nullptr, Y = nullptr, *Z = nullptr;
		ConstantInt C1 = nullptr, C2 = nullptr;

		// ADd(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
		if (match(Val, m_Add(m_Value(X), m_One())) &&
		match(X, m_Xor(m_Value(Y), m_ConstantInt(C1))) &&
		match(Y, m_Or(m_Value(Z), m_ConstantInt(C2))) &&
		C1->getValue() == ~C2->getValue()) {
		nicholasUnsubmitted Not Done Reply Inline Actions "atleast" --> "at least" nicholas: "atleast" --> "at least"
		dinesh.dAuthorUnsubmitted Not Done Reply Inline Actions updated dinesh.d: updated
		return Builder->CreateAnd(Z, C1);
		bkramerUnsubmitted Not Done Reply Inline Actions Can you return the fully transformed value? I find it confusing to have this split in half. You can create both instructions with IRBuilder and use ReplaceInstUsesWith at the call site. bkramer: Can you return the fully transformed value? I find it confusing to have this split in half. You…
		}

		// XOR(AND(Z, ~C), (~C + 1)) == NEG(OR(Z, C)) if C is odd
		if (match(Val, m_Xor(m_Value(Y), m_ConstantInt(C1))) &&
		match(Y, m_And(m_Value(Z), m_ConstantInt(C2))) &&
		C1->getValue() == C2->getValue() + 1 &&
		!C1->getValue().countTrailingZeros()) {
		return Builder->CreateOr(Z, ConstantExpr::getNot(C2));
		}

		// TODO: Check for other patterns [visitXorInst's DeMorgan's Law conversion]
		// ADD(XOR(AND(Z, ~C), ~C) + 1)== NEG(OR(Z, C)) if C is even

		return nullptr;
		}

Instruction *InstCombiner::visitAdd(BinaryOperator &I) {		Instruction *InstCombiner::visitAdd(BinaryOperator &I) {
bool Changed = SimplifyAssociativeOrCommutative(I);		bool Changed = SimplifyAssociativeOrCommutative(I);
Value LHS = I.getOperand(0), RHS = I.getOperand(1);		Value LHS = I.getOperand(0), RHS = I.getOperand(1);

if (Value *V = SimplifyAddInst(LHS, RHS, I.hasNoSignedWrap(),		if (Value *V = SimplifyAddInst(LHS, RHS, I.hasNoSignedWrap(),
I.hasNoUnsignedWrap(), DL))		I.hasNoUnsignedWrap(), DL))
		nicholasUnsubmitted Not Done Reply Inline Actions Optional: C2 == ~(C1) might be more efficient as !C2->intersects(C1). It isn't, but in theory we could improve intersects to not compute an intermediate value (ie., it doesn't need to look at all the bits once it finds a pair are set in both). nicholas:* Optional: C2 == ~(C1) might be more efficient as !C2->intersects(*C1). It isn't, but in…
		dinesh.dAuthorUnsubmitted Not Done Reply Inline Actions problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one but is set in one, it will not set in second. We can compute xor and then check if result is all zero but I am not sure if that is more efficient than current code, it will sure reduce readability though. dinesh.d: problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but…
		dinesh.dAuthorUnsubmitted Not Done Reply Inline Actions Problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one bit is not set in one, it will be set in second. dinesh.d: Problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but…
return ReplaceInstUsesWith(I, V);		return ReplaceInstUsesWith(I, V);

// (AB)+(AC) -> A*(B+C) etc		// (AB)+(AC) -> A*(B+C) etc
if (Value *V = SimplifyUsingDistributiveLaws(I))		if (Value *V = SimplifyUsingDistributiveLaws(I))
return ReplaceInstUsesWith(I, V);		return ReplaceInstUsesWith(I, V);

if (ConstantInt *CI = dyn_cast<ConstantInt>(RHS)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(RHS)) {
// X + (signbit) --> X ^ signbit		// X + (signbit) --> X ^ signbit
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	if (Value *LHSV = dyn_castNegVal(LHS)) {
return BinaryOperator::CreateSub(RHS, LHSV);		return BinaryOperator::CreateSub(RHS, LHSV);
}		}

// A + -B --> A - B		// A + -B --> A - B
if (!isa<Constant>(RHS))		if (!isa<Constant>(RHS))
if (Value *V = dyn_castNegVal(RHS))		if (Value *V = dyn_castNegVal(RHS))
return BinaryOperator::CreateSub(LHS, V);		return BinaryOperator::CreateSub(LHS, V);

		if (Value *NegVal = checkIfValIsNegative(LHS, Builder))
		return BinaryOperator::CreateSub(RHS, NegVal);

		if (Value *NegVal = checkIfValIsNegative(RHS, Builder))
		return BinaryOperator::CreateSub(LHS, NegVal);

{		{
Constant *C2;		Constant *C2;
if (Value *X = dyn_castFoldableMul(LHS, C2)) {		if (Value *X = dyn_castFoldableMul(LHS, C2)) {
if (X == RHS) // XC + X --> X (C+1)		if (X == RHS) // XC + X --> X (C+1)
return BinaryOperator::CreateMul(RHS, AddOne(C2));		return BinaryOperator::CreateMul(RHS, AddOne(C2));

// XC1 + XC2 --> X * (C1+C2)		// XC1 + XC2 --> X * (C1+C2)
▲ Show 20 Lines • Show All 548 Lines • Show Last 20 Lines

test/Transforms/InstCombine/add2.ll

	Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	define <2 x i64> @test8(<2 x i64> %A) {			define <2 x i64> @test8(<2 x i64> %A) {
	%xor = xor <2 x i64> %A, <i64 -1, i64 -1>			%xor = xor <2 x i64> %A, <i64 -1, i64 -1>
	%add = add <2 x i64> %xor, <i64 2, i64 3>			%add = add <2 x i64> %xor, <i64 2, i64 3>
	ret <2 x i64> %add			ret <2 x i64> %add
	; CHECK-LABEL: @test8(			; CHECK-LABEL: @test8(
	; CHECK-NEXT: %add = sub <2 x i64> <i64 1, i64 2>, %A			; CHECK-NEXT: %add = sub <2 x i64> <i64 1, i64 2>, %A
	; CHECK-NEXT: ret <2 x i64> %add			; CHECK-NEXT: ret <2 x i64> %add
	}			}

				define i32 @test9(i32 %x) {
				%and = and i32 %x, 1431655765
				%neg = xor i32 %and, -1
				%add = add nsw i32 %neg, 1
				%add1 = add nsw i32 %x, %add
				ret i32 %add1
				; CHECK-LABEL: @test9(
				; CHECK-NEXT: [[AND:%[a-z0-9]+]] = and i32 %x, -1431655766
				; CHECK-NEXT: ret i32 [[AND]]
				}

				define i32 @test10(i32 %x) {
				%shr = ashr i32 %x, 3
				%and = and i32 %shr, 1431655765
				%neg = xor i32 %and, -1
				%add = add nsw i32 %neg, 1
				%add1 = add nsw i32 %x, %add
				ret i32 %add1
				; CHECK-LABEL: @test10(
				; CHECK-NEXT: [[SHR:%[a-z0-9]+]] = ashr i32 %x, 3
				; CHECK-NEXT: [[AND:%[a-z0-9]+]] = and i32 [[SHR]], 1431655765
				; CHECK-NEXT: [[SUB:%[a-z0-9]+]] = sub i32 %x, [[AND]]
				; CHECK-NEXT: ret i32 [[SUB]]
				}

				define i32 @test11(i32 %x) {
				%or = or i32 %x, 1431655765
				%neg = xor i32 %or, -1
				%add = add nsw i32 %neg, 1
				%add1 = add nsw i32 %x, %add
				ret i32 %add1
				; CHECK-LABEL: @test11(
				; CHECK-NEXT: [[OR:%[a-z0-9]+]] = or i32 %x, 1431655765
				; CHECK-NEXT: [[SUB:%[a-z0-9]+]] = sub i32 %x, [[AND]]
				; CHECK-NEXT: ret i32 [[SUB]]
				}

				define i32 @test12(i32 %x) {
				%shr = ashr i32 %x, 3
				%or = or i32 %shr, 1431655765
				%neg = xor i32 %or, -1
				%add = add nsw i32 %neg, 1
				%add1 = add nsw i32 %x, %add
				ret i32 %add1
				; CHECK-LABEL: @test12(
				; CHECK-NEXT: [[SHR:%[a-z0-9]+]] = ashr i32 %x, 3
				; CHECK-NEXT: [[AND:%[a-z0-9]+]] = or i32 [[SHR]], 1431655765
				; CHECK-NEXT: [[SUB:%[a-z0-9]+]] = sub i32 %x, [[AND]]
				; CHECK-NEXT: ret i32 [[SUB]]
				}
				nicholasUnsubmitted Not Done Reply Inline Actions Please make sure there's a newline at the end of the file. nicholas: Please make sure there's a newline at the end of the file.
				dinesh.dAuthorUnsubmitted Not Done Reply Inline Actions updated. dinesh.d: updated.

This is an archive of the discontinued LLVM Phabricator instance.

Added instruction combine to transform few more negative values addition to subtraction
ClosedPublic

Details

Diff Detail

Event Timeline

+ // LHS or RHS to have one use to ensure benefit in transform.

+ Value NewAnd = Builder->CreateAnd(Z, C1);

Revision Contents

Diff 9333

lib/Transforms/InstCombine/InstCombineAddSub.cpp

test/Transforms/InstCombine/add2.ll

This is an archive of the discontinued LLVM Phabricator instance.

Added instruction combine to transform few more negative values addition to subtractionClosedPublic

Details

Diff Detail

Event Timeline

+ // LHS or RHS to have one use to ensure benefit in transform.

+ Value *NewAnd = Builder->CreateAnd(Z, *C1);

Revision Contents

Diff 9333

lib/Transforms/InstCombine/InstCombineAddSub.cpp

test/Transforms/InstCombine/add2.ll

Added instruction combine to transform few more negative values addition to subtraction
ClosedPublic

+ Value NewAnd = Builder->CreateAnd(Z, C1);