This is an archive of the discontinued LLVM Phabricator instance.

Added instruction combine to transform few more negative values addition to subtraction
ClosedPublic

Authored by dinesh.d on May 12 2014, 3:56 PM.

Details

Summary

This patch enable following transform

(x + (~(y & c) + 1)   -->   x - (y & c)
(x + (~((y >> z) & c) + 1)   -->   x - ((y>>z) & c)
 
(x + (~(y | c) + 1)   -->   x - (y | c) if c is odd
(x + (~((y >> z) | c) + 1)   -->   x - ((y>>z) | c)  if c is odd

Z3 verification code:

http://rise4fun.com/Z3/ZA06

Diff Detail

Event Timeline

dinesh.d updated this revision to Diff 9327.May 12 2014, 3:56 PM
dinesh.d retitled this revision from to Added instruction combine to transform few more negative values addition to subtraction.
dinesh.d updated this object.
dinesh.d edited the test plan for this revision. (Show Details)
dinesh.d added reviewers: rafael, bkramer.
dinesh.d added a subscriber: Unknown Object (MLST).
dinesh.d updated this revision to Diff 9333.May 13 2014, 12:01 AM

Removed hasOneUse() check and updated test cases. This patch takes care of PR14365

bkramer added inline comments.May 14 2014, 8:49 AM
lib/Transforms/InstCombine/InstCombineAddSub.cpp
929

Can you return the fully transformed value? I find it confusing to have this split in half. You can create both instructions with IRBuilder and use ReplaceInstUsesWith at the call site.

dinesh.d updated this revision to Diff 9444.May 15 2014, 8:10 AM

Updated patch to return fully transformed value.

Added additional transforms

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even
(x + (~((y >> z) | c) + 1)   -->   x - ((y>>z) | c)  if c is even

Is it ok to loose names of instruction during transforms. This patch returns instruction names as %1, %2

dinesh.d updated this revision to Diff 9524.May 19 2014, 1:18 AM

Updated patch to use APInts and simplified few checks

gentle ping.

Will it help in reviewing if I break this patch in parts.

dinesh.d added a comment.EditedMay 27 2014, 11:47 PM

This is to answer Philip's comments

Both your comment and function name here seem misleading. As far as I
can tell from your transforms, none are actual predicated on an argument
being negative. I also see no use of subtraction.

Then for this one, I'm probably missing something obvious, but where are
you initializing C1 and C2? In general, the code structure here is a
bit hard to read.

If name of the function is misleading, I don't mind any other name. This
function tries to find if any of the operand is negative and use them to
convert ADD to SUB and remove negation. I have not added ADD in function
name as I am looking to use this function to convert SUB to ADD too.

I have found 3 patterns mentioned in comment which llvm is not optimizing
any further. This functions tries to identify those patterns and optimise them.
I have put comments for one of the pattern in code. Other patterns are
implemented in similar fashion. Let me know if it is still not clear.

// Checks if any operand is negative and we can convert add to sub.
// This function checks for following patterns
//   ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
//   ADD(XOR(AND(Z, ~C), ~C), 1)    == NEG(OR(Z, C)) if C is even
//   XOR(AND(Z, ~C), (~C + 1))      == NEG(OR(Z, C)) if C is odd

Value *checkForNegativeOperand(...) {

...

Here I am looking for first 2 patterns i.e. ADD(X, 1) where X is XOR-AND/ XOR-OR
which can occur on either side, following 2 lines to check if pattern is on RHS,
then swapping LHS and RHS will put it in LHS. This is just pointer swap and will
not change anything else. This approach will help in avoiding code duplication as
without this, I need to do all checks for first LHS and then RHS.

  // if ONE is on other side, swap
  if (match(RHS, m_Add(m_Value(X), m_One())))
    std::swap(LHS, RHS);

So if LHS = B and RHS = (X + 1), after above lines, it will become LHS = (X + 1), RHS = B.

    if (match(LHS, m_Add(m_Value(X), m_One()))) {

After this check, we can be sure that I has (X + 1) on any side and LHS is pointing to that
operand as match will true only of pattern matches and because of m_Value(X), Value *X will
be pointing to X part of (X + 1) pattern.

    // if XOR on other side, swap
    if (match(RHS, m_Xor(m_Value(Y), m_APInt(C1))))
      std::swap(X, RHS);

Above 2 lines is to check for XOR and swap (X, RHS) if pattern is LHS = (X + 1), RHS = B and
XOR on RHS.

	  
    if (match(X, m_Xor(m_Value(Y), m_APInt(C1)))) {

Once above check is true, we will be sure that there is XOR(Y, Constant) instruction and X is
pointing to that. Again, above swap is to avoid code duplication. Further, because of m_APInt(C1),
if match returns true, C1 will point to constant part of XOR.

// ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
if (match(Y, m_Or(m_Value(Z), m_APInt(C2))) && (*C1 == ~(*C2))) {

Same way, I am checking if X is in OR(Z, C2) form where C2 == NOT(C1). if above match is true, we
can convert initial ADD(X+1, B) instruction to SUB(B, AND(Z, C1)) as 'ADD(XOR(OR(Z, NOT(C1)), C1)), 1)'
complete instruction can be converted to NEG(AND(Z, C1)) and ADD(NEG(AND(Z, C1)), B) is same as
SUB(B, AND(Z, C1))

Value *NewAnd = Builder->CreateAnd(Z, *C1);

Above line create AND(Z, C1)

		  
        return Builder->CreateSub(RHS, NewAnd, "", IHasNUW, IHasNSW);

and this line create SUB instruction and return to caller, where this SUB instruction can be used
to replace ADD instruction.

		  
      }
		
    ...

Returns nullptr, if I's operand does not match to any of the pattern.

return nullptr;

}

Hi Philip,

I have added one comment at http://reviews.llvm.org/D3733 to reply your queries. Will you please
check it and let me know if it is clear or not ?

Regards
Dinesh Dwivedi

  • Original Message -------

Sender : Philip Reames<listmail@philipreames.com>
Date : May 27, 2014 22:14 (GMT+05:30)
Title : Re: [PATCH] Added instruction combine to transform few more negative
values addition to subtraction

Glancing at your patch, a few quick comments.

Checks if any operand is negative and we can convert add to sub.
This function checks for following patterns
ADD(XOR(OR(Z, NOT(C)), C)), 1) == NEG(AND(Z, C))
ADD(XOR(AND(Z, ~C), ~C), 1) == NEG(OR(Z, C)) if C is even
// XOR(AND(Z, ~C), (~C + 1)) == NEG(OR(Z, C)) if C is odd
Value *checkForNegativeOperand(BinaryOperator &I,
InstCombiner::BuilderTy *Builder) { ...}

Both your comment and function name here seem misleading. As far as I
can tell from your transforms, none are actual predicated on an argument
being negative. I also see no use of subtraction.

Then for this one, I'm probably missing something obvious, but where are
you initializing C1 and C2? In general, the code structure here is a
bit hard to read.

Philip

dinesh.d updated this revision to Diff 9989.May 31 2014, 1:38 PM

This patch creates 2 instructions to replace ADD. Updated patch to check
if atleast one of LHS or RHS to have one use to ensure benefit in transform.

dinesh.d updated this revision to Diff 10057.Jun 3 2014, 1:32 PM

Breaking patch in to parts

dinesh.d added a comment.EditedJun 3 2014, 2:07 PM

Is this the first one? Which patch is the second one?

Cheers,
Rafael

I have broken my previous patch to reduce size of patch. This is first one. As
other parts are depended on these changes, I will put them for review once this
patch is in [actually don't know how to submit dependent patches]

I have added TODO for those patches.

dinesh.d updated this revision to Diff 10064.Jun 3 2014, 2:28 PM

Updated comments

reames added a subscriber: reames.EditedJun 10 2014, 10:53 AM

Dinesh, I'm sorry, but I can not provide further review. I can not spot any obvious problems with your code, but after trying to justify the correctness of your change to myself several times, I can't quite do so. Rather than continue to waste your time - since you'd need to find another reviewer with commit rights anyways - I'm simply going to step aside.

Thanks Philip.

If there is any specific part of the patch which is bothering you and you
are not sure for its correctness. I can explain why I have written it like that

and please don't think about wasting time. I mentioned this before too. I like
to hear from you so please continue review my patches if time permits.

It will be gr8 if you can have a look.

nicholas accepted this revision.Jun 18 2014, 10:58 PM
nicholas added a reviewer: nicholas.
nicholas added a subscriber: nicholas.

Sadness. Did you know that gcc gets this even at -O0? Not this pattern mind you, but the original testcase. It's that simple, at least before we transform to xor(or) form. GCC doesn't get it in xor(or) form either.

I looked for a better way to address this optimizer deficiency and didn't find one. The one thing I didn't check is whether we could get it by reordering our optimizations inside instcombine and do something else before it gets into xor(or) form. Even if that did fix the testcase in PR14365 it would still leave the case where someone actually does write "((a|~c) ^ c) + (a+1)".

lib/Transforms/InstCombine/InstCombineAddSub.cpp
928

"atleast" --> "at least"

951

Optional: *C2 == ~(*C1) might be more efficient as !C2->intersects(*C1). It isn't, but in theory we could improve intersects to not compute an intermediate value (ie., it doesn't need to look at all the bits once it finds a pair are set in both).

test/Transforms/InstCombine/add2.ll
130

Please make sure there's a newline at the end of the file.

This revision is now accepted and ready to land.Jun 18 2014, 10:58 PM
dinesh.d added a comment.EditedJun 19 2014, 2:56 AM

Hi Nick,

m_Not checks for pattern XOR(X, -1) and will not work in current scenario.
i.e. it can check if we have XOR(X, -1) pattern but can not identify that
0x55555555 and 0xAAAAAAAA are NOT of each other.

Am I missing something?

Regards
Dinesh Dwivedi

  • Original Message -------

Sender : Nick Lewycky<nicholas@mxc.ca>
Date : Jun 19, 2014 11:50 (GMT+05:30)
Title : Re: [PATCH] Added instruction combine to transform few more negative
values addition to subtraction

Nick Lewycky wrote:

Sadness. Did you know that gcc gets this even at -O0? Not this pattern mind you, but the original testcase. It's that simple, at least before we transform to xor(or) form. GCC doesn't get it in xor(or) form either.

I looked for a better way to address this optimizer deficiency and didn't find one. The one thing I didn't check is whether we could get it by reordering our optimizations inside instcombine and do something else before it gets into xor(or) form. Even if that did fix the testcase in PR14365 it would still leave the case where someone actually does write "((a|~c) ^ c) + (a+1)".

Actually, I was just thinking about this some more. There's no reason
'C' needs to be a constant at all, is there? As long as it's X and ~X
(use m_Value(X) then m_Not(m_Specific(X))) the transform should be
correct, right?

Nick

Comment at: lib/Transforms/InstCombine/InstCombineAddSub.cpp:928
@@ +927,3 @@
+
+ // This function creates 2 instructions to replace ADD, we need atleast one of

+ // LHS or RHS to have one use to ensure benefit in transform.

"atleast" --> "at least"

Comment at: lib/Transforms/InstCombine/InstCombineAddSub.cpp:951
@@ +950,3 @@
+ if (match(X, m_Xor(m_Value(Y), m_APInt(C1)))) {
+ if (match(Y, m_Or(m_Value(Z), m_APInt(C2)))&& (*C2 == ~(*C1))) {

+ Value *NewAnd = Builder->CreateAnd(Z, *C1);

Optional: *C2 == ~(*C1) might be more efficient as !C2->intersects(*C1). It isn't, but in theory we could improve intersects to not compute an intermediate value (ie., it doesn't need to look at all the bits once it finds a pair are set in both).

Comment at: test/Transforms/InstCombine/add2.ll:142
@@ +141,1 @@
+}
No newline at end of file


Please make sure there's a newline at the end of the file.

http://reviews.llvm.org/D3733

dinesh.d updated this revision to Diff 10620.Jun 19 2014, 3:02 AM
dinesh.d edited edge metadata.

Updated as per comments

dinesh.d added inline comments.Jun 19 2014, 3:03 AM
lib/Transforms/InstCombine/InstCombineAddSub.cpp
928

updated

951

problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one but is set in one, it will not set in second.

We can compute xor and then check if result is all zero but I am not sure if that is more efficient than current code,
it will sure reduce readability though.

test/Transforms/InstCombine/add2.ll
130

updated.

fixed typo in previous comment

lib/Transforms/InstCombine/InstCombineAddSub.cpp
951

Problem with intersects() is that it just ensure that same bit is not set in both C1 and C2 but does not ensure that if one bit is not set in one, it will be set in second.

dinesh.d closed this revision.Jun 19 2014, 3:45 AM
dinesh.d updated this revision to Diff 10622.

Closed by commit rL211266 (authored by dinesh).