This is an archive of the discontinued LLVM Phabricator instance.

[InstCombineAddSub] Added the transformation - "sub (or A B) (xor A B) -> (and A B)"
ClosedPublic

Authored by ankur29.garg on Oct 9 2014, 11:12 PM.

Download Raw Diff

Details

Reviewers

majnemer
suyog
dexonsmith

Commits

rG312c3e5f3989: InstCombine: (sub (or A B) (xor A B)) --> (and A B)
rL220163: InstCombine: (sub (or A B) (xor A B)) --> (and A B)

Summary

Hi,
The following patch implements the tranformation sub (or A B) (xor A B) -> (and A B). It is similar to the one added in visitAdd()

Z3 Link: http://rise4fun.com/Z3/bNCG

Please help in reviewing it.

Thanks.
Ankur

Diff Detail

Event Timeline

ankur29.garg updated this revision to Diff 14700.Oct 9 2014, 11:12 PM

ankur29.garg retitled this revision from to [InstCombineAddSub] Added the transformation - "sub (or A B) (xor A B) -> (and A B)".

ankur29.garg updated this object.

ankur29.garg edited the test plan for this revision. (Show Details)

ankur29.garg added reviewers: dexonsmith, majnemer, suyog.

ankur29.garg set the repository for this revision to rL LLVM.

ankur29.garg added a subscriber: Unknown Object (MLST).

Gentle Ping!

This canonicalization seems unlikely to ever fire on code in the wild.

Hi David,

Actually, similar tranformation is already present in the VisitAdd() function for add operation.

InstCombine is not supposed to handle every possible just because it is theoretically possible case. It's engineered to be a series of engineering tradeoffs. I don't know what the history of the similar transform is in VisitAdd, it's possible that it wasn't added in a principled way.

If we were to arbitrarily add patterns, we would eventually make InstCombine cripplingly slow.

Hi David,

Thanks for the comments.
You are right. I think I should run LLVM test suite and find out how many times, does this case actually arise.

Thanks.

Original Message -----

From: "David Majnemer" <david.majnemer@gmail.com>
To: "ankur29 garg" <ankur29.garg@samsung.com>, dexonsmith@apple.com, "suyog sarda" <suyog.sarda@samsung.com>, "david
majnemer" <david.majnemer@gmail.com>
Cc: llvm-commits@cs.uiuc.edu
Sent: Saturday, October 18, 2014 4:47:12 AM
Subject: Re: [PATCH] [InstCombineAddSub] Added the transformation - "sub (or A B) (xor A B) -> (and A B)"

InstCombine is not supposed to handle every possible just because it
is theoretically possible case. It's engineered to be a series of
engineering tradeoffs. I don't know what the history of the similar
transform is in VisitAdd, it's possible that it wasn't added in a
principled way.

If we were to arbitrarily add patterns, we would eventually make
InstCombine cripplingly slow.

To be honest, I think this position is unhelpful. A few thoughts:

In my experience (derived from my work on an instruction-level superoptimizer), it is really hard to predict what patterns will occur in practice -- and given knowledge of certain bits from switch statements, enums (range metadata), etc. combined with LLVM's canonicalization in terms of large integers, and several other factors, lead to many patterns that I would consider strange, and not-uncommonly in performance-sensitive regions of code.
We (obviously) should not engage in premature optimization -- I've seen no evidence that matching these kinds of patterns in InstCombine contributes a significant portion of its running time. And until such evidence is presented, we should not make statements about expense (there is no complexity argument here). In any case, looking at combinations of three binary operators, are there any more than a few hundred that could simplify in any reasonable sense? Probably not even that many. I see no reason we should not handle them all.
For simplifications that require certain bits to be known, calls into ValueTracking can be expensive, but making duplicate calls into ValueTracking is a separate issue that can be fixed if it becomes important.
I feel that, in the past, the real engineering tradeoff was slightly different: it used to be difficult to prove the transformations correct. As a result, there was a significant possibility of bugs being introduced by each of these transformations, and it made more sense to limit them. As you know, this is no longer anywhere near as true as it was: we now have good tools to verify these transformations.

Thanks again,
Hal

http://reviews.llvm.org/D5719

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

Original Message -----

From: "Hal Finkel" <hfinkel@anl.gov>
To: reviews+D5719+public+3088893b38cafcf5@reviews.llvm.org
Cc: "ankur29 garg" <ankur29.garg@samsung.com>, "suyog sarda" <suyog.sarda@samsung.com>, llvm-commits@cs.uiuc.edu
Sent: Saturday, October 18, 2014 11:37:40 PM
Subject: Re: [PATCH] [InstCombineAddSub] Added the transformation - "sub (or A B) (xor A B) -> (and A B)"

Original Message -----

From: "David Majnemer" <david.majnemer@gmail.com>
To: "ankur29 garg" <ankur29.garg@samsung.com>,
dexonsmith@apple.com, "suyog sarda" <suyog.sarda@samsung.com>,
"david
majnemer" <david.majnemer@gmail.com>
Cc: llvm-commits@cs.uiuc.edu
Sent: Saturday, October 18, 2014 4:47:12 AM
Subject: Re: [PATCH] [InstCombineAddSub] Added the transformation -
"sub (or A B) (xor A B) -> (and A B)"

InstCombine is not supposed to handle every possible just because
it
is theoretically possible case. It's engineered to be a series of
engineering tradeoffs. I don't know what the history of the
similar
transform is in VisitAdd, it's possible that it wasn't added in a
principled way.

If we were to arbitrarily add patterns, we would eventually make
InstCombine cripplingly slow.

To be honest, I think this position is unhelpful. A few thoughts:

In my experience (derived from my work on an instruction-level superoptimizer), it is really hard to predict what patterns will occur in practice -- and given knowledge of certain bits from switch statements, enums (range metadata), etc. combined with LLVM's canonicalization in terms of large integers, and several other factors, lead to many patterns that I would consider strange, and not-uncommonly in performance-sensitive regions of code.

To add another anecdote, just when I thought I had classified the kinds of patterns that were occurring in practice, I started looking at instrumented code (address sanitizer, etc.), and I learned that I had a lot left to do.

-Hal

We (obviously) should not engage in premature optimization -- I've seen no evidence that matching these kinds of patterns in InstCombine contributes a significant portion of its running time. And until such evidence is presented, we should not make statements about expense (there is no complexity argument here). In any case, looking at combinations of three binary operators, are there any more than a few hundred that could simplify in any reasonable sense? Probably not even that many. I see no reason we should not handle them all.

For simplifications that require certain bits to be known, calls into ValueTracking can be expensive, but making duplicate calls into ValueTracking is a separate issue that can be fixed if it becomes important.

I feel that, in the past, the real engineering tradeoff was slightly different: it used to be difficult to prove the transformations correct. As a result, there was a significant possibility of bugs being introduced by each of these transformations, and it made more sense to limit them. As you know, this is no longer anywhere near as true as it was: we now have good tools to verify these transformations.

Thanks again,
Hal

http://reviews.llvm.org/D5719

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

LGTM

This revision is now accepted and ready to land.Oct 19 2014, 1:34 AM

Commited as rL220162.

Hi David,
Thanks for reviewing the patch.

Although, the link to the commit rL220162 seems to be of another patch. Please update it.

Thanks.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineAddSub.cpp

17 lines

test/

Transforms/

InstCombine/

sub.ll

10 lines

Diff 14700

lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 1,425 Lines • ▼ Show 20 Lines	if (match(LHS, m_Select(m_Value(C1), m_Value(A1), m_Value(B1))) &&
if (C1 == C2) {		if (C1 == C2) {
Constant Z1=nullptr, Z2=nullptr;		Constant Z1=nullptr, Z2=nullptr;
Value A, B, *C=C1;		Value A, B, *C=C1;
if (match(A1, m_AnyZero()) && match(B2, m_AnyZero())) {		if (match(A1, m_AnyZero()) && match(B2, m_AnyZero())) {
Z1 = dyn_cast<Constant>(A1); A = A2;		Z1 = dyn_cast<Constant>(A1); A = A2;
Z2 = dyn_cast<Constant>(B2); B = B1;		Z2 = dyn_cast<Constant>(B2); B = B1;
} else if (match(B1, m_AnyZero()) && match(A2, m_AnyZero())) {		} else if (match(B1, m_AnyZero()) && match(A2, m_AnyZero())) {
Z1 = dyn_cast<Constant>(B1); B = B2;		Z1 = dyn_cast<Constant>(B1); B = B2;
Z2 = dyn_cast<Constant>(A2); A = A1;		Z2 = dyn_cast<Constant>(A2); A = A1;
}		}

if (Z1 && Z2 &&		if (Z1 && Z2 &&
(I.hasNoSignedZeros() \|\|		(I.hasNoSignedZeros() \|\|
(Z1->isNegativeZeroValue() && Z2->isNegativeZeroValue()))) {		(Z1->isNegativeZeroValue() && Z2->isNegativeZeroValue()))) {
return SelectInst::Create(C, A, B);		return SelectInst::Create(C, A, B);
}		}
}		}
}		}
}		}

if (I.hasUnsafeAlgebra()) {		if (I.hasUnsafeAlgebra()) {
▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	if (match(Op1, m_Add(m_Specific(Op0), m_Value(Y))) \|\|
match(Op1, m_Add(m_Value(Y), m_Specific(Op0))))		match(Op1, m_Add(m_Value(Y), m_Specific(Op0))))
return BinaryOperator::CreateNeg(Y);		return BinaryOperator::CreateNeg(Y);

// (X-Y)-X == -Y		// (X-Y)-X == -Y
if (match(Op0, m_Sub(m_Specific(Op1), m_Value(Y))))		if (match(Op0, m_Sub(m_Specific(Op1), m_Value(Y))))
return BinaryOperator::CreateNeg(Y);		return BinaryOperator::CreateNeg(Y);
}		}

		// (sub (or A, B) (xor A, B)) --> (and A, B)
		{
		Value A = nullptr, B = nullptr;
		if (match(Op1, m_Xor(m_Value(A), m_Value(B))) &&
		(match(Op0, m_Or(m_Specific(A), m_Specific(B))) \|\|
		match(Op0, m_Or(m_Specific(B), m_Specific(A)))))
		return BinaryOperator::CreateAnd(A, B);
		}

if (Op1->hasOneUse()) {		if (Op1->hasOneUse()) {
Value X = nullptr, Y = nullptr, *Z = nullptr;		Value X = nullptr, Y = nullptr, *Z = nullptr;
Constant *C = nullptr;		Constant *C = nullptr;
Constant *CI = nullptr;		Constant *CI = nullptr;

// (X - (Y - Z)) --> (X + (Z - Y)).		// (X - (Y - Z)) --> (X + (Z - Y)).
if (match(Op1, m_Sub(m_Value(Y), m_Value(Z))))		if (match(Op1, m_Sub(m_Value(Y), m_Value(Z))))
return BinaryOperator::CreateAdd(Op0,		return BinaryOperator::CreateAdd(Op0,
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

test/Transforms/InstCombine/sub.ll

	Show First 20 Lines • Show All 523 Lines • ▼ Show 20 Lines

	define i32 @test44(i32 %x) {			define i32 @test44(i32 %x) {
	%sub = sub nsw i32 %x, 32768			%sub = sub nsw i32 %x, 32768
	ret i32 %sub			ret i32 %sub
	; CHECK-LABEL: @test44(			; CHECK-LABEL: @test44(
	; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 %x, -32768			; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 %x, -32768
	; CHECK: ret i32 [[ADD]]			; CHECK: ret i32 [[ADD]]
	}			}

				define i32 @test45(i32 %x, i32 %y) {
				%or = or i32 %x, %y
				%xor = xor i32 %x, %y
				%sub = sub i32 %or, %xor
				ret i32 %sub
				; CHECK-LABEL: @test45(
				; CHECK-NEXT: %sub = and i32 %x, %y
				; CHECK: ret i32 %sub
				}