This is an archive of the discontinued LLVM Phabricator instance.

Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized well
ClosedPublic

Authored by suyog on May 29 2014, 8:26 AM.

Download Raw Diff

Details

Reviewers

majnemer
suyog
• rafael
bkramer

Summary

This patch implements missed optimization stated in PR19753
This will work for 'ashr exact' only for now in the 'instcombine' pass.

Test Case :

define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -30, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

Result after applying patch :

define i1 @exact_ashr_eq_false(i32 %a) {

%cmp = icmp eq i32 %a, 1
ret i1 %cmp

}

It calculates the quotient of division of the 2 constants and takes log2, and then puts this value for comparison in icmp instruction.

Added the test case given in the bug report.

Please help in reviewing the patch.

Diff Detail

Event Timeline

suyog updated this revision to Diff 9922.May 29 2014, 8:26 AM

suyog retitled this revision from to Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized well.

suyog updated this object.

suyog edited the test plan for this revision. (Show Details)

suyog added reviewers: bkramer, majnemer, • rafael.

suyog added a subscriber: Unknown Object (MLST).

+ PR19753:
+ (icmp (ashr exact const1, A), const2) -> icmp A, Log2(const1/const2)
+ {
+ ConstantInt *CI2;
+ Value *V;
+ if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(V))) &&
+ (cast<BinaryOperator>(Op0)->isExact())) {
+ APInt Quotient = CI2->getValue().sdiv(CI->getValue());

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

+ unsigned shift = Quotient.logBase2();

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?

Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

+ return new ICmpInst(I.getPredicate(), V,
+ ConstantInt::get(V->getType(), shift));
+ }
+ }
+
// If we have an icmp le or icmp ge instruction, turn it into the
// appropriate icmp lt or icmp gt instruction.  This allows us to rely on
// them being folded in the code below.  The SimplifyICmpInst code has
Index: test/Transforms/InstCombine/icmp.ll

test/Transforms/InstCombine/icmp.ll

+++ test/Transforms/InstCombine/icmp.ll
@@ -1365,3 +1365,11 @@
%2 = icmp slt i32 %1, -10
ret i1 %2
}
+
+; CHECK-LABEL: @exact_ashr_eq_false
+; CHECK-NEXT: icmp eq i32 %a, 1
+define i1 @exact_ashr_eq_false(i32 %a) {
+ %shr = ashr exact i32 -30, %a
+ %cmp = icmp eq i32 %shr, -15
+ ret i1 %cmp
+}

Hi Rafael,
Thanks a lot for the review.

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

:) Actually, CI already exist in the code, hence i named the new constant as CI2. I can change the comment or change the variable name from CI2 to CI1. Let me know what is good.

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

Yups. But interestingly, there is a call to 'SimplifyICmpInst' before my code, which figures out if there is exact division or not.
If there is no exact division, then it returns false and it never hits my code.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -34, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-34 doesn't divide -15 exactly)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

This is also handled by 'SimplifyICmpInst' call.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -90, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-15 divides -90 exactly but log2(-90/-15) > log2(6) is not exact)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

So for both the cases - not exactly divisible and not exact log - it never reaches my code, 'SimplifyICmpInst' call previous to my code takes care of both cases.
Should i still add the two checks?

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?
Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

We can handle this also. But i will try to implement it another patch. i will add FIXME/TODO for this.

Please let me know regarding the two checks. I will make the changes accordingly.

Thanks.

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

:) Actually, CI already exist in the code, hence i named the new constant as CI2. I can change the comment or change the variable name from CI2 to CI1. Let me know what is good.

Just the comment is fine.

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

Yups. But interestingly, there is a call to 'SimplifyICmpInst' before my code, which figures out if there is exact division or not.
If there is no exact division, then it returns false and it never hits my code.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {
%shr = ashr exact i32 -34, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp
}

(-34 doesn't divide -15 exactly)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {
ret i1 false
}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {
ret i1 false
}

I see, thanks.

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

This is also handled by 'SimplifyICmpInst' call.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {
%shr = ashr exact i32 -90, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp
}

(-15 divides -90 exactly but log2(-90/-15) > log2(6) is not exact)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {
ret i1 false
}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {
ret i1 false
}

Awesome :-)

So for both the cases - not exactly divisible and not exact log - it never reaches my code, 'SimplifyICmpInst' call previous to my code takes care of both cases.
Should i still add the two checks?

No, it is fine. Now that I think of it better, code that can produce
just a constant (like the above) should live in instsimplify, not
instcombine.

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?
Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

We can handle this also. But i will try to implement it another patch. i will add FIXME/TODO for this.

Please let me know regarding the two checks. I will make the changes accordingly.

LGTM with

FIXME/TODO about the lhsr
The comment variable names update in the comment.
A comment about the inexact div and log being handled in instsimplify

Cheers,
Rafael

Hi Rafael,

Thanks for the review!

I have made changes as suggested by you :

FIXME/TODO about the lhsr
The comment variable names update in the comment.
A comment about the inexact div and log being handled in instsimplify

If this looks good to you, can you please commit this on my behalf.

Thanks once again for reviews and help.

I committed it as r209903.

This could also be extended to handle vectors

Thanks

This revision is now accepted and ready to land.Jun 4 2014, 12:20 AM

suyog closed this revision.Jun 4 2014, 12:21 AM

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineCompares.cpp

17 lines

test/

Transforms/

InstCombine/

icmp.ll

8 lines

Diff 9961

lib/Transforms/InstCombine/InstCombineCompares.cpp

Show First 20 Lines • Show All 2,433 Lines • ▼ Show 20 Lines	if (ConstantInt *CI = dyn_cast<ConstantInt>(Op1)) {

// (icmp ne/eq (sub A B) 0) -> (icmp ne/eq A, B)		// (icmp ne/eq (sub A B) 0) -> (icmp ne/eq A, B)
if (I.isEquality() && CI->isZero() &&		if (I.isEquality() && CI->isZero() &&
match(Op0, m_Sub(m_Value(A), m_Value(B)))) {		match(Op0, m_Sub(m_Value(A), m_Value(B)))) {
// (icmp cond A B) if cond is equality		// (icmp cond A B) if cond is equality
return new ICmpInst(I.getPredicate(), A, B);		return new ICmpInst(I.getPredicate(), A, B);
}		}

		// PR19753:
		// (icmp (ashr exact const2, A), const1) -> icmp A, Log2(const2/const1)
		// Cases where const1 doesn't divide const2 exactly or Quotient is not
		// exact of log2 are handled by SimplifyICmpInst call above where we
		// return false.
		// TODO: Handle this for lshr exact with udiv.
		{
		ConstantInt *CI2;
		if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(A))) &&
		(cast<BinaryOperator>(Op0)->isExact())) {
		APInt Quotient = CI2->getValue().sdiv(CI->getValue());
		unsigned shift = Quotient.logBase2();
		return new ICmpInst(I.getPredicate(), A,
		ConstantInt::get(A->getType(), shift));
		}
		}

// If we have an icmp le or icmp ge instruction, turn it into the		// If we have an icmp le or icmp ge instruction, turn it into the
// appropriate icmp lt or icmp gt instruction. This allows us to rely on		// appropriate icmp lt or icmp gt instruction. This allows us to rely on
// them being folded in the code below. The SimplifyICmpInst code has		// them being folded in the code below. The SimplifyICmpInst code has
// already handled the edge cases for us, so we just assert on them.		// already handled the edge cases for us, so we just assert on them.
switch (I.getPredicate()) {		switch (I.getPredicate()) {
default: break;		default: break;
case ICmpInst::ICMP_ULE:		case ICmpInst::ICMP_ULE:
assert(!CI->isMaxValue(false)); // A <=u MAX -> TRUE		assert(!CI->isMaxValue(false)); // A <=u MAX -> TRUE
▲ Show 20 Lines • Show All 1,190 Lines • Show Last 20 Lines

test/Transforms/InstCombine/icmp.ll

	Show First 20 Lines • Show All 1,359 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @icmp_neg_cst_slt			; CHECK-LABEL: @icmp_neg_cst_slt
	; CHECK-NEXT: [[CMP:%[a-z0-9]+]] = icmp sgt i32 %a, 10			; CHECK-NEXT: [[CMP:%[a-z0-9]+]] = icmp sgt i32 %a, 10
	; CHECK-NEXT: ret i1 [[CMP]]			; CHECK-NEXT: ret i1 [[CMP]]
	define i1 @icmp_neg_cst_slt(i32 %a) {			define i1 @icmp_neg_cst_slt(i32 %a) {
	%1 = sub nsw i32 0, %a			%1 = sub nsw i32 0, %a
	%2 = icmp slt i32 %1, -10			%2 = icmp slt i32 %1, -10
	ret i1 %2			ret i1 %2
	}			}

				; CHECK-LABEL: @exact_ashr_eq_false
				; CHECK-NEXT: icmp eq i32 %a, 1
				define i1 @exact_ashr_eq_false(i32 %a) {
				%shr = ashr exact i32 -30, %a
				%cmp = icmp eq i32 %shr, -15
				ret i1 %cmp
				}

This is an archive of the discontinued LLVM Phabricator instance.

Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized wellClosedPublic

Details

Diff Detail

Event Timeline

Index: test/Transforms/InstCombine/icmp.ll

Revision Contents

Diff 9961

lib/Transforms/InstCombine/InstCombineCompares.cpp

test/Transforms/InstCombine/icmp.ll

Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized well
ClosedPublic