This is an archive of the discontinued LLVM Phabricator instance.

Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized well
ClosedPublic

Authored by suyog on May 29 2014, 8:26 AM.

Details

Summary

This patch implements missed optimization stated in PR19753
This will work for 'ashr exact' only for now in the 'instcombine' pass.

Test Case :

define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -30, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

Result after applying patch :

define i1 @exact_ashr_eq_false(i32 %a) {

%cmp = icmp eq i32 %a, 1
ret i1 %cmp

}

It calculates the quotient of division of the 2 constants and takes log2, and then puts this value for comparison in icmp instruction.

Added the test case given in the bug report.

Please help in reviewing the patch.

Diff Detail

Event Timeline

suyog updated this revision to Diff 9922.May 29 2014, 8:26 AM
suyog retitled this revision from to Patch for PR19753 InstCombine: constant comparison involving "ashr exact" not optimized well.
suyog updated this object.
suyog edited the test plan for this revision. (Show Details)
suyog added reviewers: bkramer, majnemer, rafael.
suyog added a subscriber: Unknown Object (MLST).
rafael edited edge metadata.May 29 2014, 3:04 PM

+ PR19753:
+
(icmp (ashr exact const1, A), const2) -> icmp A, Log2(const1/const2)
+ {
+ ConstantInt *CI2;
+ Value *V;
+ if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(V))) &&
+ (cast<BinaryOperator>(Op0)->isExact())) {
+ APInt Quotient = CI2->getValue().sdiv(CI->getValue());

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

+ unsigned shift = Quotient.logBase2();

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?

Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

+ return new ICmpInst(I.getPredicate(), V,
+ ConstantInt::get(V->getType(), shift));
+ }
+ }
+

// If we have an icmp le or icmp ge instruction, turn it into the
// appropriate icmp lt or icmp gt instruction.  This allows us to rely on
// them being folded in the code below.  The SimplifyICmpInst code has

Index: test/Transforms/InstCombine/icmp.ll

  • test/Transforms/InstCombine/icmp.ll

+++ test/Transforms/InstCombine/icmp.ll
@@ -1365,3 +1365,11 @@

%2 = icmp slt i32 %1, -10
ret i1 %2

}
+
+; CHECK-LABEL: @exact_ashr_eq_false
+; CHECK-NEXT: icmp eq i32 %a, 1
+define i1 @exact_ashr_eq_false(i32 %a) {
+ %shr = ashr exact i32 -30, %a
+ %cmp = icmp eq i32 %shr, -15
+ ret i1 %cmp
+}

suyog added a comment.May 30 2014, 2:20 AM

Hi Rafael,
Thanks a lot for the review.

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

:) Actually, CI already exist in the code, hence i named the new constant as CI2. I can change the comment or change the variable name from CI2 to CI1. Let me know what is good.

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

Yups. But interestingly, there is a call to 'SimplifyICmpInst' before my code, which figures out if there is exact division or not.
If there is no exact division, then it returns false and it never hits my code.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -34, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-34 doesn't divide -15 exactly)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

This is also handled by 'SimplifyICmpInst' call.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -90, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-15 divides -90 exactly but log2(-90/-15) > log2(6) is not exact)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

So for both the cases - not exactly divisible and not exact log - it never reaches my code, 'SimplifyICmpInst' call previous to my code takes care of both cases.
Should i still add the two checks?

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?
Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

We can handle this also. But i will try to implement it another patch. i will add FIXME/TODO for this.

Please let me know regarding the two checks. I will make the changes accordingly.

Thanks.

const1 in the comment is CI2 and const2 is CI, which is a bit confusing :-)

:) Actually, CI already exist in the code, hence i named the new constant as CI2. I can change the comment or change the variable name from CI2 to CI1. Let me know what is good.

Just the comment is fine.

If const2 doesn't exactly divide const1 I think we can produce a
undef, which is probably more profitable, right?

Yups. But interestingly, there is a call to 'SimplifyICmpInst' before my code, which figures out if there is exact division or not.
If there is no exact division, then it returns false and it never hits my code.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -34, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-34 doesn't divide -15 exactly)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

I see, thanks.

Some thing if the log is not exact. That means that the exact flag in
the ashr was wrong.

This is also handled by 'SimplifyICmpInst' call.

e.g
define i1 @exact_ashr_eq_false(i32 %a) {

%shr = ashr exact i32 -90, %a
%cmp = icmp eq i32 %shr, -15
ret i1 %cmp

}

(-15 divides -90 exactly but log2(-90/-15) > log2(6) is not exact)

Output without my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Output with my patch:

define i1 @exact_ashr_eq_false(i32 %a) {

ret i1 false

}

Awesome :-)

So for both the cases - not exactly divisible and not exact log - it never reaches my code, 'SimplifyICmpInst' call previous to my code takes care of both cases.
Should i still add the two checks?

No, it is fine. Now that I think of it better, code that can produce
just a constant (like the above) should live in instsimplify, not
instcombine.

Could lshr be handled in a similar (udiv instead of sdiv) way if it
has an exact flag?
Not sure if you want to handle these now or in another patch, but at
least add a FIXME comment.

We can handle this also. But i will try to implement it another patch. i will add FIXME/TODO for this.

Please let me know regarding the two checks. I will make the changes accordingly.

LGTM with

  • FIXME/TODO about the lhsr
  • The comment variable names update in the comment.
  • A comment about the inexact div and log being handled in instsimplify

Cheers,
Rafael

suyog updated this revision to Diff 9961.May 30 2014, 8:25 AM
suyog edited edge metadata.

Hi Rafael,

Thanks for the review!

I have made changes as suggested by you :

  • FIXME/TODO about the lhsr
  • The comment variable names update in the comment.
  • A comment about the inexact div and log being handled in instsimplify

If this looks good to you, can you please commit this on my behalf.

Thanks once again for reviews and help.

I committed it as r209903.

arsenm added a subscriber: arsenm.May 30 2014, 11:34 AM

This could also be extended to handle vectors

suyog accepted this revision.Jun 4 2014, 12:20 AM
suyog added a reviewer: suyog.

Thanks

This revision is now accepted and ready to land.Jun 4 2014, 12:20 AM
suyog closed this revision.Jun 4 2014, 12:21 AM