This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] optimize icmp-ugt-ashr
AcceptedPublic

Authored by nadav on Jan 14 2022, 4:05 PM.

Details

Summary

This diff optimizes the sequence icmp-ugt(ashr,C_1) C_2. InstCombine already implements this optimization for sgt, and this patch adds support ugt.

@craig.topper came up with the idea and proof.

This patch adds the check for UGT, and also simplifies the check for SGT because Craig's proof shows that the comparison to min_int is not necessary.

define i1 @src(i8 %x, i8 %y, i8 %c) {
  %cp1 = add i8 %c, 1
  %i = shl i8 %cp1, %y
  %i.2 = ashr i8 %i, %y
  %cmp = icmp eq i8 %cp1, %i.2
  ;Assume: C + 1 == (((C + 1) << y) >> y)
  call void @llvm.assume(i1 %cmp)

  ; uncomment for the sgt case
  %j = shl i8 %cp1, %y 
  %j.2 = sub i8 %j, 1
  %cmp2 = icmp ne i8 %j.2, 127
  ;Assume (((c + 1 ) << y) - 1) != 127
  call void @llvm.assume(i1 %cmp2)

  %s = ashr i8 %x, %y
  %r = icmp sgt i8 %s, %c
  ret i1 %r
}

define i1 @tgt(i8 %x, i8 %y, i8 %c) {
  %cp1 = add i8 %c, 1
  %j = shl i8 %cp1, %y
  %j.2 = sub i8 %j, 1

  %r = icmp sgt i8 %x, %j.2
  ret i1 %r
}

declare void @llvm.assume(i1)

This change is related to the optimizations in D117252.

Diff Detail

Event Timeline

nadav requested review of this revision.Jan 14 2022, 4:05 PM
nadav created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2022, 4:05 PM
nadav updated this revision to Diff 400177.Jan 14 2022, 4:08 PM

Could you post the general proof, not for a single constant?

lib/Transforms/InstCombine/InstCombineCompares.cpp
2242 ↗(On Diff #400177)

As i have said in D117252, this should be

if (IsExact || Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_ULT) {

Could you post the general proof, not for a single constant?

I think this is the general proof. It requires less pre-conditions than the sgt case.

https://alive2.llvm.org/ce/z/tFAcZt

lib/Transforms/InstCombine/InstCombineCompares.cpp
2253 ↗(On Diff #400177)

I believe the !C.isMaxSignedValue() is an unneeded condition for the signed case. If C.isMaxSignedValue() is true then C+1 is 0x80 for i8. (ShiftedC + 1).ashr(ShAmtVal) == (C + 1) could only possibly be true when ShAmtVal is 0. 0x80 only has 1 sign bit, any other ShAmtVal would mean that (C+1) would have more than 1 sign bit.

I'm not suggesting we remove it in this patch. Just pointing it out for alive proof purposes.

nadav updated this revision to Diff 400942.Jan 18 2022, 12:10 PM
nadav edited the summary of this revision. (Show Details)
craig.topper added inline comments.Jan 18 2022, 3:34 PM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
2253

I think the !(C + 1).shl(ShAmtVal).isMinSignedValue() term is required for the signed case isn't it? It's the commented out code in my alive proof.

nadav added a comment.Jan 19 2022, 7:48 AM

@craig.topper Thank you for the review, and for writing the proof, of course. This is my understanding of the code and of your proof.

Your proof has two checks, which I added as comments.

;Assume: C + 1 == ((C+1 << y) >> y)
;Assume ((c +1 ) << y) != 127      -- only needed for the sgt case

The first check is implemented with the code:

APInt ShiftedC = (C + 1).shl(ShAmtVal) - 1;
if ((ShiftedC + 1).ashr(ShAmtVal) == (C + 1))

And the second one with:

!C.isMaxSignedValue() && (ShiftedC + 1).ashr(ShAmtVal) == (C + 1)

The code also has this check, which I removed, because the proof does not have this check:

!(C + 1).shl(ShAmtVal).isMinSignedValue()

Is this also your understanding, or did I miss something.

nadav added a comment.Jan 19 2022, 7:51 AM

Ah, I did miss something. I'll rewrite the second check: "Assume ((c +1 ) << y) != 127"

nadav updated this revision to Diff 401265.Jan 19 2022, 8:56 AM
nadav edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jan 19 2022, 2:42 PM