This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] icmp sgt (shl nsw X, C1), C0 --> icmp sgt X, C0 >> C1
ClosedPublic

Authored by spatel on Jan 6 2017, 11:41 AM.

Diff Detail

Event Timeline

spatel updated this revision to Diff 83394.Jan 6 2017, 11:41 AM
spatel retitled this revision from to [InstCombine] icmp sgt (shl nsw X, C1), C0 --> icmp sgt X, C0 >> C1.
spatel updated this object.
spatel added reviewers: nlopes, bryant, majnemer.
spatel added a subscriber: llvm-commits.
bryant edited edge metadata.Jan 17 2017, 10:29 AM

This looks like the signed analogue of the unsigned case on line 1965. Since the unsigned case handles both ugt and ule (but makes the substitutions for ult), perhaps you could handle sle as well:

Name: sle
%c = shl nsw i8 %x, C1
%d = icmp sle %c, C0
  =>
%d = icmp sle %x, (C0 >> C1)

----------------------------------------
Optimization: sle
Done: 1
Optimization is correct!

http://rise4fun.com/Alive/j

This looks like the signed analogue of the unsigned case on line 1965. Since the unsigned case handles both ugt and ule (but makes the substitutions for ult), perhaps you could handle sle as well:

Sure - I was planning to expand this for sle (slt) and the missed eq/ne folds. It will make the code patch slightly bigger to handle sgt/sle in one step, but if that's preferred to match the existing code, I think it's ok.

To match LLVM's canonicalization to slt, I think the Alive code becomes:
Pre: C0 != -128
%a = shl nsw i8 %x, C1
%b = icmp slt %a, C0

=>

%b = icmp slt %x, ((C0 - 1) >> C1) + 1

spatel updated this revision to Diff 84742.Jan 17 2017, 2:05 PM
spatel edited the summary of this revision. (Show Details)

Patch updated:
Handle sgt, sle (slt), eq/ne in one patch to match the existing and related 'shl nuw' folds.
I added tests for sle with rL292248 and refactored with rL292260.

bryant added inline comments.Jan 17 2017, 11:28 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

Cmp.isEquality() || Pred == ICmpInst::ICMP_UGT looks equivalent to Pred == ICMP_UGE. From memory, the unsigned counterparts only work with ugt and ule. It doesn't work with uge.

Name: ugt
%a = shl nsw i8 %x, C1
%b = icmp ugt %a, C0
  =>
%b = icmp ugt %x, (C0 >> C1)

----------------------------------------
Optimization: ugt
Done: 1
Optimization is correct! 

Name: eq
%a = shl nsw i8 %x, C1
%b = icmp eq %a, C0
  =>
%b = icmp eq %x, (C0 >> C1)

----------------------------------------
Optimization: eq
ERROR: Mismatch in values of i1 %b

http://rise4fun.com/Alive/D1

bryant added inline comments.Jan 17 2017, 11:40 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1915

Just working by analogy (see below), I'm guessing that this isn't true for eq.

spatel added inline comments.Jan 18 2017, 7:30 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

Except that Cmp.isEquality() has nothing to do with ICMP_UGE:

static bool isEquality(Predicate P) {
  return P == ICMP_EQ || P == ICMP_NE;
}

I'll take this as a suggestion to improve readability by spelling out the relevant predicates instead of using isEquality, but there is no functional change.

bryant added inline comments.Jan 18 2017, 7:33 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

It doesn't work with uge because it doesn't work with eq. Line 1942 would allow the transform to happen if Pred == ICmpInst::ICMP_EQ, right? Am I mis-reading?

spatel added inline comments.Jan 18 2017, 7:45 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

Sorry, I sent that reply too soon. I see now why there is confusion. You're correct (as Alive shows) that eq/ne don't work in general:

http://rise4fun.com/Alive/Yu

What's missing from the equation is the precondition / assertion that the shifted compare constant hasn't dropped any bits:
http://rise4fun.com/Alive/Ee

Known bits analysis should guarantee that cases like that are already folded to true/false, so there is no need for a compare. I'll add asserts to the patch. Makes sense?

spatel updated this revision to Diff 84846.Jan 18 2017, 8:58 AM

Patch updated:
Add asserts for eq/ne compares (both for the existing NUW path and the NSW path that's being added here) to make it clear what the pre-conditions for these folds are.

I'm not seeing any test cases for nsw/nuw eq/ne. Should/Can they be added?

lib/Transforms/InstCombine/InstCombineCompares.cpp
1913

Should this go into its own patch?

1917

Can this branch be merged with ugt?

if (Cmp.isEquality() || Pred == ICMP_UGT) {
  assert(!Cmp.isEquality || C->lshr(*ShiftAmt).shl(*ShiftAmt) == *C &&
             "Compare known true or false was not folded");
  // do the transform
}

I'm not seeing any test cases for nsw/nuw eq/ne. Should/Can they be added?

This is why I started with a minimal patch. :)
This patch does not intend to change any functionality for the nuw path, so either those tests already exist or should be added separately. The removal of the trailing check for shl nuw + icmp eq/ne could have been part of r292260, so I can remove that separately if you think that's better. Ie, we're now handling all of the nuw transforms in the earlier block.

Tests that change with this patch and are commented with "Known bits analysis turns this into an equality predicate" are effectively tests of nsw+eq/ne, but I can add explicit checks.

lib/Transforms/InstCombine/InstCombineCompares.cpp
1917

It could, but I think this is easier to read, and I'd really prefer to leave refactoring for a follow-up if that seems worthwhile.

spatel updated this revision to Diff 84885.Jan 18 2017, 2:31 PM

Patch updated:
Removed nuw distractions from this patch (rL292433, rL292440) and added extra tests for ne/eq (rL292441).

LGTM with just one nit.

lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

shl nsw is also shifting out zeroes, no? It's ashr that shifts in sign bits.

bryant added inline comments.Jan 18 2017, 10:05 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

What I mean is (also for the ugt summary): It's because shl shifts in zeroes that warrants this transform. In other words, in the un-transformed icmp, we're comparing against known zero bits introduced by the shl.

spatel added inline comments.Jan 19 2017, 6:57 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
1912–1913

It's the nuw or nsw property of the shift that allows these transforms. Ie, any shift-left *shifts in* zeroes to the low bits. The fact that we've guaranteed *shifting out* sign-bits or zeroes from the high bits is what allows us to safely lshr/ashr the constant.

I'll try to make that clearer in the comment.

This revision was automatically updated to reflect the committed changes.