This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Transform A & (L - 1) u< L --> L != 0
ClosedPublic

Authored by sanjoy on Aug 20 2015, 1:29 PM.

Details

Summary

This transform is never a pessimization at the IR level (since it
replaces an icmp with another), and has potentiall payoffs:

  1. It may make the icmp fold away or become loop invariant.
  2. It may make the A & (L - 1) computation dead.

This shows up in Java, in range checks generated by array accesses of
the form a[i & (a.length - 1)].

Diff Detail

Event Timeline

sanjoy updated this revision to Diff 32730.Aug 20 2015, 1:29 PM
sanjoy retitled this revision from to [InstCombine] Transform A & (L - 1) u< L --> L != 0.
sanjoy updated this object.
sanjoy added reviewers: majnemer, reames.
sanjoy added a subscriber: llvm-commits.
majnemer added inline comments.Aug 20 2015, 1:42 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
3495–3498

Isn't X - 1 always canonicalized to X + -1 ? Your transform is permitted to assume canonicalization occurred earlier.

3504

Might be more concise as auto *Zero = Constant::getNullValue(BO0->getType());

reames added inline comments.Aug 20 2015, 1:43 PM
test/Transforms/InstCombine/ult-and-bitwise-and.ll
5

Add a version of this test case with sub 1.

I think you need the fact that %lim is >= 0 for this to hold. Consider %lim with 0x101 and %val of 0x110. The original check fails, yours succeeds.

Also, I think that if the add is nsw, we can infer lim is > 0 and thus directly fold to true.

majnemer edited edge metadata.Aug 20 2015, 1:44 PM

Can we add the testcase to icmp.ll?

sanjoy updated this revision to Diff 32742.Aug 20 2015, 2:01 PM
sanjoy marked 2 inline comments as done.
sanjoy edited edge metadata.
  • address review
sanjoy added inline comments.Aug 20 2015, 2:02 PM
test/Transforms/InstCombine/ult-and-bitwise-and.ll
5

Add a version of this test case with sub 1.

Good idea, done (+ moved to icmp.ll as David suggested)

I think you need the fact that %lim is >= 0 for this to hold. Consider %lim with 0x101 and %val of 0x110. The original check fails, yours succeeds.

The original succeeds also -- the check is with val & (lim - 1) == 0x110 & 0x100 = 0x100 u< lim. val & (lim - 1) will always have a lesser or equal number of ones in its binary representation than lim - 1 and hence always will be ULE than lim - 1. lim - 1 is ULT than lim iff lim != 0. Thus val & (lim - 1) ULT lim iff lim != 0.

Also, I think that if the add is nsw, we can infer lim is > 0 and thus directly fold to true.

add nsw i32 0, -1 is not poison. I could do that inference if the instruction was sub nuw %lim, 1, but the sub will get canonicalized to add, and the nuw will then have to be dropped.

In any case, I'd rather not be aggressive about exploiting nuw and nsw unless there is good reason showing that it matters.

majnemer added inline comments.Aug 20 2015, 2:18 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
3503

I'd use Constant:: instead of ConstantInt:: to make it a little more clear. It is possible to get back a vector constant.

sanjoy updated this revision to Diff 32750.Aug 20 2015, 2:30 PM
sanjoy marked an inline comment as done.
  • address review
sanjoy updated this revision to Diff 32751.Aug 20 2015, 2:32 PM
  • run through clang-format
majnemer accepted this revision.Aug 20 2015, 2:34 PM
majnemer edited edge metadata.

LGTM

This revision is now accepted and ready to land.Aug 20 2015, 2:34 PM
This revision was automatically updated to reflect the committed changes.
escha added a subscriber: escha.Aug 21 2015, 12:57 AM
escha added inline comments.
llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp
3495 ↗(On Diff #32760)

Is it just me, or does this never even look at the RHS of the comparison?

It seems to be matching the following, despite the comment: A & (L - 1) ult M -> L != 0 for any M.

sanjoy added a subscriber: sanjoy.Aug 21 2015, 1:12 AM

Yeah, the code is just wrong; I should have been more careful here.

Thanks for reverting, I'll fix this tomorrow.

Sent from a mobile device, please excuse typos.

Hi All,

I've checked in a fixed version in http://reviews.llvm.org/rL245753.
I was able to locally do a 3 stage bootstrap of clang with this
change, so hopefully there should be no more issues, but I'll keep an
eye on http://bb.pgr.jp/builders/clang-3stage-i686-linux till it sees
my change.

  • Sanjoy