This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Add KnownBits patterns `xor(x, x - 1)` and `and(x, -x)` for knowing upper bits to be zero
ClosedPublic

Authored by goldstein.w.n on Jan 20 2023, 9:10 PM.

Details

Summary

These two BMI pattern will clear the upper bits of result past the
first set bit. So if we know a single bit in x is set, we know that
results[bitwidth - 1, log2(x) + 1] = 0.

Alive2:
blsmsk: https://alive2.llvm.org/ce/z/a397BS
blsi: https://alive2.llvm.org/ce/z/tsbQhC

Diff Detail

Event Timeline

goldstein.w.n created this revision.Jan 20 2023, 9:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2023, 9:10 PM
goldstein.w.n requested review of this revision.Jan 20 2023, 9:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2023, 9:10 PM
craig.topper added inline comments.
llvm/lib/Analysis/ValueTracking.cpp
1124–1125

The description of blsmsk says "Sets all the lower bits of the destination operand to “1” up to and including lowest set bit (=1) in the source operand".

craig.topper added inline comments.Jan 20 2023, 9:21 PM
llvm/lib/Analysis/ValueTracking.cpp
1089

that sounds more like reassociating than commuting.

goldstein.w.n added inline comments.Jan 20 2023, 9:54 PM
llvm/lib/Analysis/ValueTracking.cpp
1089

that sounds more like reassociating than commuting.

Yup, yet again I get my properties wrong :(

1124–1125

The description of blsmsk says "Sets all the lower bits of the destination operand to “1” up to and including lowest set bit (=1) in the source operand".

Err, yeah should be 'clear all bits above the lowest set bit'. Will fix.

Fix spellings / grammars

goldstein.w.n marked 2 inline comments as done.Jan 21 2023, 12:25 AM
nikic added a comment.Jan 23 2023, 8:16 AM

I believe this is missing negative tests (e.g. x & -y, x ^ (x + 2)). We should also have tests for commuted variants and basic vector tests.

llvm/lib/Analysis/ValueTracking.cpp
1086
1097

APInt::getBitsSetFrom(BitWidth, MinBit + 1)

1099

Redundant parentheses.

1134

I'd suggest to combine these two matches as match(I, m_c_BinOp(m_Value(X), m_Add(m_Deferred(X), m_AllOnes()))).

1138

APInt::getBitsSetFrom here again.

foad added a comment.Jan 23 2023, 8:30 AM

I suggest implementing this in new helper functions KnownBits::blsi and KnownBits::blsmsk (if we're happy with those names), plus comprehensive test coverage in unittests/Support/KnownBitsTest.cpp.

llvm/lib/Analysis/ValueTracking.cpp
1124–1125

'clear all bits above the lowest set bit' isn't enough. It also sets all bits below the lowest set bit.

goldstein.w.n added inline comments.Jan 23 2023, 12:01 PM
llvm/lib/Analysis/ValueTracking.cpp
1124–1125

'clear all bits above the lowest set bit' isn't enough. It also sets all bits below the lowest set bit.

If we know all bits below the first known set bit, then yes we can full evaluate (although this is already handled by the normal constant propegation of bits through the xor/sub.

But if we only know a single bit, we don't know if there is a bit below it that is also set, so I don't see a way we can start setting the bits below it.

You are right though that we can always set Known.One.setBit(0). In fact we can make it a bit more generic:

(xor (x, (sub x, OddC))) -> isOdd

https://alive2.llvm.org/ce/z/aaABdS

Will add that.

goldstein.w.n added a comment.EditedJan 23 2023, 1:52 PM

I suggest implementing this in new helper functions KnownBits::blsi and KnownBits::blsmsk (if we're happy with those names), plus comprehensive test coverage in unittests/Support/KnownBitsTest.cpp.

So started doing this. I realized, however, that what we really want is something like computeKnownBitsFromAndXor.

I think though doing all that in one patch is too much. Think it makes sense here to do the following:

  1. Add the BMI patterns (this patch)
  2. Factor our and/xor into common function (this will cleanup the BMI logics) + get the odd bit pattern for XOR
  3. Going to apply this in simplifydemanded bits: https://reviews.llvm.org/D142300

So for this patch would prefer skipping adding into new functions as it will just be temporary anyways (unless @nikic you are okay with me bloating this patch with addition of factoring and / xor logic into common function and combining there logic, otherwise will make it follow up).

craig.topper added inline comments.Jan 23 2023, 2:00 PM
llvm/lib/Analysis/ValueTracking.cpp
1094

I'm not sure about this min. The -x operand will potentially find less trailing bits due to the recursion limit in computeKnownBits. It has one more node to walk through. Using the tailing zeros bits based on which operand is x seems more reliable.

goldstein.w.n marked 2 inline comments as not done.Jan 23 2023, 2:11 PM
goldstein.w.n added inline comments.
llvm/lib/Analysis/ValueTracking.cpp
1094

I'm not sure about this min. The -x operand will potentially find less trailing bits due to the recursion limit in computeKnownBits. It has one more node to walk through. Using the tailing zeros bits based on which operand is x seems more reliable.

I think its possible for -x to find more bits if, for example, there is an llvm.assume on -x but but not on x.

Improve test coverage + fix nits

goldstein.w.n marked 6 inline comments as done.Jan 23 2023, 5:37 PM
goldstein.w.n added inline comments.Jan 23 2023, 5:44 PM
llvm/test/Transforms/InstCombine/ctpop-pow2.ll
150

This regression is fixed in: https://reviews.llvm.org/D142429

What is happening is computeKnownBits is allowing instsimplify to optimize out the ctpop b.c its ctpop(x & 1), but since SimplifyDemandedUseBits doesn't have this logic yet, it never gets optimized out. D142429 starts using computeKnownBits in SimplifyDemandedUseBits which fixed this.

foad added a comment.Jan 24 2023, 6:56 AM

I suggest implementing this in new helper functions KnownBits::blsi and KnownBits::blsmsk (if we're happy with those names), plus comprehensive test coverage in unittests/Support/KnownBitsTest.cpp.

I was thinking of something like D142469. Is this any use to you?

goldstein.w.n added a comment.EditedJan 24 2023, 11:28 AM

I suggest implementing this in new helper functions KnownBits::blsi and KnownBits::blsmsk (if we're happy with those names), plus comprehensive test coverage in unittests/Support/KnownBitsTest.cpp.

I was thinking of something like D142469. Is this any use to you?

I see, yeah that makes sense although I would design it as a static API b.c we can technically get more information from -x or x - 1 than x itself if it happens to match some assume.
I.e:

static KnownBits blsi(KnowBits X, KnownBits NegX) const;
static KnownBits blsmsk(KnowBits X, KnownBits XMinus1) const;

(Also realized there is a bug in the current impl as XMinus1.One.countTrailingZeros() can't be used for low bound).

Do you want to make that an independent patch or want me to just combine it with this?

foad added a comment.Jan 24 2023, 11:48 AM

I see, yeah that makes sense although I would design it as a static API b.c we can technically get more information from -x or x - 1 than x itself if it happens to match some assume.

Is that really going to happen in practice? To me it seems so unlikely as to be not worth worrying about, for the sake a a nice simple API.

goldstein.w.n added a comment.EditedJan 24 2023, 12:51 PM

I see, yeah that makes sense although I would design it as a static API b.c we can technically get more information from -x or x - 1 than x itself if it happens to match some assume.

Is that really going to happen in practice? To me it seems so unlikely as to be not worth worrying about, for the sake a a nice simple API.

Personally in favor of completeness. Or at least writing an API that makes it easy to make complete at a later time with a TODO.

Rebase and use blsi/blsmsk api

I see, yeah that makes sense although I would design it as a static API b.c we can technically get more information from -x or x - 1 than x itself if it happens to match some assume.

Is that really going to happen in practice? To me it seems so unlikely as to be not worth worrying about, for the sake a a nice simple API.

This is how I'd do it: D142519. Keeps you code but adds overloads for specifying the exact knownbits for -x / x-C.

nikic added inline comments.Jan 26 2023, 5:37 AM
llvm/lib/Analysis/ValueTracking.cpp
1093

Add a comment that as -(-x) == x, we can view this as a blsi operation on either operand, and pick whichever produces the best result.

1096

break here? Don't think we want to fall through.

1134

m_APInt(C) -> m_AllOnes().

1136

Shouldn't this be ? Known : Known2? We want the known bits of X, not of X-1.

nikic added inline comments.Jan 26 2023, 5:45 AM
llvm/lib/Analysis/ValueTracking.cpp
1136

Ignore this comment, the variable naming is a bit confusing here.

goldstein.w.n marked 5 inline comments as done.

Rebase + update comments + allones > apint

NB; Applied all relevant changes here to D142427

llvm/lib/Analysis/ValueTracking.cpp
1096

break here? Don't think we want to fall through.

Fallthrough here has no issue. While I think the blsi case is mutually exclusive with the other transform, I generally think breaking early in this type of code (where you have back-to-back cases) can make things unclear as the assumption is generally we will try them all.

1134

m_APInt(C) -> m_AllOnes().

1134

m_APInt(C) -> m_AllOnes().

Sure, had changed to m_APInt(C) for the todo to stress that C could be any constant but guess no real reason for it.

1136

Ignore this comment, the variable naming is a bit confusing here.

nikic accepted this revision.Jan 29 2023, 2:50 AM

LGTM

llvm/lib/Analysis/ValueTracking.cpp
1096

At least with this current code structure, falling through is pretty confusing. Known was previously the known bits of the second operand (that is either X or -X) and now it will be the known bits of X & -X. I guess this is fine in practice because & is idempotent, so we end up computing something like X & X & -X.

The new code structure in D142427 avoids this problem, but for this patch I would not do the fall through.

This revision is now accepted and ready to land.Jan 29 2023, 2:50 AM
foad added inline comments.Jan 29 2023, 4:08 AM
llvm/lib/Analysis/ValueTracking.cpp
1093

A less "if"fy way to implement this would be:

Known = Known.blsi();
Known2 = Known2.blsi();
// should probably have a helper for the following two lines, similar to KnownBits::commonBits
Known.Zero |= Known2.Zero;
Known.One |= Known2.One;

(But I'm still sceptical that it gives any practical benefit.)

Add Break

llvm/lib/Analysis/ValueTracking.cpp
1093

A less "if"fy way to implement this would be:

Known = Known.blsi();
Known2 = Known2.blsi();
// should probably have a helper for the following two lines, similar to KnownBits::commonBits
Known.Zero |= Known2.Zero;
Known.One |= Known2.One;

I'm not sure this is cleaner / clearer / faster.

(But I'm still sceptical that it gives any practical benefit.)

Agreeds its not a big deal, but it can change the result and we don't need any extra expensive calls to handle it so seems like "why not" (would agree with you if we would need an extra computeKnownBits call or something). I added two new tests: blsi_differing_lowbits and blsi_differing_lowbits2 which only work if we pick the minimum.

1096

At least with this current code structure, falling through is pretty confusing. Known was previously the known bits of the second operand (that is either X or -X) and now it will be the known bits of X & -X. I guess this is fine in practice because & is idempotent, so we end up computing something like X & X & -X.

The new code structure in D142427 avoids this problem, but for this patch I would not do the fall through.

Okay added the break, guess its only temporary anyways and doesn't change any of the tests.

nikic added a comment.Feb 18 2023, 6:26 AM

Any reason not to commit the first three patches in this stack yet?

goldstein.w.n added a comment.EditedFeb 18 2023, 9:42 AM

Any reason not to commit the first three patches in this stack yet?

I generally like to keep series together, but not strong reason.

Are the following 4 problematic?

But not issue pushing first 3 if youd prefer

This revision was landed with ongoing or failed builds.Feb 18 2023, 11:31 AM
This revision was automatically updated to reflect the committed changes.
goldstein.w.n added a comment.EditedFeb 18 2023, 11:32 AM

Any reason not to commit the first three patches in this stack yet?

Pushed first three, but the following four not be forgotten? They have been sitting around for a while (likewise will push for two of the series starting at D142827, but the last 4 have also been sitting around for a while. Can they be looked at?)