This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Analysis/ValueTracking.cpp
701	I believe the placement of this call was intentional, because it is more expansive than the icmp operand checks. That said, I don't see a compile-time impact on CTMark, though that is not exactly an assume-heavy workload. We did address various inefficiencies in isValidAssumeForContext() over time though, so I would be fine with trying this change. It should be split out into a separate NFC patch though.
948	It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With that done, this becomes `v & b == 0` and is covered by existing handling.
952	Move this computeKnownBits() call into the isKnownToBeAPowerOfTwo() branch.
959	This should just check whether A is zero, no need to compute known bits.

goldstein.w.n added inline comments.Jan 4 2023, 9:24 AM

llvm/lib/Analysis/ValueTracking.cpp
948	It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With that done, this becomes `v & b == 0` and is covered by existing handling. So would a better approach be to handle the `v & b != a` canonicalization in `InstCombine` and drop this case or leave this as is?

nikic added a reviewer: spatel.Jan 4 2023, 9:35 AM

nikic added inline comments.

llvm/lib/Analysis/ValueTracking.cpp
948	Handling this in InstCombine would be preferred, it reduces the number of patterns other passes see.

nikic mentioned this in D140849: [InstCombine] Add tests for binops with conditions/assume constraints.Jan 4 2023, 9:51 AM

goldstein.w.n added inline comments.Jan 4 2023, 10:10 AM

llvm/lib/Analysis/ValueTracking.cpp
948	Handling this in InstCombine would be preferred, it reduces the number of patterns other passes see. Do you know where I should look in `InstCombineCompares` for where to put this?
959	This should just check whether A is zero, no need to compute known bits. How can I check if a value is known zero w.o `computeKnownBits`? I see `isKnownNonZero` but not the inverse.

goldstein.w.n marked 2 inline comments as not done.Jan 4 2023, 10:12 AM

Remove case from computeKnownBits ICMP_NE and cannocilize instead

goldstein.w.n marked 4 inline comments as done.Jan 4 2023, 4:13 PM

goldstein.w.n added inline comments.

llvm/lib/Analysis/ValueTracking.cpp
701	I believe the placement of this call was intentional, because it is more expansive than the icmp operand checks. That said, I don't see a compile-time impact on CTMark, though that is not exactly an assume-heavy workload. We did address various inefficiencies in isValidAssumeForContext() over time though, so I would be fine with trying this change. It should be split out into a separate NFC patch though. Fair enough. Once this is all through I'll make a patch and can discuss there.

Harbormaster completed remote builds in B205796: Diff 486423.Jan 4 2023, 5:15 PM

spatel added inline comments.Jan 5 2023, 9:03 AM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

4832–4834 ↗

(On Diff #486423)

This should be split into an independent patch with several tests.

Double-check to make sure this is the same logic, but I think it'd be shorter and easier to read without the loop:

// If Op1 is a power-of-2 (has exactly one bit set):
// (Op1 & X) == Op1 --> (Op1 & X) != 0
// (Op1 & X) != Op1 --> (Op1 & X) == 0
if (match(Op0, m_c_And(m_Specific(Op1), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op1, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op0,
                      ConstantInt::getNullValue(Op0->getType()));
}
// If Op0 is a power-of-2 (has exactly one bit set):
// Op0 == (Op0 & X) --> (Op0 & X) != 0
// Op0 != (Op0 & X) --> (Op0 & X) == 0
if (match(Op1, m_c_And(m_Specific(Op0), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op0, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op1,
                      ConstantInt::getNullValue(Op0->getType()));
}

goldstein.w.n marked an inline comment as done.Jan 5 2023, 9:19 AM

goldstein.w.n added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

4832–4834 ↗

(On Diff #486423)

This should be split into an independent patch with several tests.

Sure, if I split it off will I be able to just revise the parent/child status of the surrounding reviews to link it in or will I need to make 5-new ones?

Double-check to make sure this is the same logic, but I think it'd be shorter and easier to read without the loop:

// If Op1 is a power-of-2 (has exactly one bit set):
// (Op1 & X) == Op1 --> (Op1 & X) != 0
// (Op1 & X) != Op1 --> (Op1 & X) == 0
if (match(Op0, m_c_And(m_Specific(Op1), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op1, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op0,
                      ConstantInt::getNullValue(Op0->getType()));
}
// If Op0 is a power-of-2 (has exactly one bit set):
// Op0 == (Op0 & X) --> (Op0 & X) != 0
// Op0 != (Op0 & X) --> (Op0 & X) == 0
if (match(Op1, m_c_And(m_Specific(Op0), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op0, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op1,
                      ConstantInt::getNullValue(Op0->getType()));
}

Will take a look for next version.

spatel added inline comments.Jan 5 2023, 9:30 AM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
4832–4834 ↗	(On Diff #486423)	It's independent of anything else, so there's no need to update the other patches up for review? Just pull out this hunk and regenerate test diffs if needed. I just added some tests here: ad14cef1d530

nikic added inline comments.Jan 5 2023, 10:50 AM

llvm/lib/Analysis/ValueTracking.cpp
959	`match(A, m_Zero())` would be the usual way.

goldstein.w.n added inline comments.Jan 5 2023, 10:52 AM

llvm/lib/Analysis/ValueTracking.cpp
959	`match(A, m_Zero())` would be the usual way. Won't that only match `Constant` types? Could miss a case no?

goldstein.w.n removed a child revision: D140852: [Patch 4/4]: Use cannoical patterns `(A > C1 && A < C2)` and `(A & B != C)` in `isKnownNonZero`.Jan 5 2023, 1:23 PM

goldstein.w.n added a child revision: D140852: [Patch 4/4]: Use cannoical patterns `(A > C1 && A < C2)` and `(A & B != C)` in `isKnownNonZero`.

Propegating changes from rebase

goldstein.w.n marked 2 inline comments as done.Jan 5 2023, 2:33 PM

goldstein.w.n added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

4832–4834 ↗

(On Diff #486423)

This should be split into an independent patch with several tests.

Double-check to make sure this is the same logic, but I think it'd be shorter and easier to read without the loop:

// If Op1 is a power-of-2 (has exactly one bit set):
// (Op1 & X) == Op1 --> (Op1 & X) != 0
// (Op1 & X) != Op1 --> (Op1 & X) == 0
if (match(Op0, m_c_And(m_Specific(Op1), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op1, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op0,
                      ConstantInt::getNullValue(Op0->getType()));
}
// If Op0 is a power-of-2 (has exactly one bit set):
// Op0 == (Op0 & X) --> (Op0 & X) != 0
// Op0 != (Op0 & X) --> (Op0 & X) == 0
if (match(Op1, m_c_And(m_Specific(Op0), m_Value())) &&
    isKnownToBeAPowerOfTwo(Op0, false, 0, &I)) {
  return new ICmpInst(CmpInst::getInversePredicate(Pred), Op1,
                      ConstantInt::getNullValue(Op0->getType()));
}

Done (and used your code, its much less verbose), added you as reviewer but see: https://reviews.llvm.org/D141090

Harbormaster completed remote builds in B205991: Diff 486680.Jan 5 2023, 3:51 PM

nikic added inline comments.Jan 6 2023, 5:19 AM

llvm/lib/Analysis/AssumptionCache.cpp
131–132	nit: Drop newline
135	Could add `match(B, m_Zero())` here and drop the case below, no longer relevant.
llvm/lib/Analysis/ValueTracking.cpp
944	and drop comments below, merge conditions, etc. We're only handling the one case now.
959	If all bits are known zero, uses should be replaced by a zero value. There is no need to handle non-constant case here. In fact, you should also drop the isKnnownToBeAPowerOfTwo and computeKnownBits calls on B, because we don't just need a power of two, we need to know which power of two it is: It has to be constant. We can just use `match(B, m_Power2(Pow2))` instead.

goldstein.w.n marked 6 inline comments as done.Jan 6 2023, 11:19 AM

Remove unnecessary code b.c of canonicalizations earlier / only using constants

Harbormaster completed remote builds in B206174: Diff 486953.Jan 6 2023, 12:42 PM

The implementation looks basically fine. I think it would be easiest to land this if it's detached from the patch stack, with some dedicated tests -- which would probably be pretty much just these?

declare void @llvm.assume(i1)

define i32 @pow2(i32 %x) {
  %and = and i32 %x, 4
  %cmp = icmp ne i32 %and, 0
  call void @llvm.assume(i1 %cmp)
  %and2 = and i32 %x, 4
  ret i32 %and2
}

define i32 @not_pow2(i32 %x) {
  %and = and i32 %x, 3
  %cmp = icmp ne i32 %and, 0
  call void @llvm.assume(i1 %cmp)
  %and2 = and i32 %x, 3
  ret i32 %and2
}

llvm/lib/Analysis/ValueTracking.cpp
943	Huh, I would have expected this to require `{}` due to the variable declaration.
946	The `c_`s here can be dropped due to canonicalization (constant on the right).
947	`!BPow2->isZero()` is not necessary, `m_Power2` only matches power of twos (unlike `m_Power2OrZero`).

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 10 2023, 8:22 AM

Seperate from series and add indepedent tests

goldstein.w.n removed a parent revision: D140850: [InstCombine] Add optimizations for icmp eq/ne (mul(X, Y), 0).Jan 10 2023, 10:59 AM

goldstein.w.n removed a child revision: D140852: [Patch 4/4]: Use cannoical patterns `(A > C1 && A < C2)` and `(A & B != C)` in `isKnownNonZero`.

goldstein.w.n added a parent revision: D141412: [InstCombine]: Add tests for icmp ne non-zero power of 2; NFC.

Harbormaster completed remote builds in B206856: Diff 487889.Jan 10 2023, 10:59 AM

goldstein.w.n retitled this revision from [Patch 3/4]: Add cases for assume (X & Y != {0|Y}) to [InstCombine]: Add cases for assume (X & Y != {0|Y}).Jan 10 2023, 10:59 AM

goldstein.w.n marked 3 inline comments as done.

goldstein.w.n added inline comments.

llvm/lib/Analysis/ValueTracking.cpp
943	Added braces.

In D140851#4040038, @nikic wrote:

The implementation looks basically fine. I think it would be easiest to land this if it's detached from the patch stack, with some dedicated tests -- which would probably be pretty much just these?

Done, test patch is: https://reviews.llvm.org/D141412

declare void @llvm.assume(i1)

define i32 @pow2(i32 %x) {
  %and = and i32 %x, 4
  %cmp = icmp ne i32 %and, 0
  call void @llvm.assume(i1 %cmp)
  %and2 = and i32 %x, 4
  ret i32 %and2
}

define i32 @not_pow2(i32 %x) {
  %and = and i32 %x, 3
  %cmp = icmp ne i32 %and, 0
  call void @llvm.assume(i1 %cmp)
  %and2 = and i32 %x, 3
  ret i32 %and2
}

Added those tests along with some other variations.
Some of which will be "fixed" in D140852 (the _br cases).

Also added some tests for when the power of 2 is non-constant but
we still "should" be able to prove we don't need the operation (will
look into that more later).

My plan is:

get this in
Split D140852 from the series and submit that (it will cleanup some cases here)
Split D140850 and D140849 and independent patches (just the mul case)
non-constant cases.

That sound reasonable?

Also, if you have a moment about an unrelated thing. I'm working on a patch
for SimplifyLibCalls to see if there is a previous dominating slen = strlen(s) call
that will allow transformation for str*(s, ...) -> mem*(s, ..., slen) even if the string is non-constant.
To do this, it needs to check that between the strlen call and lib-call the memory
*s can't have been clobbered. I think the "right" way to do this is memoryssa but
it seems ridiculous (and overly consuming of compile time) to add memoryssa to all
of instcombine just for this one relatively unimportant pass. I saw we have AAResults
already computed so started with that, but it only works on a single basic-block (and since
we don't have postdom tree there doesn't seem to be any cheap way to collect all possible
affected basic-blocks between the strlen call and lib-call. My thought for how to handle
this was either:

Bite the bullet and use memoryssa throughout instcombine.
Only handle strlen calls in the same basic-block as the lib-call and use AAResults
Add a file to "canonicalize" lib calls before EarlyCSE by making ALL str*(s,...) functions mem*(s, ..., strlen(s)), let EarlyCSE remove all duplicate strlen calls, then make the SimplifyLibCall in instcombine replace mem*(s, ..., strlen(s)) with the str* equivilent.

Do you have any advise about which of these (if any) make the most sense. And if none
how you might approach it?

Propegate test changes

Harbormaster completed remote builds in B206924: Diff 487978.Jan 10 2023, 7:00 PM

nikic mentioned this in rG8c9f13bd4c0f: [InstCombine] Add tests for icmp ne non-zero power of 2; NFC.Jan 11 2023, 1:02 AM

LGTM

In D140851#4040862, @goldstein.w.n wrote:

Also, if you have a moment about an unrelated thing. I'm working on a patch
for SimplifyLibCalls to see if there is a previous dominating slen = strlen(s) call
that will allow transformation for str*(s, ...) -> mem*(s, ..., slen) even if the string is non-constant.
To do this, it needs to check that between the strlen call and lib-call the memory
*s can't have been clobbered. I think the "right" way to do this is memoryssa but
it seems ridiculous (and overly consuming of compile time) to add memoryssa to all
of instcombine just for this one relatively unimportant pass. I saw we have AAResults
already computed so started with that, but it only works on a single basic-block (and since
we don't have postdom tree there doesn't seem to be any cheap way to collect all possible
affected basic-blocks between the strlen call and lib-call. My thought for how to handle
this was either:

Bite the bullet and use memoryssa throughout instcombine.

Extremely unlikely. Both because of compile-time impact, and because this would add implementation complexity to all other transforms, that now need to keep MSSA up to date.

Only handle strlen calls in the same basic-block as the lib-call and use AAResults

This is possible -- probably easiest if it covers your motivating cases.

Add a file to "canonicalize" lib calls before EarlyCSE by making ALL str*(s,...) functions mem*(s, ..., strlen(s)), let EarlyCSE remove all duplicate strlen calls, then make the SimplifyLibCall in instcombine replace mem*(s, ..., strlen(s)) with the str* equivilent.

Handling this in EarlyCSE/GVN does seem most principled. I'm not sure it's necessary to actually perform the conversion from str to mem+strlen in advance -- shouldn't we be able to look up whether we have an available strlen call, and perform the transform from str to mem if we do? Of course, it would be a bit of an awkward special case...

This revision is now accepted and ready to land.Jan 11 2023, 1:17 AM

This revision was landed with ongoing or failed builds.Jan 11 2023, 1:21 AM

Closed by commit rGcc845e9de8c8: [InstCombine] Handle assume(X & Pow2 != 0) in computeKnownBits() (authored by goldstein.w.n, committed by nikic). · Explain Why

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rGcc845e9de8c8: [InstCombine] Handle assume(X & Pow2 != 0) in computeKnownBits().

In D140851#4042727, @nikic wrote:

LGTM

In D140851#4040862, @goldstein.w.n wrote:

Also, if you have a moment about an unrelated thing. I'm working on a patch
for SimplifyLibCalls to see if there is a previous dominating slen = strlen(s) call
that will allow transformation for str*(s, ...) -> mem*(s, ..., slen) even if the string is non-constant.
To do this, it needs to check that between the strlen call and lib-call the memory
*s can't have been clobbered. I think the "right" way to do this is memoryssa but
it seems ridiculous (and overly consuming of compile time) to add memoryssa to all
of instcombine just for this one relatively unimportant pass. I saw we have AAResults
already computed so started with that, but it only works on a single basic-block (and since
we don't have postdom tree there doesn't seem to be any cheap way to collect all possible
affected basic-blocks between the strlen call and lib-call. My thought for how to handle
this was either:

Bite the bullet and use memoryssa throughout instcombine.

Extremely unlikely. Both because of compile-time impact, and because this would add implementation complexity to all other transforms, that now need to keep MSSA up to date.

Only handle strlen calls in the same basic-block as the lib-call and use AAResults

This is possible -- probably easiest if it covers your motivating cases.

Add a file to "canonicalize" lib calls before EarlyCSE by making ALL str*(s,...) functions mem*(s, ..., strlen(s)), let EarlyCSE remove all duplicate strlen calls, then make the SimplifyLibCall in instcombine replace mem*(s, ..., strlen(s)) with the str* equivilent.

Handling this in EarlyCSE/GVN does seem most principled. I'm not sure it's necessary to actually perform the conversion from str to mem+strlen in advance -- shouldn't we be able to look up whether we have an available strlen call, and perform the transform from str to mem if we do? Of course, it would be a bit of an awkward special case...

Thank you for the advise :)

The canonicalization wouldn't be necessary, it would be just be a simple way to avoid having to check the strlen case for each function (although
then again we would need to be able to reverse it later which may have the same level of complexity). Just searching for a dominating + valid
strlen call and using that isn't too much of a special case as is, most of the functions already check for a constant-time strlen so its really just
a matter of adding an extra branch in existing infrastructure.

Think I'll give EarlyCSE/GVN a shot (I was leaning towards EarlyCSE), if it doesn't work out i'll just do in the AAResults method.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

AssumptionCache.cpp

25 lines

ValueTracking.cpp

8 lines

test/

Transforms/

InstCombine/

icmp-ne-pow2.ll

9 lines

Diff 488113

llvm/lib/Analysis/AssumptionCache.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	if (Pred == ICmpInst::ICMP_EQ) {
V = A;		V = A;
}		}

Value *B;		Value *B;
// (A & B) or (A \| B) or (A ^ B).		// (A & B) or (A \| B) or (A ^ B).
if (match(V, m_BitwiseLogic(m_Value(A), m_Value(B)))) {		if (match(V, m_BitwiseLogic(m_Value(A), m_Value(B)))) {
AddAffected(A);		AddAffected(A);
AddAffected(B);		AddAffected(B);
// (A << C) or (A >>_s C) or (A >>_u C) where C is some constant.		// (A << C) or (A >>_s C) or (A >>_u C) where C is some constant.
} else if (match(V, m_Shift(m_Value(A), m_ConstantInt()))) {		} else if (match(V, m_Shift(m_Value(A), m_ConstantInt()))) {
AddAffected(A);		AddAffected(A);
}		}
};		};

AddAffectedFromEq(A);		AddAffectedFromEq(A);
AddAffectedFromEq(B);		AddAffectedFromEq(B);
		} else if (Pred == ICmpInst::ICMP_NE) {
		Value X, Y;
		// Handle (a & b != 0). If a/b is a power of 2 we can use this
		// information.
		if (match(A, m_And(m_Value(X), m_Value(Y))) && match(B, m_Zero())) {
		AddAffected(X);
		AddAffected(Y);
}		}
		} else if (Pred == ICmpInst::ICMP_ULT) {
Value *X;		Value *X;
// Handle (A + C1) u< C2, which is the canonical form of A > C3 && A < C4,		// Handle (A + C1) u< C2, which is the canonical form of A > C3 && A < C4,
// and recognized by LVI at least.		// and recognized by LVI at least.
if (Pred == ICmpInst::ICMP_ULT &&		if (match(A, m_Add(m_Value(X), m_ConstantInt())) &&
match(A, m_Add(m_Value(X), m_ConstantInt())) &&
match(B, m_ConstantInt()))		match(B, m_ConstantInt()))
AddAffected(X);		AddAffected(X);
}		}
		}
		nikicUnsubmitted Done Reply Inline Actions nit: Drop newline nikic: nit: Drop newline

if (TTI) {		if (TTI) {
const Value *Ptr;		const Value *Ptr;
		nikicUnsubmitted Done Reply Inline Actions Could add `match(B, m_Zero())` here and drop the case below, no longer relevant. nikic: Could add `match(B, m_Zero())` here and drop the case below, no longer relevant.
unsigned AS;		unsigned AS;
std::tie(Ptr, AS) = TTI->getPredicatedAddrSpace(Cond);		std::tie(Ptr, AS) = TTI->getPredicatedAddrSpace(Cond);
if (Ptr)		if (Ptr)
AddAffected(const_cast<Value *>(Ptr->stripInBoundsOffsets()));		AddAffected(const_cast<Value *>(Ptr->stripInBoundsOffsets()));
}		}
}		}

void AssumptionCache::updateAffectedValues(AssumeInst *CI) {		void AssumptionCache::updateAffectedValues(AssumeInst *CI) {
▲ Show 20 Lines • Show All 207 Lines • Show Last 20 Lines

llvm/lib/Analysis/ValueTracking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 692 Lines • ▼ Show 20 Lines for (auto &AssumeVH : Q.AC->assumptionsFor(V)) {

if (Depth == MaxAnalysisRecursionDepth) if (Depth == MaxAnalysisRecursionDepth)

continue; continue;

ICmpInst *Cmp = dyn_cast<ICmpInst>(Arg); ICmpInst *Cmp = dyn_cast<ICmpInst>(Arg);

if (!Cmp) if (!Cmp)

continue; continue;

// We are attempting to compute known bits for the operands of an assume. // We are attempting to compute known bits for the operands of an assume.

// Do not try to use other assumptions for those recursive calls because // Do not try to use other assumptions for those recursive calls because

nikicUnsubmitted

Done

I believe the placement of this call was intentional, because it is more expansive than the icmp operand checks. That said, I don't see a compile-time impact on CTMark, though that is not exactly an assume-heavy workload.

We did address various inefficiencies in isValidAssumeForContext() over time though, so I would be fine with trying this change. It should be split out into a separate NFC patch though.

nikic: I believe the placement of this call was intentional, because it is more expansive than the…

goldstein.w.nAuthorUnsubmitted

Done

I believe the placement of this call was intentional, because it is more expansive than the icmp operand checks. That said, I don't see a compile-time impact on CTMark, though that is not exactly an assume-heavy workload.

We did address various inefficiencies in isValidAssumeForContext() over time though, so I would be fine with trying this change. It should be split out into a separate NFC patch though.

Fair enough. Once this is all through I'll make a patch and can discuss there.

goldstein.w.n: > I believe the placement of this call was intentional, because it is more expansive than the…

// that can lead to mutual recursion and a compile-time explosion. // that can lead to mutual recursion and a compile-time explosion.

// An example of the mutual recursion: computeKnownBits can call // An example of the mutual recursion: computeKnownBits can call

// isKnownNonZero which calls computeKnownBitsFromAssume (this function) // isKnownNonZero which calls computeKnownBitsFromAssume (this function)

// and so on. // and so on.

Query QueryNoAC = Q; Query QueryNoAC = Q;

QueryNoAC.AC = nullptr; QueryNoAC.AC = nullptr;

// Note that ptrtoint may change the bitwidth. // Note that ptrtoint may change the bitwidth.

▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines case ICmpInst::ICMP_ULT:

// Whatever high bits in c are zero are known to be zero (if c is a power // Whatever high bits in c are zero are known to be zero (if c is a power

// of 2, then one more). // of 2, then one more).

if (isKnownToBeAPowerOfTwo(A, false, Depth + 1, QueryNoAC)) if (isKnownToBeAPowerOfTwo(A, false, Depth + 1, QueryNoAC))

Known.Zero.setHighBits(RHSKnown.countMinLeadingZeros() + 1); Known.Zero.setHighBits(RHSKnown.countMinLeadingZeros() + 1);

else else

Known.Zero.setHighBits(RHSKnown.countMinLeadingZeros()); Known.Zero.setHighBits(RHSKnown.countMinLeadingZeros());

} }

break; break;

case ICmpInst::ICMP_NE: {

nikicUnsubmitted

Done

Huh, I would have expected this to require {} due to the variable declaration.

nikic: Huh, I would have expected this to require `{}` due to the variable declaration.

goldstein.w.nAuthorUnsubmitted

Done

Added braces.

goldstein.w.n: Added braces.

// assume (v & b != 0) where b is a power of 2

nikicUnsubmitted

Done

case ICmpInst::ICMP_NE:

- // assume (v & b != a)

+ // assume (v & b != 0) where b is a power of two

if (match(Cmp, m_c_ICmp(Pred, m_c_And(m_V, m_Value(B)), m_Value(A))) &&

and drop comments below, merge conditions, etc. We're only handling the one case now.

nikic: and drop comments below, merge conditions, etc. We're only handling the one case now.

const APInt *BPow2;

if (match(Cmp, m_ICmp(Pred, m_c_And(m_V, m_Power2(BPow2)), m_Zero())) &&

nikicUnsubmitted

Done

The c_s here can be dropped due to canonicalization (constant on the right).

nikic: The `c_`s here can be dropped due to canonicalization (constant on the right).

isValidAssumeForContext(I, Q.CxtI, Q.DT)) {

nikicUnsubmitted

Done

!BPow2->isZero() is not necessary, m_Power2 only matches power of twos (unlike m_Power2OrZero).

nikic: `!BPow2->isZero()` is not necessary, `m_Power2` only matches power of twos (unlike…

Known.One |= BPow2->zextOrTrunc(BitWidth);

nikicUnsubmitted

Done

It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With that done, this becomes v & b == 0 and is covered by existing handling.

nikic: It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With…

goldstein.w.nAuthorUnsubmitted

Done

It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With that done, this becomes v & b == 0 and is covered by existing handling.

So would a better approach be to handle the v & b != a canonicalization in InstCombine and drop this case or leave this as is?

goldstein.w.n: > It looks like we are missing this canonicalization: https://alive2.llvm.org/ce/z/MtveLU With…

nikicUnsubmitted

Done

Handling this in InstCombine would be preferred, it reduces the number of patterns other passes see.

nikic: Handling this in InstCombine would be preferred, it reduces the number of patterns other passes…

goldstein.w.nAuthorUnsubmitted

Done

Handling this in InstCombine would be preferred, it reduces the number of patterns other passes see.

Do you know where I should look in InstCombineCompares for where to put this?

goldstein.w.n: > Handling this in InstCombine would be preferred, it reduces the number of patterns other…

}

} break;

} }

nikicUnsubmitted

Done

Move this computeKnownBits() call into the isKnownToBeAPowerOfTwo() branch.

nikic: Move this computeKnownBits() call into the isKnownToBeAPowerOfTwo() branch.

// If assumptions conflict with each other or previous known bits, then we // If assumptions conflict with each other or previous known bits, then we

// have a logical fallacy. It's possible that the assumption is not reachable, // have a logical fallacy. It's possible that the assumption is not reachable,

// so this isn't a real bug. On the other hand, the program may have undefined // so this isn't a real bug. On the other hand, the program may have undefined

// behavior, or we might have a bug in the compiler. We can't assert/crash, so // behavior, or we might have a bug in the compiler. We can't assert/crash, so

// clear out the known bits, try to warn the user, and hope for the best. // clear out the known bits, try to warn the user, and hope for the best.

if (Known.Zero.intersects(Known.One)) { if (Known.Zero.intersects(Known.One)) {

nikicUnsubmitted

Done

This should just check whether A is zero, no need to compute known bits.

nikic: This should just check whether A is zero, no need to compute known bits.

goldstein.w.nAuthorUnsubmitted

Done

This should just check whether A is zero, no need to compute known bits.

How can I check if a value is known zero w.o computeKnownBits? I see isKnownNonZero but not the inverse.

goldstein.w.n: > This should just check whether A is zero, no need to compute known bits. How can I check if…

nikicUnsubmitted

Done

match(A, m_Zero()) would be the usual way.

nikic: `match(A, m_Zero())` would be the usual way.

goldstein.w.nAuthorUnsubmitted

Done

match(A, m_Zero()) would be the usual way.

Won't that only match Constant types? Could miss a case no?

goldstein.w.n: > `match(A, m_Zero())` would be the usual way. Won't that only match `Constant` types? Could…

nikicUnsubmitted

Done

If all bits are known zero, uses should be replaced by a zero value. There is no need to handle non-constant case here.

In fact, you should also drop the isKnnownToBeAPowerOfTwo and computeKnownBits calls on B, because we don't just need a power of two, we need to know which power of two it is: It has to be constant. We can just use match(B, m_Power2(Pow2)) instead.

nikic: If all bits are known zero, uses should be replaced by a zero value. There is no need to handle…

Known.resetAll(); Known.resetAll();

if (Q.ORE) if (Q.ORE)

Q.ORE->emit([&]() { Q.ORE->emit([&]() {

auto *CxtI = const_cast<Instruction *>(Q.CxtI); auto *CxtI = const_cast<Instruction *>(Q.CxtI);

return OptimizationRemarkAnalysis("value-tracking", "BadAssumption", return OptimizationRemarkAnalysis("value-tracking", "BadAssumption",

CxtI) CxtI)

<< "Detected conflicting code assumptions. Program may " << "Detected conflicting code assumptions. Program may "

▲ Show 20 Lines • Show All 6,565 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/icmp-ne-pow2.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -passes=instcombine -S < %s \| FileCheck %s		; RUN: opt -passes=instcombine -S < %s \| FileCheck %s

declare void @llvm.assume(i1)		declare void @llvm.assume(i1)
declare i32 @llvm.ctpop.i32(i32)		declare i32 @llvm.ctpop.i32(i32)

define i32 @pow2_32_assume(i32 %x) {		define i32 @pow2_32_assume(i32 %x) {
; CHECK-LABEL: @pow2_32_assume(		; CHECK-LABEL: @pow2_32_assume(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 4		; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 4
; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[AND]], 0		; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[AND]], 0
; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])		; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
; CHECK-NEXT: [[AND2:%.*]] = and i32 [[X]], 4		; CHECK-NEXT: ret i32 4
; CHECK-NEXT: ret i32 [[AND2]]
;		;
%and = and i32 %x, 4		%and = and i32 %x, 4
%cmp = icmp ne i32 %and, 0		%cmp = icmp ne i32 %and, 0
call void @llvm.assume(i1 %cmp)		call void @llvm.assume(i1 %cmp)
%and2 = and i32 %x, 4		%and2 = and i32 %x, 4
ret i32 %and2		ret i32 %and2
}		}

Show All 12 Lines	;
ret i32 %and2		ret i32 %and2
}		}

define i64 @pow2_64_assume(i64 %x) {		define i64 @pow2_64_assume(i64 %x) {
; CHECK-LABEL: @pow2_64_assume(		; CHECK-LABEL: @pow2_64_assume(
; CHECK-NEXT: [[AND:%.]] = and i64 [[X:%.]], 1		; CHECK-NEXT: [[AND:%.]] = and i64 [[X:%.]], 1
; CHECK-NEXT: [[CMP:%.*]] = icmp ne i64 [[AND]], 0		; CHECK-NEXT: [[CMP:%.*]] = icmp ne i64 [[AND]], 0
; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])		; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
; CHECK-NEXT: [[OR:%.*]] = or i64 [[X]], 1		; CHECK-NEXT: ret i64 [[X]]
; CHECK-NEXT: ret i64 [[OR]]
;		;
%and = and i64 %x, 1		%and = and i64 %x, 1
%cmp = icmp ne i64 %and, 0		%cmp = icmp ne i64 %and, 0
call void @llvm.assume(i1 %cmp)		call void @llvm.assume(i1 %cmp)
%or = or i64 %x, 1		%or = or i64 %x, 1
ret i64 %or		ret i64 %or
}		}

Show All 12 Lines	;
ret i64 %or		ret i64 %or
}		}

define i16 @pow2_16_assume(i16 %x) {		define i16 @pow2_16_assume(i16 %x) {
; CHECK-LABEL: @pow2_16_assume(		; CHECK-LABEL: @pow2_16_assume(
; CHECK-NEXT: [[AND:%.]] = and i16 [[X:%.]], 16384		; CHECK-NEXT: [[AND:%.]] = and i16 [[X:%.]], 16384
; CHECK-NEXT: [[CMP:%.*]] = icmp ne i16 [[AND]], 0		; CHECK-NEXT: [[CMP:%.*]] = icmp ne i16 [[AND]], 0
; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])		; CHECK-NEXT: call void @llvm.assume(i1 [[CMP]])
; CHECK-NEXT: [[AND2:%.*]] = and i16 [[X]], 16384		; CHECK-NEXT: ret i16 16384
; CHECK-NEXT: ret i16 [[AND2]]
;		;
%and = and i16 %x, 16384		%and = and i16 %x, 16384
%cmp = icmp eq i16 %and, 16384		%cmp = icmp eq i16 %and, 16384
call void @llvm.assume(i1 %cmp)		call void @llvm.assume(i1 %cmp)
%and2 = and i16 %x, 16384		%and2 = and i16 %x, 16384
ret i16 %and2		ret i16 %and2
}		}

▲ Show 20 Lines • Show All 472 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine]: Add cases for assume (X & Y != {0|Y})ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 488113

llvm/lib/Analysis/AssumptionCache.cpp

llvm/lib/Analysis/ValueTracking.cpp

llvm/test/Transforms/InstCombine/icmp-ne-pow2.ll

[InstCombine]: Add cases for assume (X & Y != {0|Y})
ClosedPublic