This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
7/29
InstCombineCompares.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1/1
icmp.ll

Differential D18841

[InstCombine] Canonicalize icmp instructions based on dominating conditions.
ClosedPublic

Authored by bmakam on Apr 6 2016, 12:53 PM.

Download Raw Diff

Details

Reviewers

t.p.northover
reames
majnemer
gberry
sanjoy
llvm-commits
hfinkel
mcrosier

Commits

rGcf3bcb2625a7: [InstCombine] Canonicalize icmp instructions based on dominating conditions.
rL268521: [InstCombine] Canonicalize icmp instructions based on dominating conditions.

Summary

This patch canonicalizes conditions based on the constant range information
of the dominating branch condition.
For example:

  %cmp = icmp slt i64 %a, 0
  br i1 %cmp, label %land.lhs.true, label %lor.rhs
  lor.rhs:
    %cmp2 = icmp sgt i64 %a, 0

Would now be canonicalized into:

  %cmp = icmp slt i64 %a, 0
  br i1 %cmp, label %land.lhs.true, label %lor.rhs
  lor.rhs:
    %cmp2 = icmp ne i64 %a, 0

Diff Detail

Event Timeline

bmakam updated this revision to Diff 52839.Apr 6 2016, 12:53 PM

bmakam retitled this revision from to [InstCombine] Canonicalize icmp instructions based on dominating conditions..

bmakam updated this object.

bmakam added reviewers: mcrosier, gberry, t.p.northover, llvm-commits.

Herald added a subscriber: mcrosier. · View Herald TranscriptApr 6 2016, 12:53 PM

Is this the best pass for this change to be in? ISTM that something like jump threading would be more appropriate.

In D18841#393646, @majnemer wrote:

Is this the best pass for this change to be in? ISTM that something like jump threading would be more appropriate.

I see no reason to only solve this one way. This could be handled at instsimplify or early-cse or jumpthreading or instcombine or gvn. I felt instcombine to be better place because we have dominator tree info and there already exists a function isSignBitCheck that checks if the comparison only checks the sign bit.

Fixed a bug. We need to guarantee that the dominating branch condition holds at the compare instruction, so it is not enough to check the immediate dominator. We also need to ensure that the edge dominates the compare instruction.

Addressed few minor nits to reduce the amount of work done and also added another test for an interesting case found in spec2006/povray.
FWIW, this change has 8% speedup in spec2006/soplex on Kryo and some small wins in other CINT2006 benchmarks.

Ping?

Ping #2?
I plan to improve this patch to also handle PR23333, but wanted to get consensus on the current approach. I believe this can be also solved in JumpThreading too but it will have to depend on LazyValueInfo which is known to have compile-time implications. Please let me know if this approach is reasonable.

bmakam added reviewers: reames, sanjoy.Apr 14 2016, 4:00 AM

bmakam added a reviewer: hfinkel.Apr 14 2016, 4:16 AM

bmakam mentioned this in D18905: [ValueTracking] Improve isImpliedCondition for conditions with matching operands..Apr 15 2016, 10:40 AM

Hi,

I think this is an overtly specific solution. Ideally we'd do
something that would generalize to cases like

if (a >= 5)
  if (a < 6) {
    FOO;
  }

if (a == 5) {
  FOO;
}

as well.

IMO the right layering for all the predicate logic you have here is in
ConstantRange. Just like we have makeAllowedICmpRegion and
makeSatisfyingICmpRegion, we should have a makeExactICmpRegion
where the RHS is not a ConstantRange but an APInt and have it
return *exactly* the set of values that satisfy the predicate (i.e. a
value satisfies the predicate *iff* it is in that set). Then the
current algorithm could become:

For a dominating icmp instruction "Var Pred RHS",
    compute R_Dom = makeAllowedICmpRegion(Pred, getRange(RHS));
If the current icmp is of the form "Var Pred ConstRHS" then
    compute R_Self = makeExactICmpRegion(Pred, ConstRHS)
    if R_Self.intersect(R_Dom) is empty:
      replace current icmp with false
    if R_Self.intersect(R_Dom) has exactly one element E:
      replace current icmp with "Var == E"
    if R_Self.intersect(R_Dom).complement() has exact one element E:
      replace current icmp with "Var != E"

lib/Transforms/InstCombine/InstCombineCompares.cpp
3263	If you're going to restrict yourself to this case, I don't think you should depend on `DT`. Instead you can just check that the only predecessor of `Parent` is `TrueBB` or `FalseBB` (depending on `TrueIfSigned`) and be done with it, since the only value of `IDom` you'll succeed with here is `Parent->getSinglePredecessor()`.

Thanks for the review Sanjoy. Please let me know if this looks reasonable.

sanjoy requested changes to this revision.Apr 21 2016, 11:37 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3341	I think you also need to check that `TrueBB` and `FalseBB` are not equal.
3345	I think what you want is `makeAllowedICmpRegion(getInversePredicate(Pred), CI2)`. In this case (where `Other` is a singleton set) what you have is fine, but generally (i.e. if the RHS was not a constant integer, but a range) it is not. Actually, it might be clearer to introduce a new helper: `ConstantRange::makeExactICmpRegion` that takes an `APInt` as RHS, and returns a set that is equal to both the "allowed" and "satisfying" icmp range.
3347	This is confusingly named -- by "RHS", I'd normally understand `CI` or `CI2`; but what you mean here is the dominated LHS. How about calling it just that: `DominatedLHSRange`?
3352	Why do you care about `isSignBit`?
test/Transforms/InstCombine/icmp.ll
2015	Can you add a dominated check with, say, `10` or something just to demonstrate that we're general here?

This revision now requires changes to proceed.Apr 21 2016, 11:37 PM

Addressed Sanjoy's comments and added more tests.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3341	I think we won't match this pattern if TrueBB == FalseBB because instcombine will already have transformed such IR into br i1 undef, label %dest, label %dest. I will add an assert to make sure they aren't equal.
3345	I am not sure if I follow this correctly, could you please give an example?. If the new helper takes an APInt as RHS isn't it a singleton set in this case? Taking a general example if Pred is ult and Other is i8 [2, 5) makeAllowedICmpRegion would return [0, 4) and make SatisfyingICmpRegion would return [0, 2). Isn't a set that is equal to both same as the "satisfying" icmp range i.e [0, 2) in this example?
3347	How about using CR and DominatingCR corresponding to CI and CI2?
3352	If it is a SignBit, canonicalizing could transform for example an icmp slt 0 into icmp eq/ne 0 which we want to avoid in situations where the icmp is only used as input to a branch because it will pessimize the codegen by generating a compare and branch on zero instruction (cbz) instead of a test and branch(tbz). tbz has better branch displacement than a cbz. If the icmp is an input to anything other than branches such as a select we would be better of canonicalizing it to eq/ne. Added a helper function isZBranchCheck to restrict it further for this case and added testcases.

sanjoy requested changes to this revision.Apr 22 2016, 8:13 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
117	I'd use an architecture neutral name here, if possible.
3341	My bias is that it is better to have these transforms be correct in isolation, and not rely on ordering across a large pass like -instcombine. But if you're sure that the "branch on undef" optimization is guaranteed to kick in before this optimization (i.e. that borders around "obvious"), then feel free to add an assert instead of checking and bailing out.
3345	Yeah, I wasn't as clear as I should have been. The "problem" here is that the logic is structured in a way that will not generalize to RHS being a non-singleton range. For instance, if we have if (lhs u< "[2, 5)") { ... } else { // This is what you're trying to optimize if (lhs u< 3) print("hello"); } then currently you will infer the range of `lhs` on the else branch to be `[0, 4).inverse()` == `[4, 0)`, and fold `lhs u< 3` to `false`. But `lhs u< 3` can be true also, if the RHS (the value we know is in `[2, 5)`) was `2` and `lhs` was `2`. In other words, in the above IR you know that in the "then" case `lhs` cannot be unsigned-greater than 4, but that does not mean that in the else case `lhs` has to be unsigned greater than 4. Now I think this won't be an observable bug if RHS is always a singleton set, but I'd like to have some measures in place that make it clear what's going on. There are two ways we can do that: For a singleton RHS, I think (but please verify) `makeAllowedICmpRegion` and `makeSatisfyingICmpRegion` will return the same exact set, which contains _exactly_ the value that the predicate is true for. If this assumption is true, my original suggestion was to add a `makeExactICmpRegion` method as: makeExactICmpRegion(...) { assert(makeAllowedICmpRegion(...) == makeSatisfyingICmpRegion(...)); return makeAllowedICmpRegion(...); } and to use that here instead of `makeAllowedICmpRegion`. It would take an `APInt` as RHS, and if someone comes along and tries to generalize this code to work on non-singleton RHSs, they'll immediately know that it isn't a 100% straightforward. The second possibility is to generalize this logic to work with a non-singleton RHS range. To do that, we can start by allowing the RHS to be a load instruction with `!range` metadata, and eventually move over to even more sophisticated analyses.
3347	SGTM

This revision now requires changes to proceed.Apr 22 2016, 8:13 PM

Addressed Sanjoy's comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
117	Does 'isBranchOnSignBitCheck' sound good?
3341	Since instcombine visits the instruction in topological order, we will always visit the dominating branch before visting the icmp instruction and so it is guranteed that the branch condition is undef when the condition is irrelevant.
3345	For a singleton RHS, I think (but please verify) makeAllowedICmpRegion and makeSatisfyingICmpRegion will return the same exact set, which contains _exactly_ the value that the predicate is true for. I think this assumption is not always true, see my previous example. then currently you will infer the range of lhs on the else branch to be [0, 4).inverse() == [4, 0) I see your point here, I have already modified this to 'makeAllowedICmpRegion(getInversePredicate(Pred), CI2)' based on your earlier suggestion. The second possibility is to generalize this logic to work with a non-singleton RHS range. I have no idea if non-singleton RHS ranges occur enough to be worthwhile to generalize. Do you mind if I defer this suggestion?

correctly update the use of isBranchOnSignBitCheck()

Hi Balaram,

I've replied to your inline comment, but it is possible that we're just talking past each other. I'm usually on #llvm as sanjoyd on PDT daytime, we can chat there in realtime if you think that'll help.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3342	Add a comment here on why that is true.
3345	I think this assumption is not always true, see my previous example. Are you talking about the "Taking a general example if Pred is ult and Other is i8 [2, 5) makeAllowedICmpRegion would return [0, 4) and make SatisfyingICmpRegion would return [0, 2). Isn't a set that is equal to both same as the "satisfying" icmp range i.e [0, 2) in this example?" bit? In that case isn't the RHS (i.e. `Other`) `[2, 5)`, specifically not a singleton set? If you're talking about another example, then I must have missed it. I have already modified this to 'makeAllowedICmpRegion(getInversePredicate(Pred), CI2)' based on your earlier suggestion. That's not the only place where the current code will not generalize to a non-singleton RHS. For instance, the `Intersection.isSingleElement()` check won't generalize either -- you cannot transform if (a u< "[2, 5)") { // AllowedRegion = [0, 4) if (a >=u "[3, 5)") { // AllowedRegion = [3, 0) print(); } else abort(); } to if (a u< "[2, 5)") if (a == 3) print(); else abort(); since for `"[2, 5)"` == `4` and `"[3, 5)"` == `4` and `a` == `3`, the first program would have called `abort()` but the new program will call `print()`. On the other hand, for the `Intersection.isEmptySet()` you do need to call `AllowedICmpRegion` on both the conditions. There is a general pattern here -- `makeAllowedICmpRegion` only lets you "rule out" certain values (e.g. if `a >=u "[3, 5)"` then `a` definitely cannot be in {`0`, `1`, `2`}, but it can be anything else), but it does not guarantee that the comparison will fold one way or the other. IOW, it is the "only if" part of the constraint (the comparison succeeds only if the LHS in the returned set), but not the "if" part (just because LHS is in the returned set does not mean the comparison will succeed). I have no idea if non-singleton RHS ranges occur enough to be worthwhile to generalize. Do you mind if I defer this suggestion? Sure, but in that case we need to have the logic phrased in a way that makes it obvious that the current logic is not fully general (i.e. with a `makeExactICmpRegion`). I don't think non-singletom RHS ranges are too rare: if (a u< 100) if (b u< a) if (b >u 500) In the inner condition `b u< a`, you know the RHS is in `[0, 100)`, and you can use that information to fold the innermost condition to `false`. However, it is fine to not handle these cases as a starting point.

majnemer added inline comments.Apr 25 2016, 11:38 PM

lib/Transforms/InstCombine/InstCombineCompares.cpp
3341	This is only true for the very first iteration of InstCombine. InstCombine adds the users of all instructions that it modifies to a worklist, this worklist is not toplogically sorted.

In D18841#411704, @sanjoy wrote:

Hi Balaram,

I've replied to your inline comment, but it is possible that we're just talking past each other. I'm usually on #llvm as sanjoyd on PDT daytime, we can chat there in realtime if you think that'll help.

Thanks Sanjoy, sure I will be available on #llvm as bmakam. I think I have misinterpreted what you mean by non-singleton RHS. However, I think the current logic as is should be able to generalize to what you define as non-singleton RHS ranges.

since for "[2, 5)" == 4 and "[3, 5)" == 4 and a == 3, the first program would have called abort() but the new program will call print().

The condition here is u< [2,5) so the allowed range is [0,4) which means a can never be == 4 for the first/dominating if condition and the program will abort() as expected.

If it helps to clarify, I think of the if conditions allowed ranges as set A for the dominating if condition and set B for the second(Parent) if condition. If dominating if condition is true, then the second if condition is true iff A ∩ B is true. So if {A ∩ B} == {ε} then it is sufficient to canonicalize the second if condition to icmp eq ε. OTOH, the second if condition is false iff A\B (aka relative complement) is true. So if {A\B} == {ε} then it is sufficient to canonicalize the second if condition to icmp ne ε. Similarly, if the dominating if condition is false, then the second if condition is true iff ~A ∩ B is true and the second if condition is false iff ~A\B is true. For non-singleton RHS ranges, the satisfying range is ⊆ allowed range, so it is sufficient to check for allowed range.

In D18841#412078, @bmakam wrote:

In D18841#411704, @sanjoy wrote:

Hi Balaram,

I've replied to your inline comment, but it is possible that we're just talking past each other. I'm usually on #llvm as sanjoyd on PDT daytime, we can chat there in realtime if you think that'll help.

Thanks Sanjoy, sure I will be available on #llvm as bmakam. I think I have misinterpreted what you mean by non-singleton RHS. However, I think the current logic as is should be able to generalize to what you define as non-singleton RHS ranges.

since for "[2, 5)" == 4 and "[3, 5)" == 4 and a == 3, the first program would have called abort() but the new program will call print().

The condition here is u< [2,5) so the allowed range is [0,4) which
means a can never be == 4 for the first/dominating if condition and
the program will abort() as expected.

I didn't mean to say a was 4. By "[2, 5)" == 4 I meant the
RHS we know is in "[2, 5)" is 4 (similar for the other time I used
that notation). a itself is 3 in the above example.

[Edit: added this example]IOW, at runtime the ranged values resolve as

if (a u< 4) {
  // AllowedRegion = [0, 4)
  if (a >=u 4") {
    // AllowedRegion = [3, 0)
    print();
  } else
    abort();
}

If it helps to clarify, I think of the if conditions allowed ranges
as set A for the dominating if condition and set B for the
second(Parent) if condition. If dominating if condition is true,
then the second if condition is true iff A ∩ B is true.

As I said in my previous email, the second condition is true only
if A ∩ B is true, not iff A ∩ B is true (follows from the
fact that the second condition is true only if B is true).
I.e. the second condition implies A ∩ B but not the other way around.

makeAllowedICmpRegion gives you a set with the only if property.
makeSatisfyingICmpRegion gives you a set with the if property.
If they return equal sets, then you have iff on that set (or if
you take an intersection of the sets they return).

makeAllowedICmpRegion gives you a set with the only if property.
makeSatisfyingICmpRegion gives you a set with the if property.
If they return equal sets, then you have iff on that set (or if
you take an intersection of the sets they return).

Add a helper makeExactICmpRegion that will return the region if makeSatisfyingICmpRegion is exactly same as makeAllowedICmpRegion.

bmakam marked an inline comment as done.Apr 26 2016, 12:51 PM

bmakam added inline comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3355	if Difference is empty then the inner condition is always true.

bmakam added inline comments.Apr 26 2016, 12:56 PM

lib/Transforms/InstCombine/InstCombineCompares.cpp
3341	This is only true for the very first iteration of InstCombine. InstCombine adds the users of all instructions that it modifies to a worklist, this worklist is not toplogically sorted. Since it isn't very obvious, to be on the safer side I have changed the assert back into a check.

sanjoy requested changes to this revision.Apr 27 2016, 3:19 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

lib/IR/ConstantRange.cpp
132 ↗	(On Diff #55071)	Can you add a comment on why this assert is valid (i.e. that we're relying on knowledge of how the two `makeAllowedXXXRegion` s are implemented).
lib/Transforms/InstCombine/InstCombineCompares.cpp
117	We no longer repeat function names in comments.
118	This isn't an exploded icmp right?
3356	Add a comment on why you need `isBranchOnSignBitCheck` here.
3357	These are better written as: if (auto AI = Intersection.getSingleElement()) return new ICmpInst(..., Builder->getInt(AI)); etc.

This revision now requires changes to proceed.Apr 27 2016, 3:19 PM

Thanks Sanjoy, please let me know if the comments added make sense.

bmakam marked 3 inline comments as done.Apr 28 2016, 6:00 AM

bmakam added inline comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
118	Right, I have corrected the comment.

Generally speaking, InstCombine canonicalization helps out down stream analysis by simplifying the IR or exposes further opportunities for canonicalization. i'm not entirely sure that this transform fits that bill.

I wonder if a more appropriate place for this to live is in CodeGenPrepare seeing as how sensitive this transform is to the peculiarities of the target ISA, perhaps with a TargetLowering check which verifies that it would make sense to perform on such a target.

In D18841#415959, @majnemer wrote:

Generally speaking, InstCombine canonicalization helps out down stream analysis by simplifying the IR or exposes further opportunities for canonicalization. i'm not entirely sure that this transform fits that bill.

FWIW, the examples listed in the testcases demonstrate that it helps out by simplifying the IR. In my tests on spec2006 benchmarks it has reduced more than 3000 static instructions in xalancbmk and 100s of static instructions in gcc and several other benchmarks. This is similar to the transformations done in FoldAndOfICmps except that here we consider dominating condition as the LHS. I have constrained this only to the first predecessor to find the dominating icmp and could still remove several static instructions which should improve the code quality and compile time.

I wonder if a more appropriate place for this to live is in CodeGenPrepare seeing as how sensitive this transform is to the peculiarities of the target ISA, perhaps with a TargetLowering check which verifies that it would make sense to perform on such a target.

I am not totally against moving this to the backend and infact I started off by implementing it at the very end of the pipeline in D18572, but realized that it would catch more such cases and might help other downstream optimizations when we do this transform in instcombine. What do you think?

In D18841#416085, @bmakam wrote:

In D18841#415959, @majnemer wrote:

Generally speaking, InstCombine canonicalization helps out down stream analysis by simplifying the IR or exposes further opportunities for canonicalization. i'm not entirely sure that this transform fits that bill.

FWIW, the examples listed in the testcases demonstrate that it helps out by simplifying the IR. In my tests on spec2006 benchmarks it has reduced more than 3000 static instructions in xalancbmk and 100s of static instructions in gcc and several other benchmarks. This is similar to the transformations done in FoldAndOfICmps except that here we consider dominating condition as the LHS. I have constrained this only to the first predecessor to find the dominating icmp and could still remove several static instructions which should improve the code quality and compile time.

Does this result in instructions removed at the LLVM IR or does it result in fewer instructions emitted in the backend?

If it results in fewer IR instructions, can you determine why this is the case? If it results in the same number of IR instructions, then performing this transform might make more sense in CodeGenPrepare.

I wonder if a more appropriate place for this to live is in CodeGenPrepare seeing as how sensitive this transform is to the peculiarities of the target ISA, perhaps with a TargetLowering check which verifies that it would make sense to perform on such a target.

I am not totally against moving this to the backend and infact I started off by implementing it at the very end of the pipeline in D18572, but realized that it would catch more such cases and might help other downstream optimizations when we do this transform in instcombine. What do you think?

I would have done this optimization a little earlier at around the CodeGenPrepare level.

Does this result in instructions removed at the LLVM IR or does it result in fewer instructions emitted in the backend?

If it results in fewer IR instructions, can you determine why this is the case? If it results in the same number of IR instructions, then performing this transform might make more sense in CodeGenPrepare.

I generated the IR after LTO using -Wl,plugin-opt=emit-llvm for xalancbmk and with this patch the number of IR instructions were 1059942. Compared to the baseline without my patch the number of IR instructions were 1067179. So there was a net reduction of 7237 IR instructions which translated into 3531 fewer instructions emitted in the backend. Over 1000 add instructions, and branch instructions were removed. Most of the cases were similar to the tests I have added in this patch.

gentle ping.

This looks okay to me (pending the one inline comment), assuming @majnemer is fine with too.

Btw, looks like instcombine already canonicalizes towards equality, for instance:

define i1 @f(i32 %x) {
  %v = icmp ugt i32 %x, 0
  ret i1 %v
}

is instcombined to

define i1 @f(i32 %x) {
  %v = icmp ne i32 %x, 0
  ret i1 %v
}

and this change looks like it is doing more of the same.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3336	I think you can just use `dyn_cast` here. For a well formed CFG, `getTerminator` should always return `nonnull`.

This revision is now accepted and ready to land.May 2 2016, 2:19 PM

Thank you very much for the review Sanjoy.

@majnemer, Is it reasonable to commit this patch?

majnemer added inline comments.May 4 2016, 10:18 AM

lib/Transforms/InstCombine/InstCombineCompares.cpp
3355–3366	The prior two replacements seem like pure goodness: they make conditional branches branch on true or false. These two seem weird to have in InstCombine. Seeing as how you are concerned with pessimization, can we move these to CGP and have them operate on a TargetLowering hook?

majnemer accepted this revision.May 4 2016, 10:20 AM

majnemer added a reviewer: majnemer.

majnemer added inline comments.

lib/Transforms/InstCombine/InstCombineCompares.cpp
3355–3366	On second thought, this is a nice canonicalization regardless because it makes the comparisons easier to analyze. LGTM.

Thank you very much David! I will commit this patch shortly.

Closed by commit rL268521: [InstCombine] Canonicalize icmp instructions based on dominating conditions. (authored by bmakam). · Explain WhyMay 4 2016, 10:40 AM

This revision was automatically updated to reflect the committed changes.

bmakam mentioned this in D19550: [InstCombine] Determine the result of a select based on a dominating condition..May 4 2016, 2:28 PM

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineCompares.cpp

67 lines

test/

Transforms/

InstCombine/

icmp.ll

74 lines

Diff 53296

lib/Transforms/InstCombine/InstCombineCompares.cpp

Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	if (VectorType *VTy = dyn_cast<VectorType>(In1->getType())) {
return false;		return false;
}		}

return HasSubOverflow(cast<ConstantInt>(Result),		return HasSubOverflow(cast<ConstantInt>(Result),
cast<ConstantInt>(In1), cast<ConstantInt>(In2),		cast<ConstantInt>(In1), cast<ConstantInt>(In2),
IsSigned);		IsSigned);
}		}

/// isSignBitCheck - Given an exploded icmp instruction, return true if the		/// isSignBitCheck - Given an exploded icmp instruction, return true if the
		sanjoyUnsubmitted Done Reply Inline Actions I'd use an architecture neutral name here, if possible. sanjoy: I'd use an architecture neutral name here, if possible.
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions Does 'isBranchOnSignBitCheck' sound good? bmakam: Does 'isBranchOnSignBitCheck' sound good?
		sanjoyUnsubmitted Done Reply Inline Actions We no longer repeat function names in comments. sanjoy: We no longer repeat function names in comments.
/// comparison only checks the sign bit. If it only checks the sign bit, set		/// comparison only checks the sign bit. If it only checks the sign bit, set
		sanjoyUnsubmitted Not Done Reply Inline Actions This isn't an exploded icmp right? sanjoy: This isn't an exploded icmp right?
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions Right, I have corrected the comment. bmakam: Right, I have corrected the comment.
/// TrueIfSigned if the result of the comparison is true when the input value is		/// TrueIfSigned if the result of the comparison is true when the input value is
/// signed.		/// signed.
static bool isSignBitCheck(ICmpInst::Predicate pred, ConstantInt *RHS,		static bool isSignBitCheck(ICmpInst::Predicate pred, ConstantInt *RHS,
bool &TrueIfSigned) {		bool &TrueIfSigned) {
switch (pred) {		switch (pred) {
case ICmpInst::ICMP_SLT: // True if LHS s< 0		case ICmpInst::ICMP_SLT: // True if LHS s< 0
TrueIfSigned = true;		TrueIfSigned = true;
return RHS->isZero();		return RHS->isZero();
▲ Show 20 Lines • Show All 3,110 Lines • ▼ Show 20 Lines	case ICmpInst::ICMP_UGE:
return new ICmpInst(ICmpInst::ICMP_UGT, Op0,		return new ICmpInst(ICmpInst::ICMP_UGT, Op0,
Builder->getInt(CI->getValue()-1));		Builder->getInt(CI->getValue()-1));
case ICmpInst::ICMP_SGE:		case ICmpInst::ICMP_SGE:
assert(!CI->isMinValue(true)); // A >=s MIN -> TRUE		assert(!CI->isMinValue(true)); // A >=s MIN -> TRUE
return new ICmpInst(ICmpInst::ICMP_SGT, Op0,		return new ICmpInst(ICmpInst::ICMP_SGT, Op0,
Builder->getInt(CI->getValue()-1));		Builder->getInt(CI->getValue()-1));
}		}


		// If the immediately dominating branch condition is a sign bit check,
		// turn it into the appropriate icmp eq/ne Op0 0 instruction.
		if (!I.isEquality()) {
		BasicBlock *Parent = I.getParent();
		DomTreeNode *Node = DT ? DT->getNode(Parent) : nullptr;
		DomTreeNode *IDomNode = Node ? Node->getIDom() : nullptr;
		BasicBlock *IDom = IDomNode ? IDomNode->getBlock() : nullptr;
		auto *BI =
		IDom ? dyn_cast_or_null<BranchInst>(IDom->getTerminator()) : nullptr;
		ICmpInst::Predicate Pred;
		BasicBlock TrueBB, FalseBB;
		ConstantInt *CI2;
		bool TrueIfSigned = false;
		if (BI &&
		match(BI, m_Br(m_ICmp(Pred, m_Specific(Op0), m_ConstantInt(CI2)),
		TrueBB, FalseBB)) &&
		isSignBitCheck(Pred, CI2, TrueIfSigned)) {
		if ((TrueIfSigned && Parent == FalseBB) \|\|
		sanjoyUnsubmitted Done Reply Inline Actions If you're going to restrict yourself to this case, I don't think you should depend on `DT`. Instead you can just check that the only predecessor of `Parent` is `TrueBB` or `FalseBB` (depending on `TrueIfSigned`) and be done with it, since the only value of `IDom` you'll succeed with here is `Parent->getSinglePredecessor()`. sanjoy: If you're going to restrict yourself to this case, I don't think you should depend on `DT`.
		(!TrueIfSigned && Parent == TrueBB)) {
		// Also ensure that the edge dominates the compare instruction
		// to guarantee that the dominating condition holds in 'Parent'.
		BasicBlockEdge Edge(IDom, Parent);
		if (Edge.isSingleEdge() && DT->dominates(Edge, Parent)) {
		if (CI->isZero()) {
		switch (I.getPredicate()) {
		default:
		break;
		case ICmpInst::ICMP_SLT:
		case ICmpInst::ICMP_ULT:
		return replaceInstUsesWith(I, Builder->getFalse());
		case ICmpInst::ICMP_SGT:
		case ICmpInst::ICMP_UGT:
		return new ICmpInst(ICmpInst::ICMP_NE, Op0, CI);
		case ICmpInst::ICMP_SLE:
		case ICmpInst::ICMP_ULE:
		return new ICmpInst(ICmpInst::ICMP_EQ, Op0, CI);
		}
		}
		if (CI->isOne()) {
		switch (I.getPredicate()) {
		default:
		break;
		case ICmpInst::ICMP_SGE:
		case ICmpInst::ICMP_UGE:
		return new ICmpInst(ICmpInst::ICMP_NE, Op0,
		Builder->getInt(CI->getValue() - 1));
		case ICmpInst::ICMP_SLT:
		case ICmpInst::ICMP_ULT:
		return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
		Builder->getInt(CI->getValue() - 1));
		}
		}
		if (CI->isMinusOne()) {
		switch (I.getPredicate()) {
		default:
		break;
		case ICmpInst::ICMP_SLE:
		case ICmpInst::ICMP_ULE:
		return replaceInstUsesWith(I, Builder->getFalse());
		}
		}
		}
		}
		}
		}

if (I.isEquality()) {		if (I.isEquality()) {
ConstantInt *CI2;		ConstantInt *CI2;
if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(A))) \|\|		if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(A))) \|\|
match(Op0, m_LShr(m_ConstantInt(CI2), m_Value(A)))) {		match(Op0, m_LShr(m_ConstantInt(CI2), m_Value(A)))) {
// (icmp eq/ne (ashr/lshr const2, A), const1)		// (icmp eq/ne (ashr/lshr const2, A), const1)
if (Instruction *Inst = FoldICmpCstShrCst(I, Op0, A, CI, CI2))		if (Instruction *Inst = FoldICmpCstShrCst(I, Op0, A, CI, CI2))
return Inst;		return Inst;
}		}
if (match(Op0, m_Shl(m_ConstantInt(CI2), m_Value(A)))) {		if (match(Op0, m_Shl(m_ConstantInt(CI2), m_Value(A)))) {
// (icmp eq/ne (shl const2, A), const1)		// (icmp eq/ne (shl const2, A), const1)
if (Instruction *Inst = FoldICmpCstShlCst(I, Op0, A, CI, CI2))		if (Instruction *Inst = FoldICmpCstShlCst(I, Op0, A, CI, CI2))
return Inst;		return Inst;
}		}
}		}

// If this comparison is a normal comparison, it demands all		// If this comparison is a normal comparison, it demands all
// bits, if it is a sign bit comparison, it only demands the sign bit.		// bits, if it is a sign bit comparison, it only demands the sign bit.
bool UnusedBit;		bool UnusedBit;
isSignBit = isSignBitCheck(I.getPredicate(), CI, UnusedBit);		isSignBit = isSignBitCheck(I.getPredicate(), CI, UnusedBit);
}		}

// See if we can fold the comparison based on range information we can get		// See if we can fold the comparison based on range information we can get
// by checking whether bits are known to be zero or one in the input.		// by checking whether bits are known to be zero or one in the input.
if (BitWidth != 0) {		if (BitWidth != 0) {
APInt Op0KnownZero(BitWidth, 0), Op0KnownOne(BitWidth, 0);		APInt Op0KnownZero(BitWidth, 0), Op0KnownOne(BitWidth, 0);
		sanjoyUnsubmitted Done Reply Inline Actions I think you can just use `dyn_cast` here. For a well formed CFG, `getTerminator` should always return `nonnull`. sanjoy: I think you can just use `dyn_cast` here. For a well formed CFG, `getTerminator` should always…
APInt Op1KnownZero(BitWidth, 0), Op1KnownOne(BitWidth, 0);		APInt Op1KnownZero(BitWidth, 0), Op1KnownOne(BitWidth, 0);

if (SimplifyDemandedBits(I.getOperandUse(0),		if (SimplifyDemandedBits(I.getOperandUse(0),
DemandedBitsLHSMask(I, BitWidth, isSignBit),		DemandedBitsLHSMask(I, BitWidth, isSignBit),
Op0KnownZero, Op0KnownOne, 0))		Op0KnownZero, Op0KnownOne, 0))
		sanjoyUnsubmitted Done Reply Inline Actions I think you also need to check that `TrueBB` and `FalseBB` are not equal. sanjoy: I think you also need to check that `TrueBB` and `FalseBB` are not equal.
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions I think we won't match this pattern if TrueBB == FalseBB because instcombine will already have transformed such IR into br i1 undef, label %dest, label %dest. I will add an assert to make sure they aren't equal. bmakam: I think we won't match this pattern if TrueBB == FalseBB because instcombine will already have…
		sanjoyUnsubmitted Not Done Reply Inline Actions My bias is that it is better to have these transforms be correct in isolation, and not rely on ordering across a large pass like -instcombine. But if you're sure that the "branch on undef" optimization is guaranteed to kick in before this optimization (i.e. that borders around "obvious"), then feel free to add an assert instead of checking and bailing out. sanjoy: My bias is that it is better to have these transforms be correct in isolation, and not rely on…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions Since instcombine visits the instruction in topological order, we will always visit the dominating branch before visting the icmp instruction and so it is guranteed that the branch condition is undef when the condition is irrelevant. bmakam: Since instcombine visits the instruction in topological order, we will always visit the…
		majnemerUnsubmitted Done Reply Inline Actions This is only true for the very first iteration of InstCombine. InstCombine adds the users of all instructions that it modifies to a worklist, this worklist is not toplogically sorted. majnemer: This is only true for the very first iteration of InstCombine. InstCombine adds the users of…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions This is only true for the very first iteration of InstCombine. InstCombine adds the users of all instructions that it modifies to a worklist, this worklist is not toplogically sorted. Since it isn't very obvious, to be on the safer side I have changed the assert back into a check. bmakam: >This is only true for the very first iteration of InstCombine. InstCombine adds the users of…
return &I;		return &I;
		sanjoyUnsubmitted Not Done Reply Inline Actions Add a comment here on why that is true. sanjoy: Add a comment here on why that is true.
if (SimplifyDemandedBits(I.getOperandUse(1),		if (SimplifyDemandedBits(I.getOperandUse(1),
APInt::getAllOnesValue(BitWidth), Op1KnownZero,		APInt::getAllOnesValue(BitWidth), Op1KnownZero,
Op1KnownOne, 0))		Op1KnownOne, 0))
		sanjoyUnsubmitted Not Done Reply Inline Actions I think what you want is `makeAllowedICmpRegion(getInversePredicate(Pred), CI2)`. In this case (where `Other` is a singleton set) what you have is fine, but generally (i.e. if the RHS was not a constant integer, but a range) it is not. Actually, it might be clearer to introduce a new helper: `ConstantRange::makeExactICmpRegion` that takes an `APInt` as RHS, and returns a set that is equal to both the "allowed" and "satisfying" icmp range. sanjoy: I think what you want is `makeAllowedICmpRegion(getInversePredicate(Pred), CI2)`. In this case…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions I am not sure if I follow this correctly, could you please give an example?. If the new helper takes an APInt as RHS isn't it a singleton set in this case? Taking a general example if Pred is ult and Other is i8 [2, 5) makeAllowedICmpRegion would return [0, 4) and make SatisfyingICmpRegion would return [0, 2). Isn't a set that is equal to both same as the "satisfying" icmp range i.e [0, 2) in this example? bmakam: I am not sure if I follow this correctly, could you please give an example?. If the new helper…
		sanjoyUnsubmitted Not Done Reply Inline Actions Yeah, I wasn't as clear as I should have been. The "problem" here is that the logic is structured in a way that will not generalize to RHS being a non-singleton range. For instance, if we have if (lhs u< "[2, 5)") { ... } else { // This is what you're trying to optimize if (lhs u< 3) print("hello"); } then currently you will infer the range of `lhs` on the else branch to be `[0, 4).inverse()` == `[4, 0)`, and fold `lhs u< 3` to `false`. But `lhs u< 3` can be true also, if the RHS (the value we know is in `[2, 5)`) was `2` and `lhs` was `2`. In other words, in the above IR you know that in the "then" case `lhs` cannot be unsigned-greater than 4, but that does not mean that in the else case `lhs` has to be unsigned greater than 4. Now I think this won't be an observable bug if RHS is always a singleton set, but I'd like to have some measures in place that make it clear what's going on. There are two ways we can do that: For a singleton RHS, I think (but please verify) `makeAllowedICmpRegion` and `makeSatisfyingICmpRegion` will return the same exact set, which contains _exactly_ the value that the predicate is true for. If this assumption is true, my original suggestion was to add a `makeExactICmpRegion` method as: makeExactICmpRegion(...) { assert(makeAllowedICmpRegion(...) == makeSatisfyingICmpRegion(...)); return makeAllowedICmpRegion(...); } and to use that here instead of `makeAllowedICmpRegion`. It would take an `APInt` as RHS, and if someone comes along and tries to generalize this code to work on non-singleton RHSs, they'll immediately know that it isn't a 100% straightforward. The second possibility is to generalize this logic to work with a non-singleton RHS range. To do that, we can start by allowing the RHS to be a load instruction with `!range` metadata, and eventually move over to even more sophisticated analyses. sanjoy: Yeah, I wasn't as clear as I should have been. The "problem" here is that the logic is…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions For a singleton RHS, I think (but please verify) makeAllowedICmpRegion and makeSatisfyingICmpRegion will return the same exact set, which contains _exactly_ the value that the predicate is true for. I think this assumption is not always true, see my previous example. then currently you will infer the range of lhs on the else branch to be [0, 4).inverse() == [4, 0) I see your point here, I have already modified this to 'makeAllowedICmpRegion(getInversePredicate(Pred), CI2)' based on your earlier suggestion. The second possibility is to generalize this logic to work with a non-singleton RHS range. I have no idea if non-singleton RHS ranges occur enough to be worthwhile to generalize. Do you mind if I defer this suggestion? bmakam: > For a singleton RHS, I think (but please verify) makeAllowedICmpRegion and…
		sanjoyUnsubmitted Not Done Reply Inline Actions I think this assumption is not always true, see my previous example. Are you talking about the "Taking a general example if Pred is ult and Other is i8 [2, 5) makeAllowedICmpRegion would return [0, 4) and make SatisfyingICmpRegion would return [0, 2). Isn't a set that is equal to both same as the "satisfying" icmp range i.e [0, 2) in this example?" bit? In that case isn't the RHS (i.e. `Other`) `[2, 5)`, specifically not a singleton set? If you're talking about another example, then I must have missed it. I have already modified this to 'makeAllowedICmpRegion(getInversePredicate(Pred), CI2)' based on your earlier suggestion. That's not the only place where the current code will not generalize to a non-singleton RHS. For instance, the `Intersection.isSingleElement()` check won't generalize either -- you cannot transform if (a u< "[2, 5)") { // AllowedRegion = [0, 4) if (a >=u "[3, 5)") { // AllowedRegion = [3, 0) print(); } else abort(); } to if (a u< "[2, 5)") if (a == 3) print(); else abort(); since for `"[2, 5)"` == `4` and `"[3, 5)"` == `4` and `a` == `3`, the first program would have called `abort()` but the new program will call `print()`. On the other hand, for the `Intersection.isEmptySet()` you do need to call `AllowedICmpRegion` on both the conditions. There is a general pattern here -- `makeAllowedICmpRegion` only lets you "rule out" certain values (e.g. if `a >=u "[3, 5)"` then `a` definitely cannot be in {`0`, `1`, `2`}, but it can be anything else), but it does not guarantee that the comparison will fold one way or the other. IOW, it is the "only if" part of the constraint (the comparison succeeds only if the LHS in the returned set), but not the "if" part (just because LHS is in the returned set does not mean the comparison will succeed). I have no idea if non-singleton RHS ranges occur enough to be worthwhile to generalize. Do you mind if I defer this suggestion? Sure, but in that case we need to have the logic phrased in a way that makes it obvious that the current logic is not fully general (i.e. with a `makeExactICmpRegion`). I don't think non-singletom RHS ranges are too rare: if (a u< 100) if (b u< a) if (b >u 500) In the inner condition `b u< a`, you know the RHS is in `[0, 100)`, and you can use that information to fold the innermost condition to `false`. However, it is fine to not handle these cases as a starting point. sanjoy: > I think this assumption is not always true, see my previous example. Are you talking about…
return &I;		return &I;

		sanjoyUnsubmitted Not Done Reply Inline Actions This is confusingly named -- by "RHS", I'd normally understand `CI` or `CI2`; but what you mean here is the dominated LHS. How about calling it just that: `DominatedLHSRange`? sanjoy: This is confusingly named -- by "RHS", I'd normally understand `CI` or `CI2`; but what you mean…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions How about using CR and DominatingCR corresponding to CI and CI2? bmakam: How about using CR and DominatingCR corresponding to CI and CI2?
		sanjoyUnsubmitted Not Done Reply Inline Actions SGTM sanjoy: SGTM
// Given the known and unknown bits, compute a range that the LHS could be		// Given the known and unknown bits, compute a range that the LHS could be
// in. Compute the Min, Max and RHS values based on the known bits. For the		// in. Compute the Min, Max and RHS values based on the known bits. For the
// EQ and NE we use unsigned values.		// EQ and NE we use unsigned values.
APInt Op0Min(BitWidth, 0), Op0Max(BitWidth, 0);		APInt Op0Min(BitWidth, 0), Op0Max(BitWidth, 0);
APInt Op1Min(BitWidth, 0), Op1Max(BitWidth, 0);		APInt Op1Min(BitWidth, 0), Op1Max(BitWidth, 0);
		sanjoyUnsubmitted Not Done Reply Inline Actions Why do you care about `isSignBit`? sanjoy: Why do you care about `isSignBit`?
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions If it is a SignBit, canonicalizing could transform for example an icmp slt 0 into icmp eq/ne 0 which we want to avoid in situations where the icmp is only used as input to a branch because it will pessimize the codegen by generating a compare and branch on zero instruction (cbz) instead of a test and branch(tbz). tbz has better branch displacement than a cbz. If the icmp is an input to anything other than branches such as a select we would be better of canonicalizing it to eq/ne. Added a helper function isZBranchCheck to restrict it further for this case and added testcases. bmakam: If it is a SignBit, canonicalizing could transform for example an icmp slt 0 into icmp eq/ne 0…
if (I.isSigned()) {		if (I.isSigned()) {
ComputeSignedMinMaxValuesFromKnownBits(Op0KnownZero, Op0KnownOne,		ComputeSignedMinMaxValuesFromKnownBits(Op0KnownZero, Op0KnownOne,
Op0Min, Op0Max);		Op0Min, Op0Max);
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions if Difference is empty then the inner condition is always true. bmakam: if Difference is empty then the inner condition is always true.
ComputeSignedMinMaxValuesFromKnownBits(Op1KnownZero, Op1KnownOne,		ComputeSignedMinMaxValuesFromKnownBits(Op1KnownZero, Op1KnownOne,
		sanjoyUnsubmitted Done Reply Inline Actions Add a comment on why you need `isBranchOnSignBitCheck` here. sanjoy: Add a comment on why you need `isBranchOnSignBitCheck` here.
Op1Min, Op1Max);		Op1Min, Op1Max);
		sanjoyUnsubmitted Not Done Reply Inline Actions These are better written as: if (auto AI = Intersection.getSingleElement()) return new ICmpInst(..., Builder->getInt(AI)); etc. sanjoy: These are better written as: ``` if (auto *AI = Intersection.getSingleElement()) return new…
} else {		} else {
ComputeUnsignedMinMaxValuesFromKnownBits(Op0KnownZero, Op0KnownOne,		ComputeUnsignedMinMaxValuesFromKnownBits(Op0KnownZero, Op0KnownOne,
Op0Min, Op0Max);		Op0Min, Op0Max);
ComputeUnsignedMinMaxValuesFromKnownBits(Op1KnownZero, Op1KnownOne,		ComputeUnsignedMinMaxValuesFromKnownBits(Op1KnownZero, Op1KnownOne,
Op1Min, Op1Max);		Op1Min, Op1Max);
}		}

// If Min and Max are known to be the same, then SimplifyDemandedBits		// If Min and Max are known to be the same, then SimplifyDemandedBits
// figured out that the LHS is a constant. Just constant fold this now so		// figured out that the LHS is a constant. Just constant fold this now so
		majnemerUnsubmitted Not Done Reply Inline Actions The prior two replacements seem like pure goodness: they make conditional branches branch on true or false. These two seem weird to have in InstCombine. Seeing as how you are concerned with pessimization, can we move these to CGP and have them operate on a TargetLowering hook? majnemer: The prior two replacements seem like pure goodness: they make conditional branches branch on…
		majnemerUnsubmitted Not Done Reply Inline Actions On second thought, this is a nice canonicalization regardless because it makes the comparisons easier to analyze. LGTM. majnemer: On second thought, this is a nice canonicalization regardless because it makes the comparisons…
// that code below can assume that Min != Max.		// that code below can assume that Min != Max.
if (!isa<Constant>(Op0) && Op0Min == Op0Max)		if (!isa<Constant>(Op0) && Op0Min == Op0Max)
return new ICmpInst(I.getPredicate(),		return new ICmpInst(I.getPredicate(),
ConstantInt::get(Op0->getType(), Op0Min), Op1);		ConstantInt::get(Op0->getType(), Op0Min), Op1);
if (!isa<Constant>(Op1) && Op1Min == Op1Max)		if (!isa<Constant>(Op1) && Op1Min == Op1Max)
return new ICmpInst(I.getPredicate(), Op0,		return new ICmpInst(I.getPredicate(), Op0,
ConstantInt::get(Op1->getType(), Op1Min));		ConstantInt::get(Op1->getType(), Op1Min));

▲ Show 20 Lines • Show All 1,302 Lines • Show Last 20 Lines

test/Transforms/InstCombine/icmp.ll

	Show First 20 Lines • Show All 1,950 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[CMP:%.*]] = icmp uge i32 [[ADDX]], [[ADDY]]			; CHECK-NEXT: [[CMP:%.*]] = icmp uge i32 [[ADDX]], [[ADDY]]
	; CHECK-NEXT: ret i1 [[CMP]]			; CHECK-NEXT: ret i1 [[CMP]]
	;			;
	%addx = add i32 %x, 2147483647			%addx = add i32 %x, 2147483647
	%addy = add i32 %y, 2147483647			%addy = add i32 %y, 2147483647
	%cmp = icmp uge i32 %addx, %addy			%cmp = icmp uge i32 %addx, %addy
	ret i1 %cmp			ret i1 %cmp
	}			}

				; CHECK-LABEL: @idom_sign_bit_check_edge_dominates
				define void @idom_sign_bit_check_edge_dominates(i64 %a) {
				entry:
				%cmp = icmp slt i64 %a, 0
				br i1 %cmp, label %land.lhs.true, label %lor.rhs

				land.lhs.true: ; preds = %entry
				br label %lor.end

				; CHECK-LABEL: lor.rhs:
				; CHECK-NOT: icmp sgt i64 %a, 0
				; CHECK: icmp eq i64 %a, 0
				lor.rhs: ; preds = %entry
				%cmp2 = icmp sgt i64 %a, 0
				br i1 %cmp2, label %land.rhs, label %lor.end

				land.rhs: ; preds = %lor.rhs
				br label %lor.end

				lor.end: ; preds = %land.rhs, %lor.rhs, %land.lhs.true
				ret void
				}

				; CHECK-LABEL: @idom_sign_bit_check_edge_not_dominates
				define void @idom_sign_bit_check_edge_not_dominates(i64 %a) {
				entry:
				%cmp = icmp slt i64 %a, 0
				br i1 %cmp, label %land.lhs.true, label %lor.rhs

				land.lhs.true: ; preds = %entry
				br i1 undef, label %lor.end, label %lor.rhs

				; CHECK-LABEL: lor.rhs:
				; CHECK: icmp sgt i64 %a, 0
				; CHECK-NOT: icmp eq i64 %a, 0
				lor.rhs: ; preds = %land.lhs.true, %entry
				%cmp2 = icmp sgt i64 %a, 0
				br i1 %cmp2, label %land.rhs, label %lor.end

				land.rhs: ; preds = %lor.rhs
				br label %lor.end

				lor.end: ; preds = %land.rhs, %lor.rhs, %land.lhs.true
				ret void
				}

				; CHECK-LABEL: @idom_sign_bit_check_edge_dominates_select
				define void @idom_sign_bit_check_edge_dominates_select(i64 %a, i64 %b) {
				entry:
				%cmp = icmp slt i64 %a, 0
				br i1 %cmp, label %land.lhs.true, label %lor.rhs

				land.lhs.true: ; preds = %entry
				br label %lor.end

				; CHECK-LABEL: lor.rhs:
				sanjoyUnsubmitted Done Reply Inline Actions Can you add a dominated check with, say, `10` or something just to demonstrate that we're general here? sanjoy: Can you add a dominated check with, say, `10` or something just to demonstrate that we're…
				; CHECK-NOT: [[B:%.*]] = icmp sgt i64 %a, 0
				; CHECK: [[C:%.*]] = icmp eq i64 %a, %b
				; CHECK-NOT: [[D:%.*]] = select i1 [[B]], i64 %a, i64 0
				; CHECK-NOT: icmp ne i64 [[D]], %b
				; CHECK-NEXT: br i1 [[C]], label %lor.end, label %land.rhs
				lor.rhs: ; preds = %entry
				%cmp2 = icmp sgt i64 %a, 0
				%select = select i1 %cmp2, i64 %a, i64 0
				%cmp3 = icmp ne i64 %select, %b
				br i1 %cmp3, label %land.rhs, label %lor.end

				land.rhs: ; preds = %lor.rhs
				br label %lor.end

				lor.end: ; preds = %land.rhs, %lor.rhs, %land.lhs.true
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Canonicalize icmp instructions based on dominating conditions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 53296

lib/Transforms/InstCombine/InstCombineCompares.cpp

test/Transforms/InstCombine/icmp.ll

[InstCombine] Canonicalize icmp instructions based on dominating conditions.
ClosedPublic