This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
7/11
InstCombineCompares.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1/3
icmp-shift-and-signbit.ll
1/2
pr17827.ll

Differential D62818

[InstCombine] Introduce fold for icmp pred (and X, (sh signbit, Y)), 0.
AbandonedPublic

Authored by huihuiz on Jun 3 2019, 10:23 AM.

Download Raw Diff

Details

Reviewers

spatel
craig.topper
efriedma
lebedev.ri

Summary

Fold:

(X & (signbit l>> Y)) ==/!= 0 -> (X << Y) s>=/s< 0
(X & (signbit << Y)) ==/!= 0 -> (X l>> Y) s>=/s< 0

Diff Detail

Repository: rL LLVM

Event Timeline

huihuiz created this revision.Jun 3 2019, 10:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2019, 10:23 AM

For thumb target, this optimization allow generation of more compacted instructions.
run: clang -mcpu=cortex-m0 -target armv6m-none-eabi icmp-shl-and.ll -O2 -S -o t.s

@ %bb.0:                                @ %entry
        subs    r0, r0, #1
        lsls    r1, r0
        cmp     r1, #0
        blt     .LBB0_2
@ %bb.1:                                @ %entry
        mov     r2, r3
.LBB0_2:                                @ %entry
        mov     r0, r2
        bx      lr

Otherwise will generate more instructions with signmask shifting

@ %bb.0:                                @ %entry
        .save   {r4, lr}
        push    {r4, lr}
        subs    r0, r0, #1
        movs    r4, #1
        lsls    r4, r4, #31
        lsrs    r4, r0
        tst     r4, r1
        beq     .LBB0_2
@ %bb.1:                                @ %entry
        mov     r3, r2
.LBB0_2:                                @ %entry
        mov     r0, r3
        pop     {r4, pc}

ARM and thumb2 target allow flexible second operand, for this case test bit instruction with shift. This optimization does not affect performance of generated instructions.
Run: clang -mcpu=cortex-a53 -target armv8-none-musleabi icmp-shl-and.ll -O2 -S -o t.s

With this optimization

@ %bb.0:                                @ %entry
        sub     r0, r0, #1
        lsl     r0, r1, r0
        cmp     r0, #0
        movge   r2, r3
        mov     r0, r2
        bx      lr

Without this optimization:

@ %bb.0:                                @ %entry
        sub     r12, r0, #1
        mov     r0, #-2147483648
        tst     r1, r0, lsr r12
        moveq   r2, r3
        mov     r0, r2
        bx      lr

This looks like a missing backend-level transform, either a generic-one in DAGCombiner, or in ARMISelLowering.cpp.

This fix is not the right thing to do because even if you disable this fold,
you can still receive this 'bad' IR you are trying to avoid here,
and will still end up generating bad ASM.

This revision now requires changes to proceed.Jun 3 2019, 10:54 AM

Though this transform is also bad for X86: https://godbolt.org/z/KFM3gQ
When the C2 << Y isn't being hoisted out of the loop that is of course.

So we're missing an undo fold: https://rise4fun.com/Alive/w25
Not sure if it should be guarded by a TTI hook, i would expect it to be always beneficial.
(that doesn't mind the original fold is always not beneficial though)
I'll try to take a look.

On the instcombine side, one thing worth noting which isn't called out in the commit message is the interaction with other instcombine patterns. In the testcase, note that the final IR actually doesn't contain any mask; instead, it checks icmp slt i32 [[SHL]], 0. Huihui, please update the commit message to make this clear.

It's possible we should also implement the related pattern to transform (x & (signbit >> y)) != 0 to (x << y) < 0, sure.

In terms of whether it's universally profitable, I'm not sure... I guess if somehow "icmp ne X, 0" is free, but "icmp slt X, 0" isn't, it could be an issue, but I don't think that applies to any architecture I can think of.

I'm about to post dagcominer undo-fold, hold on..

In D62818#1528149, @efriedma wrote:

On the instcombine side, one thing worth noting which isn't called out in the commit message is the interaction with other instcombine patterns. In the testcase, note that the final IR actually doesn't contain any mask; instead, it checks icmp slt i32 [[SHL]], 0. Huihui, please update the commit message to make this clear.

It's possible we should also implement the related pattern to transform (x & (signbit >> y)) != 0 to (x << y) < 0, sure.

Yes, now that would be a good patch, +see inline comment.

In terms of whether it's universally profitable, I'm not sure... I guess if somehow "icmp ne X, 0" is free, but "icmp slt X, 0" isn't, it could be an issue, but I don't think that applies to any architecture I can think of.

I think there may or may not bea confusion here. We are in a middle-end here. Other than TTI,
we don't really care about what ever backed/target may find troubling/unprofitable.
We only care about producing most simple IR, that is most suited for further transforms.
That new IR may, or may not, be optimal for any particular target.
If IR is not optimal for back-end, then an opposite transform should be present in backend.

lib/Transforms/InstCombine/InstCombineCompares.cpp
1606–1611	There should also be a sibling fold with swapped shift directions

lebedev.ri added inline comments.Jun 3 2019, 3:26 PM

test/Transforms/InstCombine/icmp-shl-and.ll
12–13 ↗	(On Diff #202749)	Hmm, this already should be folding: https://godbolt.org/z/77mvnv I guess the order of folds is wrong.

Diffusion mentioned this in rL362494: [NFC][Codegen] D62818 - also add tests with X being constant.Jun 4 2019, 4:41 AM

lebedev.ri mentioned this in rG2e49e8196dab: [NFC][Codegen] D62818 - also add tests with X being constant.Jun 4 2019, 4:43 AM

And posted: D62871

As for the instcombine side,
i guess i would recommend a new differential,
with actual folds, not this blacklisting.

The other approach could be changing the order of folding. Move foldICmpBinOpEqualityWithConstant to the very beginning of foldICmpInstWithConstant.
foldICmpBinOpEqualityWithConstant has rules to replace (and X, (1 << size(X)-1) != 0) with x s< 0.
Let me know if this approach is more preferable?

In D62818#1529921, @huihuiz wrote:

The other approach could be changing the order of folding. Move foldICmpBinOpEqualityWithConstant to the very beginning of foldICmpInstWithConstant.
foldICmpBinOpEqualityWithConstant has rules to replace (and X, (1 << size(X)-1) != 0) with x s< 0.
Let me know if this approach is more preferable?

You want (1+1*2)*2 = 6 folds: https://rise4fun.com/Alive/Y8Ct

huihuiz updated this revision to Diff 203024.Jun 4 2019, 2:28 PM

huihuiz retitled this revision from [InstCombine] Allow ((X << Y) & SignMask) != 0 to be optimized as (X << Y) s< 0. to [InstCombine] Change order of ICmp fold..

huihuiz edited the summary of this revision. (Show Details)

Yes , changing the order would allow these folds.

(X & (signbit >> Y)) != 0  ->  (X << Y) s< 0
(X & (signbit >> Y)) == 0  ->  (X << Y) >= 0
((X << Y) & signbit) != 0  ->  (X << Y) s< 0
((X << Y) & signbit) == 0  ->  (X << Y) >= 0

lebedev.ri added inline comments.Jun 4 2019, 3:11 PM

lib/Transforms/InstCombine/InstCombineCompares.cpp

1762–1778

Eww, this looks too much like backend pattern matching :)
Here you want something more like

// (V0 & (signbit l>> V1)) ==/!= 0 -> (V0 << V1) >=/< 0
// (V0 & (signbit << V1)) ==/!= 0 -> (V0 l>> V1) >=/< 0
Value *V0, *V1, *Shift, *Zero;
ICmpInst::Predicate Pred;
if (match(&Cmp,
          m_ICmp(Pred,
                 m_OneUse(m_c_And(
                     m_CombineAnd(
                         m_CombineAnd(m_Shift(m_SignMask(), m_Value(V1)),
                                      m_Value(Shift)),
                         m_CombineOr(m_Shl(m_Value(), m_Value()),
                                     m_LShr(m_Value(), m_Value()))),
                     m_Value(V0))),
                 m_CombineAnd(m_Zero(), m_Value(Zero)))) &&
    Cmp.isEquality(Pred)) {
  Value *NewShift = cast<Instruction>(Shift)->getOpcode() == Instruction::LShr
                        ? Builder.CreateShl(V0, V1)
                        : Builder.CreateLShr(V0, V1);
  ICmpInst::Predicate NewPred =
      Pred == CmpInst::ICMP_EQ ? CmpInst::ICMP_SGE : CmpInst::ICMP_SLT;
  return new ICmpInst(NewPred, NewShift, Zero);
}

lebedev.ri added inline comments.Jun 4 2019, 3:14 PM

lib/Transforms/InstCombine/InstCombineCompares.cpp
2664–2666	I'm not looking forward seeing the fallout of this move. I will be extremely surprised if, while fixing the target problem, this won't expose numerous other fold order issues. Can you instead simply follow the `TODO`, and simply refactor the single interesting fold out of `foldICmpBinOpEqualityWithConstant()` into `foldICmpAndConstant()` i guess?

Test cases in icmp-shift-and-signbit.ll shows the updated fold order can generate better IR.

huihuiz marked 2 inline comments as done.Jun 5 2019, 10:46 PM

Nice, getting closer.
Could you please split this up:

A patch that adds your original motivational testcase that shows that the fold order is wrong.
The move of the // Replace (and X, (1 << size(X)-1) != 0) with x s< 0 fold
A patch with just new test/Transforms/InstCombine/icmp-shift-and-signbit.ll
The fold itself, showing the changes to the check lines

lib/Transforms/InstCombine/InstCombineCompares.cpp
1646	Here the codegen is irrelevant. We do this because it results in simpler IR. Not sure if that new comment adds anything useful
1660	`C2->negate().isPowerOf2()`
2791–2792	Uhm, where did this check that we were comparing with `0`?
test/Transforms/InstCombine/icmp-shift-and-signbit.ll
2	Please Move this to a new differential In that same patch, re-add your initial motivational pattern, that shows that fold reordering did something Use `llvm/utils/update_test_checks.py` to generate check lines Rebase this diff ontop of that new patch, so this diff shows how the check lines change
13	`select` is not relevant for this pattern, drop it
68	You also want a few extra tests: A trivial vector test with `<i32 -2147483648, i32 -2147483648>` and `<i32 0, i32 0>` 3 vector tests with undefs: `<i32 -2147483648, i32 undef, i32 -2147483648>` and `<i32 0, i32 0, i32 0>` `<i32 -2147483648, i32 -2147483648, i32 -2147483648>` and `<i32 0, i32 undef, i32 0>` `<i32 -2147483648, i32 undef, i32 -2147483648>` and `<i32 0, i32 undef, i32 0>` A tests to verify single-use constraints: a test with extra use on `%shr` (should get folded, but not others) a test with extra use on `%and` a test with extra use on `%shr` and `%and`. How to introduce extra uses see e.g. `llvm/test/Transforms/InstCombine/unfold-masked-merge-with-const-mask-scalar.ll`
test/Transforms/InstCombine/pr17827.ll
66	These don't look like improvements to me. Looks like that reordering exposes yet another missing fold.

huihuiz mentioned this in D63025: [InstCombine] Add tests to show missing fold opportunity for "icmp and shift" (nfc)..Jun 7 2019, 1:13 PM

huihuiz mentioned this in D63026: [InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier.Jun 7 2019, 2:12 PM

Thank you so much for all the review feedback, really appreciate it! :)

lib/Transforms/InstCombine/InstCombineCompares.cpp
1660	Should not call C2->negate() If C2 negate is not power of 2, then calling negate() will replace C2 with C2 negate. C2 should not be modified.
2791–2792	What happened was, C being 0, signbit, other number we are ok with 0 if C is signbit, consider test: X & signbit == signbit fold: X & -C == -C -> X > u ~C X & -C != -C -> X <= u ~C and fold: For i32: x >u 2147483647 -> x <s 0 -> true if sign bit set are scheduled before fold: (and X, (1 << size(X)-1) != 0) with x s< 0 if C is other number, SimplifyICmpInst will do its job
test/Transforms/InstCombine/pr17827.ll
66	in D63026 I am moving fold ((X & ~7) == 0) --> X < 8 ahead. If X is (BinOp Y, C3), should allow other rules to fold C3 with C2, eg (X >> C3) & C2 != C1 -> (X & (C2 << C3)) != (C1 << C3)

huihuiz mentioned this in D63028: [InstCombine] Add tests for missing fold icmp pred (and X, (sh signbit, Y)), 0..Jun 7 2019, 2:32 PM

D62818 is now split into D63025 , D63026 , D63028 and D62818

More signum, sgn patterns
https://godbolt.org/z/tE00f4

In D62818#1534806, @xbolva00 wrote:

More signum, sgn patterns
https://godbolt.org/z/tE00f4

Hey @xbolva00 , I don't see there is much difference between codegen of x86-clang and x86-gcc.
Let's focus on the missing folds we are trying to resolve here:

(X & (signbit l>> Y)) ==/!= 0 -> (X << Y) >=/< 0
(X & (signbit << Y)) ==/!= 0 -> (X l>> Y) >=/< 0

and fold order issue of

((X << Y) & signbit) ==/!= 0) -> (X << Y) >=/< 0;
(X << Y) & ~C ==/!= 0 -> (X << Y) </>= C+1, C+1 is power of 2;
and
((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.

Oh, i thought i commented on these reviews, apparently not :(
I still see random changes to test coverage (new tests being added) in an non-nfc patches.
Let me rephrase: can you put all the test updates, new tests into *ONE* review, and the rest of the patches should not add new/change existing tests?

Original test cases are added in D63025 . Hopefully would be good coverage :)
D63026 fix fold order issue
this differential introduce new fold for icmp pred (and X, (sh signbit, Y)), 0

Is this the only remaining patch?
I don't think i should review my own code, perhaps @spatel can take a look?

lib/Transforms/InstCombine/InstCombineCompares.cpp
1796	I'm not sure why i have added `m_OneUse()` here, it should not be here.

spatel added inline comments.Jun 26 2019, 9:23 AM

lib/Transforms/InstCombine/InstCombineCompares.cpp
1792–1795	m_LogicalShift() ?
test/Transforms/InstCombine/signbit-shl-and-icmpeq-zero.ll
180 ↗	(On Diff #204436)	I didn't step through the transforms, but it seems wrong to call this a 'negative test'. This patch must have fired and allowed further simplification?

huihuiz mentioned this in rL364497: [InstCombine][NFCI] Fix test comments..Jun 26 2019, 10:46 PM

huihuiz mentioned this in rG9f69052394a4: [InstCombine][NFCI] Fix test comments..

I simplify the code for pattern matching, more readable.

Herald added a subscriber: hiraditya. · View Herald TranscriptJun 26 2019, 11:30 PM

huihuiz added inline comments.Jun 26 2019, 11:33 PM

lib/Transforms/InstCombine/InstCombineCompares.cpp
1796	I agree that m_OneUse() should not be in the pattern matching. V0 might be constant value, which will have more than one use outside of its current function. Actually I added, not you, sorry about that. There is a regression, see test file: test/Transforms/InstCombine/onehot_merge.ll define i1 @foo1_and(i32 %k, i32 %c1, i32 %c2) { bb: %tmp = shl i32 1, %c1 %tmp4 = lshr i32 -2147483648, %c2 %tmp1 = and i32 %tmp, %k %tmp2 = icmp eq i32 %tmp1, 0 %tmp5 = and i32 %tmp4, %k %tmp6 = icmp eq i32 %tmp5, 0 %or = or i1 %tmp2, %tmp6 ret i1 %or } failed to fold (iszero(A&K1) \| iszero(A&K2)) -> (A&(K1\|K2)) != (K1\|K2) , where K1 and K2 are 'one-hot' (only one bit is on). Here K1 is one, K2 is signbit. I am still thinking how to get over this regression.
test/Transforms/InstCombine/signbit-shl-and-icmpeq-zero.ll
180 ↗	(On Diff #204436)	Yes, X being constant is positive case. Fold happened, and allowed further simplification.

lebedev.ri added inline comments.Jun 27 2019, 1:33 AM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1790–1792 ↗	(On Diff #206784)	`m_Value(V0)` will always match, it's best to swap them.
llvm/test/Transforms/InstCombine/onehot_merge.ll
18 ↗	(On Diff #206784)	Can you please regenerate the original test?
18–23 ↗	(On Diff #206784)	I'm not sure what's on the LHS of the diff, but ignoring the instruction count this looks like improvement to me.

lebedev.ri added inline comments.Jun 27 2019, 2:49 AM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1785–1786 ↗	(On Diff #206784)	// (V0 & (signbit l>> V1)) ==/!= 0 -> (V0 << V1) s>=/s< 0 // (V0 & (signbit << V1)) ==/!= 0 -> (V0 l>> V1) s>=/s< 0
llvm/test/Transforms/InstCombine/onehot_merge.ll
18 ↗	(On Diff #206784)	AH, you also want to str-replace `%tmp` with `%t`, it confuses the update script likely.

spatel added inline comments.Jun 27 2019, 7:25 AM

llvm/test/Transforms/InstCombine/onehot_merge.ll
18 ↗	(On Diff #206784)	rL364546

Rebased patch, and addressed review comments.

llvm/test/Transforms/InstCombine/onehot_merge.ll
18–23 ↗	(On Diff #206784)	Actually we missed the fold for ((k & ( 1 l<< C1 )) == 0) \|\| ((k & ( signbit l>> C2 )) == 0) --> ((k & (( 1 l<< C1 ) \|\| ( signbit l>> C2 ))) != 0)

lebedev.ri added inline comments.Jun 27 2019, 11:21 AM

llvm/test/Transforms/InstCombine/onehot_merge.ll
18–23 ↗	(On Diff #206784)	Thanks for the analysis! @spatel does this fall into the nowadays reasoning that we shouldn't be doing too much folds into bitmath here in instcombine? I'm almost tempted to say that this isn't a regression, but the original fold that now no longer happens should be removed instead.

huihuiz added a child revision: D63903: [InstCombine][NFCI] Update test cases in onehot_merge.ll.Jun 27 2019, 4:10 PM

for onehot_merge.ll
mathematically speaking

(signbit l>> C)

is equivalent to

(one l<< (bitwidth - C - 1))

In D63903, I update the test input, so that we are still checking fold for 'or' of ICmps and 'and' of ICmps.

spatel added inline comments.Jun 28 2019, 10:58 AM

llvm/test/Transforms/InstCombine/onehot_merge.ll
18–23 ↗	(On Diff #206784)	If we say that the longer IR sequence is more canonical, then we'd want to add a transform to create that longer sequence starting from the shorter sequence. Are we willing to do that to improve analysis in IR? As a practical matter, we probably also want to look at asm output for the alternatives on a few targets to see how much backend logic is required to do/undo this.

spatel added inline comments.Jun 28 2019, 11:05 AM

llvm/test/Transforms/InstCombine/onehot_merge.ll
18–23 ↗	(On Diff #206784)	Sorry - I haven't followed this patch and its friends closely; scrolling back through the comments, I think the backend questions are covered by D62871.

lebedev.ri mentioned this in D63829: [InstCombine] Shift amount reassociation in bittest (PR42399).Jul 1 2019, 11:58 AM

This isn't specific to sign bit, the more general pattern is https://rise4fun.com/Alive/2zpl
I'm apparently working on it..

Diffusion mentioned this in rL365056: [NFC][InstCombine] onehot_merge.ll: add last few tests in the state they….Jul 3 2019, 9:50 AM

lebedev.ri mentioned this in rG826db453d1fc: [NFC][InstCombine] onehot_merge.ll: add last few tests in the state they….Jul 3 2019, 9:50 AM

lebedev.ri added inline comments.Jul 3 2019, 1:11 PM

llvm/test/Transforms/InstCombine/onehot_merge.ll
115–122 ↗	(On Diff #206958)	Looks like to support this pattern, `InstCombiner::foldAndOrOfICmpsOfAndWithPow2()` will need to be generalized.

huihuiz marked 2 inline comments as done.Jul 3 2019, 10:26 PM

huihuiz added inline comments.

llvm/test/Transforms/InstCombine/onehot_merge.ll
115–122 ↗	(On Diff #206958)	I am looking into this, hold on.

Generalize InstCombiner::foldAndOrOfICmpsOfAndWithPow2() in D64275

huihuiz added a child revision: D64275: [InstCombine] Generalize InstCombiner::foldAndOrOfICmpsOfAndWithPow2()..Jul 5 2019, 8:55 PM

Diffusion mentioned this in rL366955: [Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 fold.Jul 24 2019, 3:58 PM

lebedev.ri mentioned this in rG017e272c3add: [Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 fold.Jul 24 2019, 3:59 PM

lebedev.ri requested changes to this revision.Aug 1 2019, 3:09 PM

This revision now requires changes to proceed.Aug 1 2019, 3:09 PM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:43 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:43 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

huihuiz abandoned this revision.Jan 13 2023, 9:23 AM

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineCompares.cpp

72 lines

test/

Transforms/

InstCombine/

icmp-shift-and-signbit.ll

69 lines

pr17827.ll

8 lines

Diff 203284

lib/Transforms/InstCombine/InstCombineCompares.cpp

	Show First 20 Lines • Show All 968 Lines • ▼ Show 20 Lines
	And->setOperand(1, ConstantInt::get(And->getType(), NewAndCst));			And->setOperand(1, ConstantInt::get(And->getType(), NewAndCst));
	And->setOperand(0, Shift->getOperand(0));			And->setOperand(0, Shift->getOperand(0));
	Worklist.Add(Shift); // Shift is dead.			Worklist.Add(Shift); // Shift is dead.
	return &Cmp;			return &Cmp;
	}			}
	}			}
	}			}

	// Turn ((X >> Y) & C2) == 0 into (X & (C2 << Y)) == 0. The latter is			// Turn ((X >> Y) & C2) == 0 into (X & (C2 << Y)) == 0. The latter is
	// preferable because it allows the C2 << Y expression to be hoisted out of a			// preferable because it allows the C2 << Y expression to be hoisted out of a
	// loop if Y is invariant and X is not.			// loop if Y is invariant and X is not.
	if (Shift->hasOneUse() && C1.isNullValue() && Cmp.isEquality() &&			if (Shift->hasOneUse() && C1.isNullValue() && Cmp.isEquality() &&
	!Shift->isArithmeticShift() && !isa<Constant>(Shift->getOperand(0))) {			!Shift->isArithmeticShift() && !isa<Constant>(Shift->getOperand(0))) {
	// Compute C2 << Y.			// Compute C2 << Y.
				lebedev.riUnsubmitted Not Done Reply Inline Actions There should also be a sibling fold with swapped shift directions lebedev.ri: There should also be a sibling fold with swapped shift directions
	Value *NewShift =			Value *NewShift =
	IsShl ? Builder.CreateLShr(And->getOperand(1), Shift->getOperand(1))			IsShl ? Builder.CreateLShr(And->getOperand(1), Shift->getOperand(1))
	: Builder.CreateShl(And->getOperand(1), Shift->getOperand(1));			: Builder.CreateShl(And->getOperand(1), Shift->getOperand(1));

	// Compute X & (C2 << Y).			// Compute X & (C2 << Y).
	Value *NewAnd = Builder.CreateAnd(Shift->getOperand(0), NewShift);			Value *NewAnd = Builder.CreateAnd(Shift->getOperand(0), NewShift);
	Cmp.setOperand(0, NewAnd);			Cmp.setOperand(0, NewAnd);
	return &Cmp;			return &Cmp;
	}			}

	return nullptr;			return nullptr;
	}			}

	/// Fold icmp (and X, C2), C1.			/// Fold icmp (and X, C2), C1.
	Instruction *InstCombiner::foldICmpAndConstConst(ICmpInst &Cmp,			Instruction *InstCombiner::foldICmpAndConstConst(ICmpInst &Cmp,
	BinaryOperator *And,			BinaryOperator *And,
	const APInt &C1) {			const APInt &C1) {
				bool isICMP_NE = Cmp.getPredicate() == ICmpInst::ICMP_NE;

	// For vectors: icmp ne (and X, 1), 0 --> trunc X to N x i1			// For vectors: icmp ne (and X, 1), 0 --> trunc X to N x i1
	// TODO: We canonicalize to the longer form for scalars because we have			// TODO: We canonicalize to the longer form for scalars because we have
	// better analysis/folds for icmp, and codegen may be better with icmp.			// better analysis/folds for icmp, and codegen may be better with icmp.
	if (Cmp.getPredicate() == CmpInst::ICMP_NE && Cmp.getType()->isVectorTy() &&			if (isICMP_NE && Cmp.getType()->isVectorTy() && C1.isNullValue() &&
	C1.isNullValue() && match(And->getOperand(1), m_One()))			match(And->getOperand(1), m_One()))
	return new TruncInst(And->getOperand(0), Cmp.getType());			return new TruncInst(And->getOperand(0), Cmp.getType());

	const APInt *C2;			const APInt *C2;
	if (!match(And->getOperand(1), m_APInt(C2)))			Value *X;
				if (!match(And, m_And(m_Value(X), m_APInt(C2))))
	return nullptr;			return nullptr;

	if (!And->hasOneUse())			if (!And->hasOneUse())
	return nullptr;			return nullptr;

				// If X is shift instruction, codegen is better when fold
				lebedev.riUnsubmitted Done Reply Inline Actions Here the codegen is irrelevant. We do this because it results in simpler IR. Not sure if that new comment adds anything useful lebedev.ri: Here the codegen is irrelevant. We do this because it results in simpler IR. Not sure if that…
				// (V0 << V1) & signbit != 0 into (V0 << V1) s< 0
				// rather than (V0 & (signbit >> V1)) != 0;
				// and fold (V0 << V1) & ~C == 0, C+1 is power of 2 into
				// (V0 << V1) < C+1, rather than (V0 & (~C >> V1)) == 0.
				if (Cmp.isEquality() && C1.isNullValue()) {
				// Replace (and X, (1 << size(X)-1) != 0) with X s< 0
				if (C2->isSignMask()) {
				Constant *Zero = Constant::getNullValue(X->getType());
				auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;
				return new ICmpInst(NewPred, X, Zero);
				}

				// ((X & ~7) == 0) --> X < 8
				if ((~(*C2) + 1).isPowerOf2()) {
				lebedev.riUnsubmitted Not Done Reply Inline Actions `C2->negate().isPowerOf2()` lebedev.ri: `C2->negate().isPowerOf2()`
				huihuizAuthorUnsubmitted Done Reply Inline Actions Should not call C2->negate() If C2 negate is not power of 2, then calling negate() will replace C2 with C2 negate. C2 should not be modified. huihuiz: Should not call C2->negate() If C2 negate is not power of 2, then calling negate() will replace…
				Constant *NegBOC =
				ConstantExpr::getNeg(cast<Constant>(And->getOperand(1)));
				auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;
				return new ICmpInst(NewPred, X, NegBOC);
				}
				}

	// If the LHS is an 'and' of a truncate and we can widen the and/compare to			// If the LHS is an 'and' of a truncate and we can widen the and/compare to
	// the input width without changing the value produced, eliminate the cast:			// the input width without changing the value produced, eliminate the cast:
	//			//
	// icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'			// icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'
	//			//
	// We can do this transformation if the constants do not have their sign bits			// We can do this transformation if the constants do not have their sign bits
	// set or if it is an equality comparison. Extending a relational comparison			// set or if it is an equality comparison. Extending a relational comparison
	// when we're checking the sign bit would not work.			// when we're checking the sign bit would not work.
	▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
	if (Instruction *Res = foldCmpLoadFromIndexedGlobal(GEP, GV, Cmp, C2))			if (Instruction *Res = foldCmpLoadFromIndexedGlobal(GEP, GV, Cmp, C2))
	return Res;			return Res;
	}			}

	if (!Cmp.isEquality())			if (!Cmp.isEquality())
	return nullptr;			return nullptr;

	// X & -C == -C -> X > u ~C			// X & -C == -C -> X > u ~C
	// X & -C != -C -> X <= u ~C			// X & -C != -C -> X <= u ~C
	// iff C is a power of 2			// iff C is a power of 2
	if (Cmp.getOperand(1) == Y && (-C).isPowerOf2()) {			if (Cmp.getOperand(1) == Y && (-C).isPowerOf2()) {
	auto NewPred = Cmp.getPredicate() == CmpInst::ICMP_EQ ? CmpInst::ICMP_UGT			auto NewPred = Cmp.getPredicate() == CmpInst::ICMP_EQ ? CmpInst::ICMP_UGT
	: CmpInst::ICMP_ULE;			: CmpInst::ICMP_ULE;
	return new ICmpInst(NewPred, X, SubOne(cast<Constant>(Cmp.getOperand(1))));			return new ICmpInst(NewPred, X, SubOne(cast<Constant>(Cmp.getOperand(1))));
	}			}

	// (X & C2) == 0 -> (trunc X) >= 0			// (X & C2) == 0 -> (trunc X) >= 0
	// (X & C2) != 0 -> (trunc X) < 0			// (X & C2) != 0 -> (trunc X) < 0
	// iff C2 is a power of 2 and it masks the sign bit of a legal integer type.			// iff C2 is a power of 2 and it masks the sign bit of a legal integer type.
	const APInt *C2;			const APInt *C2;
	if (And->hasOneUse() && C.isNullValue() && match(Y, m_APInt(C2))) {			if (And->hasOneUse() && C.isNullValue() && match(Y, m_APInt(C2))) {
	int32_t ExactLogBase2 = C2->exactLogBase2();			int32_t ExactLogBase2 = C2->exactLogBase2();
	if (ExactLogBase2 != -1 && DL.isLegalInteger(ExactLogBase2 + 1)) {			if (ExactLogBase2 != -1 && DL.isLegalInteger(ExactLogBase2 + 1)) {
	Type *NTy = IntegerType::get(Cmp.getContext(), ExactLogBase2 + 1);			Type *NTy = IntegerType::get(Cmp.getContext(), ExactLogBase2 + 1);
	if (And->getType()->isVectorTy())			if (And->getType()->isVectorTy())
				lebedev.riUnsubmitted Done Reply Inline Actions Eww, this looks too much like backend pattern matching :) Here you want something more like // (V0 & (signbit l>> V1)) ==/!= 0 -> (V0 << V1) >=/< 0 // (V0 & (signbit << V1)) ==/!= 0 -> (V0 l>> V1) >=/< 0 Value V0, V1, Shift, Zero; ICmpInst::Predicate Pred; if (match(&Cmp, m_ICmp(Pred, m_OneUse(m_c_And( m_CombineAnd( m_CombineAnd(m_Shift(m_SignMask(), m_Value(V1)), m_Value(Shift)), m_CombineOr(m_Shl(m_Value(), m_Value()), m_LShr(m_Value(), m_Value()))), m_Value(V0))), m_CombineAnd(m_Zero(), m_Value(Zero)))) && Cmp.isEquality(Pred)) { Value NewShift = cast<Instruction>(Shift)->getOpcode() == Instruction::LShr ? Builder.CreateShl(V0, V1) : Builder.CreateLShr(V0, V1); ICmpInst::Predicate NewPred = Pred == CmpInst::ICMP_EQ ? CmpInst::ICMP_SGE : CmpInst::ICMP_SLT; return new ICmpInst(NewPred, NewShift, Zero); } lebedev.ri:* Eww, this looks too much like backend pattern matching :) Here you want something more like ```…
	NTy = VectorType::get(NTy, And->getType()->getVectorNumElements());			NTy = VectorType::get(NTy, And->getType()->getVectorNumElements());
	Value *Trunc = Builder.CreateTrunc(X, NTy);			Value *Trunc = Builder.CreateTrunc(X, NTy);
	auto NewPred = Cmp.getPredicate() == CmpInst::ICMP_EQ ? CmpInst::ICMP_SGE			auto NewPred = Cmp.getPredicate() == CmpInst::ICMP_EQ ? CmpInst::ICMP_SGE
	: CmpInst::ICMP_SLT;			: CmpInst::ICMP_SLT;
	return new ICmpInst(NewPred, Trunc, Constant::getNullValue(NTy));			return new ICmpInst(NewPred, Trunc, Constant::getNullValue(NTy));
	}			}
	}			}

				// (V0 & (signbit l>> V1)) ==/!= 0 -> (V0 << V1) >=/< 0
				// (V0 & (signbit << V1)) ==/!= 0 -> (V0 l>> V1) >=/< 0
				Value V0, V1, Shift, Zero;
				ICmpInst::Predicate Pred;
				if (match(&Cmp,
				m_ICmp(Pred,
				m_OneUse(m_c_And(
				m_CombineAnd(
				m_CombineAnd(m_Shift(m_SignMask(), m_Value(V1)),
				spatelUnsubmitted Done Reply Inline Actions m_LogicalShift() ? spatel: m_LogicalShift() ?
				m_Value(Shift)),
				lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure why i have added `m_OneUse()` here, it should not be here. lebedev.ri: I'm not sure why i have added `m_OneUse()` here, it should not be here.
				huihuizAuthorUnsubmitted Done Reply Inline Actions I agree that m_OneUse() should not be in the pattern matching. V0 might be constant value, which will have more than one use outside of its current function. Actually I added, not you, sorry about that. There is a regression, see test file: test/Transforms/InstCombine/onehot_merge.ll define i1 @foo1_and(i32 %k, i32 %c1, i32 %c2) { bb: %tmp = shl i32 1, %c1 %tmp4 = lshr i32 -2147483648, %c2 %tmp1 = and i32 %tmp, %k %tmp2 = icmp eq i32 %tmp1, 0 %tmp5 = and i32 %tmp4, %k %tmp6 = icmp eq i32 %tmp5, 0 %or = or i1 %tmp2, %tmp6 ret i1 %or } failed to fold (iszero(A&K1) \| iszero(A&K2)) -> (A&(K1\|K2)) != (K1\|K2) , where K1 and K2 are 'one-hot' (only one bit is on). Here K1 is one, K2 is signbit. I am still thinking how to get over this regression. huihuiz: I agree that m_OneUse() should not be in the pattern matching. V0 might be constant value…
				m_CombineOr(m_Shl(m_Value(), m_Value()),
				m_LShr(m_Value(), m_Value()))),
				m_OneUse(m_Value(V0)))),
				m_CombineAnd(m_Zero(), m_Value(Zero)))) &&
				Cmp.isEquality(Pred)) {
				Value *NewShift = cast<Instruction>(Shift)->getOpcode() == Instruction::LShr
				? Builder.CreateShl(V0, V1)
				: Builder.CreateLShr(V0, V1);
				ICmpInst::Predicate NewPred =
				Pred == CmpInst::ICMP_EQ ? CmpInst::ICMP_SGE : CmpInst::ICMP_SLT;
				return new ICmpInst(NewPred, NewShift, Zero);
				}

	return nullptr;			return nullptr;
	}			}

	/// Fold icmp (or X, Y), C.			/// Fold icmp (or X, Y), C.
	Instruction InstCombiner::foldICmpOrConstant(ICmpInst &Cmp, BinaryOperator Or,			Instruction InstCombiner::foldICmpOrConstant(ICmpInst &Cmp, BinaryOperator Or,
	const APInt &C) {			const APInt &C) {
	ICmpInst::Predicate Pred = Cmp.getPredicate();			ICmpInst::Predicate Pred = Cmp.getPredicate();
	if (C.isOneValue()) {			if (C.isOneValue()) {
	▲ Show 20 Lines • Show All 886 Lines • ▼ Show 20 Lines
	break;			break;
	case Instruction::Add:			case Instruction::Add:
	if (Instruction I = foldICmpAddConstant(Cmp, BO, C))			if (Instruction I = foldICmpAddConstant(Cmp, BO, C))
	return I;			return I;
	break;			break;
	default:			default:
	break;			break;
	}			}
	// TODO: These folds could be refactored to be part of the above calls.			// TODO: These folds could be refactored to be part of the above calls.
	if (Instruction I = foldICmpBinOpEqualityWithConstant(Cmp, BO, C))			if (Instruction I = foldICmpBinOpEqualityWithConstant(Cmp, BO, C))
	return I;			return I;
	lebedev.riUnsubmitted Done Reply Inline Actions I'm not looking forward seeing the fallout of this move. I will be extremely surprised if, while fixing the target problem, this won't expose numerous other fold order issues. Can you instead simply follow the `TODO`, and simply refactor the single interesting fold out of `foldICmpBinOpEqualityWithConstant()` into `foldICmpAndConstant()` i guess? lebedev.ri: I'm not looking forward seeing the fallout of this move. I will be extremely surprised if…
	}			}

	// Match against CmpInst LHS being instructions other than binary operators.			// Match against CmpInst LHS being instructions other than binary operators.

	if (auto *SI = dyn_cast<SelectInst>(Cmp.getOperand(0))) {			if (auto *SI = dyn_cast<SelectInst>(Cmp.getOperand(0))) {
	// For now, we only support constant integers while folding the			// For now, we only support constant integers while folding the
	// ICMP(SELECT)) pattern. We can extend this to support vector of integers			// ICMP(SELECT)) pattern. We can extend this to support vector of integers
	// similar to the cases handled by binary ops above.			// similar to the cases handled by binary ops above.
	▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
	}			}
	case Instruction::And: {			case Instruction::And: {
	const APInt *BOC;			const APInt *BOC;
	if (match(BOp1, m_APInt(BOC))) {			if (match(BOp1, m_APInt(BOC))) {
	// If we have ((X & C) == C), turn it into ((X & C) != 0).			// If we have ((X & C) == C), turn it into ((X & C) != 0).
	if (C == *BOC && C.isPowerOf2())			if (C == *BOC && C.isPowerOf2())
	return new ICmpInst(isICMP_NE ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE,			return new ICmpInst(isICMP_NE ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE,
	BO, Constant::getNullValue(RHS->getType()));			BO, Constant::getNullValue(RHS->getType()));

	// Don't perform the following transforms if the AND has multiple uses
	if (!BO->hasOneUse())
	break;

	// Replace (and X, (1 << size(X)-1) != 0) with x s< 0
	if (BOC->isSignMask()) {
	lebedev.riUnsubmitted Not Done Reply Inline Actions Uhm, where did this check that we were comparing with `0`? lebedev.ri: Uhm, where did this check that we were comparing with `0`?
	huihuizAuthorUnsubmitted Done Reply Inline Actions What happened was, C being 0, signbit, other number we are ok with 0 if C is signbit, consider test: X & signbit == signbit fold: X & -C == -C -> X > u ~C X & -C != -C -> X <= u ~C and fold: For i32: x >u 2147483647 -> x <s 0 -> true if sign bit set are scheduled before fold: (and X, (1 << size(X)-1) != 0) with x s< 0 if C is other number, SimplifyICmpInst will do its job huihuiz: What happened was, C being 0, signbit, other number 1. we are ok with 0 2. if C is signbit…
	Constant *Zero = Constant::getNullValue(BOp0->getType());
	auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;
	return new ICmpInst(NewPred, BOp0, Zero);
	}

	// ((X & ~7) == 0) --> X < 8
	if (C.isNullValue() && (~(*BOC) + 1).isPowerOf2()) {
	Constant *NegBOC = ConstantExpr::getNeg(cast<Constant>(BOp1));
	auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;
	return new ICmpInst(NewPred, BOp0, NegBOC);
	}
	}			}
	break;			break;
	}			}
	case Instruction::Mul:			case Instruction::Mul:
	if (C.isNullValue() && BO->hasNoSignedWrap()) {			if (C.isNullValue() && BO->hasNoSignedWrap()) {
	const APInt *BOC;			const APInt *BOC;
	if (match(BOp1, m_APInt(BOC)) && !BOC->isNullValue()) {			if (match(BOp1, m_APInt(BOC)) && !BOC->isNullValue()) {
	// The trivial case (mul X, 0) is handled by InstSimplify.			// The trivial case (mul X, 0) is handled by InstSimplify.
	▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

test/Transforms/InstCombine/icmp-shift-and-signbit.ll

This file was added.

				; RUN: opt %s -instcombine -S \| FileCheck %s

				lebedev.riUnsubmitted Not Done Reply Inline Actions Please Move this to a new differential In that same patch, re-add your initial motivational pattern, that shows that fold reordering did something Use `llvm/utils/update_test_checks.py` to generate check lines Rebase this diff ontop of that new patch, so this diff shows how the check lines change lebedev.ri: Please 1. Move this to a new differential 2. In that same patch, re-add your initial…

				; ((X << Y) & signbit) ==/!= 0 -> (X << Y) >=/< 0
				; CHECK-LABEL: shl-and-signbit
				; CHECK: [[SHL:%.*]] = shl i32 %x, %y
				; CHECK-NEXT icmp slt i32 [[SHL]], 0
				define dso_local i32 @shl-and-signbit(i32 %x, i32 %y, i32 %a, i32 %b) {
				entry:
				%shl = shl i32 %x, %y
				%and = and i32 %shl, -2147483648
				%tobool = icmp ne i32 %and, 0
				%selv = select i1 %tobool, i32 %a, i32 %b
				lebedev.riUnsubmitted Done Reply Inline Actions `select` is not relevant for this pattern, drop it lebedev.ri: `select` is not relevant for this pattern, drop it
				ret i32 %selv
				}

				; ((X << Y) & ~C) ==/!= 0 -> (X << Y) </>= C+1; C+1 is power of 2
				; CHECK-LABEL: shl-and-negC
				; CHECK: [[SHL:%.*]] = shl i32 %x, %y
				; CHECK-NEXT icmp ugt i32 [[SHL]], 7
				define dso_local i32 @shl-and-negC(i32 %x, i32 %y, i32 %a, i32 %b) {
				entry:
				%shl = shl i32 %x, %y
				%and = and i32 %shl, 4294967288 ; ~7
				%tobool = icmp ne i32 %and, 0
				%selv = select i1 %tobool, i32 %a, i32 %b
				ret i32 %selv
				}

				; ((X l>> Y) & ~C) ==/!= 0 -> (X l>> Y) </>= C+1; C+1 is power of 2
				; CHECK-LABEL: lshr-and-negC
				; CHECK: [[LSHR:%.*]] = lshr i32 %x, %y
				; CHECK-NEXT icmp ult i32 [[LSHR]], 8
				define dso_local i32 @lshr-and-negC(i32 %x, i32 %y, i32 %a, i32 %b) {
				entry:
				%shl = lshr i32 %x, %y
				%and = and i32 %shl, 4294967288 ; ~7
				%tobool = icmp eq i32 %and, 0
				%selv = select i1 %tobool, i32 %a, i32 %b
				ret i32 %selv
				}

				; (X & (signbit l>> Y)) ==/!= 0 -> (X << Y) >=/< 0
				; CHECK-LABEL: signbit-lshr-and
				; CHECK: [[SHL:%.*]] = shl i32 %x, %y
				; CHECK-NEXT icmp slt i32 [[SHL]], 0
				define dso_local i32 @signbit-lshr-and(i32 %x, i32 %y, i32 %a, i32 %b) {
				entry:
				%shr = lshr i32 -2147483648, %y
				%and = and i32 %shr, %x
				%tobool = icmp ne i32 %and, 0
				%selv = select i1 %tobool, i32 %a, i32 %b
				ret i32 %selv
				}

				; (X & (signbit << Y)) ==/!= 0 -> (X l>> Y) >=/< 0
				; CHECK-LABEL: signbit-shl-and
				; CHECK: [[LSHR:%.*]] = lshr i32 %x, %y
				; CHECK-NEXT icmp sgt i32 [[LSHR]], -1
				define dso_local i32 @signbit-shl-and(i32 %x, i32 %y, i32 %a, i32 %b) {
				entry:
				%shr = shl i32 -2147483648, %y
				%and = and i32 %shr, %x
				%tobool = icmp eq i32 %and, 0
				%selv = select i1 %tobool, i32 %a, i32 %b
				ret i32 %selv
				}

				lebedev.riUnsubmitted Not Done Reply Inline Actions You also want a few extra tests: A trivial vector test with `<i32 -2147483648, i32 -2147483648>` and `<i32 0, i32 0>` 3 vector tests with undefs: `<i32 -2147483648, i32 undef, i32 -2147483648>` and `<i32 0, i32 0, i32 0>` `<i32 -2147483648, i32 -2147483648, i32 -2147483648>` and `<i32 0, i32 undef, i32 0>` `<i32 -2147483648, i32 undef, i32 -2147483648>` and `<i32 0, i32 undef, i32 0>` A tests to verify single-use constraints: a test with extra use on `%shr` (should get folded, but not others) a test with extra use on `%and` a test with extra use on `%shr` and `%and`. How to introduce extra uses see e.g. `llvm/test/Transforms/InstCombine/unfold-masked-merge-with-const-mask-scalar.ll` lebedev.ri: You also want a few extra tests: 1. A trivial vector test with `<i32 -2147483648, i32…

test/Transforms/InstCombine/pr17827.ll

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	;
%andq = and <2 x i8> %q, <i8 8, i8 8>		%andq = and <2 x i8> %q, <i8 8, i8 8>
%or = or <2 x i8> %andq, %andp		%or = or <2 x i8> %andq, %andp
%shl = shl <2 x i8> %or, <i8 5, i8 5>		%shl = shl <2 x i8> %or, <i8 5, i8 5>
%ashr = ashr <2 x i8> %shl, <i8 5, i8 5>		%ashr = ashr <2 x i8> %shl, <i8 5, i8 5>
%cmp = icmp slt <2 x i8> %ashr, <i8 1, i8 1>		%cmp = icmp slt <2 x i8> %ashr, <i8 1, i8 1>
ret <2 x i1> %cmp		ret <2 x i1> %cmp
}		}

; Unsigned compare allows a transformation to compare against 0.		; Unsigned compare allows a transformation to compare against 0.
lebedev.riUnsubmitted Not Done Reply Inline Actions These don't look like improvements to me. Looks like that reordering exposes yet another missing fold. lebedev.ri: These don't look like improvements to me. Looks like that reordering exposes yet another…
huihuizAuthorUnsubmitted Done Reply Inline Actions in D63026 I am moving fold ((X & ~7) == 0) --> X < 8 ahead. If X is (BinOp Y, C3), should allow other rules to fold C3 with C2, eg (X >> C3) & C2 != C1 -> (X & (C2 << C3)) != (C1 << C3) huihuiz: in D63026 I am moving fold ((X & ~7) == 0) --> X < 8 ahead. If X is (BinOp Y, C3), should…
define i1 @test_shift_and_cmp_changed2(i8 %p) {		define i1 @test_shift_and_cmp_changed2(i8 %p) {
; CHECK-LABEL: @test_shift_and_cmp_changed2(		; CHECK-LABEL: @test_shift_and_cmp_changed2(
; CHECK-NEXT: [[ANDP:%.*]] = and i8 %p, 6		; CHECK-NEXT: [[SHLP:%.*]] = shl i8 %p, 5
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[ANDP]], 0		; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[SHLP]], 64
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%shlp = shl i8 %p, 5		%shlp = shl i8 %p, 5
%andp = and i8 %shlp, -64		%andp = and i8 %shlp, -64
%cmp = icmp ult i8 %andp, 32		%cmp = icmp ult i8 %andp, 32
ret i1 %cmp		ret i1 %cmp
}		}

define <2 x i1> @test_shift_and_cmp_changed2_vec(<2 x i8> %p) {		define <2 x i1> @test_shift_and_cmp_changed2_vec(<2 x i8> %p) {
; CHECK-LABEL: @test_shift_and_cmp_changed2_vec(		; CHECK-LABEL: @test_shift_and_cmp_changed2_vec(
; CHECK-NEXT: [[ANDP:%.*]] = and <2 x i8> %p, <i8 6, i8 6>		; CHECK-NEXT: [[SHLP:%.*]] = shl <2 x i8> %p, <i8 5, i8 5>
; CHECK-NEXT: [[CMP:%.*]] = icmp eq <2 x i8> [[ANDP]], zeroinitializer		; CHECK-NEXT: [[CMP:%.*]] = icmp ult <2 x i8> [[SHLP]], <i8 64, i8 64>
; CHECK-NEXT: ret <2 x i1> [[CMP]]		; CHECK-NEXT: ret <2 x i1> [[CMP]]
;		;
%shlp = shl <2 x i8> %p, <i8 5, i8 5>		%shlp = shl <2 x i8> %p, <i8 5, i8 5>
%andp = and <2 x i8> %shlp, <i8 -64, i8 -64>		%andp = and <2 x i8> %shlp, <i8 -64, i8 -64>
%cmp = icmp ult <2 x i8> %andp, <i8 32, i8 32>		%cmp = icmp ult <2 x i8> %andp, <i8 32, i8 32>
ret <2 x i1> %cmp		ret <2 x i1> %cmp
}		}

Show All 25 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Introduce fold for icmp pred (and X, (sh signbit, Y)), 0.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 203284

lib/Transforms/InstCombine/InstCombineCompares.cpp

test/Transforms/InstCombine/icmp-shift-and-signbit.ll

test/Transforms/InstCombine/pr17827.ll

[InstCombine] Introduce fold for icmp pred (and X, (sh signbit, Y)), 0.
AbandonedPublic