This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineAddSub.cpp
2/3
InstCombineAndOrXor.cpp
1
InstCombineInternal.h
12/18
InstructionCombining.cpp
-
test/Transforms/
-
Transforms/
-
InstCombine/
-
and-xor-or.ll
-
binop-and-shifts.ll
2/4
or-shifted-masks.ll
-
PhaseOrdering/SystemZ/
-
SystemZ/
-
sub-xor.ll

Differential D152568

[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`
ClosedPublic

Authored by goldstein.w.n on Jun 9 2023, 11:02 AM.

Download Raw Diff

Details

Reviewers

nikic
RKSimon

Commits

rG91cdffcb2f9e: [InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`

Summary

If Mask and Amt are not constants and binop1 and binop2 are
the same we can transform to:
(binop (lshift (binop X, Y), Amt), Mask)

If binop is add, lshift must be shl.

If Mask and Amt are constants C and C1 respectively.
We can transform to:
(lshift1 (binop1 (binop2 X, (inv_lshift1 C, C1), Y)), C1)

Saving an instruction IFF:
lshift1 is same opcode as lshift2
Either bitwise1 and/or bitwise2 is and.

Proofs(1/2): https://alive2.llvm.org/ce/z/BjN-m_
Proofs(2/2): https://alive2.llvm.org/ce/z/bZn5QB

This is to help fix the regression caused in D151807

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

goldstein.w.n created this revision.Jun 9 2023, 11:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 11:02 AM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

goldstein.w.n requested review of this revision.Jun 9 2023, 11:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 11:02 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

goldstein.w.n added a parent revision: D152567: [InstCombine] Add tests for (binop (binop (lshift X,Amt),Mask),(lshift Y,Amt)); NFC.Jun 9 2023, 11:02 AM

goldstein.w.n mentioned this in D151807: [InstCombine] Revisit user of newly one-use instructions.

Harbormaster completed remote builds in B237817: Diff 530024.Jun 9 2023, 12:48 PM

nikic added inline comments.Jun 9 2023, 1:12 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2096	This is probably not safe, it looks like `m_LogicalShift` also supports constant expressions.
2108	So I guess this works, but it's mostly by accident and doesn't capture the core requirement. We're moving a bitwise op before the shift, which for and/or requires that the shifted out bits are zero (this also extends to add/sub): https://alive2.llvm.org/ce/z/ieAz2T This requirement doesn't exist for and, which is why this check happens to work. I would prefer checking that the constant round-trips the shift rather than checking opcodes here. It's okay to enforce this for `and` as well, as demanded bits simplification will zero out the low bits.

Dramatically increase cases covered

goldstein.w.n retitled this revision from [InstCombine] Transform `(bitwise1 (bitwise2 (lshift1 X, C), C1), (lshift2 Y, C))` to [InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`.Jun 10 2023, 1:08 AM

goldstein.w.n edited the summary of this revision. (Show Details)

goldstein.w.n marked an inline comment as done.Jun 10 2023, 1:11 AM

goldstein.w.n added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2108	Okay, updated for all the cases I could think of. I didn't extend to `sub` just because 1) it has some special cases and 2) `sub X, Const` always canonicalizes to `add X, -Const` so covering add is enough. Current workings are: if both ops are same -> do generic transform and just re-associate the shift. (if `add`, then only if `shl`) if both are bitwise and 1 is `and` -> just do transform if both are bitwise no and -> check mask -> do transform if outer binop is and -> just do transform else -> fail

Harbormaster completed remote builds in B237936: Diff 530182.Jun 10 2023, 1:46 AM

It looks like there are some more tests that need to be updated.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
771	Here and elsewhere: Keep variable scope minimal.
775	ConstantExpr
785	it's
786	irrelevant
795	reassociate Though this is really more "distribute" than reassociate.
812	Does this case trigger for anything other than non-splat vectors? If not, please remove it. (Generally, don't add non-splat vector support it it requires any additional complexity.)
820	Isn't this also fine for add/shl, as in the case above? https://alive2.llvm.org/ce/z/ztthn5

In D152568#4412078, @nikic wrote:

It looks like there are some more tests that need to be updated.

Fixed.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
812	Why is that? I'd argue this is very minimal complexity and think it is an issue that a significant number of our combines only work for splats. Generally we mark combines that fail because of lacking splat support not as "_fail", but "_todo" which seems to imply we want to handle them. Leaving out these 4-lines of code (that also provide an early exit before more expensive checks) seems equivelent to choosing to leave a simple addressable TODO.

Fix nits and handle add better

Harbormaster completed remote builds in B238047: Diff 530324.Jun 11 2023, 1:28 PM

Rebase

Harbormaster completed remote builds in B238058: Diff 530338.Jun 11 2023, 2:26 PM

nikic added inline comments.Jun 11 2023, 2:29 PM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
808	Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D I think you can extract a helper to check `Instruction::isBitwiseLogicOp(BinOpc) \|\| ShOpc == Instruction::Shl` and use that everywhere that `isBitWiseLogicOp()` is used. (I wanted to suggest just excluding the add+lshr combination upfront so we don't have to worry about it everywhere else, but it is valid for the `and` case above. If you're okay with losing that, we can just check this in IsValidBinOpc and not worry about it after that anymore.)
812	It's not a lot of code, but it does increase the verification space of the transform non-trivially, for a transform that already has some combinatorial explosion going on. Anyway, I'm okay with keeping this, but I would suggest thinking about whether non-trivial non-splat vector handling is really a good use of implementation, review and long-term maintenance time, for the optimization benefit it provides.

goldstein.w.n added inline comments.Jun 11 2023, 11:27 PM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
808	Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D I think you can extract a helper to check `Instruction::isBitwiseLogicOp(BinOpc) \|\| ShOpc == Instruction::Shl` and use that everywhere that `isBitWiseLogicOp()` is used. Good catch (when I was playing around with combinations, I was using lshr so missed most of the add + shl cases). (I wanted to suggest just excluding the add+lshr combination upfront so we don't have to worry about it everywhere else, but it is valid for the `and` case above. If you're okay with losing that, we can just check this in IsValidBinOpc and not worry about it after that anymore.) I added `IsCompletelyDistributable` and early out if its not met after this check so the following two checks can ignore it.
812	It's not a lot of code, but it does increase the verification space of the transform non-trivially, for a transform that already has some combinatorial explosion going on. Anyway, I'm okay with keeping this, but I would suggest thinking about whether non-trivial non-splat vector handling is really a good use of implementation, review and long-term maintenance time, for the optimization benefit it provides. I guess this (`add` + `lshr`) highlight a difference in philosophy between us (that has come up on other reviews). My general feeling is the more that can be covered, the better. Its a lot more painful to have to rewrite code because the compiler isn't getting your case and updating the compiler doesn't reap rewards for most until at least a year or so down the line. More cases does imply code complexity, but generally think a clean 10k LOC is clearer than 1k messy LOC i.e, as long as coding standards are kept high and the niche cases don't degrade the quality of the file, they are worth it. Maybe I'm trivializing some aspect of it however.

Add helper for detecting easily distributable ops + ruleout non-distributable ops earlier

Harbormaster completed remote builds in B238103: Diff 530394.Jun 12 2023, 12:08 AM

It looks like binop-and-shifts.ll reverted to the baseline checks.

llvm/lib/Transforms/InstCombine/InstCombineInternal.h
453–467	I don't think listing the preconditions for the transform is particularly useful in the API docs, especially if they are this convoluted. Alternatively, maybe move the comment into the source file.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
811–818
819–833	Or alternative extract all the checks into a function returning bool, but unnecessary `// Pass` please.

nikic added inline comments.Jun 12 2023, 9:35 AM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
812	My general philosophy here is that transforms should be "principled", and not based around long lists of special cases. For pragmatic reasons, I am willing to accept unprincipled transforms for cases that have been encountered in real code, but I prefer not to have unprincipled generalizations of transforms "just because we can". For example, if you want to implement a fold for `eq`, you should always also handle `ne`, otherwise you're just handling one special case of a more general fold. If that same fold also works for unsigned predicates, it would be strongly encouraged to handle that as well. But if it's possible to generalize that fold to signed predicates, but this requires introducing special cases for known sign bit, known power of two, known bits and signed min, then ... that's not a principled extension, it's just a bag of random special cases. Adding those special cases may be fine if there is reason to believe that they will actually be useful. But we should not add those special cases just because it leaves the transform "incomplete" in a rather pedantic sense. At least that's my 2c. Of course, the point where we go from a principled extension to a bag of special cases is subjective. The particular case that started this discussion isn't really problematic (it's really less of a special-case extension and more a precisely expressed pre-condition), and deserves less attention than this discussion ended up giving it ;)

nikic added inline comments.Jun 12 2023, 10:31 AM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
811–818	Ignore this bit, it's not going to work because it's followed by `else if` ... in that case, see the note below about just extracting these checks into a bool-returning function, so that you can use early returns.

In D152568#4413983, @nikic wrote:

It looks like binop-and-shifts.ll reverted to the baseline checks.

Fixed.

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
812	My general philosophy here is that transforms should be "principled", and not based around long lists of special cases. For pragmatic reasons, I am willing to accept unprincipled transforms for cases that have been encountered in real code, but I prefer not to have unprincipled generalizations of transforms "just because we can". For example, if you want to implement a fold for `eq`, you should always also handle `ne`, otherwise you're just handling one special case of a more general fold. If that same fold also works for unsigned predicates, it would be strongly encouraged to handle that as well. But if it's possible to generalize that fold to signed predicates, but this requires introducing special cases for known sign bit, known power of two, known bits and signed min, then ... that's not a principled extension, it's just a bag of random special cases. Adding those special cases may be fine if there is reason to believe that they will actually be useful. But we should not add those special cases just because it leaves the transform "incomplete" in a rather pedantic sense. At least that's my 2c. Ah, so I guess I often end up on the pedantic end of wanting "complete" transformations. Of course, the point where we go from a principled extension to a bag of special cases is subjective. The particular case that started this discussion isn't really problematic (it's really less of a special-case extension and more a precisely expressed pre-condition), and deserves less attention than this discussion ended up giving it ;) Sorry :( but thank you for sharing your 2c :) On the note of potentially more useful patches, any chance you can let me know if: D149423 and D152226 are on your radar / todolist (wherever they may be ordered on that list). Just want to know if I should be looking for new reviewers for them or if you intend to review when you have the time (again, whenever that may be).

Move checks for when transform is valid to helper.

LGTM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
772	and -> any?
llvm/test/Transforms/InstCombine/or-shifted-masks.ll
133	This highlights another possible extension: We can have an extra binop on both sides, not just on one.

This revision is now accepted and ready to land.Jun 12 2023, 12:07 PM

Harbormaster completed remote builds in B238260: Diff 530603.Jun 12 2023, 1:45 PM

any -> and

Harbormaster completed remote builds in B238320: Diff 530686.Jun 12 2023, 5:09 PM

Move comments

Harbormaster completed remote builds in B238364: Diff 530738.Jun 12 2023, 7:38 PM

nikic added inline comments.Jun 13 2023, 6:42 AM

llvm/test/Transforms/InstCombine/or-shifted-masks.ll
133	Tested this patch together with D151807, and it looks like handling this case would be needed to get the patterns in one instcombine run instead of instcombine,early-cse,instcombine.

goldstein.w.n added inline comments.Jun 13 2023, 10:32 AM

llvm/test/Transforms/InstCombine/or-shifted-masks.ll
133	kk, working on a follow up patch.

goldstein.w.n added inline comments.Jun 13 2023, 3:30 PM

llvm/test/Transforms/InstCombine/or-shifted-masks.ll
133	I'm going to push this one first as is and split multi-binop version to new patch. Handling generic chains necessitates an entire rewrite + alot more complexity so may end up not being worth it if its more modularly done across multiple passes.

Closed by commit rG91cdffcb2f9e: [InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))` (authored by goldstein.w.n). · Explain WhyJun 13 2023, 6:08 PM

This revision was automatically updated to reflect the committed changes.

goldstein.w.n added a commit: rG91cdffcb2f9e: [InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineAddSub.cpp

3 lines

InstCombineAndOrXor.cpp

9 lines

InstCombineInternal.h

6 lines

InstructionCombining.cpp

136 lines

test/

Transforms/

InstCombine/

and-xor-or.ll

33 lines

binop-and-shifts.ll

145 lines

or-shifted-masks.ll

39 lines

PhaseOrdering/

SystemZ/

sub-xor.ll

69 lines

Diff 531130

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 1,377 Lines • ▼ Show 20 Lines	if (Instruction *R = factorizeMathWithShlOps(I, Builder))
return R;		return R;

if (Instruction *X = foldAddWithConstant(I))		if (Instruction *X = foldAddWithConstant(I))
return X;		return X;

if (Instruction *X = foldNoWrapAdd(I, Builder))		if (Instruction *X = foldNoWrapAdd(I, Builder))
return X;		return X;

		if (Instruction *R = foldBinOpShiftWithShift(I))
		return R;

Value LHS = I.getOperand(0), RHS = I.getOperand(1);		Value LHS = I.getOperand(0), RHS = I.getOperand(1);
Type *Ty = I.getType();		Type *Ty = I.getType();
if (Ty->isIntOrIntVectorTy(1))		if (Ty->isIntOrIntVectorTy(1))
return BinaryOperator::CreateXor(LHS, RHS);		return BinaryOperator::CreateXor(LHS, RHS);

// X + X --> X << 1		// X + X --> X << 1
if (LHS == RHS) {		if (LHS == RHS) {
auto *Shl = BinaryOperator::CreateShl(LHS, ConstantInt::get(Ty, 1));		auto *Shl = BinaryOperator::CreateShl(LHS, ConstantInt::get(Ty, 1));
▲ Show 20 Lines • Show All 1,370 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 2,087 Lines • ▼ Show 20 Lines	static Instruction *canonicalizeLogicFirst(BinaryOperator &I,
Value *Op0 = I.getOperand(0);		Value *Op0 = I.getOperand(0);
Value *Op1 = I.getOperand(1);		Value *Op1 = I.getOperand(1);
Value *X;		Value *X;
const APInt C, C2;		const APInt C, C2;

if (!(match(Op0, m_OneUse(m_Add(m_Value(X), m_APInt(C2)))) &&		if (!(match(Op0, m_OneUse(m_Add(m_Value(X), m_APInt(C2)))) &&
match(Op1, m_APInt(C))))		match(Op1, m_APInt(C))))
return nullptr;		return nullptr;

		nikicUnsubmitted Done Reply Inline Actions This is probably not safe, it looks like `m_LogicalShift` also supports constant expressions. nikic: This is probably not safe, it looks like `m_LogicalShift` also supports constant expressions.
unsigned Width = Ty->getScalarSizeInBits();		unsigned Width = Ty->getScalarSizeInBits();
unsigned LastOneMath = Width - C2->countr_zero();		unsigned LastOneMath = Width - C2->countr_zero();

switch (OpC) {		switch (OpC) {
case Instruction::And:		case Instruction::And:
if (C->countl_one() < LastOneMath)		if (C->countl_one() < LastOneMath)
return nullptr;		return nullptr;
break;		break;
case Instruction::Xor:		case Instruction::Xor:
case Instruction::Or:		case Instruction::Or:
if (C->countl_zero() < LastOneMath)		if (C->countl_zero() < LastOneMath)
return nullptr;		return nullptr;
		nikicUnsubmitted Not Done Reply Inline Actions So I guess this works, but it's mostly by accident and doesn't capture the core requirement. We're moving a bitwise op before the shift, which for and/or requires that the shifted out bits are zero (this also extends to add/sub): https://alive2.llvm.org/ce/z/ieAz2T This requirement doesn't exist for and, which is why this check happens to work. I would prefer checking that the constant round-trips the shift rather than checking opcodes here. It's okay to enforce this for `and` as well, as demanded bits simplification will zero out the low bits. nikic: So I guess this works, but it's mostly by accident and doesn't capture the core requirement.
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Okay, updated for all the cases I could think of. I didn't extend to `sub` just because 1) it has some special cases and 2) `sub X, Const` always canonicalizes to `add X, -Const` so covering add is enough. Current workings are: if both ops are same -> do generic transform and just re-associate the shift. (if `add`, then only if `shl`) if both are bitwise and 1 is `and` -> just do transform if both are bitwise no and -> check mask -> do transform if outer binop is and -> just do transform else -> fail goldstein.w.n: Okay, updated for all the cases I could think of. I didn't extend to `sub` just because 1) it…
break;		break;
default:		default:
llvm_unreachable("Unexpected BinaryOp!");		llvm_unreachable("Unexpected BinaryOp!");
}		}

auto *Add = cast<BinaryOperator>(Op0);		auto *Add = cast<BinaryOperator>(Op0);
Value NewBinOp = Builder.CreateBinOp(OpC, X, ConstantInt::get(Ty, C));		Value NewBinOp = Builder.CreateBinOp(OpC, X, ConstantInt::get(Ty, C));
return BinaryOperator::CreateWithCopiedFlags(Instruction::Add, NewBinOp,		return BinaryOperator::CreateWithCopiedFlags(Instruction::Add, NewBinOp,
Show All 33 Lines	Instruction *InstCombinerImpl::visitAnd(BinaryOperator &I) {

// (A\|B)&(A\|C) -> A\|(B&C) etc		// (A\|B)&(A\|C) -> A\|(B&C) etc
if (Value *V = foldUsingDistributiveLaws(I))		if (Value *V = foldUsingDistributiveLaws(I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Value *V = SimplifyBSwap(I, Builder))		if (Value *V = SimplifyBSwap(I, Builder))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

		if (Instruction *R = foldBinOpShiftWithShift(I))
		return R;

Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

Value X, Y;		Value X, Y;
if (match(Op0, m_OneUse(m_LogicalShift(m_One(), m_Value(X)))) &&		if (match(Op0, m_OneUse(m_LogicalShift(m_One(), m_Value(X)))) &&
match(Op1, m_One())) {		match(Op1, m_One())) {
// (1 << X) & 1 --> zext(X == 0)		// (1 << X) & 1 --> zext(X == 0)
// (1 >> X) & 1 --> zext(X == 0)		// (1 >> X) & 1 --> zext(X == 0)
Value *IsZero = Builder.CreateICmpEQ(X, ConstantInt::get(Ty, 0));		Value *IsZero = Builder.CreateICmpEQ(X, ConstantInt::get(Ty, 0));
▲ Show 20 Lines • Show All 1,031 Lines • ▼ Show 20 Lines	if (Instruction BitOp = matchBSwapOrBitReverse(I, /MatchBSwaps*/ true,
return BitOp;		return BitOp;

if (Instruction Funnel = matchFunnelShift(I, this))		if (Instruction Funnel = matchFunnelShift(I, this))
return Funnel;		return Funnel;

if (Instruction *Concat = matchOrConcat(I, Builder))		if (Instruction *Concat = matchOrConcat(I, Builder))
return replaceInstUsesWith(I, Concat);		return replaceInstUsesWith(I, Concat);

		if (Instruction *R = foldBinOpShiftWithShift(I))
		return R;

Value X, Y;		Value X, Y;
const APInt *CV;		const APInt *CV;
if (match(&I, m_c_Or(m_OneUse(m_Xor(m_Value(X), m_APInt(CV))), m_Value(Y))) &&		if (match(&I, m_c_Or(m_OneUse(m_Xor(m_Value(X), m_APInt(CV))), m_Value(Y))) &&
!CV->isAllOnes() && MaskedValueIsZero(Y, *CV, 0, &I)) {		!CV->isAllOnes() && MaskedValueIsZero(Y, *CV, 0, &I)) {
// (X ^ C) \| Y -> (X \| Y) ^ C iff Y & C == 0		// (X ^ C) \| Y -> (X \| Y) ^ C iff Y & C == 0
// The check for a 'not' op is for efficiency (if Y is known zero --> ~X).		// The check for a 'not' op is for efficiency (if Y is known zero --> ~X).
Value *Or = Builder.CreateOr(X, Y);		Value *Or = Builder.CreateOr(X, Y);
return BinaryOperator::CreateXor(Or, ConstantInt::get(Ty, *CV));		return BinaryOperator::CreateXor(Or, ConstantInt::get(Ty, *CV));
▲ Show 20 Lines • Show All 1,057 Lines • ▼ Show 20 Lines	if (SimplifyDemandedInstructionBits(I))
return &I;		return &I;

if (Value *V = SimplifyBSwap(I, Builder))		if (Value *V = SimplifyBSwap(I, Builder))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Instruction *R = foldNot(I))		if (Instruction *R = foldNot(I))
return R;		return R;

		if (Instruction *R = foldBinOpShiftWithShift(I))
		return R;

// Fold (X & M) ^ (Y & ~M) -> (X & M) \| (Y & ~M)		// Fold (X & M) ^ (Y & ~M) -> (X & M) \| (Y & ~M)
// This it a special case in haveNoCommonBitsSet, but the computeKnownBits		// This it a special case in haveNoCommonBitsSet, but the computeKnownBits
// calls in there are unnecessary as SimplifyDemandedInstructionBits should		// calls in there are unnecessary as SimplifyDemandedInstructionBits should
// have already taken care of those cases.		// have already taken care of those cases.
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
Value *M;		Value *M;
if (match(&I, m_c_Xor(m_c_And(m_Not(m_Value(M)), m_Value()),		if (match(&I, m_c_Xor(m_c_And(m_Not(m_Value(M)), m_Value()),
m_c_And(m_Deferred(M), m_Value()))))		m_c_And(m_Deferred(M), m_Value()))))
▲ Show 20 Lines • Show All 253 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines

public:

/// X % (C0 * C1)

Value *SimplifyAddWithRemainder(BinaryOperator &I);

// Binary Op helper for select operations where the expression can be

// efficiently reorganized.

Value *SimplifySelectsFeedingBinaryOp(BinaryOperator &I, Value *LHS,

Value *RHS);

// (Binop1 (Binop2 (logic_shift X, C), C1), (logic_shift Y, C))

// -> (logic_shift (Binop1 (Binop2 X, inv_logic_shift(C1, C)), Y), C)

// (Binop1 (Binop2 (logic_shift X, Amt), Mask), (logic_shift Y, Amt))

// -> (BinOp (logic_shift (BinOp X, Y)), Mask)

Instruction *foldBinOpShiftWithShift(BinaryOperator &I);

/// This tries to simplify binary operations by factorizing out common terms

/// (e. g. "(A*B)+(A*C)" -> "A*(B+C)").

Value *tryFactorizationFolds(BinaryOperator &I);

/// Match a select chain which produces one of three values based on whether

/// the LHS is less than, equal to, or greater than RHS respectively.

/// Return true if we matched a three way compare idiom. The LHS, RHS, Less,

/// Equal and Greater values are saved in the matching process and returned to

/// the caller.

nikicUnsubmitted

Not Done

Value *RHS);

// (Binop1 (Binop2 (logic_shift X, C), C1), (logic_shift Y, C))

- // IFF

- // 1) the logic_shifts match

- // 2) either both binops are binops and one is `and` or

- // BinOp1 is `and`

- // (logic_shift (inv_logic_shift C1, C), C) == C1 or

- //

// -> (logic_shift (Binop1 (Binop2 X, inv_logic_shift(C1, C)), Y), C)

// (Binop1 (Binop2 (logic_shift X, Amt), Mask), (logic_shift Y, Amt))

- // IFF

- // 1) the logic_shifts match

- // 2) BinOp1 == BinOp2 (if BinOp == `add`, then also requires `shl`).

- //

// -> (BinOp (logic_shift (BinOp X, Y)), Mask)

Instruction *foldBinOpShiftWithShift(BinaryOperator &I);

I don't think listing the preconditions for the transform is particularly useful in the API docs, especially if they are this convoluted.

Alternatively, maybe move the comment into the source file.

nikic: I don't think listing the preconditions for the transform is particularly useful in the API…

bool matchThreeWayIntCompare(SelectInst *SI, Value *&LHS, Value *&RHS,

ConstantInt *&Less, ConstantInt *&Equal,

ConstantInt *&Greater);

/// Attempts to replace V with a simpler value based on the demanded

/// bits.

Value *SimplifyDemandedUseBits(Value *V, APInt DemandedMask, KnownBits &Known,

unsigned Depth, Instruction *CxtI);

▲ Show 20 Lines • Show All 243 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 724 Lines • ▼ Show 20 Lines if (TopLevelOpcode == Instruction::Add && InnerOpcode == Instruction::Mul) {

// nuw can be propagated with any constant or nuw value. // nuw can be propagated with any constant or nuw value.

cast<Instruction>(RetVal)->setHasNoUnsignedWrap(HasNUW); cast<Instruction>(RetVal)->setHasNoUnsignedWrap(HasNUW);

} }

return RetVal; return RetVal;

} }

// (Binop1 (Binop2 (logic_shift X, C), C1), (logic_shift Y, C))

// IFF

// 1) the logic_shifts match

// 2) either both binops are binops and one is `and` or

// BinOp1 is `and`

// (logic_shift (inv_logic_shift C1, C), C) == C1 or

// -> (logic_shift (Binop1 (Binop2 X, inv_logic_shift(C1, C)), Y), C)

// (Binop1 (Binop2 (logic_shift X, Amt), Mask), (logic_shift Y, Amt))

// IFF

// 1) the logic_shifts match

// 2) BinOp1 == BinOp2 (if BinOp == `add`, then also requires `shl`).

// -> (BinOp (logic_shift (BinOp X, Y)), Mask)

Instruction *InstCombinerImpl::foldBinOpShiftWithShift(BinaryOperator &I) {

auto IsValidBinOpc = [](unsigned Opc) {

switch (Opc) {

default:

return false;

case Instruction::And:

case Instruction::Or:

case Instruction::Xor:

case Instruction::Add:

// Skip Sub as we only match constant masks which will canonicalize to use

// add.

return true;

}

};

// Check if we can distribute binop arbitrarily. `add` + `lshr` has extra

// constraints.

auto IsCompletelyDistributable = [](unsigned BinOpc1, unsigned BinOpc2,

unsigned ShOpc) {

return (BinOpc1 != Instruction::Add && BinOpc2 != Instruction::Add) ||

ShOpc == Instruction::Shl;

};

auto GetInvShift = [](unsigned ShOpc) {

nikicUnsubmitted

Done

// LHS and RHS need same shift opcode

- ShOpc = IY->getOpcode();

+ unsigned ShOpc = IY->getOpcode();

if (ShOpc != IX->getOpcode())

Here and elsewhere: Keep variable scope minimal.

nikic: Here and elsewhere: Keep variable scope minimal.

return ShOpc == Instruction::LShr ? Instruction::Shl : Instruction::LShr;

nikicUnsubmitted

Not Done

and -> any?

nikic: and -> any?

};

auto CanDistributeBinops = [&](unsigned BinOpc1, unsigned BinOpc2,

nikicUnsubmitted

Done

ConstantExpr

nikic: ConstantExpr

unsigned ShOpc, Constant *CMask,

Constant *CShift) {

// If the BinOp1 is `and` we don't need to check the mask.

if (BinOpc1 == Instruction::And)

return true;

// For all other possible transfers we need complete distributable

// binop/shift (anything but `add` + `lshr`).

if (!IsCompletelyDistributable(BinOpc1, BinOpc2, ShOpc))

return false;

nikicUnsubmitted

Done

it's

nikic: it's

nikicUnsubmitted

Done

irrelevant

nikic: irrelevant

// If BinOp2 is `and`, any mask works (this only really helps for non-splat

// vecs, otherwise the mask will be simplified and the following check will

// handle it).

if (BinOpc2 == Instruction::And)

return true;

// Otherwise, need mask that meets the below requirement.

// (logic_shift (inv_logic_shift Mask, ShAmt), ShAmt) == Mask

return ConstantExpr::get(

nikicUnsubmitted

Done

reassociate

Though this is really more "distribute" than reassociate.

nikic: reassociate Though this is really more "distribute" than reassociate.

ShOpc, ConstantExpr::get(GetInvShift(ShOpc), CMask, CShift),

CShift) == CMask;

};

auto MatchBinOp = [&](unsigned ShOpnum) -> Instruction * {

Constant *CMask, *CShift;

Value *X, *Y, *ShiftedX, *Mask, *Shift;

if (!match(I.getOperand(ShOpnum),

m_OneUse(m_LogicalShift(m_Value(Y), m_Value(Shift)))))

return nullptr;

if (!match(I.getOperand(1 - ShOpnum),

m_BinOp(m_Value(ShiftedX), m_Value(Mask))))

return nullptr;

nikicUnsubmitted

Not Done

Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D

I think you can extract a helper to check Instruction::isBitwiseLogicOp(BinOpc) || ShOpc == Instruction::Shl and use that everywhere that isBitWiseLogicOp() is used.

(I wanted to suggest just excluding the add+lshr combination upfront so we don't have to worry about it everywhere else, but it is valid for the and case above. If you're okay with losing that, we can just check this in IsValidBinOpc and not worry about it after that anymore.)

nikic: Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D I think you…

goldstein.w.nAuthorUnsubmitted

Done

Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D

I think you can extract a helper to check Instruction::isBitwiseLogicOp(BinOpc) || ShOpc == Instruction::Shl and use that everywhere that isBitWiseLogicOp() is used.

Good catch (when I was playing around with combinations, I was using lshr so missed most of the add + shl cases).

(I wanted to suggest just excluding the add+lshr combination upfront so we don't have to worry about it everywhere else, but it is valid for the and case above. If you're okay with losing that, we can just check this in IsValidBinOpc and not worry about it after that anymore.)

I added IsCompletelyDistributable and early out if its not met after this check so the following two checks can ignore it.

goldstein.w.n: > Handling add+shl should be safe here as well: https://alive2.llvm.org/ce/z/dWf_6D > > I…

if (!match(ShiftedX,

m_OneUse(m_LogicalShift(m_Value(X), m_Specific(Shift)))))

return nullptr;

nikicUnsubmitted

Not Done

Does this case trigger for anything other than non-splat vectors? If not, please remove it. (Generally, don't add non-splat vector support it it requires any additional complexity.)

nikic: Does this case trigger for anything other than non-splat vectors? If not, please remove it.

goldstein.w.nAuthorUnsubmitted

Done

Why is that? I'd argue this is very minimal complexity and think it is an issue that a significant number of our combines only work for splats. Generally we mark combines that fail because of lacking splat support not as "*_fail", but "*_todo" which seems to imply we *want* to handle them. Leaving out these 4-lines of code (that also provide an early exit before more expensive checks) seems equivelent to choosing to leave a simple addressable TODO.

goldstein.w.n: Why is that? I'd argue this is very minimal complexity and think it is an issue that a…

nikicUnsubmitted

Not Done

It's not a lot of code, but it does increase the verification space of the transform non-trivially, for a transform that already has some combinatorial explosion going on.

Anyway, I'm okay with keeping this, but I would suggest thinking about whether non-trivial non-splat vector handling is really a good use of implementation, review and long-term maintenance time, for the optimization benefit it provides.

nikic: It's not a lot of code, but it does increase the verification space of the transform non…

goldstein.w.nAuthorUnsubmitted

Done

It's not a lot of code, but it does increase the verification space of the transform non-trivially, for a transform that already has some combinatorial explosion going on.

Anyway, I'm okay with keeping this, but I would suggest thinking about whether non-trivial non-splat vector handling is really a good use of implementation, review and long-term maintenance time, for the optimization benefit it provides.

I guess this (add + lshr) highlight a difference in philosophy between us (that has come up on other reviews). My general feeling is the more that can be covered, the better. Its a lot more painful to have to rewrite code because the compiler isn't getting your case and updating the compiler doesn't reap rewards for most until at least a year or so down the line. More cases does imply code complexity, but generally think a clean 10k LOC is clearer than 1k messy LOC i.e, as long as coding standards are kept high and the niche cases don't degrade the quality of the file, they are worth it.
Maybe I'm trivializing some aspect of it however.

goldstein.w.n: > It's not a lot of code, but it does increase the verification space of the transform non…

nikicUnsubmitted

Not Done

My general philosophy here is that transforms should be "principled", and not based around long lists of special cases. For pragmatic reasons, I am willing to accept unprincipled transforms for cases that have been encountered in real code, but I prefer not to have unprincipled generalizations of transforms "just because we can".

For example, if you want to implement a fold for eq, you should always also handle ne, otherwise you're just handling one special case of a more general fold. If that same fold also works for unsigned predicates, it would be strongly encouraged to handle that as well. But if it's possible to generalize that fold to signed predicates, but this requires introducing special cases for known sign bit, known power of two, known bits and signed min, then ... that's not a principled extension, it's just a bag of random special cases. Adding those special cases may be fine if there is reason to believe that they will actually be useful. But we should not add those special cases just because it leaves the transform "incomplete" in a rather pedantic sense. At least that's my 2c.

Of course, the point where we go from a principled extension to a bag of special cases is subjective. The particular case that started this discussion isn't really problematic (it's really less of a special-case extension and more a precisely expressed pre-condition), and deserves less attention than this discussion ended up giving it ;)

nikic: My general philosophy here is that transforms should be "principled", and not based around long…

goldstein.w.nAuthorUnsubmitted

Done

My general philosophy here is that transforms should be "principled", and not based around long lists of special cases. For pragmatic reasons, I am willing to accept unprincipled transforms for cases that have been encountered in real code, but I prefer not to have unprincipled generalizations of transforms "just because we can".

For example, if you want to implement a fold for eq, you should always also handle ne, otherwise you're just handling one special case of a more general fold. If that same fold also works for unsigned predicates, it would be strongly encouraged to handle that as well. But if it's possible to generalize that fold to signed predicates, but this requires introducing special cases for known sign bit, known power of two, known bits and signed min, then ... that's not a principled extension, it's just a bag of random special cases. Adding those special cases may be fine if there is reason to believe that they will actually be useful. But we should not add those special cases just because it leaves the transform "incomplete" in a rather pedantic sense. At least that's my 2c.

Ah, so I guess I often end up on the pedantic end of wanting "complete" transformations.

Of course, the point where we go from a principled extension to a bag of special cases is subjective. The particular case that started this discussion isn't really problematic (it's really less of a special-case extension and more a precisely expressed pre-condition), and deserves less attention than this discussion ended up giving it ;)

Sorry :(
but thank you for sharing your 2c :)

On the note of potentially more useful patches, any chance you can let me know if:
D149423
and
D152226

are on your radar / todolist (wherever they may be ordered on that list).
Just want to know if I should be looking for new reviewers for them or
if you intend to review when you have the time (again, whenever that
may be).

goldstein.w.n: > My general philosophy here is that transforms should be "principled", and not based around…

// Make sure we are matching instruction shifts and not ConstantExpr

auto *IY = dyn_cast<Instruction>(I.getOperand(ShOpnum));

auto *IX = dyn_cast<Instruction>(ShiftedX);

if (!IY || !IX)

return nullptr;

nikicUnsubmitted

Not Done

ShOpc == Instruction::LShr ? Instruction::Shl : Instruction::LShr;

- if (I.getOpcode() == Instruction::And) {

- // Pass

- }

- // For all other possible transfers we need complete distributable

- // binop/shift (anything but `add` + `lshr`).

- else if (!IsCompletelyDistributable(I.getOpcode(), BinOpc, ShOpc)) {

+ // Top-level and can always be transformed. For all other cases we need

+ // completely distributable binop/shift (anything but `add` + `lshr`).

+ if (I.getOpcode() != Instruction::And &&

+ !IsCompletelyDistributable(I.getOpcode(), BinOpc, ShOpc))

return nullptr;

- }

// If BinOp2 is `and`, and mask works (this only really helps for non-splat

nikic:

nikicUnsubmitted

Done

Ignore this bit, it's not going to work because it's followed by else if ... in that case, see the note below about just extracting these checks into a bool-returning function, so that you can use early returns.

nikic: Ignore this bit, it's not going to work because it's followed by `else if` ... in that case…

// LHS and RHS need same shift opcode

nikicUnsubmitted

Done

Isn't this also fine for add/shl, as in the case above? https://alive2.llvm.org/ce/z/ztthn5

nikic: Isn't this also fine for add/shl, as in the case above? https://alive2.llvm.org/ce/z/ztthn5

unsigned ShOpc = IY->getOpcode();

if (ShOpc != IX->getOpcode())

return nullptr;

// Make sure binop is real instruction and not ConstantExpr

auto *BO2 = dyn_cast<Instruction>(I.getOperand(1 - ShOpnum));

if (!BO2)

return nullptr;

unsigned BinOpc = BO2->getOpcode();

// Make sure we have valid binops.

if (!IsValidBinOpc(I.getOpcode()) || !IsValidBinOpc(BinOpc))

return nullptr;

nikicUnsubmitted

Done

return nullptr;

}

- // If BinOp2 is `and`, and mask works (this only really helps for non-splat

- // vecs, otherwise the mask will be simplified and the following check will

- // handle it).

- else if (BinOpc == Instruction::And) {

- // Pass

- }

- // Otherwise, need mask that meets the below requirement.

+ // Unless BinOp2 is `and`, make sure no bits are shifted out of the mask.

+ // The `and` case only really helps for non-splat vecs, otherwise the mask

+ // will be simplified and the following check will handle it).

// (logic_shift (inv_logic_shift Mask, ShAmt), ShAmt) == Mask

- else if (ConstantExpr::get(ShOpc,

- ConstantExpr::get(InvShOpc, CMask, CShift),

- CShift) == CMask) {

- // Pass

- } else {

+ if (BinOpc != Instruction::And &&

+ ConstantExpr::get(ShOpc,

+ ConstantExpr::get(InvShOpc, CMask, CShift),

+ CShift) != CMask)

return nullptr;

- }

Constant *NewCMask = ConstantExpr::get(InvShOpc, CMask, CShift);

Or alternative extract all the checks into a function returning bool, but unnecessary // Pass please.

nikic: Or alternative extract all the checks into a function returning bool, but unnecessary `// Pass`…

// If BinOp1 == BinOp2 and it's bitwise or shl with add, then just

// distribute to drop the shift irrelevant of constants.

if (BinOpc == I.getOpcode() &&

IsCompletelyDistributable(I.getOpcode(), BinOpc, ShOpc)) {

Value *NewBinOp2 = Builder.CreateBinOp(I.getOpcode(), X, Y);

Value *NewBinOp1 = Builder.CreateBinOp(

static_cast<Instruction::BinaryOps>(ShOpc), NewBinOp2, Shift);

return BinaryOperator::Create(I.getOpcode(), NewBinOp1, Mask);

}

// Otherwise we can only distribute by constant shifting the mask, so

// ensure we have constants.

if (!match(Shift, m_ImmConstant(CShift)))

return nullptr;

if (!match(Mask, m_ImmConstant(CMask)))

return nullptr;

// Check if we can distribute the binops.

if (!CanDistributeBinops(I.getOpcode(), BinOpc, ShOpc, CMask, CShift))

return nullptr;

Constant *NewCMask = ConstantExpr::get(GetInvShift(ShOpc), CMask, CShift);

Value *NewBinOp2 = Builder.CreateBinOp(

static_cast<Instruction::BinaryOps>(BinOpc), X, NewCMask);

Value *NewBinOp1 = Builder.CreateBinOp(I.getOpcode(), Y, NewBinOp2);

return BinaryOperator::Create(static_cast<Instruction::BinaryOps>(ShOpc),

NewBinOp1, CShift);

};

if (Instruction *R = MatchBinOp(0))

return R;

return MatchBinOp(1);

}

Value *InstCombinerImpl::tryFactorizationFolds(BinaryOperator &I) { Value *InstCombinerImpl::tryFactorizationFolds(BinaryOperator &I) {

Value *LHS = I.getOperand(0), *RHS = I.getOperand(1); Value *LHS = I.getOperand(0), *RHS = I.getOperand(1);

BinaryOperator *Op0 = dyn_cast<BinaryOperator>(LHS); BinaryOperator *Op0 = dyn_cast<BinaryOperator>(LHS);

BinaryOperator *Op1 = dyn_cast<BinaryOperator>(RHS); BinaryOperator *Op1 = dyn_cast<BinaryOperator>(RHS);

Instruction::BinaryOps TopLevelOpcode = I.getOpcode(); Instruction::BinaryOps TopLevelOpcode = I.getOpcode();

Value *A, *B, *C, *D; Value *A, *B, *C, *D;

Instruction::BinaryOps LHSOpcode, RHSOpcode; Instruction::BinaryOps LHSOpcode, RHSOpcode;

▲ Show 20 Lines • Show All 3,455 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/and-xor-or.ll

Show First 20 Lines • Show All 350 Lines • ▼ Show 20 Lines
; PR37098 - https://bugs.llvm.org/show_bug.cgi?id=37098		; PR37098 - https://bugs.llvm.org/show_bug.cgi?id=37098
; Reassociate bitwise logic to eliminate a shift.		; Reassociate bitwise logic to eliminate a shift.
; There are 4 commuted * 3 shift ops * 3 logic ops = 36 potential variations of this fold.		; There are 4 commuted * 3 shift ops * 3 logic ops = 36 potential variations of this fold.
; Mix the commutation options to provide coverage using less tests.		; Mix the commutation options to provide coverage using less tests.

define i8 @and_shl(i8 %x, i8 %y, i8 %z, i8 %shamt) {		define i8 @and_shl(i8 %x, i8 %y, i8 %z, i8 %shamt) {
; CHECK-LABEL: define {{[^@]+}}@and_shl		; CHECK-LABEL: define {{[^@]+}}@and_shl
; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {		; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {
; CHECK-NEXT: [[SX:%.*]] = shl i8 [[X]], [[SHAMT]]		; CHECK-NEXT: [[TMP1:%.*]] = and i8 [[X]], [[Y]]
; CHECK-NEXT: [[SY:%.*]] = shl i8 [[Y]], [[SHAMT]]		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], [[SHAMT]]
; CHECK-NEXT: [[A:%.*]] = and i8 [[SX]], [[Z]]		; CHECK-NEXT: [[R:%.*]] = and i8 [[TMP2]], [[Z]]
; CHECK-NEXT: [[R:%.*]] = and i8 [[SY]], [[A]]
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%sx = shl i8 %x, %shamt		%sx = shl i8 %x, %shamt
%sy = shl i8 %y, %shamt		%sy = shl i8 %y, %shamt
%a = and i8 %sx, %z		%a = and i8 %sx, %z
%r = and i8 %sy, %a		%r = and i8 %sy, %a
ret i8 %r		ret i8 %r
}		}

define i8 @or_shl(i8 %x, i8 %y, i8 %z, i8 %shamt) {		define i8 @or_shl(i8 %x, i8 %y, i8 %z, i8 %shamt) {
; CHECK-LABEL: define {{[^@]+}}@or_shl		; CHECK-LABEL: define {{[^@]+}}@or_shl
; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {		; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {
; CHECK-NEXT: [[SX:%.*]] = shl i8 [[X]], [[SHAMT]]		; CHECK-NEXT: [[TMP1:%.*]] = or i8 [[X]], [[Y]]
; CHECK-NEXT: [[SY:%.*]] = shl i8 [[Y]], [[SHAMT]]		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], [[SHAMT]]
; CHECK-NEXT: [[A:%.*]] = or i8 [[SX]], [[Z]]		; CHECK-NEXT: [[R:%.*]] = or i8 [[TMP2]], [[Z]]
; CHECK-NEXT: [[R:%.*]] = or i8 [[A]], [[SY]]
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%sx = shl i8 %x, %shamt		%sx = shl i8 %x, %shamt
%sy = shl i8 %y, %shamt		%sy = shl i8 %y, %shamt
%a = or i8 %sx, %z		%a = or i8 %sx, %z
%r = or i8 %a, %sy		%r = or i8 %a, %sy
ret i8 %r		ret i8 %r
}		}
Show All 32 Lines	;
%a = and i8 %z, %sx		%a = and i8 %z, %sx
%r = and i8 %sy, %a		%r = and i8 %sy, %a
ret i8 %r		ret i8 %r
}		}

define i8 @or_lshr(i8 %x, i8 %y, i8 %z, i8 %shamt) {		define i8 @or_lshr(i8 %x, i8 %y, i8 %z, i8 %shamt) {
; CHECK-LABEL: define {{[^@]+}}@or_lshr		; CHECK-LABEL: define {{[^@]+}}@or_lshr
; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {		; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {
; CHECK-NEXT: [[SX:%.*]] = lshr i8 [[X]], [[SHAMT]]		; CHECK-NEXT: [[TMP1:%.*]] = or i8 [[X]], [[Y]]
; CHECK-NEXT: [[SY:%.*]] = lshr i8 [[Y]], [[SHAMT]]		; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], [[SHAMT]]
; CHECK-NEXT: [[A:%.*]] = or i8 [[SX]], [[Z]]		; CHECK-NEXT: [[R:%.*]] = or i8 [[TMP2]], [[Z]]
; CHECK-NEXT: [[R:%.*]] = or i8 [[SY]], [[A]]
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%sx = lshr i8 %x, %shamt		%sx = lshr i8 %x, %shamt
%sy = lshr i8 %y, %shamt		%sy = lshr i8 %y, %shamt
%a = or i8 %sx, %z		%a = or i8 %sx, %z
%r = or i8 %sy, %a		%r = or i8 %sy, %a
ret i8 %r		ret i8 %r
}		}

define i8 @xor_lshr(i8 %x, i8 %y, i8 %z, i8 %shamt) {		define i8 @xor_lshr(i8 %x, i8 %y, i8 %z, i8 %shamt) {
; CHECK-LABEL: define {{[^@]+}}@xor_lshr		; CHECK-LABEL: define {{[^@]+}}@xor_lshr
; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {		; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {
; CHECK-NEXT: [[SX:%.*]] = lshr i8 [[X]], [[SHAMT]]		; CHECK-NEXT: [[TMP1:%.*]] = xor i8 [[X]], [[Y]]
; CHECK-NEXT: [[SY:%.*]] = lshr i8 [[Y]], [[SHAMT]]		; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], [[SHAMT]]
; CHECK-NEXT: [[A:%.*]] = xor i8 [[SX]], [[Z]]		; CHECK-NEXT: [[R:%.*]] = xor i8 [[TMP2]], [[Z]]
; CHECK-NEXT: [[R:%.*]] = xor i8 [[A]], [[SY]]
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%sx = lshr i8 %x, %shamt		%sx = lshr i8 %x, %shamt
%sy = lshr i8 %y, %shamt		%sy = lshr i8 %y, %shamt
%a = xor i8 %sx, %z		%a = xor i8 %sx, %z
%r = xor i8 %a, %sy		%r = xor i8 %a, %sy
ret i8 %r		ret i8 %r
}		}
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines
}		}

; Negative test - multi-use		; Negative test - multi-use

define i8 @xor_lshr_multiuse(i8 %x, i8 %y, i8 %z, i8 %shamt) {		define i8 @xor_lshr_multiuse(i8 %x, i8 %y, i8 %z, i8 %shamt) {
; CHECK-LABEL: define {{[^@]+}}@xor_lshr_multiuse		; CHECK-LABEL: define {{[^@]+}}@xor_lshr_multiuse
; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {		; CHECK-SAME: (i8 [[X:%.]], i8 [[Y:%.]], i8 [[Z:%.]], i8 [[SHAMT:%.]]) {
; CHECK-NEXT: [[SX:%.*]] = lshr i8 [[X]], [[SHAMT]]		; CHECK-NEXT: [[SX:%.*]] = lshr i8 [[X]], [[SHAMT]]
; CHECK-NEXT: [[SY:%.*]] = lshr i8 [[Y]], [[SHAMT]]
; CHECK-NEXT: [[A:%.*]] = xor i8 [[SX]], [[Z]]		; CHECK-NEXT: [[A:%.*]] = xor i8 [[SX]], [[Z]]
; CHECK-NEXT: [[R:%.*]] = xor i8 [[A]], [[SY]]		; CHECK-NEXT: [[TMP1:%.*]] = xor i8 [[X]], [[Y]]
		; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], [[SHAMT]]
		; CHECK-NEXT: [[R:%.*]] = xor i8 [[TMP2]], [[Z]]
; CHECK-NEXT: [[R2:%.*]] = sdiv i8 [[A]], [[R]]		; CHECK-NEXT: [[R2:%.*]] = sdiv i8 [[A]], [[R]]
; CHECK-NEXT: ret i8 [[R2]]		; CHECK-NEXT: ret i8 [[R2]]
;		;
%sx = lshr i8 %x, %shamt		%sx = lshr i8 %x, %shamt
%sy = lshr i8 %y, %shamt		%sy = lshr i8 %y, %shamt
%a = xor i8 %sx, %z		%a = xor i8 %sx, %z
%r = xor i8 %a, %sy		%r = xor i8 %a, %sy
%r2 = sdiv i8 %a, %r		%r2 = sdiv i8 %a, %r
▲ Show 20 Lines • Show All 4,183 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/binop-and-shifts.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=instcombine -S \| FileCheck %s		; RUN: opt < %s -passes=instcombine -S \| FileCheck %s

define i8 @shl_and_and(i8 %x, i8 %y) {		define i8 @shl_and_and(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_and_and(		; CHECK-LABEL: @shl_and_and(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], 4		; CHECK-NEXT: [[TMP1:%.]] = and i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], 4		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], 4
; CHECK-NEXT: [[BW2:%.*]] = and i8 [[SHIFT2]], 80		; CHECK-NEXT: [[BW1:%.*]] = and i8 [[TMP2]], 80
; CHECK-NEXT: [[BW1:%.*]] = and i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 4		%shift1 = shl i8 %x, 4
%shift2 = shl i8 %y, 4		%shift2 = shl i8 %y, 4
%bw2 = and i8 %shift2, 88		%bw2 = and i8 %shift2, 88
%bw1 = and i8 %shift1, %bw2		%bw1 = and i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = shl i8 %y, 5		%shift2 = shl i8 %y, 5
%bw2 = and i8 %shift2, 88		%bw2 = and i8 %shift2, 88
%bw1 = and i8 %shift1, %bw2		%bw1 = and i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @shl_add_add(i8 %x, i8 %y) {		define i8 @shl_add_add(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_add_add(		; CHECK-LABEL: @shl_add_add(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], 2		; CHECK-NEXT: [[TMP1:%.]] = add i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], 2		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], 2
; CHECK-NEXT: [[BW2:%.*]] = add i8 [[SHIFT2]], 48		; CHECK-NEXT: [[BW1:%.*]] = add i8 [[TMP2]], 48
; CHECK-NEXT: [[BW1:%.*]] = add i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 2		%shift1 = shl i8 %x, 2
%shift2 = shl i8 %y, 2		%shift2 = shl i8 %y, 2
%bw2 = add i8 %shift2, 48		%bw2 = add i8 %shift2, 48
%bw1 = add i8 %shift1, %bw2		%bw1 = add i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 25 Lines	;
%shift2 = shl i8 4, %y		%shift2 = shl i8 4, %y
%bw2 = and i8 %shift2, 88		%bw2 = and i8 %shift2, 88
%bw1 = and i8 %shift1, %bw2		%bw1 = and i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @lshr_and_or(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @lshr_and_or(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @lshr_and_or(		; CHECK-LABEL: @lshr_and_or(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr <2 x i8> [[X:%.]], <i8 4, i8 5>		; CHECK-NEXT: [[TMP1:%.]] = and <2 x i8> [[X:%.]], <i8 -64, i8 96>
; CHECK-NEXT: [[SHIFT2:%.]] = lshr <2 x i8> [[Y:%.]], <i8 4, i8 5>		; CHECK-NEXT: [[TMP2:%.]] = or <2 x i8> [[TMP1]], [[Y:%.]]
; CHECK-NEXT: [[BW2:%.*]] = and <2 x i8> [[SHIFT1]], <i8 44, i8 99>		; CHECK-NEXT: [[BW1:%.*]] = lshr <2 x i8> [[TMP2]], <i8 4, i8 5>
; CHECK-NEXT: [[BW1:%.*]] = or <2 x i8> [[SHIFT2]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = lshr <2 x i8> %x, <i8 4, i8 5>		%shift1 = lshr <2 x i8> %x, <i8 4, i8 5>
%shift2 = lshr <2 x i8> %y, <i8 4, i8 5>		%shift2 = lshr <2 x i8> %y, <i8 4, i8 5>
%bw2 = and <2 x i8> %shift1, <i8 44, i8 99>		%bw2 = and <2 x i8> %shift1, <i8 44, i8 99>
%bw1 = or <2 x i8> %shift2, %bw2		%bw1 = or <2 x i8> %shift2, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr <2 x i8> %y, <i8 5, i8 4>		%shift2 = lshr <2 x i8> %y, <i8 5, i8 4>
%bw2 = and <2 x i8> %shift2, <i8 44, i8 99>		%bw2 = and <2 x i8> %shift2, <i8 44, i8 99>
%bw1 = or <2 x i8> %shift1, %bw2		%bw1 = or <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define i8 @shl_and_xor(i8 %x, i8 %y) {		define i8 @shl_and_xor(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_and_xor(		; CHECK-LABEL: @shl_and_xor(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], 1		; CHECK-NEXT: [[TMP1:%.]] = and i8 [[X:%.]], 10
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], 1		; CHECK-NEXT: [[TMP2:%.]] = xor i8 [[TMP1]], [[Y:%.]]
; CHECK-NEXT: [[BW2:%.*]] = and i8 [[SHIFT1]], 20		; CHECK-NEXT: [[BW1:%.*]] = shl i8 [[TMP2]], 1
; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[SHIFT2]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 1		%shift1 = shl i8 %x, 1
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = and i8 %shift1, 20		%bw2 = and i8 %shift1, 20
%bw1 = xor i8 %shift2, %bw2		%bw1 = xor i8 %shift2, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @shl_and_add(i8 %x, i8 %y) {		define i8 @shl_and_add(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_and_add(		; CHECK-LABEL: @shl_and_add(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], 1		; CHECK-NEXT: [[TMP1:%.]] = and i8 [[Y:%.]], 59
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], 1		; CHECK-NEXT: [[TMP2:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[BW2:%.*]] = and i8 [[SHIFT2]], 118		; CHECK-NEXT: [[BW1:%.*]] = shl i8 [[TMP2]], 1
; CHECK-NEXT: [[BW1:%.*]] = add i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 1		%shift1 = shl i8 %x, 1
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = and i8 %shift2, 119		%bw2 = and i8 %shift2, 119
%bw1 = add i8 %shift1, %bw2		%bw1 = add i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = xor i8 %shift2, 119		%bw2 = xor i8 %shift2, 119
%bw1 = add i8 %shift1, %bw2		%bw1 = add i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @lshr_or_and(i8 %x, i8 %y) {		define i8 @lshr_or_and(i8 %x, i8 %y) {
; CHECK-LABEL: @lshr_or_and(		; CHECK-LABEL: @lshr_or_and(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr i8 [[X:%.]], 5		; CHECK-NEXT: [[TMP1:%.]] = or i8 [[X:%.]], -64
; CHECK-NEXT: [[SHIFT2:%.]] = lshr i8 [[Y:%.]], 5		; CHECK-NEXT: [[TMP2:%.]] = and i8 [[TMP1]], [[Y:%.]]
; CHECK-NEXT: [[BW2:%.*]] = or i8 [[SHIFT1]], 6		; CHECK-NEXT: [[BW1:%.*]] = lshr i8 [[TMP2]], 5
; CHECK-NEXT: [[BW1:%.*]] = and i8 [[BW2]], [[SHIFT2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = lshr i8 %x, 5		%shift1 = lshr i8 %x, 5
%shift2 = lshr i8 %y, 5		%shift2 = lshr i8 %y, 5
%bw2 = or i8 %shift1, 198		%bw2 = or i8 %shift1, 198
%bw1 = and i8 %bw2, %shift2		%bw1 = and i8 %bw2, %shift2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @lshr_or_or_fail(i8 %x, i8 %y) {		define i8 @lshr_or_or_fail(i8 %x, i8 %y) {
; CHECK-LABEL: @lshr_or_or_fail(		; CHECK-LABEL: @lshr_or_or_fail(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr i8 [[X:%.]], 5		; CHECK-NEXT: [[TMP1:%.]] = or i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = lshr i8 [[Y:%.]], 5		; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], 5
; CHECK-NEXT: [[BW2:%.*]] = or i8 [[SHIFT2]], -58		; CHECK-NEXT: [[BW1:%.*]] = or i8 [[TMP2]], -58
; CHECK-NEXT: [[BW1:%.*]] = or i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = lshr i8 %x, 5		%shift1 = lshr i8 %x, 5
%shift2 = lshr i8 %y, 5		%shift2 = lshr i8 %y, 5
%bw2 = or i8 %shift2, 198		%bw2 = or i8 %shift2, 198
%bw1 = or i8 %shift1, %bw2		%bw1 = or i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @shl_xor_and(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @shl_xor_and(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @shl_xor_and(		; CHECK-LABEL: @shl_xor_and(
; CHECK-NEXT: [[SHIFT1:%.]] = shl <2 x i8> [[X:%.]], <i8 2, i8 undef>		; CHECK-NEXT: [[TMP1:%.]] = xor <2 x i8> [[Y:%.]], <i8 11, i8 poison>
; CHECK-NEXT: [[SHIFT2:%.]] = shl <2 x i8> [[Y:%.]], <i8 2, i8 undef>		; CHECK-NEXT: [[TMP2:%.]] = and <2 x i8> [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[BW2:%.*]] = xor <2 x i8> [[SHIFT2]], <i8 44, i8 undef>		; CHECK-NEXT: [[BW1:%.*]] = shl <2 x i8> [[TMP2]], <i8 2, i8 undef>
; CHECK-NEXT: [[BW1:%.*]] = and <2 x i8> [[BW2]], [[SHIFT1]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = shl <2 x i8> %x, <i8 2, i8 undef>		%shift1 = shl <2 x i8> %x, <i8 2, i8 undef>
%shift2 = shl <2 x i8> %y, <i8 2, i8 undef>		%shift2 = shl <2 x i8> %y, <i8 2, i8 undef>
%bw2 = xor <2 x i8> %shift2, <i8 44, i8 undef>		%bw2 = xor <2 x i8> %shift2, <i8 44, i8 undef>
%bw1 = and <2 x i8> %bw2, %shift1		%bw1 = and <2 x i8> %bw2, %shift1
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}
Show All 10 Lines	;
%shift2 = shl <2 x i8> %y, <i8 undef, i8 2>		%shift2 = shl <2 x i8> %y, <i8 undef, i8 2>
%bw2 = xor <2 x i8> %shift2, <i8 44, i8 undef>		%bw2 = xor <2 x i8> %shift2, <i8 44, i8 undef>
%bw1 = and <2 x i8> %shift1, %bw2		%bw1 = and <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define i8 @lshr_or_or_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {		define i8 @lshr_or_or_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {
; CHECK-LABEL: @lshr_or_or_no_const(		; CHECK-LABEL: @lshr_or_or_no_const(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr i8 [[X:%.]], [[SH:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = or i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = lshr i8 [[Y:%.]], [[SH]]		; CHECK-NEXT: [[TMP2:%.]] = lshr i8 [[TMP1]], [[SH:%.]]
; CHECK-NEXT: [[BW2:%.]] = or i8 [[SHIFT2]], [[MASK:%.]]		; CHECK-NEXT: [[BW1:%.]] = or i8 [[TMP2]], [[MASK:%.]]
; CHECK-NEXT: [[BW1:%.*]] = or i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = lshr i8 %x, %sh		%shift1 = lshr i8 %x, %sh
%shift2 = lshr i8 %y, %sh		%shift2 = lshr i8 %y, %sh
%bw2 = or i8 %shift2, %mask		%bw2 = or i8 %shift2, %mask
%bw1 = or i8 %shift1, %bw2		%bw1 = or i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr i8 %y, %sh		%shift2 = lshr i8 %y, %sh
%bw2 = or i8 %shift2, %mask		%bw2 = or i8 %shift2, %mask
%bw1 = or i8 %shift1, %bw2		%bw1 = or i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @shl_xor_xor_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {		define i8 @shl_xor_xor_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {
; CHECK-LABEL: @shl_xor_xor_no_const(		; CHECK-LABEL: @shl_xor_xor_no_const(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], [[SH:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = xor i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], [[SH]]		; CHECK-NEXT: [[TMP2:%.]] = shl i8 [[TMP1]], [[SH:%.]]
; CHECK-NEXT: [[BW2:%.]] = xor i8 [[SHIFT2]], [[MASK:%.]]		; CHECK-NEXT: [[BW1:%.]] = xor i8 [[TMP2]], [[MASK:%.]]
; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, %sh		%shift1 = shl i8 %x, %sh
%shift2 = shl i8 %y, %sh		%shift2 = shl i8 %y, %sh
%bw2 = xor i8 %shift2, %mask		%bw2 = xor i8 %shift2, %mask
%bw1 = xor i8 %shift1, %bw2		%bw1 = xor i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = shl i8 %y, %sh		%shift2 = shl i8 %y, %sh
%bw2 = xor i8 %shift2, %mask		%bw2 = xor i8 %shift2, %mask
%bw1 = and i8 %shift1, %bw2		%bw1 = and i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @shl_and_and_no_const(<2 x i8> %x, <2 x i8> %y, <2 x i8> %sh, <2 x i8> %mask) {		define <2 x i8> @shl_and_and_no_const(<2 x i8> %x, <2 x i8> %y, <2 x i8> %sh, <2 x i8> %mask) {
; CHECK-LABEL: @shl_and_and_no_const(		; CHECK-LABEL: @shl_and_and_no_const(
; CHECK-NEXT: [[SHIFT1:%.]] = shl <2 x i8> [[X:%.]], [[SH:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = and <2 x i8> [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl <2 x i8> [[Y:%.]], [[SH]]		; CHECK-NEXT: [[TMP2:%.]] = shl <2 x i8> [[TMP1]], [[SH:%.]]
; CHECK-NEXT: [[BW2:%.]] = and <2 x i8> [[SHIFT2]], [[MASK:%.]]		; CHECK-NEXT: [[BW1:%.]] = and <2 x i8> [[TMP2]], [[MASK:%.]]
; CHECK-NEXT: [[BW1:%.*]] = and <2 x i8> [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = shl <2 x i8> %x, %sh		%shift1 = shl <2 x i8> %x, %sh
%shift2 = shl <2 x i8> %y, %sh		%shift2 = shl <2 x i8> %y, %sh
%bw2 = and <2 x i8> %shift2, %mask		%bw2 = and <2 x i8> %shift2, %mask
%bw1 = and <2 x i8> %shift1, %bw2		%bw1 = and <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define i8 @shl_add_add_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {		define i8 @shl_add_add_no_const(i8 %x, i8 %y, i8 %sh, i8 %mask) {
; CHECK-LABEL: @shl_add_add_no_const(		; CHECK-LABEL: @shl_add_add_no_const(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], [[SH:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = add i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], [[SH]]		; CHECK-NEXT: [[TMP2:%.]] = shl i8 [[TMP1]], [[SH:%.]]
; CHECK-NEXT: [[BW2:%.]] = add i8 [[SHIFT2]], [[MASK:%.]]		; CHECK-NEXT: [[BW1:%.]] = add i8 [[TMP2]], [[MASK:%.]]
; CHECK-NEXT: [[BW1:%.*]] = add i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, %sh		%shift1 = shl i8 %x, %sh
%shift2 = shl i8 %y, %sh		%shift2 = shl i8 %y, %sh
%bw2 = add i8 %shift2, %mask		%bw2 = add i8 %shift2, %mask
%bw1 = add i8 %shift1, %bw2		%bw1 = add i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr i8 %y, %sh		%shift2 = lshr i8 %y, %sh
%bw2 = add i8 %shift2, %mask		%bw2 = add i8 %shift2, %mask
%bw1 = add i8 %shift1, %bw2		%bw1 = add i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @lshr_add_and(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @lshr_add_and(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @lshr_add_and(		; CHECK-LABEL: @lshr_add_and(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr <2 x i8> [[X:%.]], <i8 3, i8 4>		; CHECK-NEXT: [[TMP1:%.]] = add <2 x i8> [[Y:%.]], <i8 -8, i8 16>
; CHECK-NEXT: [[SHIFT2:%.]] = lshr <2 x i8> [[Y:%.]], <i8 3, i8 4>		; CHECK-NEXT: [[TMP2:%.]] = and <2 x i8> [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[BW2:%.*]] = add <2 x i8> [[SHIFT2]], <i8 -1, i8 1>		; CHECK-NEXT: [[BW1:%.*]] = lshr <2 x i8> [[TMP2]], <i8 3, i8 4>
; CHECK-NEXT: [[BW1:%.*]] = and <2 x i8> [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = lshr <2 x i8> %x, <i8 3, i8 4>		%shift1 = lshr <2 x i8> %x, <i8 3, i8 4>
%shift2 = lshr <2 x i8> %y, <i8 3, i8 4>		%shift2 = lshr <2 x i8> %y, <i8 3, i8 4>
%bw2 = add <2 x i8> %shift2, <i8 255, i8 1>		%bw2 = add <2 x i8> %shift2, <i8 255, i8 1>
%bw1 = and <2 x i8> %shift1, %bw2		%bw1 = and <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr <2 x i8> %y, <i8 undef, i8 3>		%shift2 = lshr <2 x i8> %y, <i8 undef, i8 3>
%bw2 = add <2 x i8> %shift2, <i8 255, i8 1>		%bw2 = add <2 x i8> %shift2, <i8 255, i8 1>
%bw1 = and <2 x i8> %shift1, %bw2		%bw1 = and <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define <2 x i8> @shl_or_or_good_mask(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @shl_or_or_good_mask(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @shl_or_or_good_mask(		; CHECK-LABEL: @shl_or_or_good_mask(
; CHECK-NEXT: [[SHIFT1:%.]] = shl <2 x i8> [[X:%.]], <i8 1, i8 1>		; CHECK-NEXT: [[TMP1:%.]] = or <2 x i8> [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl <2 x i8> [[Y:%.]], <i8 1, i8 1>		; CHECK-NEXT: [[TMP2:%.*]] = shl <2 x i8> [[TMP1]], <i8 1, i8 1>
; CHECK-NEXT: [[BW2:%.*]] = or <2 x i8> [[SHIFT2]], <i8 18, i8 24>		; CHECK-NEXT: [[BW1:%.*]] = or <2 x i8> [[TMP2]], <i8 18, i8 24>
; CHECK-NEXT: [[BW1:%.*]] = or <2 x i8> [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = shl <2 x i8> %x, <i8 1, i8 1>		%shift1 = shl <2 x i8> %x, <i8 1, i8 1>
%shift2 = shl <2 x i8> %y, <i8 1, i8 1>		%shift2 = shl <2 x i8> %y, <i8 1, i8 1>
%bw2 = or <2 x i8> %shift2, <i8 18, i8 24>		%bw2 = or <2 x i8> %shift2, <i8 18, i8 24>
%bw1 = or <2 x i8> %shift1, %bw2		%bw1 = or <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define <2 x i8> @shl_or_or_fail_bad_mask(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @shl_or_or_fail_bad_mask(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @shl_or_or_fail_bad_mask(		; CHECK-LABEL: @shl_or_or_fail_bad_mask(
; CHECK-NEXT: [[SHIFT1:%.]] = shl <2 x i8> [[X:%.]], <i8 1, i8 1>		; CHECK-NEXT: [[TMP1:%.]] = or <2 x i8> [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = shl <2 x i8> [[Y:%.]], <i8 1, i8 1>		; CHECK-NEXT: [[TMP2:%.*]] = shl <2 x i8> [[TMP1]], <i8 1, i8 1>
; CHECK-NEXT: [[BW2:%.*]] = or <2 x i8> [[SHIFT2]], <i8 19, i8 24>		; CHECK-NEXT: [[BW1:%.*]] = or <2 x i8> [[TMP2]], <i8 19, i8 24>
; CHECK-NEXT: [[BW1:%.*]] = or <2 x i8> [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = shl <2 x i8> %x, <i8 1, i8 1>		%shift1 = shl <2 x i8> %x, <i8 1, i8 1>
%shift2 = shl <2 x i8> %y, <i8 1, i8 1>		%shift2 = shl <2 x i8> %y, <i8 1, i8 1>
%bw2 = or <2 x i8> %shift2, <i8 19, i8 24>		%bw2 = or <2 x i8> %shift2, <i8 19, i8 24>
%bw1 = or <2 x i8> %shift1, %bw2		%bw1 = or <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define i8 @lshr_xor_or_good_mask(i8 %x, i8 %y) {		define i8 @lshr_xor_or_good_mask(i8 %x, i8 %y) {
; CHECK-LABEL: @lshr_xor_or_good_mask(		; CHECK-LABEL: @lshr_xor_or_good_mask(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr i8 [[X:%.]], 4		; CHECK-NEXT: [[TMP1:%.]] = or i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[SHIFT2:%.]] = lshr i8 [[Y:%.]], 4		; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], 4
; CHECK-NEXT: [[BW21:%.*]] = or i8 [[SHIFT2]], 48		; CHECK-NEXT: [[BW1:%.*]] = or i8 [[TMP2]], 48
; CHECK-NEXT: [[BW1:%.*]] = or i8 [[SHIFT1]], [[BW21]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = lshr i8 %x, 4		%shift1 = lshr i8 %x, 4
%shift2 = lshr i8 %y, 4		%shift2 = lshr i8 %y, 4
%bw2 = xor i8 %shift2, 48		%bw2 = xor i8 %shift2, 48
%bw1 = or i8 %shift1, %bw2		%bw1 = or i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr i8 %y, 6		%shift2 = lshr i8 %y, 6
%bw2 = xor i8 %shift2, 129		%bw2 = xor i8 %shift2, 129
%bw1 = or i8 %shift1, %bw2		%bw1 = or i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @lshr_or_xor_good_mask(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @lshr_or_xor_good_mask(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @lshr_or_xor_good_mask(		; CHECK-LABEL: @lshr_or_xor_good_mask(
; CHECK-NEXT: [[SHIFT1:%.]] = lshr <2 x i8> [[X:%.]], <i8 6, i8 6>		; CHECK-NEXT: [[TMP1:%.]] = or <2 x i8> [[Y:%.]], <i8 -64, i8 64>
; CHECK-NEXT: [[SHIFT2:%.]] = lshr <2 x i8> [[Y:%.]], <i8 6, i8 6>		; CHECK-NEXT: [[TMP2:%.]] = xor <2 x i8> [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[BW2:%.*]] = or <2 x i8> [[SHIFT2]], <i8 3, i8 1>		; CHECK-NEXT: [[BW1:%.*]] = lshr <2 x i8> [[TMP2]], <i8 6, i8 6>
; CHECK-NEXT: [[BW1:%.*]] = xor <2 x i8> [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = lshr <2 x i8> %x, <i8 6, i8 6>		%shift1 = lshr <2 x i8> %x, <i8 6, i8 6>
%shift2 = lshr <2 x i8> %y, <i8 6, i8 6>		%shift2 = lshr <2 x i8> %y, <i8 6, i8 6>
%bw2 = or <2 x i8> %shift2, <i8 3, i8 1>		%bw2 = or <2 x i8> %shift2, <i8 3, i8 1>
%bw1 = xor <2 x i8> %shift1, %bw2		%bw1 = xor <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}
Show All 10 Lines	;
%shift2 = lshr <2 x i8> %y, <i8 6, i8 6>		%shift2 = lshr <2 x i8> %y, <i8 6, i8 6>
%bw2 = or <2 x i8> %shift2, <i8 7, i8 1>		%bw2 = or <2 x i8> %shift2, <i8 7, i8 1>
%bw1 = xor <2 x i8> %shift1, %bw2		%bw1 = xor <2 x i8> %shift1, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}

define i8 @shl_xor_xor_good_mask(i8 %x, i8 %y) {		define i8 @shl_xor_xor_good_mask(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_xor_xor_good_mask(		; CHECK-LABEL: @shl_xor_xor_good_mask(
; CHECK-NEXT: [[SHIFT21:%.]] = xor i8 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = xor i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[TMP1:%.*]] = shl i8 [[SHIFT21]], 1		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], 1
; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[TMP1]], 88		; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[TMP2]], 88
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 1		%shift1 = shl i8 %x, 1
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = xor i8 %shift2, 88		%bw2 = xor i8 %shift2, 88
%bw1 = xor i8 %shift1, %bw2		%bw1 = xor i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @shl_xor_xor_bad_mask_distribute(i8 %x, i8 %y) {		define i8 @shl_xor_xor_bad_mask_distribute(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_xor_xor_bad_mask_distribute(		; CHECK-LABEL: @shl_xor_xor_bad_mask_distribute(
; CHECK-NEXT: [[SHIFT21:%.]] = xor i8 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = xor i8 [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[TMP1:%.*]] = shl i8 [[SHIFT21]], 1		; CHECK-NEXT: [[TMP2:%.*]] = shl i8 [[TMP1]], 1
; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[TMP1]], -68		; CHECK-NEXT: [[BW1:%.*]] = xor i8 [[TMP2]], -68
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 1		%shift1 = shl i8 %x, 1
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = xor i8 %shift2, 188		%bw2 = xor i8 %shift2, 188
%bw1 = xor i8 %shift1, %bw2		%bw1 = xor i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define i8 @shl_add_and(i8 %x, i8 %y) {		define i8 @shl_add_and(i8 %x, i8 %y) {
; CHECK-LABEL: @shl_add_and(		; CHECK-LABEL: @shl_add_and(
; CHECK-NEXT: [[SHIFT1:%.]] = shl i8 [[X:%.]], 1		; CHECK-NEXT: [[TMP1:%.]] = add i8 [[Y:%.]], 61
; CHECK-NEXT: [[SHIFT2:%.]] = shl i8 [[Y:%.]], 1		; CHECK-NEXT: [[TMP2:%.]] = and i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[BW2:%.*]] = add i8 [[SHIFT2]], 123		; CHECK-NEXT: [[BW1:%.*]] = shl i8 [[TMP2]], 1
; CHECK-NEXT: [[BW1:%.*]] = and i8 [[SHIFT1]], [[BW2]]
; CHECK-NEXT: ret i8 [[BW1]]		; CHECK-NEXT: ret i8 [[BW1]]
;		;
%shift1 = shl i8 %x, 1		%shift1 = shl i8 %x, 1
%shift2 = shl i8 %y, 1		%shift2 = shl i8 %y, 1
%bw2 = add i8 %shift2, 123		%bw2 = add i8 %shift2, 123
%bw1 = and i8 %shift1, %bw2		%bw1 = and i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}
Show All 40 Lines	;
%shift2 = lshr i8 %y, 1		%shift2 = lshr i8 %y, 1
%bw2 = add i8 %shift2, 123		%bw2 = add i8 %shift2, 123
%bw1 = xor i8 %shift1, %bw2		%bw1 = xor i8 %shift1, %bw2
ret i8 %bw1		ret i8 %bw1
}		}

define <2 x i8> @lshr_and_add(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @lshr_and_add(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @lshr_and_add(		; CHECK-LABEL: @lshr_and_add(
; CHECK-NEXT: [[SHIFT1:%.]] = shl <2 x i8> [[X:%.]], <i8 4, i8 5>		; CHECK-NEXT: [[TMP1:%.]] = and <2 x i8> [[X:%.]], <i8 11, i8 3>
; CHECK-NEXT: [[SHIFT2:%.]] = shl <2 x i8> [[Y:%.]], <i8 4, i8 5>		; CHECK-NEXT: [[TMP2:%.]] = add <2 x i8> [[TMP1]], [[Y:%.]]
; CHECK-NEXT: [[BW2:%.*]] = and <2 x i8> [[SHIFT1]], <i8 -67, i8 123>		; CHECK-NEXT: [[BW1:%.*]] = shl <2 x i8> [[TMP2]], <i8 4, i8 5>
; CHECK-NEXT: [[BW1:%.*]] = add <2 x i8> [[SHIFT2]], [[BW2]]
; CHECK-NEXT: ret <2 x i8> [[BW1]]		; CHECK-NEXT: ret <2 x i8> [[BW1]]
;		;
%shift1 = shl <2 x i8> %x, <i8 4, i8 5>		%shift1 = shl <2 x i8> %x, <i8 4, i8 5>
%shift2 = shl <2 x i8> %y, <i8 4, i8 5>		%shift2 = shl <2 x i8> %y, <i8 4, i8 5>
%bw2 = and <2 x i8> %shift1, <i8 189, i8 123>		%bw2 = and <2 x i8> %shift1, <i8 189, i8 123>
%bw1 = add <2 x i8> %shift2, %bw2		%bw1 = add <2 x i8> %shift2, %bw2
ret <2 x i8> %bw1		ret <2 x i8> %bw1
}		}
Show All 30 Lines

llvm/test/Transforms/InstCombine/or-shifted-masks.ll

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	;
%i6 = or i32 %i2, %i4		%i6 = or i32 %i2, %i4
%i7 = or i32 %i3, %i5		%i7 = or i32 %i3, %i5
%i8 = or i32 %i7, %i6		%i8 = or i32 %i7, %i6
ret i32 %i8		ret i32 %i8
}		}

define i32 @multiuse2(i32 %x) {		define i32 @multiuse2(i32 %x) {
; CHECK-LABEL: @multiuse2(		; CHECK-LABEL: @multiuse2(
; CHECK-NEXT: [[I:%.]] = shl i32 [[X:%.]], 1		; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[X:%.]], 8
; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 12		; CHECK-NEXT: [[I10:%.*]] = and i32 [[TMP1]], 32256
; CHECK-NEXT: [[I6:%.*]] = shl i32 [[X]], 8		; CHECK-NEXT: [[TMP2:%.*]] = shl i32 [[X]], 1
; CHECK-NEXT: [[I7:%.*]] = and i32 [[I6]], 24576		; CHECK-NEXT: [[I12:%.*]] = and i32 [[TMP2]], 252
; CHECK-NEXT: [[I14:%.*]] = shl i32 [[X]], 8
; CHECK-NEXT: [[I9:%.*]] = and i32 [[I14]], 7680
; CHECK-NEXT: [[I10:%.*]] = or i32 [[I7]], [[I9]]
; CHECK-NEXT: [[I85:%.*]] = shl i32 [[X]], 1
; CHECK-NEXT: [[I11:%.*]] = and i32 [[I85]], 240
; CHECK-NEXT: [[I12:%.*]] = or i32 [[I2]], [[I11]]
; CHECK-NEXT: [[I13:%.*]] = or i32 [[I10]], [[I12]]		; CHECK-NEXT: [[I13:%.*]] = or i32 [[I10]], [[I12]]
; CHECK-NEXT: ret i32 [[I13]]		; CHECK-NEXT: ret i32 [[I13]]
;		;
%i = and i32 %x, 6		%i = and i32 %x, 6
%i1 = shl nuw nsw i32 %i, 8		%i1 = shl nuw nsw i32 %i, 8
%i2 = shl nuw nsw i32 %i, 1		%i2 = shl nuw nsw i32 %i, 1
%i3 = and i32 %x, 24		%i3 = and i32 %x, 24
%i4 = shl nuw nsw i32 %i3, 8		%i4 = shl nuw nsw i32 %i3, 8
%i5 = shl nuw nsw i32 %i3, 1		%i5 = shl nuw nsw i32 %i3, 1
%i6 = and i32 %x, 96		%i6 = and i32 %x, 96
%i7 = shl nuw nsw i32 %i6, 8		%i7 = shl nuw nsw i32 %i6, 8
%i8 = shl nuw nsw i32 %i6, 1		%i8 = shl nuw nsw i32 %i6, 1
%i9 = or i32 %i1, %i4		%i9 = or i32 %i1, %i4
%i10 = or i32 %i7, %i9		%i10 = or i32 %i7, %i9
%i11 = or i32 %i8, %i5		%i11 = or i32 %i8, %i5
%i12 = or i32 %i2, %i11		%i12 = or i32 %i2, %i11
%i13 = or i32 %i10, %i12		%i13 = or i32 %i10, %i12
ret i32 %i13		ret i32 %i13
}		}

define i32 @multiuse3(i32 %x) {		define i32 @multiuse3(i32 %x) {
; CHECK-LABEL: @multiuse3(		; CHECK-LABEL: @multiuse3(
; CHECK-NEXT: [[I:%.]] = and i32 [[X:%.]], 96		; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[X:%.]], 6
; CHECK-NEXT: [[I1:%.*]] = shl nuw nsw i32 [[I]], 6		; CHECK-NEXT: [[I5:%.*]] = and i32 [[TMP1]], 8064
; CHECK-NEXT: [[I2:%.*]] = lshr exact i32 [[I]], 1		; CHECK-NEXT: [[TMP2:%.*]] = lshr i32 [[X]], 1
; CHECK-NEXT: [[I3:%.*]] = shl i32 [[X]], 6		; CHECK-NEXT: [[I8:%.*]] = and i32 [[TMP2]], 63
; CHECK-NEXT: [[I4:%.*]] = and i32 [[I3]], 1920
; CHECK-NEXT: [[I5:%.*]] = or i32 [[I1]], [[I4]]
; CHECK-NEXT: [[I6:%.*]] = lshr i32 [[X]], 1
; CHECK-NEXT: [[I7:%.*]] = and i32 [[I6]], 15
; CHECK-NEXT: [[I8:%.*]] = or i32 [[I2]], [[I7]]
; CHECK-NEXT: [[I9:%.*]] = or i32 [[I8]], [[I5]]		; CHECK-NEXT: [[I9:%.*]] = or i32 [[I8]], [[I5]]
; CHECK-NEXT: ret i32 [[I9]]		; CHECK-NEXT: ret i32 [[I9]]
;		;
%i = and i32 %x, 96		%i = and i32 %x, 96
%i1 = shl nuw nsw i32 %i, 6		%i1 = shl nuw nsw i32 %i, 6
%i2 = lshr exact i32 %i, 1		%i2 = lshr exact i32 %i, 1
%i3 = shl i32 %x, 6		%i3 = shl i32 %x, 6
%i4 = and i32 %i3, 1920		%i4 = and i32 %i3, 1920
%i5 = or i32 %i1, %i4		%i5 = or i32 %i1, %i4
%i6 = lshr i32 %x, 1		%i6 = lshr i32 %x, 1
%i7 = and i32 %i6, 15		%i7 = and i32 %i6, 15
%i8 = or i32 %i2, %i7		%i8 = or i32 %i2, %i7
%i9 = or i32 %i8, %i5		%i9 = or i32 %i8, %i5
ret i32 %i9		ret i32 %i9
}		}

define i32 @multiuse4(i32 %x) local_unnamed_addr {		define i32 @multiuse4(i32 %x) local_unnamed_addr {
; CHECK-LABEL: @multiuse4(		; CHECK-LABEL: @multiuse4(
; CHECK-NEXT: [[I:%.]] = and i32 [[X:%.]], 100663296		; CHECK-NEXT: [[I1:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[I1:%.*]] = icmp sgt i32 [[X]], -1
; CHECK-NEXT: br i1 [[I1]], label [[IF:%.]], label [[ELSE:%.]]		; CHECK-NEXT: br i1 [[I1]], label [[IF:%.]], label [[ELSE:%.]]
; CHECK: if:		; CHECK: if:
; CHECK-NEXT: [[I2:%.*]] = lshr exact i32 [[I]], 22		; CHECK-NEXT: [[I:%.*]] = lshr i32 [[X]], 22
		; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 24
; CHECK-NEXT: [[I3:%.*]] = lshr i32 [[X]], 22		; CHECK-NEXT: [[I3:%.*]] = lshr i32 [[X]], 22
; CHECK-NEXT: [[I4:%.*]] = and i32 [[I3]], 480		; CHECK-NEXT: [[I4:%.*]] = and i32 [[I3]], 480
; CHECK-NEXT: [[I5:%.*]] = or i32 [[I4]], [[I2]]		; CHECK-NEXT: [[I5:%.*]] = or i32 [[I4]], [[I2]]
		nikicUnsubmitted Not Done Reply Inline Actions This highlights another possible extension: We can have an extra binop on both sides, not just on one. nikic: This highlights another possible extension: We can have an extra binop on both sides, not just…
		nikicUnsubmitted Not Done Reply Inline Actions Tested this patch together with D151807, and it looks like handling this case would be needed to get the patterns in one instcombine run instead of instcombine,early-cse,instcombine. nikic: Tested this patch together with D151807, and it looks like handling this case would be needed…
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions kk, working on a follow up patch. goldstein.w.n: kk, working on a follow up patch.
		goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I'm going to push this one first as is and split multi-binop version to new patch. Handling generic chains necessitates an entire rewrite + alot more complexity so may end up not being worth it if its more modularly done across multiple passes. goldstein.w.n: I'm going to push this one first as is and split multi-binop version to new patch. Handling…
; CHECK-NEXT: br label [[END:%.*]]		; CHECK-NEXT: br label [[END:%.*]]
; CHECK: else:		; CHECK: else:
; CHECK-NEXT: [[I6:%.*]] = lshr exact i32 [[I]], 17		; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[X]], 17
; CHECK-NEXT: [[I7:%.*]] = lshr i32 [[X]], 17		; CHECK-NEXT: [[I9:%.*]] = and i32 [[TMP1]], 16128
; CHECK-NEXT: [[I8:%.*]] = and i32 [[I7]], 15360
; CHECK-NEXT: [[I9:%.*]] = or i32 [[I8]], [[I6]]
; CHECK-NEXT: br label [[END]]		; CHECK-NEXT: br label [[END]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: [[I10:%.*]] = phi i32 [ [[I5]], [[IF]] ], [ [[I9]], [[ELSE]] ]		; CHECK-NEXT: [[I10:%.*]] = phi i32 [ [[I5]], [[IF]] ], [ [[I9]], [[ELSE]] ]
; CHECK-NEXT: ret i32 [[I10]]		; CHECK-NEXT: ret i32 [[I10]]
;		;
%i = and i32 %x, 100663296		%i = and i32 %x, 100663296
%i1 = icmp sgt i32 %x, -1		%i1 = icmp sgt i32 %x, -1
br i1 %i1, label %if, label %else		br i1 %i1, label %if, label %else
▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[EXITCOND_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_7]], 32			; CHECK-NEXT: [[EXITCOND_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_7]], 32
	; CHECK-NEXT: br i1 [[EXITCOND_NOT_7]], label [[FOR_BODY4_1:%.*]], label [[FOR_BODY4]], !llvm.loop [[LOOP7:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT_7]], label [[FOR_BODY4_1:%.*]], label [[FOR_BODY4]], !llvm.loop [[LOOP7:![0-9]+]]
	; CHECK: for.body4.1:			; CHECK: for.body4.1:
	; CHECK-NEXT: [[INDVARS_IV_1:%.]] = phi i64 [ [[INDVARS_IV_NEXT_1_7:%.]], [[FOR_BODY4_1]] ], [ 0, [[FOR_BODY4]] ]			; CHECK-NEXT: [[INDVARS_IV_1:%.]] = phi i64 [ [[INDVARS_IV_NEXT_1_7:%.]], [[FOR_BODY4_1]] ], [ 0, [[FOR_BODY4]] ]
	; CHECK-NEXT: [[SUM_11_1:%.]] = phi i32 [ [[ADD_1_7:%.]], [[FOR_BODY4_1]] ], [ [[ADD_7]], [[FOR_BODY4]] ]			; CHECK-NEXT: [[SUM_11_1:%.]] = phi i32 [ [[ADD_1_7:%.]], [[FOR_BODY4_1]] ], [ [[ADD_7]], [[FOR_BODY4]] ]
	; CHECK-NEXT: [[IDX_NEG_1:%.*]] = sub nsw i64 0, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1:%.*]] = sub nsw i64 0, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1]]			; CHECK-NEXT: [[ADD_PTR_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1]]
	; CHECK-NEXT: [[TMP8:%.*]] = load i32, ptr [[ADD_PTR_1]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP8:%.*]] = load i32, ptr [[ADD_PTR_1]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1:%.*]] = shl i32 [[TMP8]], 1
	; CHECK-NEXT: [[ADD_1:%.*]] = add i32 [[MUL_1]], [[SUM_11_1]]
	; CHECK-NEXT: [[IDX_NEG_1_1:%.*]] = xor i64 [[INDVARS_IV_1]], -1			; CHECK-NEXT: [[IDX_NEG_1_1:%.*]] = xor i64 [[INDVARS_IV_1]], -1
	; CHECK-NEXT: [[ADD_PTR_1_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_1]]			; CHECK-NEXT: [[ADD_PTR_1_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_1]]
	; CHECK-NEXT: [[TMP9:%.*]] = load i32, ptr [[ADD_PTR_1_1]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP9:%.*]] = load i32, ptr [[ADD_PTR_1_1]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_1:%.*]] = shl i32 [[TMP9]], 1			; CHECK-NEXT: [[TMP10:%.*]] = add i32 [[TMP8]], [[TMP9]]
	; CHECK-NEXT: [[ADD_1_1:%.*]] = add i32 [[MUL_1_1]], [[ADD_1]]
	; CHECK-NEXT: [[IDX_NEG_1_2:%.*]] = sub nuw nsw i64 -2, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_2:%.*]] = sub nuw nsw i64 -2, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_2]]			; CHECK-NEXT: [[ADD_PTR_1_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_2]]
	; CHECK-NEXT: [[TMP10:%.*]] = load i32, ptr [[ADD_PTR_1_2]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP11:%.*]] = load i32, ptr [[ADD_PTR_1_2]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_2:%.*]] = shl i32 [[TMP10]], 1			; CHECK-NEXT: [[TMP12:%.*]] = add i32 [[TMP10]], [[TMP11]]
	; CHECK-NEXT: [[ADD_1_2:%.*]] = add i32 [[MUL_1_2]], [[ADD_1_1]]
	; CHECK-NEXT: [[IDX_NEG_1_3:%.*]] = sub nuw nsw i64 -3, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_3:%.*]] = sub nuw nsw i64 -3, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_3:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_3]]			; CHECK-NEXT: [[ADD_PTR_1_3:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_3]]
	; CHECK-NEXT: [[TMP11:%.*]] = load i32, ptr [[ADD_PTR_1_3]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP13:%.*]] = load i32, ptr [[ADD_PTR_1_3]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_3:%.*]] = shl i32 [[TMP11]], 1			; CHECK-NEXT: [[TMP14:%.*]] = add i32 [[TMP12]], [[TMP13]]
	; CHECK-NEXT: [[ADD_1_3:%.*]] = add i32 [[MUL_1_3]], [[ADD_1_2]]
	; CHECK-NEXT: [[IDX_NEG_1_4:%.*]] = sub nuw nsw i64 -4, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_4:%.*]] = sub nuw nsw i64 -4, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_4:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_4]]			; CHECK-NEXT: [[ADD_PTR_1_4:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_4]]
	; CHECK-NEXT: [[TMP12:%.*]] = load i32, ptr [[ADD_PTR_1_4]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP15:%.*]] = load i32, ptr [[ADD_PTR_1_4]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_4:%.*]] = shl i32 [[TMP12]], 1			; CHECK-NEXT: [[TMP16:%.*]] = add i32 [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[ADD_1_4:%.*]] = add i32 [[MUL_1_4]], [[ADD_1_3]]
	; CHECK-NEXT: [[IDX_NEG_1_5:%.*]] = sub nuw nsw i64 -5, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_5:%.*]] = sub nuw nsw i64 -5, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_5:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_5]]			; CHECK-NEXT: [[ADD_PTR_1_5:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_5]]
	; CHECK-NEXT: [[TMP13:%.*]] = load i32, ptr [[ADD_PTR_1_5]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP17:%.*]] = load i32, ptr [[ADD_PTR_1_5]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_5:%.*]] = shl i32 [[TMP13]], 1			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[ADD_1_5:%.*]] = add i32 [[MUL_1_5]], [[ADD_1_4]]
	; CHECK-NEXT: [[IDX_NEG_1_6:%.*]] = sub nuw nsw i64 -6, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_6:%.*]] = sub nuw nsw i64 -6, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_6:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_6]]			; CHECK-NEXT: [[ADD_PTR_1_6:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_6]]
	; CHECK-NEXT: [[TMP14:%.*]] = load i32, ptr [[ADD_PTR_1_6]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP19:%.*]] = load i32, ptr [[ADD_PTR_1_6]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_6:%.*]] = shl i32 [[TMP14]], 1			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP18]], [[TMP19]]
	; CHECK-NEXT: [[ADD_1_6:%.*]] = add i32 [[MUL_1_6]], [[ADD_1_5]]
	; CHECK-NEXT: [[IDX_NEG_1_7:%.*]] = sub nuw nsw i64 -7, [[INDVARS_IV_1]]			; CHECK-NEXT: [[IDX_NEG_1_7:%.*]] = sub nuw nsw i64 -7, [[INDVARS_IV_1]]
	; CHECK-NEXT: [[ADD_PTR_1_7:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_7]]			; CHECK-NEXT: [[ADD_PTR_1_7:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_1_7]]
	; CHECK-NEXT: [[TMP15:%.*]] = load i32, ptr [[ADD_PTR_1_7]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP21:%.*]] = load i32, ptr [[ADD_PTR_1_7]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_1_7:%.*]] = shl i32 [[TMP15]], 1			; CHECK-NEXT: [[TMP22:%.*]] = add i32 [[TMP20]], [[TMP21]]
	; CHECK-NEXT: [[ADD_1_7]] = add i32 [[MUL_1_7]], [[ADD_1_6]]			; CHECK-NEXT: [[TMP23:%.*]] = shl i32 [[TMP22]], 1
				; CHECK-NEXT: [[ADD_1_7]] = add i32 [[TMP23]], [[SUM_11_1]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_1_7]] = add nuw nsw i64 [[INDVARS_IV_1]], 8			; CHECK-NEXT: [[INDVARS_IV_NEXT_1_7]] = add nuw nsw i64 [[INDVARS_IV_1]], 8
	; CHECK-NEXT: [[EXITCOND_1_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_1_7]], 32			; CHECK-NEXT: [[EXITCOND_1_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_1_7]], 32
	; CHECK-NEXT: br i1 [[EXITCOND_1_NOT_7]], label [[FOR_BODY4_2:%.*]], label [[FOR_BODY4_1]], !llvm.loop [[LOOP7]]			; CHECK-NEXT: br i1 [[EXITCOND_1_NOT_7]], label [[FOR_BODY4_2:%.*]], label [[FOR_BODY4_1]], !llvm.loop [[LOOP7]]
	; CHECK: for.body4.2:			; CHECK: for.body4.2:
	; CHECK-NEXT: [[INDVARS_IV_2:%.]] = phi i64 [ [[INDVARS_IV_NEXT_2_7:%.]], [[FOR_BODY4_2]] ], [ 0, [[FOR_BODY4_1]] ]			; CHECK-NEXT: [[INDVARS_IV_2:%.]] = phi i64 [ [[INDVARS_IV_NEXT_2_7:%.]], [[FOR_BODY4_2]] ], [ 0, [[FOR_BODY4_1]] ]
	; CHECK-NEXT: [[SUM_11_2:%.]] = phi i32 [ [[ADD_2_7:%.]], [[FOR_BODY4_2]] ], [ [[ADD_1_7]], [[FOR_BODY4_1]] ]			; CHECK-NEXT: [[SUM_11_2:%.]] = phi i32 [ [[ADD_2_7:%.]], [[FOR_BODY4_2]] ], [ [[ADD_1_7]], [[FOR_BODY4_1]] ]
	; CHECK-NEXT: [[IDX_NEG_2:%.*]] = sub nsw i64 0, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2:%.*]] = sub nsw i64 0, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2]]			; CHECK-NEXT: [[ADD_PTR_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2]]
	; CHECK-NEXT: [[TMP16:%.*]] = load i32, ptr [[ADD_PTR_2]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP24:%.*]] = load i32, ptr [[ADD_PTR_2]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2:%.*]] = mul i32 [[TMP16]], 3			; CHECK-NEXT: [[MUL_2:%.*]] = mul i32 [[TMP24]], 3
	; CHECK-NEXT: [[ADD_2:%.*]] = add i32 [[MUL_2]], [[SUM_11_2]]			; CHECK-NEXT: [[ADD_2:%.*]] = add i32 [[MUL_2]], [[SUM_11_2]]
	; CHECK-NEXT: [[IDX_NEG_2_1:%.*]] = xor i64 [[INDVARS_IV_2]], -1			; CHECK-NEXT: [[IDX_NEG_2_1:%.*]] = xor i64 [[INDVARS_IV_2]], -1
	; CHECK-NEXT: [[ADD_PTR_2_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_1]]			; CHECK-NEXT: [[ADD_PTR_2_1:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_1]]
	; CHECK-NEXT: [[TMP17:%.*]] = load i32, ptr [[ADD_PTR_2_1]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP25:%.*]] = load i32, ptr [[ADD_PTR_2_1]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_1:%.*]] = mul i32 [[TMP17]], 3			; CHECK-NEXT: [[MUL_2_1:%.*]] = mul i32 [[TMP25]], 3
	; CHECK-NEXT: [[ADD_2_1:%.*]] = add i32 [[MUL_2_1]], [[ADD_2]]			; CHECK-NEXT: [[ADD_2_1:%.*]] = add i32 [[MUL_2_1]], [[ADD_2]]
	; CHECK-NEXT: [[IDX_NEG_2_2:%.*]] = sub nuw nsw i64 -2, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_2:%.*]] = sub nuw nsw i64 -2, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_2]]			; CHECK-NEXT: [[ADD_PTR_2_2:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_2]]
	; CHECK-NEXT: [[TMP18:%.*]] = load i32, ptr [[ADD_PTR_2_2]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP26:%.*]] = load i32, ptr [[ADD_PTR_2_2]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_2:%.*]] = mul i32 [[TMP18]], 3			; CHECK-NEXT: [[MUL_2_2:%.*]] = mul i32 [[TMP26]], 3
	; CHECK-NEXT: [[ADD_2_2:%.*]] = add i32 [[MUL_2_2]], [[ADD_2_1]]			; CHECK-NEXT: [[ADD_2_2:%.*]] = add i32 [[MUL_2_2]], [[ADD_2_1]]
	; CHECK-NEXT: [[IDX_NEG_2_3:%.*]] = sub nuw nsw i64 -3, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_3:%.*]] = sub nuw nsw i64 -3, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_3:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_3]]			; CHECK-NEXT: [[ADD_PTR_2_3:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_3]]
	; CHECK-NEXT: [[TMP19:%.*]] = load i32, ptr [[ADD_PTR_2_3]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP27:%.*]] = load i32, ptr [[ADD_PTR_2_3]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_3:%.*]] = mul i32 [[TMP19]], 3			; CHECK-NEXT: [[MUL_2_3:%.*]] = mul i32 [[TMP27]], 3
	; CHECK-NEXT: [[ADD_2_3:%.*]] = add i32 [[MUL_2_3]], [[ADD_2_2]]			; CHECK-NEXT: [[ADD_2_3:%.*]] = add i32 [[MUL_2_3]], [[ADD_2_2]]
	; CHECK-NEXT: [[IDX_NEG_2_4:%.*]] = sub nuw nsw i64 -4, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_4:%.*]] = sub nuw nsw i64 -4, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_4:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_4]]			; CHECK-NEXT: [[ADD_PTR_2_4:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_4]]
	; CHECK-NEXT: [[TMP20:%.*]] = load i32, ptr [[ADD_PTR_2_4]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP28:%.*]] = load i32, ptr [[ADD_PTR_2_4]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_4:%.*]] = mul i32 [[TMP20]], 3			; CHECK-NEXT: [[MUL_2_4:%.*]] = mul i32 [[TMP28]], 3
	; CHECK-NEXT: [[ADD_2_4:%.*]] = add i32 [[MUL_2_4]], [[ADD_2_3]]			; CHECK-NEXT: [[ADD_2_4:%.*]] = add i32 [[MUL_2_4]], [[ADD_2_3]]
	; CHECK-NEXT: [[IDX_NEG_2_5:%.*]] = sub nuw nsw i64 -5, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_5:%.*]] = sub nuw nsw i64 -5, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_5:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_5]]			; CHECK-NEXT: [[ADD_PTR_2_5:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_5]]
	; CHECK-NEXT: [[TMP21:%.*]] = load i32, ptr [[ADD_PTR_2_5]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP29:%.*]] = load i32, ptr [[ADD_PTR_2_5]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_5:%.*]] = mul i32 [[TMP21]], 3			; CHECK-NEXT: [[MUL_2_5:%.*]] = mul i32 [[TMP29]], 3
	; CHECK-NEXT: [[ADD_2_5:%.*]] = add i32 [[MUL_2_5]], [[ADD_2_4]]			; CHECK-NEXT: [[ADD_2_5:%.*]] = add i32 [[MUL_2_5]], [[ADD_2_4]]
	; CHECK-NEXT: [[IDX_NEG_2_6:%.*]] = sub nuw nsw i64 -6, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_6:%.*]] = sub nuw nsw i64 -6, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_6:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_6]]			; CHECK-NEXT: [[ADD_PTR_2_6:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_6]]
	; CHECK-NEXT: [[TMP22:%.*]] = load i32, ptr [[ADD_PTR_2_6]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP30:%.*]] = load i32, ptr [[ADD_PTR_2_6]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_6:%.*]] = mul i32 [[TMP22]], 3			; CHECK-NEXT: [[MUL_2_6:%.*]] = mul i32 [[TMP30]], 3
	; CHECK-NEXT: [[ADD_2_6:%.*]] = add i32 [[MUL_2_6]], [[ADD_2_5]]			; CHECK-NEXT: [[ADD_2_6:%.*]] = add i32 [[MUL_2_6]], [[ADD_2_5]]
	; CHECK-NEXT: [[IDX_NEG_2_7:%.*]] = sub nuw nsw i64 -7, [[INDVARS_IV_2]]			; CHECK-NEXT: [[IDX_NEG_2_7:%.*]] = sub nuw nsw i64 -7, [[INDVARS_IV_2]]
	; CHECK-NEXT: [[ADD_PTR_2_7:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_7]]			; CHECK-NEXT: [[ADD_PTR_2_7:%.*]] = getelementptr inbounds i32, ptr getelementptr inbounds ([100 x i32], ptr @ARR, i64 0, i64 99), i64 [[IDX_NEG_2_7]]
	; CHECK-NEXT: [[TMP23:%.*]] = load i32, ptr [[ADD_PTR_2_7]], align 4, !tbaa [[TBAA3]]			; CHECK-NEXT: [[TMP31:%.*]] = load i32, ptr [[ADD_PTR_2_7]], align 4, !tbaa [[TBAA3]]
	; CHECK-NEXT: [[MUL_2_7:%.*]] = mul i32 [[TMP23]], 3			; CHECK-NEXT: [[MUL_2_7:%.*]] = mul i32 [[TMP31]], 3
	; CHECK-NEXT: [[ADD_2_7]] = add i32 [[MUL_2_7]], [[ADD_2_6]]			; CHECK-NEXT: [[ADD_2_7]] = add i32 [[MUL_2_7]], [[ADD_2_6]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_2_7]] = add nuw nsw i64 [[INDVARS_IV_2]], 8			; CHECK-NEXT: [[INDVARS_IV_NEXT_2_7]] = add nuw nsw i64 [[INDVARS_IV_2]], 8
	; CHECK-NEXT: [[EXITCOND_2_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_2_7]], 32			; CHECK-NEXT: [[EXITCOND_2_NOT_7:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_2_7]], 32
	; CHECK-NEXT: br i1 [[EXITCOND_2_NOT_7]], label [[FOR_INC5_2:%.*]], label [[FOR_BODY4_2]], !llvm.loop [[LOOP7]]			; CHECK-NEXT: br i1 [[EXITCOND_2_NOT_7]], label [[FOR_INC5_2:%.*]], label [[FOR_BODY4_2]], !llvm.loop [[LOOP7]]
	; CHECK: for.inc5.2:			; CHECK: for.inc5.2:
	; CHECK-NEXT: ret i32 [[ADD_2_7]]			; CHECK-NEXT: ret i32 [[ADD_2_7]]
	;			;
	entry:			entry:
	▲ Show 20 Lines • Show All 67 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 531130

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

llvm/test/Transforms/InstCombine/and-xor-or.ll

llvm/test/Transforms/InstCombine/binop-and-shifts.ll

llvm/test/Transforms/InstCombine/or-shifted-masks.ll

llvm/test/Transforms/PhaseOrdering/SystemZ/sub-xor.ll

[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`
ClosedPublic