Download Raw Diff

Details

Reviewers

Commits

rGc3b394ffba58: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift…
rL373960: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift…

Summary

When we do ConstantExpr::getZExt(), that "extends" undef to 0,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.Sep 30 2019, 12:35 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptSep 30 2019, 12:35 PM

lebedev.ri marked an inline comment as done.Sep 30 2019, 12:36 PM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
116 ↗	(On Diff #222475)	I'm open to suggestions about this function and it's name.

I'd be okay with either spliting this up further if wanted, or moving this particular patch
into post-commit review mode; the only worrying thing to me here is the sanitizeUndefsTo() itself.

In D68239#1693270, @lebedev.ri wrote:

I'd be okay with either spliting this up further if wanted, or moving this particular patch
into post-commit review mode; the only worrying thing to me here is the sanitizeUndefsTo() itself.

I'm a patch minimalist, so if we can split this up, let's do it.
We should not use "sanitize" in the function name because that term already has a special meaning in the clang/LLVM world.

In D68239#1694814, @spatel wrote:

In D68239#1693270, @lebedev.ri wrote:

I'd be okay with either spliting this up further if wanted, or moving this particular patch
into post-commit review mode; the only worrying thing to me here is the sanitizeUndefsTo() itself.

I'm a patch minimalist, so if we can split this up, let's do it.

Cool, will post in a sec..

We should not use "sanitize" in the function name because that term already has a special meaning in the clang/LLVM world.

Any suggestions? replaceUndefsWith?

Splitting into two patches.

lebedev.ri added a child revision: D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask.Oct 4 2019, 9:11 AM

spatel added inline comments.Oct 4 2019, 12:55 PM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

117–126 ↗

(On Diff #223232)

I suppose it's a matter of C++ familiarity, but this logic is not obvious to me.

I think it's easier to read as something like:

unsigned NumElts = CV->getType()->getVectorNumElements();
for (unsigned i = 0; i != NumElts; ++i) {
  Constant *EltC= CV->getOperand(i);
  NewOps[i] = match(EltC, m_Undef()) ? Replacement : EltC;
}

190–192 ↗

(On Diff #223232)

Probably worth explaining this more directly, and what if we make it consistent by replacing with -1?

// An extend of an undef value becomes zero because the 
// high bits are never completely unknown. Replace the
// the shift amount with -1 to ensure that the value remains
// undef when creating the subsequent shift op.

spatel added inline comments.Oct 4 2019, 12:59 PM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
116 ↗	(On Diff #223232)	Is it possible to get here with a scalar undef? If so, this doesn't replace it. If not, do we want to assert that we have a vector constant?

Addressed review notes.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
116 ↗	(On Diff #223232)	Given that we get the constant via running a custom instsimplify i wouldn't fully exclude the situation that we'd end with scalar `undef` here. It's not really a correctness problem. Let's not assert but also replace.
190–192 ↗	(On Diff #223232)	Good comments are always good :) Not with -1 though.

Upload correct patch this time.

spatel added inline comments.Oct 5 2019, 11:03 AM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
190–192 ↗	(On Diff #223232)	Why is using different variants based on bidwidth better? Or does -1 not work as a poisonous shift amount?

lebedev.ri added inline comments.Oct 5 2019, 11:11 AM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
190–192 ↗	(On Diff #223232)	In this branch - yes, i think `-1` should work. But in other branch we will negate that shift amount and add innermost shift bitwidth to it. What i'm trying to say is, it is not so much about that the replacement shift amount here will still produce `undef`, but it must do so when we actually perform the shift. It seemed consistent to me to end up with same shift-by-bitwidth in both branches. Let me know if that explanation does not make any sense.

LGTM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
190–192 ↗	(On Diff #223232)	Ok, I see what you mean. It seems like we're missing some analysis that would handle this sort of thing for us cleanly, but I don't see where/how to implement it, so I won't hold this up.

This revision is now accepted and ready to land.Oct 6 2019, 6:38 AM

In D68239#1696589, @spatel wrote:

LGTM

Thank you for the review!

There's second half of the patch in D68470.
I did't post the actual patch adding trunc support yet, tests needed...

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
190–192 ↗	(On Diff #223232)	Yeah..

spatel mentioned this in D68470: [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask.Oct 7 2019, 5:14 AM

Closed by commit rL373960: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift… (authored by lebedevri). · Explain WhyOct 7 2019, 10:19 PM

This revision was automatically updated to reflect the committed changes.

Diff 223758

llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp

Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	reassociateShiftAmtsOfTwoSameDirectionShifts(BinaryOperator *Sh0,
if (Trunc) {		if (Trunc) {
Builder.Insert(NewShift);		Builder.Insert(NewShift);
Ret = CastInst::Create(Instruction::Trunc, NewShift, Sh0->getType());		Ret = CastInst::Create(Instruction::Trunc, NewShift, Sh0->getType());
}		}

return Ret;		return Ret;
}		}

		// Try to replace `undef` constants in C with Replacement.
		static Constant replaceUndefsWith(Constant C, Constant *Replacement) {
		if (C && match(C, m_Undef()))
		return Replacement;

		if (auto *CV = dyn_cast<ConstantVector>(C)) {
		llvm::SmallVector<Constant *, 32> NewOps(CV->getNumOperands());
		for (unsigned i = 0, NumElts = NewOps.size(); i != NumElts; ++i) {
		Constant *EltC = CV->getOperand(i);
		NewOps[i] = EltC && match(EltC, m_Undef()) ? Replacement : EltC;
		}
		return ConstantVector::get(NewOps);
		}

		// Don't know how to deal with this constant.
		return C;
		}

// If we have some pattern that leaves only some low bits set, and then performs		// If we have some pattern that leaves only some low bits set, and then performs
// left-shift of those bits, if none of the bits that are left after the final		// left-shift of those bits, if none of the bits that are left after the final
// shift are modified by the mask, we can omit the mask.		// shift are modified by the mask, we can omit the mask.
//		//
// There are many variants to this pattern:		// There are many variants to this pattern:
// a) (x & ((1 << MaskShAmt) - 1)) << ShiftShAmt		// a) (x & ((1 << MaskShAmt) - 1)) << ShiftShAmt
// b) (x & (~(-1 << MaskShAmt))) << ShiftShAmt		// b) (x & (~(-1 << MaskShAmt))) << ShiftShAmt
// c) (x & (-1 >> MaskShAmt)) << ShiftShAmt		// c) (x & (-1 >> MaskShAmt)) << ShiftShAmt
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	if (match(Masked, m_c_And(m_CombineOr(MaskA, MaskB), m_Value(X)))) {
if (!match(SumOfShAmts, m_SpecificInt_ICMP(ICmpInst::Predicate::ICMP_UGE,		if (!match(SumOfShAmts, m_SpecificInt_ICMP(ICmpInst::Predicate::ICMP_UGE,
APInt(BitWidth, BitWidth)))) {		APInt(BitWidth, BitWidth)))) {
// But for a mask we need to get rid of old masking instruction.		// But for a mask we need to get rid of old masking instruction.
if (!Masked->hasOneUse())		if (!Masked->hasOneUse())
return nullptr; // Else we can't perform the fold.		return nullptr; // Else we can't perform the fold.
// The mask must be computed in a type twice as wide to ensure		// The mask must be computed in a type twice as wide to ensure
// that no bits are lost if the sum-of-shifts is wider than the base type.		// that no bits are lost if the sum-of-shifts is wider than the base type.
Type *ExtendedTy = Ty->getExtendedType();		Type *ExtendedTy = Ty->getExtendedType();
		// An extend of an undef value becomes zero because the high bits are
		// never completely unknown. Replace the the `undef` shift amounts with
		// final shift bitwidth to ensure that the value remains undef when
		// creating the subsequent shift op.
		SumOfShAmts = replaceUndefsWith(
		SumOfShAmts,
		ConstantInt::get(SumOfShAmts->getType()->getScalarType(),
		ExtendedTy->getScalarType()->getScalarSizeInBits()));
auto *ExtendedSumOfShAmts =		auto *ExtendedSumOfShAmts =
ConstantExpr::getZExt(SumOfShAmts, ExtendedTy);		ConstantExpr::getZExt(SumOfShAmts, ExtendedTy);
// And compute the mask as usual: ~(-1 << (SumOfShAmts))		// And compute the mask as usual: ~(-1 << (SumOfShAmts))
auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy);		auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy);
auto *ExtendedInvertedMask =		auto *ExtendedInvertedMask =
ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts);		ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts);
auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask);		auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask);
NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty);		NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty);
Show All 19 Lines	if (!match(ShAmtsDiff, m_NonNegative())) {
// For a mask we need to get rid of old masking instruction.		// For a mask we need to get rid of old masking instruction.
if (!Masked->hasOneUse())		if (!Masked->hasOneUse())
return nullptr; // Else we can't perform the fold.		return nullptr; // Else we can't perform the fold.
Type *Ty = X->getType();		Type *Ty = X->getType();
unsigned BitWidth = Ty->getScalarSizeInBits();		unsigned BitWidth = Ty->getScalarSizeInBits();
// The mask must be computed in a type twice as wide to ensure		// The mask must be computed in a type twice as wide to ensure
// that no bits are lost if the sum-of-shifts is wider than the base type.		// that no bits are lost if the sum-of-shifts is wider than the base type.
Type *ExtendedTy = Ty->getExtendedType();		Type *ExtendedTy = Ty->getExtendedType();
		// An extend of an undef value becomes zero because the high bits are
		// never completely unknown. Replace the the `undef` shift amounts with
		// negated shift bitwidth to ensure that the value remains undef when
		// creating the subsequent shift op.
		ShAmtsDiff = replaceUndefsWith(
		ShAmtsDiff,
		ConstantInt::get(ShAmtsDiff->getType()->getScalarType(), -BitWidth));
auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt(		auto *ExtendedNumHighBitsToClear = ConstantExpr::getZExt(
ConstantExpr::getAdd(		ConstantExpr::getAdd(
ConstantExpr::getNeg(ShAmtsDiff),		ConstantExpr::getNeg(ShAmtsDiff),
ConstantInt::get(Ty, BitWidth, /isSigned=/false)),		ConstantInt::get(Ty, BitWidth, /isSigned=/false)),
ExtendedTy);		ExtendedTy);
// And compute the mask as usual: (-1 l>> (ShAmtsDiff))		// And compute the mask as usual: (-1 l>> (ShAmtsDiff))
auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy);		auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy);
auto *ExtendedMask =		auto *ExtendedMask =
▲ Show 20 Lines • Show All 975 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll

	Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[T1:%.*]] = shl <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 undef, i32 1>, [[T0]]			; CHECK-NEXT: [[T1:%.*]] = shl <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 undef, i32 1>, [[T0]]
	; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[T1]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[T1]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	; CHECK-NEXT: [[T4:%.*]] = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, [[NBITS]]			; CHECK-NEXT: [[T4:%.*]] = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, [[NBITS]]
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]])
	; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T4]]			; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T4]]
	; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 0, i32 2147483647>			; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 undef, i32 2147483647>
	; CHECK-NEXT: ret <8 x i32> [[T5]]			; CHECK-NEXT: ret <8 x i32> [[T5]]
	;			;
	%t0 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t0 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	%t1 = shl <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 undef, i32 1>, %t0			%t1 = shl <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 undef, i32 1>, %t0
	%t2 = add <8 x i32> %t1, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t2 = add <8 x i32> %t1, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	%t3 = and <8 x i32> %t2, %x			%t3 = and <8 x i32> %t2, %x
	%t4 = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, %nbits			%t4 = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, %nbits
	call void @use8xi32(<8 x i32> %t0)			call void @use8xi32(<8 x i32> %t0)
	▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll

	Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[T1:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[T0]]			; CHECK-NEXT: [[T1:%.*]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[T0]]
	; CHECK-NEXT: [[T2:%.*]] = xor <8 x i32> [[T1]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			; CHECK-NEXT: [[T2:%.*]] = xor <8 x i32> [[T1]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	; CHECK-NEXT: [[T4:%.*]] = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, [[NBITS]]			; CHECK-NEXT: [[T4:%.*]] = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, [[NBITS]]
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T4]])
	; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T4]]			; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T4]]
	; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 0, i32 2147483647>			; CHECK-NEXT: [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 undef, i32 2147483647>
	; CHECK-NEXT: ret <8 x i32> [[T5]]			; CHECK-NEXT: ret <8 x i32> [[T5]]
	;			;
	%t0 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t0 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	%t1 = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %t0			%t1 = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %t0
	%t2 = xor <8 x i32> %t1, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t2 = xor <8 x i32> %t1, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	%t3 = and <8 x i32> %t2, %x			%t3 = and <8 x i32> %t2, %x
	%t4 = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, %nbits			%t4 = sub <8 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 undef, i32 32>, %nbits
	call void @use8xi32(<8 x i32> %t0)			call void @use8xi32(<8 x i32> %t0)
	▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll

	Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines

	define <8 x i32> @t1_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {			define <8 x i32> @t1_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
	; CHECK-LABEL: @t1_vec_splat_undef(			; CHECK-LABEL: @t1_vec_splat_undef(
	; CHECK-NEXT: [[T0:%.]] = lshr <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.]]			; CHECK-NEXT: [[T0:%.]] = lshr <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.]]
	; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])
	; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T2]]			; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T2]]
	; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 -1, i32 2147483647>			; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 undef, i32 2147483647>
	; CHECK-NEXT: ret <8 x i32> [[T3]]			; CHECK-NEXT: ret <8 x i32> [[T3]]
	;			;
	%t0 = lshr <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %nbits			%t0 = lshr <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %nbits
	%t1 = and <8 x i32> %t0, %x			%t1 = and <8 x i32> %t0, %x
	%t2 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t2 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	call void @use8xi32(<8 x i32> %t0)			call void @use8xi32(<8 x i32> %t0)
	call void @use8xi32(<8 x i32> %t2)			call void @use8xi32(<8 x i32> %t2)
	%t3 = shl <8 x i32> %t1, %t2 ; shift is smaller than mask			%t3 = shl <8 x i32> %t1, %t2 ; shift is smaller than mask
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @t2_vec_splat_undef(			; CHECK-LABEL: @t2_vec_splat_undef(
	; CHECK-NEXT: [[T0:%.]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.]]			; CHECK-NEXT: [[T0:%.]] = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, [[NBITS:%.]]
	; CHECK-NEXT: [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]			; CHECK-NEXT: [[T1:%.*]] = lshr <8 x i32> [[T0]], [[NBITS]]
	; CHECK-NEXT: [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			; CHECK-NEXT: [[T3:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T1]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T3]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T3]])
	; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T3]]			; CHECK-NEXT: [[TMP1:%.]] = shl <8 x i32> [[X:%.]], [[T3]]
	; CHECK-NEXT: [[T4:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 -1, i32 2147483647>			; CHECK-NEXT: [[T4:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 undef, i32 2147483647>
	; CHECK-NEXT: ret <8 x i32> [[T4]]			; CHECK-NEXT: ret <8 x i32> [[T4]]
	;			;
	%t0 = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %nbits			%t0 = shl <8 x i32> <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>, %nbits
	%t1 = lshr <8 x i32> %t0, %nbits			%t1 = lshr <8 x i32> %t0, %nbits
	%t2 = and <8 x i32> %t1, %x			%t2 = and <8 x i32> %t1, %x
	%t3 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t3 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	call void @use8xi32(<8 x i32> %t0)			call void @use8xi32(<8 x i32> %t0)
	call void @use8xi32(<8 x i32> %t1)			call void @use8xi32(<8 x i32> %t1)
	▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll

	Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines

	define <8 x i32> @t1_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {			define <8 x i32> @t1_vec_splat_undef(<8 x i32> %x, <8 x i32> %nbits) {
	; CHECK-LABEL: @t1_vec_splat_undef(			; CHECK-LABEL: @t1_vec_splat_undef(
	; CHECK-NEXT: [[T0:%.]] = shl <8 x i32> [[X:%.]], [[NBITS:%.*]]			; CHECK-NEXT: [[T0:%.]] = shl <8 x i32> [[X:%.]], [[NBITS:%.*]]
	; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			; CHECK-NEXT: [[T2:%.*]] = add <8 x i32> [[NBITS]], <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T0]])
	; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])			; CHECK-NEXT: call void @use8xi32(<8 x i32> [[T2]])
	; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X]], [[T2]]			; CHECK-NEXT: [[TMP1:%.*]] = shl <8 x i32> [[X]], [[T2]]
	; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 -1, i32 2147483647>			; CHECK-NEXT: [[T3:%.*]] = and <8 x i32> [[TMP1]], <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647, i32 undef, i32 2147483647>
	; CHECK-NEXT: ret <8 x i32> [[T3]]			; CHECK-NEXT: ret <8 x i32> [[T3]]
	;			;
	%t0 = shl <8 x i32> %x, %nbits			%t0 = shl <8 x i32> %x, %nbits
	%t1 = lshr <8 x i32> %t0, %nbits			%t1 = lshr <8 x i32> %t0, %nbits
	%t2 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>			%t2 = add <8 x i32> %nbits, <i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 -1, i32 undef, i32 -1>
	call void @use8xi32(<8 x i32> %t0)			call void @use8xi32(<8 x i32> %t0)
	call void @use8xi32(<8 x i32> %t2)			call void @use8xi32(<8 x i32> %t2)
	%t3 = shl <8 x i32> %t1, %t2 ; shift is smaller than mask			%t3 = shl <8 x i32> %t1, %t2 ; shift is smaller than mask
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 223758

llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amountsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 223758

llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll

llvm/trunk/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts
ClosedPublic