Download Raw Diff

Details

Reviewers

spatel
craig.topper
mareko
bogner
rampitec
nhaehnle
arsenm

Commits

rG84c11aed10eb: [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y)
rG4cdc59ecf2c0: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)
rL334818: [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y)
rL334371: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)

Summary

We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.Jun 9 2018, 3:26 AM

lebedev.ri mentioned this in D47983: [IR][PatternMatch] m_APInt(): allow undef elements..Jun 9 2018, 6:26 AM

Simplify matcher, NFC.

LGTM.

lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744	As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. I think it's fine either way, but let's keep both patches consistent in their structure.

This revision is now accepted and ready to land.Jun 10 2018, 10:03 AM

lebedev.ri added a child revision: D47983: [IR][PatternMatch] m_APInt(): allow undef elements..Jun 10 2018, 10:05 AM

In D47980#1127542, @spatel wrote:

LGTM.

Thank you for the review!

lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744	As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. I haven't really though about that.. Sounds like i should add multi-use tests with constants to the tests.

spatel added inline comments.Jun 10 2018, 10:09 AM

lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744	Oops - misplaced this inline comment. It was meant for the block where ShlAmt == ShAmt.

lebedev.ri added inline comments.Jun 10 2018, 12:30 PM

lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744	As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. Hmm, this is actually broken. That is only true if either the shifts are the same constant, or `shl` is `nuw`, else it produces one extra instruction.

lebedev.ri added inline comments.Jun 10 2018, 1:08 PM

lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744	I will add tests, but there is one test in `test/Transforms/InstCombine/shift.ll` that breaks if i 'fix' this, so maybe this is intentional..

Closed by commit rL334371: [InstCombine] Fold (x << y) >> y -> x & (-1 >> y) (authored by lebedevri). · Explain WhyJun 10 2018, 1:14 PM

This revision was automatically updated to reflect the committed changes.

Diffusion mentioned this in rL334370: [NFC][InstCombine] Revisit tests for D47980 / D47981 once more..

Diffusion mentioned this in rL334373: Revert rL334371 / D47980: "[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)".Jun 10 2018, 1:36 PM

Ok, that didn't go as planned :)
Did not notice the AMDGPU test change (yay tests), so the bots broke.
Did not evaluate the effect of that change on the AMDGPU codegen yet, will do tomorrow.

Herald added a subscriber: nhaehnle. · View Herald TranscriptJun 10 2018, 2:16 PM

lebedev.ri reopened this revision.Jun 10 2018, 2:17 PM

This revision is now accepted and ready to land.Jun 10 2018, 2:17 PM

lebedev.ri mentioned this in D48005: [NFC][AMDGPU] Add tests for all the various IR patterns equivalent to extracting low bits..Jun 11 2018, 2:49 AM

nhaehnle added inline comments.Jun 11 2018, 2:59 AM

test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
898–899	Indeed, the pattern matching in the backend does not recognize this and ends up generating worse code.

nhaehnle requested changes to this revision.Jun 11 2018, 2:59 AM

This revision now requires changes to proceed.Jun 11 2018, 2:59 AM

This comment has been deleted.

test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
898–899	Yep, see D48005 for the backend tests.

Diffusion mentioned this in rL334398: [NFC][AMDGPU] Add tests for all the various IR patterns equivalent to….Jun 11 2018, 3:25 AM

lebedev.ri mentioned this in D48007: [AMDGPU] Recognize x & (-1 >> (32 - y)) pattern..Jun 11 2018, 3:59 AM

lebedev.ri added a parent revision: D48007: [AMDGPU] Recognize x & (-1 >> (32 - y)) pattern..

@nhaehnle AMDGPU patches (D48007, D48010, D48012) are ready to land, please accept this differential :)

Herald added a subscriber: wdng. · View Herald TranscriptJun 11 2018, 8:05 AM

In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register.

What's the rational for the folding?

In addition as tests suggest we would expect the pattern to be folded into a bfe instruction but D48005 shows it is at best "bfm" (with an extra register to hold a mask) and "and". I.e. it basically shows a regression for our target. There probably would be no concern if the sequence is converted to a bfe as expected.

In D47980#1128464, @rampitec wrote:

In general for AMDGPU two shifts are better. Any shift immediate can be folded right into the shift instruction while a rather big mask produced by this change would require either extra 4 bytes in the encoding or even worse a move and a register.

What's the rational for the folding?

In addition as tests suggest we would expect the pattern to be folded into a bfe instruction but D48005 shows it is at best "bfm" (with an extra register to hold a mask) and "and". I.e. it basically shows a regression for our target. There probably would be no concern if the sequence is converted to a bfe as expected.

You did note that i have posted D48007, D48010, D48012 to fix the AMDGPU backend, right?
Any other concerns ontop of that?

Ah! I see the reviews to fold it into a bfe. I have no concern then, LGTM.

In D47980#1128473, @rampitec wrote:

Ah! I see the reviews to fold it into a bfe. I have no concern then, LGTM.

Thank you for reassurance!
Though i'm still waiting for @nhaehnle to re-accept this.

nhaehnle accepted this revision.Jun 15 2018, 2:42 AM

This revision is now accepted and ready to land.Jun 15 2018, 2:42 AM

Thanks!
Do note that there does not seem to be any folds for when the offset is not 0.
E.g. (x >> start) & (-1 >> (width - y)) might be foldable to (UBFE
$x, $start, $width)

Closed by commit rL334818: [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y) (authored by lebedevri). · Explain WhyJun 15 2018, 3:01 AM

This revision was automatically updated to reflect the committed changes.

Diffusion mentioned this in rL334815: [AMDGPU] Recognize x & (-1 >> (32 - y)) pattern..

lebedev.ri added inline comments.Jun 17 2018, 11:07 AM

llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp
741–744 ↗	(On Diff #151480)	BTW it's interesting to note that all these masks are not fine-grained, isn't it? Alive says https://rise4fun.com/Alive/Yes (lol) Though in practice, from what i have seen from the tests, somehow the mask seems to be adjusted later.

lebedev.ri mentioned this in D48768: [X86] When have BMI2, prefer shifts to clear low/high bits, rather than variable mask..Jun 29 2018, 5:15 AM

Diffusion mentioned this in rL336585: [X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts..Jul 9 2018, 12:11 PM

Diff 150662

lib/Transforms/InstCombine/InstCombineShifts.cpp

Show First 20 Lines • Show All 732 Lines • ▼ Show 20 Lines	if (match(Op0, m_Shl(m_Value(X), m_APInt(ShOp1)))) {
if (ShlAmt < ShAmt) {		if (ShlAmt < ShAmt) {
Constant *ShiftDiff = ConstantInt::get(Ty, ShAmt - ShlAmt);		Constant *ShiftDiff = ConstantInt::get(Ty, ShAmt - ShlAmt);
if (cast<BinaryOperator>(Op0)->hasNoUnsignedWrap()) {		if (cast<BinaryOperator>(Op0)->hasNoUnsignedWrap()) {
// (X <<nuw C1) >>u C2 --> X >>u (C2 - C1)		// (X <<nuw C1) >>u C2 --> X >>u (C2 - C1)
auto *NewLShr = BinaryOperator::CreateLShr(X, ShiftDiff);		auto *NewLShr = BinaryOperator::CreateLShr(X, ShiftDiff);
NewLShr->setIsExact(I.isExact());		NewLShr->setIsExact(I.isExact());
return NewLShr;		return NewLShr;
}		}
// (X << C1) >>u C2 --> (X >>u (C2 - C1)) & (-1 >> C2)		// (X << C1) >>u C2 --> (X >>u (C2 - C1)) & (-1 >> C2)
Value *NewLShr = Builder.CreateLShr(X, ShiftDiff, "", I.isExact());		Value *NewLShr = Builder.CreateLShr(X, ShiftDiff, "", I.isExact());
APInt Mask(APInt::getLowBitsSet(BitWidth, BitWidth - ShAmt));		APInt Mask(APInt::getLowBitsSet(BitWidth, BitWidth - ShAmt));
return BinaryOperator::CreateAnd(NewLShr, ConstantInt::get(Ty, Mask));		return BinaryOperator::CreateAnd(NewLShr, ConstantInt::get(Ty, Mask));
		spatelUnsubmitted Not Done Reply Inline Actions As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. I think it's fine either way, but let's keep both patches consistent in their structure. spatel: As with D47981, we could consolidate this, but the constant version doesn't need the one-use…
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. I haven't really though about that.. Sounds like i should add multi-use tests with constants to the tests. lebedev.ri: > As with D47981, we could consolidate this, but the constant version doesn't need the one-use…
		spatelUnsubmitted Not Done Reply Inline Actions Oops - misplaced this inline comment. It was meant for the block where ShlAmt == ShAmt. spatel: Oops - misplaced this inline comment. It was meant for the block where ShlAmt == ShAmt.
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions As with D47981, we could consolidate this, but the constant version doesn't need the one-use check. Hmm, this is actually broken. That is only true if either the shifts are the same constant, or `shl` is `nuw`, else it produces one extra instruction. lebedev.ri: > As with D47981, we could consolidate this, but the constant version doesn't need the one-use…
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions I will add tests, but there is one test in `test/Transforms/InstCombine/shift.ll` that breaks if i 'fix' this, so maybe this is intentional.. lebedev.ri: I will add tests, but there is one test in `test/Transforms/InstCombine/shift.ll` that breaks…
}		}
if (ShlAmt > ShAmt) {		if (ShlAmt > ShAmt) {
Constant *ShiftDiff = ConstantInt::get(Ty, ShlAmt - ShAmt);		Constant *ShiftDiff = ConstantInt::get(Ty, ShlAmt - ShAmt);
if (cast<BinaryOperator>(Op0)->hasNoUnsignedWrap()) {		if (cast<BinaryOperator>(Op0)->hasNoUnsignedWrap()) {
// (X <<nuw C1) >>u C2 --> X <<nuw (C1 - C2)		// (X <<nuw C1) >>u C2 --> X <<nuw (C1 - C2)
auto *NewShl = BinaryOperator::CreateShl(X, ShiftDiff);		auto *NewShl = BinaryOperator::CreateShl(X, ShiftDiff);
NewShl->setHasNoUnsignedWrap(true);		NewShl->setHasNoUnsignedWrap(true);
return NewShl;		return NewShl;
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (match(Op1, m_APInt(ShAmtAPInt))) {

// If the shifted-out value is known-zero, then this is an exact shift.		// If the shifted-out value is known-zero, then this is an exact shift.
if (!I.isExact() &&		if (!I.isExact() &&
MaskedValueIsZero(Op0, APInt::getLowBitsSet(BitWidth, ShAmt), 0, &I)) {		MaskedValueIsZero(Op0, APInt::getLowBitsSet(BitWidth, ShAmt), 0, &I)) {
I.setIsExact();		I.setIsExact();
return &I;		return &I;
}		}
}		}

		// Transform (x << y) >> y to x & (-1 >> y)
		Value *X;
		if (match(Op0, m_OneUse(m_Shl(m_Value(X), m_Specific(Op1))))) {
		Constant *AllOnes = ConstantInt::getAllOnesValue(Ty);
		Value *Mask = Builder.CreateLShr(AllOnes, Op1);
		return BinaryOperator::CreateAnd(Mask, X);
		}

return nullptr;		return nullptr;
}		}

Instruction *InstCombiner::visitAShr(BinaryOperator &I) {		Instruction *InstCombiner::visitAShr(BinaryOperator &I) {
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
if (Value *V =		if (Value *V =
SimplifyAShrInst(Op0, Op1, I.isExact(), SQ.getWithInstruction(&I)))		SimplifyAShrInst(Op0, Op1, I.isExact(), SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

	Show First 20 Lines • Show All 889 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: %bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 1, i32 %width)			; CHECK-NEXT: %bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 1, i32 %width)
	define i32 @ubfe_offset_33(i32 %src, i32 %width) {			define i32 @ubfe_offset_33(i32 %src, i32 %width) {
	%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 33, i32 %width)			%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 33, i32 %width)
	ret i32 %bfe			ret i32 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_0(			; CHECK-LABEL: @ubfe_offset_0(
	; CHECK-NEXT: %1 = sub i32 32, %width			; CHECK-NEXT: %1 = sub i32 32, %width
	; CHECK-NEXT: %2 = shl i32 %src, %1			; CHECK-NEXT: %2 = lshr i32 -1, %1
	; CHECK-NEXT: %bfe = lshr i32 %2, %1			; CHECK-NEXT: %bfe = and i32 %2, %src
				nhaehnleUnsubmitted Not Done Reply Inline Actions Indeed, the pattern matching in the backend does not recognize this and ends up generating worse code. nhaehnle: Indeed, the pattern matching in the backend does not recognize this and ends up generating…
				lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions Yep, see D48005 for the backend tests. lebedev.ri: Yep, see D48005 for the backend tests.
	; CHECK-NEXT: ret i32 %bfe			; CHECK-NEXT: ret i32 %bfe
	define i32 @ubfe_offset_0(i32 %src, i32 %width) {			define i32 @ubfe_offset_0(i32 %src, i32 %width) {
	%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 0, i32 %width)			%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 0, i32 %width)
	ret i32 %bfe			ret i32 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_32(			; CHECK-LABEL: @ubfe_offset_32(
	; CHECK-NEXT: %1 = sub i32 32, %width			; CHECK-NEXT: %1 = sub i32 32, %width
	; CHECK-NEXT: %2 = shl i32 %src, %1			; CHECK-NEXT: %2 = lshr i32 -1, %1
	; CHECK-NEXT: %bfe = lshr i32 %2, %1			; CHECK-NEXT: %bfe = and i32 %2, %src
	; CHECK-NEXT: ret i32 %bfe			; CHECK-NEXT: ret i32 %bfe
	define i32 @ubfe_offset_32(i32 %src, i32 %width) {			define i32 @ubfe_offset_32(i32 %src, i32 %width) {
	%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 32, i32 %width)			%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 32, i32 %width)
	ret i32 %bfe			ret i32 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_31(			; CHECK-LABEL: @ubfe_offset_31(
	; CHECK-NEXT: %1 = sub i32 32, %width			; CHECK-NEXT: %1 = sub i32 32, %width
	; CHECK-NEXT: %2 = shl i32 %src, %1			; CHECK-NEXT: %2 = lshr i32 -1, %1
	; CHECK-NEXT: %bfe = lshr i32 %2, %1			; CHECK-NEXT: %bfe = and i32 %2, %src
	; CHECK-NEXT: ret i32 %bfe			; CHECK-NEXT: ret i32 %bfe
	define i32 @ubfe_offset_31(i32 %src, i32 %width) {			define i32 @ubfe_offset_31(i32 %src, i32 %width) {
	%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 32, i32 %width)			%bfe = call i32 @llvm.amdgcn.ubfe.i32(i32 %src, i32 32, i32 %width)
	ret i32 %bfe			ret i32 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_0_width_0(			; CHECK-LABEL: @ubfe_offset_0_width_0(
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	define i64 @ubfe_offset_33_width_4_i64(i64 %src) {			define i64 @ubfe_offset_33_width_4_i64(i64 %src) {
	%bfe = call i64 @llvm.amdgcn.ubfe.i64(i64 %src, i32 33, i32 4)			%bfe = call i64 @llvm.amdgcn.ubfe.i64(i64 %src, i32 33, i32 4)
	ret i64 %bfe			ret i64 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_0_i64(			; CHECK-LABEL: @ubfe_offset_0_i64(
	; CHECK-NEXT: %1 = sub i32 64, %width			; CHECK-NEXT: %1 = sub i32 64, %width
	; CHECK-NEXT: %2 = zext i32 %1 to i64			; CHECK-NEXT: %2 = zext i32 %1 to i64
	; CHECK-NEXT: %3 = shl i64 %src, %2			; CHECK-NEXT: %3 = lshr i64 -1, %2
	; CHECK-NEXT: %bfe = lshr i64 %3, %2			; CHECK-NEXT: %bfe = and i64 %3, %src
	; CHECK-NEXT: ret i64 %bfe			; CHECK-NEXT: ret i64 %bfe
	define i64 @ubfe_offset_0_i64(i64 %src, i32 %width) {			define i64 @ubfe_offset_0_i64(i64 %src, i32 %width) {
	%bfe = call i64 @llvm.amdgcn.ubfe.i64(i64 %src, i32 0, i32 %width)			%bfe = call i64 @llvm.amdgcn.ubfe.i64(i64 %src, i32 0, i32 %width)
	ret i64 %bfe			ret i64 %bfe
	}			}

	; CHECK-LABEL: @ubfe_offset_32_width_32_i64(			; CHECK-LABEL: @ubfe_offset_32_width_32_i64(
	; CHECK-NEXT: %bfe = lshr i64 %src, 32			; CHECK-NEXT: %bfe = lshr i64 %src, 32
	▲ Show 20 Lines • Show All 711 Lines • Show Last 20 Lines

test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s

	; https://bugs.llvm.org/show_bug.cgi?id=37603			; https://bugs.llvm.org/show_bug.cgi?id=37603

	; Pattern:			; Pattern:
	; x << y >> y			; x << y >> y
	; Should be transformed into:			; Should be transformed into:
	; x & (-1 >> y)			; x & (-1 >> y)

	; ============================================================================ ;			; ============================================================================ ;
	; Basic positive tests			; Basic positive tests
	; ============================================================================ ;			; ============================================================================ ;

	define i32 @positive_samevar(i32 %x, i32 %y) {			define i32 @positive_samevar(i32 %x, i32 %y) {
	; CHECK-LABEL: @positive_samevar(			; CHECK-LABEL: @positive_samevar(
	; CHECK-NEXT: [[TMP0:%.]] = shl i32 [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = lshr i32 -1, [[Y:%.]]
	; CHECK-NEXT: [[RET:%.*]] = lshr i32 [[TMP0]], [[Y]]			; CHECK-NEXT: [[RET:%.]] = and i32 [[TMP1]], [[X:%.]]
	; CHECK-NEXT: ret i32 [[RET]]			; CHECK-NEXT: ret i32 [[RET]]
	;			;
	%tmp0 = shl i32 %x, %y			%tmp0 = shl i32 %x, %y
	%ret = lshr i32 %tmp0, %y			%ret = lshr i32 %tmp0, %y
	ret i32 %ret			ret i32 %ret
	}			}

	define i32 @positive_sameconst(i32 %x) {			define i32 @positive_sameconst(i32 %x) {
	▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Vector			; Vector
	; ============================================================================ ;			; ============================================================================ ;

	define <2 x i32> @positive_samevar_vec(<2 x i32> %x, <2 x i32> %y) {			define <2 x i32> @positive_samevar_vec(<2 x i32> %x, <2 x i32> %y) {
	; CHECK-LABEL: @positive_samevar_vec(			; CHECK-LABEL: @positive_samevar_vec(
	; CHECK-NEXT: [[TMP0:%.]] = shl <2 x i32> [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = lshr <2 x i32> <i32 -1, i32 -1>, [[Y:%.]]
	; CHECK-NEXT: [[RET:%.*]] = lshr <2 x i32> [[TMP0]], [[Y]]			; CHECK-NEXT: [[RET:%.]] = and <2 x i32> [[TMP1]], [[X:%.]]
	; CHECK-NEXT: ret <2 x i32> [[RET]]			; CHECK-NEXT: ret <2 x i32> [[RET]]
	;			;
	%tmp0 = shl <2 x i32> %x, %y			%tmp0 = shl <2 x i32> %x, %y
	%ret = lshr <2 x i32> %tmp0, %y			%ret = lshr <2 x i32> %tmp0, %y
	ret <2 x i32> %ret			ret <2 x i32> %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	Show All 29 Lines
	;			;
	%tmp0 = shl <3 x i32> %x, <i32 5, i32 5, i32 5>			%tmp0 = shl <3 x i32> %x, <i32 5, i32 5, i32 5>
	%ret = lshr <3 x i32> %tmp0, <i32 5, i32 undef, i32 5>			%ret = lshr <3 x i32> %tmp0, <i32 5, i32 undef, i32 5>
	ret <3 x i32> %ret			ret <3 x i32> %ret
	}			}

	define <3 x i32> @positive_sameconst_vec_undef2(<3 x i32> %x) {			define <3 x i32> @positive_sameconst_vec_undef2(<3 x i32> %x) {
	; CHECK-LABEL: @positive_sameconst_vec_undef2(			; CHECK-LABEL: @positive_sameconst_vec_undef2(
	; CHECK-NEXT: [[TMP0:%.]] = shl <3 x i32> [[X:%.]], <i32 5, i32 undef, i32 5>			; CHECK-NEXT: [[RET:%.]] = and <3 x i32> [[X:%.]], <i32 134217727, i32 undef, i32 134217727>
	; CHECK-NEXT: [[RET:%.*]] = lshr <3 x i32> [[TMP0]], <i32 5, i32 undef, i32 5>
	; CHECK-NEXT: ret <3 x i32> [[RET]]			; CHECK-NEXT: ret <3 x i32> [[RET]]
	;			;
	%tmp0 = shl <3 x i32> %x, <i32 5, i32 undef, i32 5>			%tmp0 = shl <3 x i32> %x, <i32 5, i32 undef, i32 5>
	%ret = lshr <3 x i32> %tmp0, <i32 5, i32 undef, i32 5>			%ret = lshr <3 x i32> %tmp0, <i32 5, i32 undef, i32 5>
	ret <3 x i32> %ret			ret <3 x i32> %ret
	}			}

	define <2 x i32> @positive_biggerShl_vec(<2 x i32> %x) {			define <2 x i32> @positive_biggerShl_vec(<2 x i32> %x) {
	▲ Show 20 Lines • Show All 216 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 150662

lib/Transforms/InstCombine/InstCombineShifts.cpp

test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 150662

lib/Transforms/InstCombine/InstCombineShifts.cpp

test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll

[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)
ClosedPublic