This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineAddSub.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
apint-shift.ll
-
logical-select.ll
-
select.ll
-
shift.ll
-
zext-bool-add-sub.ll

Differential D71064

[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`)
ClosedPublic

Authored by lebedev.ri on Dec 5 2019, 6:21 AM.

Download Raw Diff

Details

Reviewers

spatel
efriedma
t.p.northover
hfinkel

Commits

rG796fa662f128: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to…

Summary

D68408 proposes to greatly improve our negation sinking abilities.
But in current canonicalization, we produce sub A, zext(B),
which we will consider non-canonical and try to sink that negation,
undoing the existing canonicalization.
So unless we explicitly stop producing previous canonicalization,
we will have two conflicting folds, and will end up endlessly looping.

This inverts canonicalization, and adds back the obvious fold
that we'd miss:

sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y) https://rise4fun.com/Alive/xx4
sext(bool) + C -> bool ? C - 1 : C https://rise4fun.com/Alive/fBl

It is obvious that @ossfuzz_9880() / @lshr_out_of_range()/@ashr_out_of_range() (oss-fuzz 4871)
are no longer folded as much, though it isn't obvious whether those failures are important?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Dec 5 2019, 6:21 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptDec 5 2019, 6:22 AM

The select transform is 1 that I thought about but never got around to implementing. We should have at least 1 minimal test for that pattern, but I don't see that in the diffs?

Did you confirm that codegen is equal or better for these cases (apart from the fuzzer tests - I agree that those are not important)? I think we have DAGCombiner reversals for this transform, but some targets that seem like they would benefit have not enabled the TLI hook.

In D71064#1770937, @spatel wrote:

The select transform is 1 that I thought about but never got around to implementing. We should have at least 1 minimal test for that pattern, but I don't see that in the diffs?

See llvm-project/llvm/test/Transforms/InstCombine/add.ll.

llvm/test/Transforms/InstCombine/add.ll
4–44 ↗	(On Diff #232330)	@spatel i believe this is the test coverage for select transform

In D71064#1770937, @spatel wrote:

Did you confirm that codegen is equal or better for these cases?
I think we have DAGCombiner reversals for this transform, but some targets that seem like they would benefit have not enabled the TLI hook.

I didn't yet. So, as far as i can tell, these are all the interesting cases:
(i'm looking at @t0_new_canon vs @t1_old_canon since that is the only change in this patch)
https://godbolt.org/z/vbb25Q - no regression for x86
https://godbolt.org/z/Htg7m5 - aarch64 also looks ok?
https://godbolt.org/z/xmP6JV - arm good?
https://godbolt.org/z/vNrNJ8 - thumb good?

So i'd say everything is already covered by backend undo folds?
Let me know if i'm missing the point here?

(apart from the fuzzer tests - I agree that those are not important)

Good to know.

In D71064#1770969, @lebedev.ri wrote:

In D71064#1770937, @spatel wrote:

Did you confirm that codegen is equal or better for these cases?
I think we have DAGCombiner reversals for this transform, but some targets that seem like they would benefit have not enabled the TLI hook.

I didn't yet. So, as far as i can tell, these are all the interesting cases:
(i'm looking at @t0_new_canon vs @t1_old_canon since that is the only change in this patch)
https://godbolt.org/z/vbb25Q - no regression for x86
https://godbolt.org/z/Htg7m5 - aarch64 also looks ok?
https://godbolt.org/z/xmP6JV - arm good?
https://godbolt.org/z/vNrNJ8 - thumb good?

So i'd say everything is already covered by backend undo folds?
Let me know if i'm missing the point here?

Scalar looks same all-around. Vector shows some potential diffs:
https://godbolt.org/z/y3E-mb

If I'm seeing it correctly, we always do better on the typical case where the bool vector is produced by a compare, but we might do worse if we don't have that cmp and don't have AssertSext knowledge.

In D71064#1771030, @spatel wrote:

In D71064#1770969, @lebedev.ri wrote:

In D71064#1770937, @spatel wrote:

Did you confirm that codegen is equal or better for these cases?
I think we have DAGCombiner reversals for this transform, but some targets that seem like they would benefit have not enabled the TLI hook.

I didn't yet. So, as far as i can tell, these are all the interesting cases:
(i'm looking at @t0_new_canon vs @t1_old_canon since that is the only change in this patch)
https://godbolt.org/z/vbb25Q - no regression for x86
https://godbolt.org/z/Htg7m5 - aarch64 also looks ok?
https://godbolt.org/z/xmP6JV - arm good?
https://godbolt.org/z/vNrNJ8 - thumb good?

So i'd say everything is already covered by backend undo folds?
Let me know if i'm missing the point here?

Scalar looks same all-around. Vector shows some potential diffs:
https://godbolt.org/z/y3E-mb

If I'm seeing it correctly, we always do better on the typical case where the bool vector is produced by a compare, but we might do worse if we don't have that cmp and don't have AssertSext knowledge.

So the comment is that the undo fold needs to be adjusted first, to fire for non-cmp i1 vectors on aarch64 and powerpc64le?

In D71064#1771063, @lebedev.ri wrote:

In D71064#1771030, @spatel wrote:

Scalar looks same all-around. Vector shows some potential diffs:
https://godbolt.org/z/y3E-mb

If I'm seeing it correctly, we always do better on the typical case where the bool vector is produced by a compare, but we might do worse if we don't have that cmp and don't have AssertSext knowledge.

So the comment is that the undo fold needs to be adjusted first, to fire for non-cmp i1 vectors on aarch64 and powerpc64le?

I don't think that's necessary as a preliminary step because we're improving what should be the common patterns (the examples that include the cmp).
LGTM.

This revision is now accepted and ready to land.Dec 5 2019, 9:30 AM

In D71064#1771082, @spatel wrote:

In D71064#1771063, @lebedev.ri wrote:

In D71064#1771030, @spatel wrote:

Scalar looks same all-around. Vector shows some potential diffs:
https://godbolt.org/z/y3E-mb

If I'm seeing it correctly, we always do better on the typical case where the bool vector is produced by a compare, but we might do worse if we don't have that cmp and don't have AssertSext knowledge.

So the comment is that the undo fold needs to be adjusted first, to fire for non-cmp i1 vectors on aarch64 and powerpc64le?

I don't think that's necessary as a preliminary step because we're improving what should be the common patterns (the examples that include the cmp).
LGTM.

Ah, okay, good point :)
Thank you for the review.

Closed by commit rG796fa662f128: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to… (authored by lebedev.ri). · Explain WhyDec 5 2019, 10:24 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineAddSub.cpp

18 lines

test/

Transforms/

InstCombine/

7 lines

4 lines

8 lines

19 lines

20 lines

Diff 232386

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 884 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::foldAddWithConstant(BinaryOperator &Add) {
if (match(Op0, m_OneUse(m_Sub(m_Value(X), m_Value(Y)))) &&		if (match(Op0, m_OneUse(m_Sub(m_Value(X), m_Value(Y)))) &&
match(Op1, m_AllOnes()))		match(Op1, m_AllOnes()))
return BinaryOperator::CreateAdd(Builder.CreateNot(Y), X);		return BinaryOperator::CreateAdd(Builder.CreateNot(Y), X);

// zext(bool) + C -> bool ? C + 1 : C		// zext(bool) + C -> bool ? C + 1 : C
if (match(Op0, m_ZExt(m_Value(X))) &&		if (match(Op0, m_ZExt(m_Value(X))) &&
X->getType()->getScalarSizeInBits() == 1)		X->getType()->getScalarSizeInBits() == 1)
return SelectInst::Create(X, AddOne(Op1C), Op1);		return SelectInst::Create(X, AddOne(Op1C), Op1);
		// sext(bool) + C -> bool ? C - 1 : C
		if (match(Op0, m_SExt(m_Value(X))) &&
		X->getType()->getScalarSizeInBits() == 1)
		return SelectInst::Create(X, SubOne(Op1C), Op1);

// ~X + C --> (C-1) - X		// ~X + C --> (C-1) - X
if (match(Op0, m_Not(m_Value(X))))		if (match(Op0, m_Not(m_Value(X))))
return BinaryOperator::CreateSub(SubOne(Op1C), X);		return BinaryOperator::CreateSub(SubOne(Op1C), X);

const APInt *C;		const APInt *C;
if (!match(Op1, m_APInt(C)))		if (!match(Op1, m_APInt(C)))
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 382 Lines • ▼ Show 20 Lines	if (match(LHS, m_Neg(m_Value(A)))) {
// -A + -B --> -(A + B)		// -A + -B --> -(A + B)
if (match(RHS, m_Neg(m_Value(B))))		if (match(RHS, m_Neg(m_Value(B))))
return BinaryOperator::CreateNeg(Builder.CreateAdd(A, B));		return BinaryOperator::CreateNeg(Builder.CreateAdd(A, B));

// -A + B --> B - A		// -A + B --> B - A
return BinaryOperator::CreateSub(RHS, A);		return BinaryOperator::CreateSub(RHS, A);
}		}

// Canonicalize sext to zext for better value tracking potential.
// add A, sext(B) --> sub A, zext(B)
if (match(&I, m_c_Add(m_Value(A), m_OneUse(m_SExt(m_Value(B))))) &&
B->getType()->isIntOrIntVectorTy(1))
return BinaryOperator::CreateSub(A, Builder.CreateZExt(B, Ty));

// A + -B --> A - B		// A + -B --> A - B
if (match(RHS, m_Neg(m_Value(B))))		if (match(RHS, m_Neg(m_Value(B))))
return BinaryOperator::CreateSub(LHS, B);		return BinaryOperator::CreateSub(LHS, B);

if (Value *V = checkForNegativeOperand(I, Builder))		if (Value *V = checkForNegativeOperand(I, Builder))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

// (A + 1) + ~B --> A - B		// (A + 1) + ~B --> A - B
▲ Show 20 Lines • Show All 613 Lines • ▼ Show 20 Lines	if (Op1->hasOneUse()) {
// 'nuw' is dropped in favor of the canonical form.		// 'nuw' is dropped in favor of the canonical form.
if (match(Op1, m_SExt(m_Value(Y))) &&		if (match(Op1, m_SExt(m_Value(Y))) &&
Y->getType()->getScalarSizeInBits() == 1) {		Y->getType()->getScalarSizeInBits() == 1) {
Value *Zext = Builder.CreateZExt(Y, I.getType());		Value *Zext = Builder.CreateZExt(Y, I.getType());
BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Zext);		BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Zext);
Add->setHasNoSignedWrap(I.hasNoSignedWrap());		Add->setHasNoSignedWrap(I.hasNoSignedWrap());
return Add;		return Add;
}		}
		// sub [nsw] X, zext(bool Y) -> add [nsw] X, sext(bool Y)
		// 'nuw' is dropped in favor of the canonical form.
		if (match(Op1, m_ZExt(m_Value(Y))) && Y->getType()->isIntOrIntVectorTy(1)) {
		Value *Sext = Builder.CreateSExt(Y, I.getType());
		BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Sext);
		Add->setHasNoSignedWrap(I.hasNoSignedWrap());
		return Add;
		}

// X - A-B -> X + AB		// X - A-B -> X + AB
// X - -AB -> X + AB		// X - -AB -> X + AB
Value A, B;		Value A, B;
if (match(Op1, m_c_Mul(m_Value(A), m_Neg(m_Value(B)))))		if (match(Op1, m_c_Mul(m_Value(A), m_Neg(m_Value(B)))))
return BinaryOperator::CreateAdd(Op0, Builder.CreateMul(A, B));		return BinaryOperator::CreateAdd(Op0, Builder.CreateMul(A, B));

// X - AC -> X + A-C		// X - AC -> X + A-C
▲ Show 20 Lines • Show All 273 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/apint-shift.ll

	Show First 20 Lines • Show All 527 Lines • ▼ Show 20 Lines
	}			}

	; OSS-Fuzz #9880			; OSS-Fuzz #9880
	; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9880			; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9880
	define i177 @ossfuzz_9880(i177 %X) {			define i177 @ossfuzz_9880(i177 %X) {
	; CHECK-LABEL: @ossfuzz_9880(			; CHECK-LABEL: @ossfuzz_9880(
	; CHECK-NEXT: [[A:%.*]] = alloca i177, align 8			; CHECK-NEXT: [[A:%.*]] = alloca i177, align 8
	; CHECK-NEXT: [[L1:%.]] = load i177, i177 [[A]], align 8			; CHECK-NEXT: [[L1:%.]] = load i177, i177 [[A]], align 8
	; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i177 [[L1]], 0			; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i177 [[L1]], -1
	; CHECK-NEXT: [[B1:%.*]] = zext i1 [[TMP1]] to i177			; CHECK-NEXT: [[TMP2:%.*]] = sext i1 [[TMP1]] to i177
				; CHECK-NEXT: [[B14:%.*]] = add i177 [[L1]], [[TMP2]]
				; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i177 [[B14]], -1
				; CHECK-NEXT: [[B1:%.*]] = zext i1 [[TMP3]] to i177
	; CHECK-NEXT: ret i177 [[B1]]			; CHECK-NEXT: ret i177 [[B1]]
	;			;
	%A = alloca i177			%A = alloca i177
	%L1 = load i177, i177* %A			%L1 = load i177, i177* %A
	%B = or i177 0, -1			%B = or i177 0, -1
	%B5 = udiv i177 %L1, %B			%B5 = udiv i177 %L1, %B
	%B4 = add i177 %B5, %B			%B4 = add i177 %B5, %B
	%B2 = add i177 %B, %B4			%B2 = add i177 %B, %B4
	%B6 = mul i177 %B5, %B2			%B6 = mul i177 %B5, %B2
	%B20 = shl i177 %L1, %B6			%B20 = shl i177 %L1, %B6
	%B14 = sub i177 %B20, %B5			%B14 = sub i177 %B20, %B5
	%B1 = udiv i177 %B14, %B6			%B1 = udiv i177 %B14, %B6
	ret i177 %B1			ret i177 %B1
	}			}

llvm/test/Transforms/InstCombine/logical-select.ll

	Show First 20 Lines • Show All 509 Lines • ▼ Show 20 Lines
	}			}

	; Allow the transform even if the mask values have multiple uses because			; Allow the transform even if the mask values have multiple uses because
	; there's still a net reduction of instructions from removing the and/and/or.			; there's still a net reduction of instructions from removing the and/and/or.

	define <4 x i32> @vec_sel_xor_multi_use(<4 x i32> %a, <4 x i32> %b, <4 x i1> %c) {			define <4 x i32> @vec_sel_xor_multi_use(<4 x i32> %a, <4 x i32> %b, <4 x i1> %c) {
	; CHECK-LABEL: @vec_sel_xor_multi_use(			; CHECK-LABEL: @vec_sel_xor_multi_use(
	; CHECK-NEXT: [[TMP1:%.]] = xor <4 x i1> [[C:%.]], <i1 true, i1 false, i1 false, i1 false>			; CHECK-NEXT: [[TMP1:%.]] = xor <4 x i1> [[C:%.]], <i1 true, i1 false, i1 false, i1 false>
				; CHECK-NEXT: [[MASK_FLIP1:%.*]] = sext <4 x i1> [[TMP1]] to <4 x i32>
	; CHECK-NEXT: [[TMP2:%.*]] = xor <4 x i1> [[C]], <i1 false, i1 true, i1 true, i1 true>			; CHECK-NEXT: [[TMP2:%.*]] = xor <4 x i1> [[C]], <i1 false, i1 true, i1 true, i1 true>
	; CHECK-NEXT: [[TMP3:%.]] = select <4 x i1> [[TMP2]], <4 x i32> [[A:%.]], <4 x i32> [[B:%.*]]			; CHECK-NEXT: [[TMP3:%.]] = select <4 x i1> [[TMP2]], <4 x i32> [[A:%.]], <4 x i32> [[B:%.*]]
	; CHECK-NEXT: [[TMP4:%.*]] = zext <4 x i1> [[TMP1]] to <4 x i32>			; CHECK-NEXT: [[ADD:%.*]] = add <4 x i32> [[TMP3]], [[MASK_FLIP1]]
	; CHECK-NEXT: [[ADD:%.*]] = sub <4 x i32> [[TMP3]], [[TMP4]]
	; CHECK-NEXT: ret <4 x i32> [[ADD]]			; CHECK-NEXT: ret <4 x i32> [[ADD]]
	;			;
	%mask = sext <4 x i1> %c to <4 x i32>			%mask = sext <4 x i1> %c to <4 x i32>
	%mask_flip1 = xor <4 x i32> %mask, <i32 -1, i32 0, i32 0, i32 0>			%mask_flip1 = xor <4 x i32> %mask, <i32 -1, i32 0, i32 0, i32 0>
	%not_mask_flip1 = xor <4 x i32> %mask, <i32 0, i32 -1, i32 -1, i32 -1>			%not_mask_flip1 = xor <4 x i32> %mask, <i32 0, i32 -1, i32 -1, i32 -1>
	%and1 = and <4 x i32> %not_mask_flip1, %a			%and1 = and <4 x i32> %not_mask_flip1, %a
	%and2 = and <4 x i32> %mask_flip1, %b			%and2 = and <4 x i32> %mask_flip1, %b
	%or = or <4 x i32> %and1, %and2			%or = or <4 x i32> %and1, %and2
	▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/select.ll

Show First 20 Lines • Show All 688 Lines • ▼ Show 20 Lines	;
%s = select i1 %cond, i32 %y, i32 %z		%s = select i1 %cond, i32 %y, i32 %z
%r = and i32 %x, %s		%r = and i32 %x, %s
ret i32 %r		ret i32 %r
}		}

define i32 @test42(i32 %x, i32 %y) {		define i32 @test42(i32 %x, i32 %y) {
; CHECK-LABEL: @test42(		; CHECK-LABEL: @test42(
; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[X:%.]], 0		; CHECK-NEXT: [[COND:%.]] = icmp eq i32 [[X:%.]], 0
; CHECK-NEXT: [[TMP1:%.*]] = zext i1 [[COND]] to i32		; CHECK-NEXT: [[B:%.*]] = sext i1 [[COND]] to i32
; CHECK-NEXT: [[C:%.]] = sub i32 [[Y:%.]], [[TMP1]]		; CHECK-NEXT: [[C:%.]] = add i32 [[B]], [[Y:%.]]
; CHECK-NEXT: ret i32 [[C]]		; CHECK-NEXT: ret i32 [[C]]
;		;
%b = add i32 %y, -1		%b = add i32 %y, -1
%cond = icmp eq i32 %x, 0		%cond = icmp eq i32 %x, 0
%c = select i1 %cond, i32 %b, i32 %y		%c = select i1 %cond, i32 %b, i32 %y
ret i32 %c		ret i32 %c
}		}

define <2 x i32> @test42vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test42vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test42vec(		; CHECK-LABEL: @test42vec(
; CHECK-NEXT: [[COND:%.]] = icmp eq <2 x i32> [[X:%.]], zeroinitializer		; CHECK-NEXT: [[COND:%.]] = icmp eq <2 x i32> [[X:%.]], zeroinitializer
; CHECK-NEXT: [[TMP1:%.*]] = zext <2 x i1> [[COND]] to <2 x i32>		; CHECK-NEXT: [[B:%.*]] = sext <2 x i1> [[COND]] to <2 x i32>
; CHECK-NEXT: [[C:%.]] = sub <2 x i32> [[Y:%.]], [[TMP1]]		; CHECK-NEXT: [[C:%.]] = add <2 x i32> [[B]], [[Y:%.]]
; CHECK-NEXT: ret <2 x i32> [[C]]		; CHECK-NEXT: ret <2 x i32> [[C]]
;		;
%b = add <2 x i32> %y, <i32 -1, i32 -1>		%b = add <2 x i32> %y, <i32 -1, i32 -1>
%cond = icmp eq <2 x i32> %x, zeroinitializer		%cond = icmp eq <2 x i32> %x, zeroinitializer
%c = select <2 x i1> %cond, <2 x i32> %b, <2 x i32> %y		%c = select <2 x i1> %cond, <2 x i32> %b, <2 x i32> %y
ret <2 x i32> %c		ret <2 x i32> %c
}		}

▲ Show 20 Lines • Show All 848 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/shift.ll

Show First 20 Lines • Show All 1,622 Lines • ▼ Show 20 Lines	;
%3 = ashr i32 %2, 1		%3 = ashr i32 %2, 1
ret i32 %3		ret i32 %3
}		}

; OSS Fuzz #4871		; OSS Fuzz #4871
; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4871		; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4871
define i177 @lshr_out_of_range(i177 %Y, i177** %A2) {		define i177 @lshr_out_of_range(i177 %Y, i177** %A2) {
; CHECK-LABEL: @lshr_out_of_range(		; CHECK-LABEL: @lshr_out_of_range(
; CHECK-NEXT: store i177** [[A2:%.]], i177** undef, align 8		; CHECK-NEXT: [[TMP1:%.]] = icmp ne i177 [[Y:%.]], -1
		; CHECK-NEXT: [[B4:%.*]] = sext i1 [[TMP1]] to i177
		; CHECK-NEXT: [[C8:%.*]] = icmp ult i177 [[B4]], [[Y]]
		; CHECK-NEXT: [[TMP2:%.*]] = sext i1 [[C8]] to i64
		; CHECK-NEXT: [[G18:%.]] = getelementptr i177, i177** [[A2:%.*]], i64 [[TMP2]]
		; CHECK-NEXT: store i177 [[G18]], i177* undef, align 8
; CHECK-NEXT: ret i177 0		; CHECK-NEXT: ret i177 0
;		;
%B5 = udiv i177 %Y, -1		%B5 = udiv i177 %Y, -1
%B4 = add i177 %B5, -1		%B4 = add i177 %B5, -1
%B2 = add i177 %B4, -1		%B2 = add i177 %B4, -1
%B6 = mul i177 %B5, %B2		%B6 = mul i177 %B5, %B2
%B3 = add i177 %B2, %B2		%B3 = add i177 %B2, %B2
%B10 = sub i177 %B5, %B3		%B10 = sub i177 %B5, %B3
%B12 = lshr i177 %Y, %B6		%B12 = lshr i177 %Y, %B6
%C8 = icmp ugt i177 %B12, %B4		%C8 = icmp ugt i177 %B12, %B4
%G18 = getelementptr i177, i177* %A2, i1 %C8		%G18 = getelementptr i177, i177* %A2, i1 %C8
store i177 %G18, i177* undef		store i177 %G18, i177* undef
%B1 = udiv i177 %B10, %B6		%B1 = udiv i177 %B10, %B6
ret i177 %B1		ret i177 %B1
}		}

; OSS Fuzz #5032		; OSS Fuzz #5032
; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5032		; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5032
define void @ashr_out_of_range(i177* %A) {		define void @ashr_out_of_range(i177* %A) {
; CHECK-LABEL: @ashr_out_of_range(		; CHECK-LABEL: @ashr_out_of_range(
		; CHECK-NEXT: [[L:%.]] = load i177, i177 [[A:%.*]], align 4
		; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i177 [[L]], -1
		; CHECK-NEXT: [[B2:%.*]] = select i1 [[TMP1]], i64 -1, i64 -2
		; CHECK-NEXT: [[G11:%.]] = getelementptr i177, i177 [[A]], i64 [[B2]]
		; CHECK-NEXT: [[L7:%.]] = load i177, i177 [[G11]], align 4
		; CHECK-NEXT: [[B36:%.*]] = select i1 [[TMP1]], i177 0, i177 [[L7]]
		; CHECK-NEXT: [[C17:%.*]] = icmp sgt i177 [[B36]], [[L7]]
		; CHECK-NEXT: [[TMP2:%.*]] = sext i1 [[C17]] to i64
		; CHECK-NEXT: [[G62:%.]] = getelementptr i177, i177 [[G11]], i64 [[TMP2]]
		; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i177 [[L7]], -1
		; CHECK-NEXT: [[B28:%.*]] = select i1 [[TMP3]], i177 0, i177 [[L7]]
		; CHECK-NEXT: store i177 [[B28]], i177* [[G62]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
%L = load i177, i177* %A		%L = load i177, i177* %A
%B5 = udiv i177 %L, -1		%B5 = udiv i177 %L, -1
%B4 = add i177 %B5, -1		%B4 = add i177 %B5, -1
%B2 = add i177 %B4, -1		%B2 = add i177 %B4, -1
%G11 = getelementptr i177, i177* %A, i177 %B2		%G11 = getelementptr i177, i177* %A, i177 %B2
%L7 = load i177, i177* %G11		%L7 = load i177, i177* %G11
Show All 10 Lines

llvm/test/Transforms/InstCombine/zext-bool-add-sub.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

; rdar://11748024		; rdar://11748024

define i32 @a(i1 zeroext %x, i1 zeroext %y) {		define i32 @a(i1 zeroext %x, i1 zeroext %y) {
; CHECK-LABEL: @a(		; CHECK-LABEL: @a(
		; CHECK-NEXT: [[CONV3_NEG:%.]] = sext i1 [[Y:%.]] to i32
; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1		; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[Y:%.]] to i32		; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[SUB]], [[CONV3_NEG]]
; CHECK-NEXT: [[ADD:%.*]] = sub nsw i32 [[SUB]], [[TMP1]]
; CHECK-NEXT: ret i32 [[ADD]]		; CHECK-NEXT: ret i32 [[ADD]]
;		;
%conv = zext i1 %x to i32		%conv = zext i1 %x to i32
%conv3 = zext i1 %y to i32		%conv3 = zext i1 %y to i32
%conv3.neg = sub i32 0, %conv3		%conv3.neg = sub i32 0, %conv3
%sub = add i32 %conv, 1		%sub = add i32 %conv, 1
%add = add i32 %sub, %conv3.neg		%add = add i32 %sub, %conv3.neg
ret i32 %add		ret i32 %add
▲ Show 20 Lines • Show All 293 Lines • ▼ Show 20 Lines
;		;
%sext = sext i1 %y to i8		%sext = sext i1 %y to i8
%sub = sub nuw i8 %x, %sext		%sub = sub nuw i8 %x, %sext
ret i8 %sub		ret i8 %sub
}		}

define i32 @sextbool_add(i1 %c, i32 %x) {		define i32 @sextbool_add(i1 %c, i32 %x) {
; CHECK-LABEL: @sextbool_add(		; CHECK-LABEL: @sextbool_add(
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[C:%.]] to i32		; CHECK-NEXT: [[B:%.]] = sext i1 [[C:%.]] to i32
; CHECK-NEXT: [[S:%.]] = sub i32 [[X:%.]], [[TMP1]]		; CHECK-NEXT: [[S:%.]] = add i32 [[B]], [[X:%.]]
; CHECK-NEXT: ret i32 [[S]]		; CHECK-NEXT: ret i32 [[S]]
;		;
%b = sext i1 %c to i32		%b = sext i1 %c to i32
%s = add i32 %b, %x		%s = add i32 %b, %x
ret i32 %s		ret i32 %s
}		}

define i32 @sextbool_add_commute(i1 %c, i32 %px) {		define i32 @sextbool_add_commute(i1 %c, i32 %px) {
; CHECK-LABEL: @sextbool_add_commute(		; CHECK-LABEL: @sextbool_add_commute(
; CHECK-NEXT: [[X:%.]] = urem i32 [[PX:%.]], 42		; CHECK-NEXT: [[X:%.]] = urem i32 [[PX:%.]], 42
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[C:%.]] to i32		; CHECK-NEXT: [[B:%.]] = sext i1 [[C:%.]] to i32
; CHECK-NEXT: [[S:%.*]] = sub nsw i32 [[X]], [[TMP1]]		; CHECK-NEXT: [[S:%.*]] = add nsw i32 [[X]], [[B]]
; CHECK-NEXT: ret i32 [[S]]		; CHECK-NEXT: ret i32 [[S]]
;		;
%x = urem i32 %px, 42 ; thwart complexity-based canonicalization		%x = urem i32 %px, 42 ; thwart complexity-based canonicalization
%b = sext i1 %c to i32		%b = sext i1 %c to i32
%s = add i32 %x, %b		%s = add i32 %x, %b
ret i32 %s		ret i32 %s
}		}

Show All 11 Lines	;
%b = sext i1 %c to i32		%b = sext i1 %c to i32
call void @use32(i32 %b)		call void @use32(i32 %b)
%s = add i32 %b, %x		%s = add i32 %b, %x
ret i32 %s		ret i32 %s
}		}

define <4 x i32> @sextbool_add_vector(<4 x i1> %c, <4 x i32> %x) {		define <4 x i32> @sextbool_add_vector(<4 x i1> %c, <4 x i32> %x) {
; CHECK-LABEL: @sextbool_add_vector(		; CHECK-LABEL: @sextbool_add_vector(
; CHECK-NEXT: [[TMP1:%.]] = zext <4 x i1> [[C:%.]] to <4 x i32>		; CHECK-NEXT: [[B:%.]] = sext <4 x i1> [[C:%.]] to <4 x i32>
; CHECK-NEXT: [[S:%.]] = sub <4 x i32> [[X:%.]], [[TMP1]]		; CHECK-NEXT: [[S:%.]] = add <4 x i32> [[B]], [[X:%.]]
; CHECK-NEXT: ret <4 x i32> [[S]]		; CHECK-NEXT: ret <4 x i32> [[S]]
;		;
%b = sext <4 x i1> %c to <4 x i32>		%b = sext <4 x i1> %c to <4 x i32>
%s = add <4 x i32> %x, %b		%s = add <4 x i32> %x, %b
ret <4 x i32> %s		ret <4 x i32> %s
}		}

define i32 @zextbool_sub(i1 %c, i32 %x) {		define i32 @zextbool_sub(i1 %c, i32 %x) {
Show All 17 Lines	;
%b = zext i1 %c to i32		%b = zext i1 %c to i32
call void @use32(i32 %b)		call void @use32(i32 %b)
%s = sub i32 %x, %b		%s = sub i32 %x, %b
ret i32 %s		ret i32 %s
}		}

define <4 x i32> @zextbool_sub_vector(<4 x i1> %c, <4 x i32> %x) {		define <4 x i32> @zextbool_sub_vector(<4 x i1> %c, <4 x i32> %x) {
; CHECK-LABEL: @zextbool_sub_vector(		; CHECK-LABEL: @zextbool_sub_vector(
; CHECK-NEXT: [[B:%.]] = zext <4 x i1> [[C:%.]] to <4 x i32>		; CHECK-NEXT: [[TMP1:%.]] = sext <4 x i1> [[C:%.]] to <4 x i32>
; CHECK-NEXT: [[S:%.]] = sub <4 x i32> [[X:%.]], [[B]]		; CHECK-NEXT: [[S:%.]] = add <4 x i32> [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret <4 x i32> [[S]]		; CHECK-NEXT: ret <4 x i32> [[S]]
;		;
%b = zext <4 x i1> %c to <4 x i32>		%b = zext <4 x i1> %c to <4 x i32>
%s = sub <4 x i32> %x, %b		%s = sub <4 x i32> %x, %b
ret <4 x i32> %s		ret <4 x i32> %s
}		}