This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Add support for saturating add/sub
ClosedPublic

Authored by nikic on Nov 14 2018, 8:53 AM.

Download Raw Diff

Details

Reviewers

spatel
RKSimon
craig.topper

Summary

This adds support for saturating add/sub intrinsics to InstCombine. In particular the following folds are supported:

sat(X1 + X2) -> add nuw/nsw where known no overflow
sat(X1 - X2) -> sub nuw/nsw where known no overflow
sat(X1 uadd X2) -> MAX where known overflow
sat(X1 usub X2) -> 0 where known overflow
sat(X ssub C) -> sat(X sadd C) where C != MIN
sat(sat(X + C1) + C2) -> sat(X + (C1 + C2)) where legal
sat(sat(X - C1) - C2) -> sat(X + (C1 + C2)) where legal

To properly support this, I've also changed the computeOverflowForUnsignedSub() implementation to return AlwaysOverflows results, for parity with the computeOverflowForUnsignedAdd() implementation.

A possible future improvement for the vector case is to support sat(sat(X + C1) + C2) -> sat(X + (C1 + C2)) style foldings also in the case where C1 and C2 are not constant splats.

Diff Detail

Event Timeline

nikic created this revision.Nov 14 2018, 8:53 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptNov 14 2018, 8:53 AM

nikic added a parent revision: D54532: [InstructionSimplify] Add support for saturating add/sub.Nov 14 2018, 8:53 AM

RKSimon added reviewers: spatel, RKSimon, craig.topper.Nov 14 2018, 8:55 AM

Fix formatting, split up test cases, present tests as diff over baseline.

spatel added inline comments.Nov 26 2018, 2:14 PM

lib/Transforms/InstCombine/InstCombineCalls.cpp
2061–2068	This was already duplicated 3x, so I added a helper: rL347604 I might have missed it, but I don't think the tests exercise this possibility? Also, the baseline tests are not checked into trunk yet?
2085–2087	Should we add this (and the later usub_sat fold to zero) to InstSimplify? That would make all of these cases more symmetric: if (no_ov) { create regular math op } At that point, it might not be worth using this switch-within-a-switch. Just handle each intrinsic in its own case statement.

spatel added inline comments.Nov 26 2018, 2:23 PM

lib/Transforms/InstCombine/InstCombineCalls.cpp
2046–2054	Can we reuse this code for the new intrinsics?

nikic marked 4 inline comments as done.Nov 26 2018, 2:42 PM

nikic added inline comments.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2046–2054	This is generally possible. The reason why I did not do this is that for the always overflow case, we'd end up inserting an unused instruction (https://github.com/llvm-mirror/llvm/blob/7c7c0a2cd982d34de3e49dc0ef7a459e1262f306/lib/Transforms/InstCombine/InstCombineCompares.cpp#L3805). If creating unnecessary instructions as an intermediate state is fine, I can switch this to use OptimizeOverflowCheck.
2061–2068	You're right, a test case for this is missing. I didn't commit the baseline tests yet, in case there are comments.
2085–2087	Adding this to InstSimplify would mean that we have to run the overflow calculation (which requires known bits) twice, once in InstSimplify for the always overflow case and one in InstCombine for the never overflow case. Not sure if that's worthwhile?
2109	One case this code currently does not handle is something like ssub.sat(ssub.add(%foo, -10, 10), i.e. mixes of signed sub and add where the sign ends up being the same if we account for the add/sub. I think this case could be elegantly handled by canonicalizing ssub.sat(%foo, C) to ssub.add(%foo, -C) (assuming C != MIN), similar to the canonicalization that also happens for normal subs. What do you think about this?

spatel added inline comments.Nov 26 2018, 3:19 PM

lib/Transforms/InstCombine/InstCombineCalls.cpp
2046–2054	No, let's not do that then. Creating a dead instruction could mean that instcombine has to iterate the entire function again just to remove it.
2061–2068	Feel free to add/commit tests as you'd like with NFC patches. We can always adjust them if they don't provide coverage for the code that gets added later.
2085–2087	Good point. There's no clear answer for where we draw the line. Computing known bits is known to be expensive, but adding to instsimplify potentially gets the code optimized faster (and improves other passes). It comes down to how often we expect to see these intrinsics in real code vs. how often we would expect to succeed at these transforms. My guess is those answers are "rare" and "rarer", but I have no data. Let's just keep this as-is then.
2109	Yes, if we can convert ssub to sadd, let's do that. The more these are handled like the regular math ops, the better.

nikic mentioned this in rL347700: [InstCombine] Add tests for saturating add/sub; NFC.Nov 27 2018, 11:55 AM

Use operand swap helper, canonicalize ssub to sadd if possible, add tests for operand swapping and canonicalization, rebase over committed baseline tests.

I think everything's correct here, so LGTM. But this is really several independent changes in 1 patch. Please split this up when committing, so we can reduce the risk of bugs.
It could be split like this or some other sequence if you see a better way:

Canonicalize constants to Op1
NeverOverflows / AlwaysOverflows
ssub.sat(X, C) -> sadd.sat(X, -C)
sat(sat(X + Val2) + Val) -> sat(X + (Val+Val2))
sat(sat(X - Val2) - Val) -> sat(X - (Val+Val2))

This revision is now accepted and ready to land.Nov 27 2018, 2:45 PM

Split up and committed as rL347769, rL347770, rL347771, rL347772, rL347773.

Thanks for the reviews!

nikic closed this revision.Nov 28 2018, 8:44 AM

nikic mentioned this in D55011: [InstCombine] Support ssub.sat canonicalization for non-splats.Nov 28 2018, 10:04 AM

Revision Contents

Path

Size

lib/

Analysis/

ValueTracking.cpp

17 lines

Transforms/

InstCombine/

InstCombineCalls.cpp

84 lines

test/

Transforms/

InstCombine/

saturating-add-sub.ll

116 lines

Diff 175165

lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 3,995 Lines • ▼ Show 20 Lines	OverflowResult llvm::computeOverflowForUnsignedAdd(
KnownBits LHSKnown = computeKnownBits(LHS, DL, /Depth=/0, AC, CxtI, DT,		KnownBits LHSKnown = computeKnownBits(LHS, DL, /Depth=/0, AC, CxtI, DT,
nullptr, UseInstrInfo);		nullptr, UseInstrInfo);
if (LHSKnown.isNonNegative() \|\| LHSKnown.isNegative()) {		if (LHSKnown.isNonNegative() \|\| LHSKnown.isNegative()) {
KnownBits RHSKnown = computeKnownBits(RHS, DL, /Depth=/0, AC, CxtI, DT,		KnownBits RHSKnown = computeKnownBits(RHS, DL, /Depth=/0, AC, CxtI, DT,
nullptr, UseInstrInfo);		nullptr, UseInstrInfo);

if (LHSKnown.isNegative() && RHSKnown.isNegative()) {		if (LHSKnown.isNegative() && RHSKnown.isNegative()) {
// The sign bit is set in both cases: this MUST overflow.		// The sign bit is set in both cases: this MUST overflow.
// Create a simple add instruction, and insert it into the struct.
return OverflowResult::AlwaysOverflows;		return OverflowResult::AlwaysOverflows;
}		}

if (LHSKnown.isNonNegative() && RHSKnown.isNonNegative()) {		if (LHSKnown.isNonNegative() && RHSKnown.isNonNegative()) {
// The sign bit is clear in both cases: this CANNOT overflow.		// The sign bit is clear in both cases: this CANNOT overflow.
// Create a simple add instruction, and insert it into the struct.
return OverflowResult::NeverOverflows;		return OverflowResult::NeverOverflows;
}		}
}		}

return OverflowResult::MayOverflow;		return OverflowResult::MayOverflow;
}		}

/// Return true if we can prove that adding the two values of the		/// Return true if we can prove that adding the two values of the
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
}		}

OverflowResult llvm::computeOverflowForUnsignedSub(const Value *LHS,		OverflowResult llvm::computeOverflowForUnsignedSub(const Value *LHS,
const Value *RHS,		const Value *RHS,
const DataLayout &DL,		const DataLayout &DL,
AssumptionCache *AC,		AssumptionCache *AC,
const Instruction *CxtI,		const Instruction *CxtI,
const DominatorTree *DT) {		const DominatorTree *DT) {
// If the LHS is negative and the RHS is non-negative, no unsigned wrap.
KnownBits LHSKnown = computeKnownBits(LHS, DL, /Depth=/0, AC, CxtI, DT);		KnownBits LHSKnown = computeKnownBits(LHS, DL, /Depth=/0, AC, CxtI, DT);
		if (LHSKnown.isNonNegative() \|\| LHSKnown.isNegative()) {
KnownBits RHSKnown = computeKnownBits(RHS, DL, /Depth=/0, AC, CxtI, DT);		KnownBits RHSKnown = computeKnownBits(RHS, DL, /Depth=/0, AC, CxtI, DT);

		// If the LHS is negative and the RHS is non-negative, no unsigned wrap.
if (LHSKnown.isNegative() && RHSKnown.isNonNegative())		if (LHSKnown.isNegative() && RHSKnown.isNonNegative())
return OverflowResult::NeverOverflows;		return OverflowResult::NeverOverflows;

		// If the LHS is non-negative and the RHS negative, we always wrap.
		if (LHSKnown.isNonNegative() && RHSKnown.isNegative())
		return OverflowResult::AlwaysOverflows;
		}

return OverflowResult::MayOverflow;		return OverflowResult::MayOverflow;
}		}

OverflowResult llvm::computeOverflowForSignedSub(const Value *LHS,		OverflowResult llvm::computeOverflowForSignedSub(const Value *LHS,
const Value *RHS,		const Value *RHS,
const DataLayout &DL,		const DataLayout &DL,
AssumptionCache *AC,		AssumptionCache *AC,
const Instruction *CxtI,		const Instruction *CxtI,
▲ Show 20 Lines • Show All 1,235 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 2,037 Lines • ▼ Show 20 Lines	if (isa<Constant>(II->getArgOperand(0)) &&
II->setArgOperand(0, II->getArgOperand(1));		II->setArgOperand(0, II->getArgOperand(1));
II->setArgOperand(1, LHS);		II->setArgOperand(1, LHS);
return II;		return II;
}		}
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;

case Intrinsic::usub_with_overflow:		case Intrinsic::usub_with_overflow:
case Intrinsic::ssub_with_overflow: {		case Intrinsic::ssub_with_overflow: {
OverflowCheckFlavor OCF =		OverflowCheckFlavor OCF =
IntrinsicIDToOverflowCheckFlavor(II->getIntrinsicID());		IntrinsicIDToOverflowCheckFlavor(II->getIntrinsicID());
assert(OCF != OCF_INVALID && "unexpected!");		assert(OCF != OCF_INVALID && "unexpected!");

Value *OperationResult = nullptr;		Value *OperationResult = nullptr;
Constant *OverflowResult = nullptr;		Constant *OverflowResult = nullptr;
if (OptimizeOverflowCheck(OCF, II->getArgOperand(0), II->getArgOperand(1),		if (OptimizeOverflowCheck(OCF, II->getArgOperand(0), II->getArgOperand(1),
*II, OperationResult, OverflowResult))		*II, OperationResult, OverflowResult))
return CreateOverflowTuple(II, OperationResult, OverflowResult);		return CreateOverflowTuple(II, OperationResult, OverflowResult);
		spatelUnsubmitted Not Done Reply Inline Actions Can we reuse this code for the new intrinsics? spatel: Can we reuse this code for the new intrinsics?
		nikicAuthorUnsubmitted Done Reply Inline Actions This is generally possible. The reason why I did not do this is that for the always overflow case, we'd end up inserting an unused instruction (https://github.com/llvm-mirror/llvm/blob/7c7c0a2cd982d34de3e49dc0ef7a459e1262f306/lib/Transforms/InstCombine/InstCombineCompares.cpp#L3805). If creating unnecessary instructions as an intermediate state is fine, I can switch this to use OptimizeOverflowCheck. nikic: This is generally possible. The reason why I did not do this is that for the always overflow…
		spatelUnsubmitted Not Done Reply Inline Actions No, let's not do that then. Creating a dead instruction could mean that instcombine has to iterate the entire function again just to remove it. spatel: No, let's not do that then. Creating a dead instruction could mean that instcombine has to…

break;		break;
}		}

		case Intrinsic::uadd_sat:
		case Intrinsic::sadd_sat:
		if (isa<Constant>(II->getArgOperand(0)) &&
		!isa<Constant>(II->getArgOperand(1))) {
		// Canonicalize constants into the RHS.
		Value *LHS = II->getArgOperand(0);
		II->setArgOperand(0, II->getArgOperand(1));
		II->setArgOperand(1, LHS);
		return II;
		}
		spatelUnsubmitted Done Reply Inline Actions This was already duplicated 3x, so I added a helper: rL347604 I might have missed it, but I don't think the tests exercise this possibility? Also, the baseline tests are not checked into trunk yet? spatel: This was already duplicated 3x, so I added a helper: rL347604 I might have missed it, but I…
		nikicAuthorUnsubmitted Done Reply Inline Actions You're right, a test case for this is missing. I didn't commit the baseline tests yet, in case there are comments. nikic: You're right, a test case for this is missing. I didn't commit the baseline tests yet, in case…
		spatelUnsubmitted Done Reply Inline Actions Feel free to add/commit tests as you'd like with NFC patches. We can always adjust them if they don't provide coverage for the code that gets added later. spatel: Feel free to add/commit tests as you'd like with NFC patches. We can always adjust them if they…
		LLVM_FALLTHROUGH;
		case Intrinsic::usub_sat:
		case Intrinsic::ssub_sat: {
		Value *Arg0 = II->getArgOperand(0);
		Value *Arg1 = II->getArgOperand(1);
		Intrinsic::ID IID = II->getIntrinsicID();

		// Make use of known overflow information.
		OverflowResult OR;
		switch (IID) {
		default:
		llvm_unreachable("Unexpected intrinsic!");
		case Intrinsic::uadd_sat:
		OR = computeOverflowForUnsignedAdd(Arg0, Arg1, II);
		if (OR == OverflowResult::NeverOverflows)
		return BinaryOperator::CreateNUWAdd(Arg0, Arg1);
		if (OR == OverflowResult::AlwaysOverflows)
		return replaceInstUsesWith(*II,
		ConstantInt::getAllOnesValue(II->getType()));
		spatelUnsubmitted Not Done Reply Inline Actions Should we add this (and the later usub_sat fold to zero) to InstSimplify? That would make all of these cases more symmetric: if (no_ov) { create regular math op } At that point, it might not be worth using this switch-within-a-switch. Just handle each intrinsic in its own case statement. spatel: Should we add this (and the later usub_sat fold to zero) to InstSimplify? That would make all…
		nikicAuthorUnsubmitted Done Reply Inline Actions Adding this to InstSimplify would mean that we have to run the overflow calculation (which requires known bits) twice, once in InstSimplify for the always overflow case and one in InstCombine for the never overflow case. Not sure if that's worthwhile? nikic: Adding this to InstSimplify would mean that we have to run the overflow calculation (which…
		spatelUnsubmitted Not Done Reply Inline Actions Good point. There's no clear answer for where we draw the line. Computing known bits is known to be expensive, but adding to instsimplify potentially gets the code optimized faster (and improves other passes). It comes down to how often we expect to see these intrinsics in real code vs. how often we would expect to succeed at these transforms. My guess is those answers are "rare" and "rarer", but I have no data. Let's just keep this as-is then. spatel: Good point. There's no clear answer for where we draw the line. Computing known bits is known…
		break;
		case Intrinsic::usub_sat:
		OR = computeOverflowForUnsignedSub(Arg0, Arg1, II);
		if (OR == OverflowResult::NeverOverflows)
		return BinaryOperator::CreateNUWSub(Arg0, Arg1);
		if (OR == OverflowResult::AlwaysOverflows)
		return replaceInstUsesWith(*II,
		ConstantInt::getNullValue(II->getType()));
		break;
		case Intrinsic::sadd_sat:
		if (willNotOverflowSignedAdd(Arg0, Arg1, *II))
		return BinaryOperator::CreateNSWAdd(Arg0, Arg1);
		break;
		case Intrinsic::ssub_sat:
		if (willNotOverflowSignedSub(Arg0, Arg1, *II))
		return BinaryOperator::CreateNSWSub(Arg0, Arg1);
		break;
		}

		// sat(sat(X + Val2) + Val) -> sat(X + (Val+Val2))
		// sat(sat(X - Val2) - Val) -> sat(X - (Val+Val2))
		// if Val and Val2 have the same sign
		nikicAuthorUnsubmitted Done Reply Inline Actions One case this code currently does not handle is something like ssub.sat(ssub.add(%foo, -10, 10), i.e. mixes of signed sub and add where the sign ends up being the same if we account for the add/sub. I think this case could be elegantly handled by canonicalizing ssub.sat(%foo, C) to ssub.add(%foo, -C) (assuming C != MIN), similar to the canonicalization that also happens for normal subs. What do you think about this? nikic: One case this code currently does not handle is something like ssub.sat(ssub.add(%foo, -10, 10)…
		spatelUnsubmitted Done Reply Inline Actions Yes, if we can convert ssub to sadd, let's do that. The more these are handled like the regular math ops, the better. spatel: Yes, if we can convert ssub to sadd, let's do that. The more these are handled like the regular…
		if (auto *Other = dyn_cast<IntrinsicInst>(Arg0)) {
		Value *X;
		const APInt Val, Val2;
		APInt NewVal;
		bool IsUnsigned =
		IID == Intrinsic::uadd_sat \|\| IID == Intrinsic::usub_sat;
		if (Other->getIntrinsicID() == II->getIntrinsicID() &&
		match(Arg1, m_APInt(Val)) &&
		match(Other->getArgOperand(0), m_Value(X)) &&
		match(Other->getArgOperand(1), m_APInt(Val2))) {
		if (IsUnsigned)
		NewVal = Val->uadd_sat(*Val2);
		else if (Val->isNonNegative() == Val2->isNonNegative()) {
		bool Overflow;
		NewVal = Val->sadd_ov(*Val2, Overflow);
		if (Overflow) {
		// Both adds together may add more than SignedMaxValue
		// without saturating the final result.
		break;
		}
		} else {
		// Cannot fold saturated addition with different signs.
		break;
		}

		return replaceInstUsesWith(
		*II, Builder.CreateBinaryIntrinsic(
		IID, X, ConstantInt::get(II->getType(), NewVal)));
		}
		}
		break;
		}

case Intrinsic::minnum:		case Intrinsic::minnum:
case Intrinsic::maxnum:		case Intrinsic::maxnum:
case Intrinsic::minimum:		case Intrinsic::minimum:
case Intrinsic::maximum: {		case Intrinsic::maximum: {
Value *Arg0 = II->getArgOperand(0);		Value *Arg0 = II->getArgOperand(0);
Value *Arg1 = II->getArgOperand(1);		Value *Arg1 = II->getArgOperand(1);
// Canonicalize constants to the RHS.		// Canonicalize constants to the RHS.
if (isa<ConstantFP>(Arg0) && !isa<ConstantFP>(Arg1)) {		if (isa<ConstantFP>(Arg0) && !isa<ConstantFP>(Arg1)) {
▲ Show 20 Lines • Show All 2,623 Lines • Show Last 20 Lines

test/Transforms/InstCombine/saturating-add-sub.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

;		;
; Saturating addition.		; Saturating addition.
;		;

declare i8 @llvm.uadd.sat.i8(i8, i8)		declare i8 @llvm.uadd.sat.i8(i8, i8)
declare i8 @llvm.sadd.sat.i8(i8, i8)		declare i8 @llvm.sadd.sat.i8(i8, i8)
declare <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8>, <2 x i8>)		declare <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8>, <2 x i8>)
declare <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8>, <2 x i8>)		declare <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8>, <2 x i8>)

; Can combine uadds with constant operands.		; Can combine uadds with constant operands.
define i8 @test_scalar_uadd_combine(i8 %a) {		define i8 @test_scalar_uadd_combine(i8 %a) {
; CHECK-LABEL: @test_scalar_uadd_combine(		; CHECK-LABEL: @test_scalar_uadd_combine(
; CHECK-NEXT: [[X1:%.]] = call i8 @llvm.uadd.sat.i8(i8 [[A:%.]], i8 10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.uadd.sat.i8(i8 [[A:%.]], i8 30)
; CHECK-NEXT: [[X2:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[X1]], i8 20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[X2]]
;		;
%x1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 10)		%x1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 10)
%x2 = call i8 @llvm.uadd.sat.i8(i8 %x1, i8 20)		%x2 = call i8 @llvm.uadd.sat.i8(i8 %x1, i8 20)
ret i8 %x2		ret i8 %x2
}		}

define <2 x i8> @test_vector_uadd_combine(<2 x i8> %a) {		define <2 x i8> @test_vector_uadd_combine(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_uadd_combine(		; CHECK-LABEL: @test_vector_uadd_combine(
; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 30, i8 30>)
; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 20, i8 20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[X2]]
;		;
%x1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)		%x1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)
%x2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 20, i8 20>)		%x2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 20, i8 20>)
ret <2 x i8> %x2		ret <2 x i8> %x2
}		}

; This could simplify, but currently doesn't.		; This could simplify, but currently doesn't.
define <2 x i8> @test_vector_uadd_combine_non_splat(<2 x i8> %a) {		define <2 x i8> @test_vector_uadd_combine_non_splat(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_uadd_combine_non_splat(		; CHECK-LABEL: @test_vector_uadd_combine_non_splat(
; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 20>)
; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 30, i8 40>)		; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 30, i8 40>)
; CHECK-NEXT: ret <2 x i8> [[X2]]		; CHECK-NEXT: ret <2 x i8> [[X2]]
;		;
%x1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 20>)		%x1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 20>)
%x2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 30, i8 40>)		%x2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 30, i8 40>)
ret <2 x i8> %x2		ret <2 x i8> %x2
}		}

; Can combine uadds even if they overflow.		; Can combine uadds even if they overflow.
define i8 @test_scalar_uadd_overflow(i8 %a) {		define i8 @test_scalar_uadd_overflow(i8 %a) {
; CHECK-LABEL: @test_scalar_uadd_overflow(		; CHECK-LABEL: @test_scalar_uadd_overflow(
; CHECK-NEXT: [[Y1:%.]] = call i8 @llvm.uadd.sat.i8(i8 [[A:%.]], i8 100)		; CHECK-NEXT: ret i8 -1
; CHECK-NEXT: [[Y2:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[Y1]], i8 -56)
; CHECK-NEXT: ret i8 [[Y2]]
;		;
%y1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 100)		%y1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 100)
%y2 = call i8 @llvm.uadd.sat.i8(i8 %y1, i8 200)		%y2 = call i8 @llvm.uadd.sat.i8(i8 %y1, i8 200)
ret i8 %y2		ret i8 %y2
}		}

define <2 x i8> @test_vector_uadd_overflow(<2 x i8> %a) {		define <2 x i8> @test_vector_uadd_overflow(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_uadd_overflow(		; CHECK-LABEL: @test_vector_uadd_overflow(
; CHECK-NEXT: [[Y1:%.]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 100, i8 100>)		; CHECK-NEXT: ret <2 x i8> <i8 -1, i8 -1>
; CHECK-NEXT: [[Y2:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[Y1]], <2 x i8> <i8 -56, i8 -56>)
; CHECK-NEXT: ret <2 x i8> [[Y2]]
;		;
%y1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 100, i8 100>)		%y1 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 100, i8 100>)
%y2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %y1, <2 x i8> <i8 200, i8 200>)		%y2 = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %y1, <2 x i8> <i8 200, i8 200>)
ret <2 x i8> %y2		ret <2 x i8> %y2
}		}

; Can combine sadds if sign matches.		; Can combine sadds if sign matches.
define i8 @test_scalar_sadd_both_positive(i8 %a) {		define i8 @test_scalar_sadd_both_positive(i8 %a) {
; CHECK-LABEL: @test_scalar_sadd_both_positive(		; CHECK-LABEL: @test_scalar_sadd_both_positive(
; CHECK-NEXT: [[Z1:%.]] = call i8 @llvm.sadd.sat.i8(i8 [[A:%.]], i8 10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.sadd.sat.i8(i8 [[A:%.]], i8 30)
; CHECK-NEXT: [[Z2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[Z1]], i8 20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[Z2]]
;		;
%z1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 10)		%z1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 10)
%z2 = call i8 @llvm.sadd.sat.i8(i8 %z1, i8 20)		%z2 = call i8 @llvm.sadd.sat.i8(i8 %z1, i8 20)
ret i8 %z2		ret i8 %z2
}		}

define <2 x i8> @test_vector_sadd_both_positive(<2 x i8> %a) {		define <2 x i8> @test_vector_sadd_both_positive(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_sadd_both_positive(		; CHECK-LABEL: @test_vector_sadd_both_positive(
; CHECK-NEXT: [[Z1:%.]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 30, i8 30>)
; CHECK-NEXT: [[Z2:%.*]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[Z1]], <2 x i8> <i8 20, i8 20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[Z2]]
;		;
%z1 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)		%z1 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)
%z2 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %z1, <2 x i8> <i8 20, i8 20>)		%z2 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %z1, <2 x i8> <i8 20, i8 20>)
ret <2 x i8> %z2		ret <2 x i8> %z2
}		}

define i8 @test_scalar_sadd_both_negative(i8 %a) {		define i8 @test_scalar_sadd_both_negative(i8 %a) {
; CHECK-LABEL: @test_scalar_sadd_both_negative(		; CHECK-LABEL: @test_scalar_sadd_both_negative(
; CHECK-NEXT: [[U1:%.]] = call i8 @llvm.sadd.sat.i8(i8 [[A:%.]], i8 -10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.sadd.sat.i8(i8 [[A:%.]], i8 -30)
; CHECK-NEXT: [[U2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[U1]], i8 -20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[U2]]
;		;
%u1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 -10)		%u1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 -10)
%u2 = call i8 @llvm.sadd.sat.i8(i8 %u1, i8 -20)		%u2 = call i8 @llvm.sadd.sat.i8(i8 %u1, i8 -20)
ret i8 %u2		ret i8 %u2
}		}

define <2 x i8> @test_vector_sadd_both_negative(<2 x i8> %a) {		define <2 x i8> @test_vector_sadd_both_negative(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_sadd_both_negative(		; CHECK-LABEL: @test_vector_sadd_both_negative(
; CHECK-NEXT: [[U1:%.]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 -10, i8 -10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 -30, i8 -30>)
; CHECK-NEXT: [[U2:%.*]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[U1]], <2 x i8> <i8 -20, i8 -20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[U2]]
;		;
%u1 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 -10, i8 -10>)		%u1 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 -10, i8 -10>)
%u2 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %u1, <2 x i8> <i8 -20, i8 -20>)		%u2 = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %u1, <2 x i8> <i8 -20, i8 -20>)
ret <2 x i8> %u2		ret <2 x i8> %u2
}		}

; Can't combine sadds if constants have different sign.		; Can't combine sadds if constants have different sign.
define i8 @test_scalar_sadd_different_sign(i8 %a) {		define i8 @test_scalar_sadd_different_sign(i8 %a) {
Show All 17 Lines	;
%w1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 100)		%w1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 100)
%w2 = call i8 @llvm.sadd.sat.i8(i8 %w1, i8 100)		%w2 = call i8 @llvm.sadd.sat.i8(i8 %w1, i8 100)
ret i8 %w2		ret i8 %w2
}		}

; neg uadd neg always overflows.		; neg uadd neg always overflows.
define i8 @test_scalar_uadd_neg_neg(i8 %a) {		define i8 @test_scalar_uadd_neg_neg(i8 %a) {
; CHECK-LABEL: @test_scalar_uadd_neg_neg(		; CHECK-LABEL: @test_scalar_uadd_neg_neg(
; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128		; CHECK-NEXT: ret i8 -1
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[A_NEG]], i8 -10)
; CHECK-NEXT: ret i8 [[R]]
;		;
%a_neg = or i8 %a, -128		%a_neg = or i8 %a, -128
%r = call i8 @llvm.uadd.sat.i8(i8 %a_neg, i8 -10)		%r = call i8 @llvm.uadd.sat.i8(i8 %a_neg, i8 -10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_uadd_neg_neg(<2 x i8> %a) {		define <2 x i8> @test_vector_uadd_neg_neg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_uadd_neg_neg(		; CHECK-LABEL: @test_vector_uadd_neg_neg(
; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>		; CHECK-NEXT: ret <2 x i8> <i8 -1, i8 -1>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A_NEG]], <2 x i8> <i8 -10, i8 -20>)
; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>		%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>
%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 -10, i8 -20>)		%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 -10, i8 -20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; nneg uadd nneg never overflows.		; nneg uadd nneg never overflows.
define i8 @test_scalar_uadd_nneg_nneg(i8 %a) {		define i8 @test_scalar_uadd_nneg_nneg(i8 %a) {
; CHECK-LABEL: @test_scalar_uadd_nneg_nneg(		; CHECK-LABEL: @test_scalar_uadd_nneg_nneg(
; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127		; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[A_NNEG]], i8 10)		; CHECK-NEXT: [[R:%.*]] = add nuw i8 [[A_NNEG]], 10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_nneg = and i8 %a, 127		%a_nneg = and i8 %a, 127
%r = call i8 @llvm.uadd.sat.i8(i8 %a_nneg, i8 10)		%r = call i8 @llvm.uadd.sat.i8(i8 %a_nneg, i8 10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_uadd_nneg_nneg(<2 x i8> %a) {		define <2 x i8> @test_vector_uadd_nneg_nneg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_uadd_nneg_nneg(		; CHECK-LABEL: @test_vector_uadd_nneg_nneg(
; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>		; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> [[A_NNEG]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[R:%.*]] = add nuw <2 x i8> [[A_NNEG]], <i8 10, i8 20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_nneg = and <2 x i8> %a, <i8 127, i8 127>		%a_nneg = and <2 x i8> %a, <i8 127, i8 127>
%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg uadd nneg might overflow.		; neg uadd nneg might overflow.
Show All 18 Lines	;
%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.uadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg sadd nneg never overflows.		; neg sadd nneg never overflows.
define i8 @test_scalar_sadd_neg_nneg(i8 %a) {		define i8 @test_scalar_sadd_neg_nneg(i8 %a) {
; CHECK-LABEL: @test_scalar_sadd_neg_nneg(		; CHECK-LABEL: @test_scalar_sadd_neg_nneg(
; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128		; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A_NEG]], i8 10)		; CHECK-NEXT: [[R:%.*]] = add nsw i8 [[A_NEG]], 10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_neg = or i8 %a, -128		%a_neg = or i8 %a, -128
%r = call i8 @llvm.sadd.sat.i8(i8 %a_neg, i8 10)		%r = call i8 @llvm.sadd.sat.i8(i8 %a_neg, i8 10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_sadd_neg_nneg(<2 x i8> %a) {		define <2 x i8> @test_vector_sadd_neg_nneg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_sadd_neg_nneg(		; CHECK-LABEL: @test_vector_sadd_neg_nneg(
; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>		; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A_NEG]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[R:%.*]] = add nsw <2 x i8> [[A_NEG]], <i8 10, i8 20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>		%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>
%r = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; nneg sadd neg never overflows.		; nneg sadd neg never overflows.
define i8 @test_scalar_sadd_nneg_neg(i8 %a) {		define i8 @test_scalar_sadd_nneg_neg(i8 %a) {
; CHECK-LABEL: @test_scalar_sadd_nneg_neg(		; CHECK-LABEL: @test_scalar_sadd_nneg_neg(
; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127		; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A_NNEG]], i8 -10)		; CHECK-NEXT: [[R:%.*]] = add nsw i8 [[A_NNEG]], -10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_nneg = and i8 %a, 127		%a_nneg = and i8 %a, 127
%r = call i8 @llvm.sadd.sat.i8(i8 %a_nneg, i8 -10)		%r = call i8 @llvm.sadd.sat.i8(i8 %a_nneg, i8 -10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_sadd_nneg_neg(<2 x i8> %a) {		define <2 x i8> @test_vector_sadd_nneg_neg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_sadd_nneg_neg(		; CHECK-LABEL: @test_vector_sadd_nneg_neg(
; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>		; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> [[A_NNEG]], <2 x i8> <i8 -10, i8 -20>)		; CHECK-NEXT: [[R:%.*]] = add nsw <2 x i8> [[A_NNEG]], <i8 -10, i8 -20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_nneg = and <2 x i8> %a, <i8 127, i8 127>		%a_nneg = and <2 x i8> %a, <i8 127, i8 127>
%r = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 -10, i8 -20>)		%r = call <2 x i8> @llvm.sadd.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 -10, i8 -20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg sadd neg might overflow.		; neg sadd neg might overflow.
Show All 26 Lines
declare i8 @llvm.usub.sat.i8(i8, i8)		declare i8 @llvm.usub.sat.i8(i8, i8)
declare i8 @llvm.ssub.sat.i8(i8, i8)		declare i8 @llvm.ssub.sat.i8(i8, i8)
declare <2 x i8> @llvm.usub.sat.v2i8(<2 x i8>, <2 x i8>)		declare <2 x i8> @llvm.usub.sat.v2i8(<2 x i8>, <2 x i8>)
declare <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8>, <2 x i8>)		declare <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8>, <2 x i8>)

; Can combine usubs with constant operands.		; Can combine usubs with constant operands.
define i8 @test_scalar_usub_combine(i8 %a) {		define i8 @test_scalar_usub_combine(i8 %a) {
; CHECK-LABEL: @test_scalar_usub_combine(		; CHECK-LABEL: @test_scalar_usub_combine(
; CHECK-NEXT: [[X1:%.]] = call i8 @llvm.usub.sat.i8(i8 [[A:%.]], i8 10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.usub.sat.i8(i8 [[A:%.]], i8 30)
; CHECK-NEXT: [[X2:%.*]] = call i8 @llvm.usub.sat.i8(i8 [[X1]], i8 20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[X2]]
;		;
%x1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 10)		%x1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 10)
%x2 = call i8 @llvm.usub.sat.i8(i8 %x1, i8 20)		%x2 = call i8 @llvm.usub.sat.i8(i8 %x1, i8 20)
ret i8 %x2		ret i8 %x2
}		}

define <2 x i8> @test_vector_usub_combine(<2 x i8> %a) {		define <2 x i8> @test_vector_usub_combine(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_usub_combine(		; CHECK-LABEL: @test_vector_usub_combine(
; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 30, i8 30>)
; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 20, i8 20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[X2]]
;		;
%x1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)		%x1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)
%x2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 20, i8 20>)		%x2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 20, i8 20>)
ret <2 x i8> %x2		ret <2 x i8> %x2
}		}

; This could simplify, but currently doesn't.		; This could simplify, but currently doesn't.
define <2 x i8> @test_vector_usub_combine_non_splat(<2 x i8> %a) {		define <2 x i8> @test_vector_usub_combine_non_splat(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_usub_combine_non_splat(		; CHECK-LABEL: @test_vector_usub_combine_non_splat(
; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[X1:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 20>)
; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 30, i8 40>)		; CHECK-NEXT: [[X2:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[X1]], <2 x i8> <i8 30, i8 40>)
; CHECK-NEXT: ret <2 x i8> [[X2]]		; CHECK-NEXT: ret <2 x i8> [[X2]]
;		;
%x1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 20>)		%x1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 20>)
%x2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 30, i8 40>)		%x2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %x1, <2 x i8> <i8 30, i8 40>)
ret <2 x i8> %x2		ret <2 x i8> %x2
}		}

; Can combine usubs even if they overflow.		; Can combine usubs even if they overflow.
define i8 @test_scalar_usub_overflow(i8 %a) {		define i8 @test_scalar_usub_overflow(i8 %a) {
; CHECK-LABEL: @test_scalar_usub_overflow(		; CHECK-LABEL: @test_scalar_usub_overflow(
; CHECK-NEXT: [[Y1:%.]] = call i8 @llvm.usub.sat.i8(i8 [[A:%.]], i8 100)		; CHECK-NEXT: ret i8 0
; CHECK-NEXT: [[Y2:%.*]] = call i8 @llvm.usub.sat.i8(i8 [[Y1]], i8 -56)
; CHECK-NEXT: ret i8 [[Y2]]
;		;
%y1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 100)		%y1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 100)
%y2 = call i8 @llvm.usub.sat.i8(i8 %y1, i8 200)		%y2 = call i8 @llvm.usub.sat.i8(i8 %y1, i8 200)
ret i8 %y2		ret i8 %y2
}		}

define <2 x i8> @test_vector_usub_overflow(<2 x i8> %a) {		define <2 x i8> @test_vector_usub_overflow(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_usub_overflow(		; CHECK-LABEL: @test_vector_usub_overflow(
; CHECK-NEXT: [[Y1:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 100, i8 100>)		; CHECK-NEXT: ret <2 x i8> zeroinitializer
; CHECK-NEXT: [[Y2:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[Y1]], <2 x i8> <i8 -56, i8 -56>)
; CHECK-NEXT: ret <2 x i8> [[Y2]]
;		;
%y1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 100, i8 100>)		%y1 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 100, i8 100>)
%y2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %y1, <2 x i8> <i8 200, i8 200>)		%y2 = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %y1, <2 x i8> <i8 200, i8 200>)
ret <2 x i8> %y2		ret <2 x i8> %y2
}		}

; Can combine ssubs if sign matches.		; Can combine ssubs if sign matches.
define i8 @test_scalar_ssub_both_positive(i8 %a) {		define i8 @test_scalar_ssub_both_positive(i8 %a) {
; CHECK-LABEL: @test_scalar_ssub_both_positive(		; CHECK-LABEL: @test_scalar_ssub_both_positive(
; CHECK-NEXT: [[Z1:%.]] = call i8 @llvm.ssub.sat.i8(i8 [[A:%.]], i8 10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.ssub.sat.i8(i8 [[A:%.]], i8 30)
; CHECK-NEXT: [[Z2:%.*]] = call i8 @llvm.ssub.sat.i8(i8 [[Z1]], i8 20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[Z2]]
;		;
%z1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 10)		%z1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 10)
%z2 = call i8 @llvm.ssub.sat.i8(i8 %z1, i8 20)		%z2 = call i8 @llvm.ssub.sat.i8(i8 %z1, i8 20)
ret i8 %z2		ret i8 %z2
}		}

define <2 x i8> @test_vector_ssub_both_positive(<2 x i8> %a) {		define <2 x i8> @test_vector_ssub_both_positive(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_ssub_both_positive(		; CHECK-LABEL: @test_vector_ssub_both_positive(
; CHECK-NEXT: [[Z1:%.]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 10, i8 10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 30, i8 30>)
; CHECK-NEXT: [[Z2:%.*]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[Z1]], <2 x i8> <i8 20, i8 20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[Z2]]
;		;
%z1 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)		%z1 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 10, i8 10>)
%z2 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %z1, <2 x i8> <i8 20, i8 20>)		%z2 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %z1, <2 x i8> <i8 20, i8 20>)
ret <2 x i8> %z2		ret <2 x i8> %z2
}		}

define i8 @test_scalar_ssub_both_negative(i8 %a) {		define i8 @test_scalar_ssub_both_negative(i8 %a) {
; CHECK-LABEL: @test_scalar_ssub_both_negative(		; CHECK-LABEL: @test_scalar_ssub_both_negative(
; CHECK-NEXT: [[U1:%.]] = call i8 @llvm.ssub.sat.i8(i8 [[A:%.]], i8 -10)		; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.ssub.sat.i8(i8 [[A:%.]], i8 -30)
; CHECK-NEXT: [[U2:%.*]] = call i8 @llvm.ssub.sat.i8(i8 [[U1]], i8 -20)		; CHECK-NEXT: ret i8 [[TMP1]]
; CHECK-NEXT: ret i8 [[U2]]
;		;
%u1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 -10)		%u1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 -10)
%u2 = call i8 @llvm.ssub.sat.i8(i8 %u1, i8 -20)		%u2 = call i8 @llvm.ssub.sat.i8(i8 %u1, i8 -20)
ret i8 %u2		ret i8 %u2
}		}

define <2 x i8> @test_vector_ssub_both_negative(<2 x i8> %a) {		define <2 x i8> @test_vector_ssub_both_negative(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_ssub_both_negative(		; CHECK-LABEL: @test_vector_ssub_both_negative(
; CHECK-NEXT: [[U1:%.]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 -10, i8 -10>)		; CHECK-NEXT: [[TMP1:%.]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 -30, i8 -30>)
; CHECK-NEXT: [[U2:%.*]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[U1]], <2 x i8> <i8 -20, i8 -20>)		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
; CHECK-NEXT: ret <2 x i8> [[U2]]
;		;
%u1 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 -10, i8 -10>)		%u1 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a, <2 x i8> <i8 -10, i8 -10>)
%u2 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %u1, <2 x i8> <i8 -20, i8 -20>)		%u2 = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %u1, <2 x i8> <i8 -20, i8 -20>)
ret <2 x i8> %u2		ret <2 x i8> %u2
}		}

; Can't combine ssubs if constants have different sign.		; Can't combine ssubs if constants have different sign.
define i8 @test_scalar_ssub_different_sign(i8 %a) {		define i8 @test_scalar_ssub_different_sign(i8 %a) {
Show All 17 Lines	;
%w1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 100)		%w1 = call i8 @llvm.ssub.sat.i8(i8 %a, i8 100)
%w2 = call i8 @llvm.ssub.sat.i8(i8 %w1, i8 100)		%w2 = call i8 @llvm.ssub.sat.i8(i8 %w1, i8 100)
ret i8 %w2		ret i8 %w2
}		}

; nneg usub neg always overflows.		; nneg usub neg always overflows.
define i8 @test_scalar_usub_nneg_neg(i8 %a) {		define i8 @test_scalar_usub_nneg_neg(i8 %a) {
; CHECK-LABEL: @test_scalar_usub_nneg_neg(		; CHECK-LABEL: @test_scalar_usub_nneg_neg(
; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127		; CHECK-NEXT: ret i8 0
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.usub.sat.i8(i8 [[A_NNEG]], i8 -10)
; CHECK-NEXT: ret i8 [[R]]
;		;
%a_nneg = and i8 %a, 127		%a_nneg = and i8 %a, 127
%r = call i8 @llvm.usub.sat.i8(i8 %a_nneg, i8 -10)		%r = call i8 @llvm.usub.sat.i8(i8 %a_nneg, i8 -10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_usub_nneg_neg(<2 x i8> %a) {		define <2 x i8> @test_vector_usub_nneg_neg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_usub_nneg_neg(		; CHECK-LABEL: @test_vector_usub_nneg_neg(
; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>		; CHECK-NEXT: ret <2 x i8> zeroinitializer
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A_NNEG]], <2 x i8> <i8 -10, i8 -20>)
; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_nneg = and <2 x i8> %a, <i8 127, i8 127>		%a_nneg = and <2 x i8> %a, <i8 127, i8 127>
%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 -10, i8 -20>)		%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 -10, i8 -20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg usub nneg never overflows.		; neg usub nneg never overflows.
define i8 @test_scalar_usub_neg_nneg(i8 %a) {		define i8 @test_scalar_usub_neg_nneg(i8 %a) {
; CHECK-LABEL: @test_scalar_usub_neg_nneg(		; CHECK-LABEL: @test_scalar_usub_neg_nneg(
; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128		; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.usub.sat.i8(i8 [[A_NEG]], i8 10)		; CHECK-NEXT: [[R:%.*]] = add i8 [[A_NEG]], -10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_neg = or i8 %a, -128		%a_neg = or i8 %a, -128
%r = call i8 @llvm.usub.sat.i8(i8 %a_neg, i8 10)		%r = call i8 @llvm.usub.sat.i8(i8 %a_neg, i8 10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_usub_neg_nneg(<2 x i8> %a) {		define <2 x i8> @test_vector_usub_neg_nneg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_usub_neg_nneg(		; CHECK-LABEL: @test_vector_usub_neg_nneg(
; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>		; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A_NEG]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[R:%.*]] = add <2 x i8> [[A_NEG]], <i8 -10, i8 -20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>		%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>
%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; nneg usub nneg never may overflow.		; nneg usub nneg never may overflow.
Show All 18 Lines	;
%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg ssub neg never overflows.		; neg ssub neg never overflows.
define i8 @test_scalar_ssub_neg_neg(i8 %a) {		define i8 @test_scalar_ssub_neg_neg(i8 %a) {
; CHECK-LABEL: @test_scalar_ssub_neg_neg(		; CHECK-LABEL: @test_scalar_ssub_neg_neg(
; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128		; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.ssub.sat.i8(i8 [[A_NEG]], i8 -10)		; CHECK-NEXT: [[R:%.*]] = add nsw i8 [[A_NEG]], 10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_neg = or i8 %a, -128		%a_neg = or i8 %a, -128
%r = call i8 @llvm.ssub.sat.i8(i8 %a_neg, i8 -10)		%r = call i8 @llvm.ssub.sat.i8(i8 %a_neg, i8 -10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_ssub_neg_neg(<2 x i8> %a) {		define <2 x i8> @test_vector_ssub_neg_neg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_ssub_neg_neg(		; CHECK-LABEL: @test_vector_ssub_neg_neg(
; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>		; CHECK-NEXT: [[A_NEG:%.]] = or <2 x i8> [[A:%.]], <i8 -128, i8 -128>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A_NEG]], <2 x i8> <i8 -10, i8 -20>)		; CHECK-NEXT: [[R:%.*]] = add nsw <2 x i8> [[A_NEG]], <i8 10, i8 20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>		%a_neg = or <2 x i8> %a, <i8 -128, i8 -128>
%r = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 -10, i8 -20>)		%r = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a_neg, <2 x i8> <i8 -10, i8 -20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; nneg ssub nneg never overflows.		; nneg ssub nneg never overflows.
define i8 @test_scalar_ssub_nneg_nneg(i8 %a) {		define i8 @test_scalar_ssub_nneg_nneg(i8 %a) {
; CHECK-LABEL: @test_scalar_ssub_nneg_nneg(		; CHECK-LABEL: @test_scalar_ssub_nneg_nneg(
; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127		; CHECK-NEXT: [[A_NNEG:%.]] = and i8 [[A:%.]], 127
; CHECK-NEXT: [[R:%.*]] = call i8 @llvm.ssub.sat.i8(i8 [[A_NNEG]], i8 10)		; CHECK-NEXT: [[R:%.*]] = add nsw i8 [[A_NNEG]], -10
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a_nneg = and i8 %a, 127		%a_nneg = and i8 %a, 127
%r = call i8 @llvm.ssub.sat.i8(i8 %a_nneg, i8 10)		%r = call i8 @llvm.ssub.sat.i8(i8 %a_nneg, i8 10)
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @test_vector_ssub_nneg_nneg(<2 x i8> %a) {		define <2 x i8> @test_vector_ssub_nneg_nneg(<2 x i8> %a) {
; CHECK-LABEL: @test_vector_ssub_nneg_nneg(		; CHECK-LABEL: @test_vector_ssub_nneg_nneg(
; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>		; CHECK-NEXT: [[A_NNEG:%.]] = and <2 x i8> [[A:%.]], <i8 127, i8 127>
; CHECK-NEXT: [[R:%.*]] = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> [[A_NNEG]], <2 x i8> <i8 10, i8 20>)		; CHECK-NEXT: [[R:%.*]] = add nsw <2 x i8> [[A_NNEG]], <i8 -10, i8 -20>
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a_nneg = and <2 x i8> %a, <i8 127, i8 127>		%a_nneg = and <2 x i8> %a, <i8 127, i8 127>
%r = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)		%r = call <2 x i8> @llvm.ssub.sat.v2i8(<2 x i8> %a_nneg, <2 x i8> <i8 10, i8 20>)
ret <2 x i8> %r		ret <2 x i8> %r
}		}

; neg ssub nneg may overflow.		; neg ssub nneg may overflow.
Show All 21 Lines