Download Raw Diff

Details

Reviewers

leonardchan
craig.topper
RKSimon
spatel

Summary

We'd like to make use of the new saturating add intrinsics in Rust (in part because some obvious patterns don't optimize when implemented via with.overflow intrinsics). However right now saturing adds seems to be completely opaque to the optimizer.

This patch implements some basic optimization support for saturating add intrinsics, namely:

Constant folding sat(C1 + C2) -> C3
sat(X + 0) -> X
sat(X + undef) -> X
sat(X uadd MAX) -> MAX
sat(sat(X + C1) + C2) -> sat(X + C3) where legal
sat(X1 + X2) -> add nuw/nsw where possible

Diff Detail

Event Timeline

nikic created this revision.Nov 7 2018, 4:18 PM

Herald added subscribers: llvm-commits, kristina, dexonsmith, JDevlieghere. · View Herald TranscriptNov 7 2018, 4:18 PM

What about subtraction?

LGTM, but you should probably wait for LGTM from @craig.topper since he's likely more familiar with InstCombine than I am.

This revision is now accepted and ready to land.Nov 7 2018, 9:42 PM

leonardchan added subscribers: bjope, ebevhan.Nov 7 2018, 9:42 PM

I'm no expert on how things are divided between InstructionSimplify and InstCombine, but shouldn't the simple folds be in InstructionSimplify?
For the record, we were planning to upstream something like this in simplifyBinaryIntrinsic in InstructionSimplify.cpp:

case Intrinsic::sadd_sat:
  // X + 0 -> X
  if (match(Op1, m_Zero()))
    return Op0;

  // 0 + X -> X
  if (match(Op0, m_Zero()))
    return Op1;

  // X + undef -> undef
  // undef + X -> undef
  if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
    return UndefValue::get(ReturnType);

  break;
case Intrinsic::ssub_sat:
  // X - 0 -> X
  if (match(Op1, m_Zero()))
    return Op0;

  // X - undef -> undef
  // undef - X -> undef
  if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
    return UndefValue::get(ReturnType);

  break;

The above also fold expressions involving undef.

nit: In our dowstream repo we currently have implemented folds for sadd_sat and ssub_sat. So even if doing add/sub in different patches, it would be really nice to have a patch for ssub_sat and usub_sat ready and land both patches at the same time. That way we can replace things in one merge, without the need to support the old folds of ssub_sat and the new folds for sadd_sat during a transition period.

lib/Support/APInt.cpp
1953 ↗	(On Diff #173074)	Coding style: skip all braces here.
1966 ↗	(On Diff #173074)	Coding style: skip all braces here.
lib/Transforms/InstCombine/InstCombineCalls.cpp
2027	full stop/period missing
2045	full stop/period missing
2077	full stop/period missing
2081	full stop/period missing

Please can you add APInt unit tests

This revision now requires changes to proceed.Nov 8 2018, 1:20 AM

Regarding

// X + undef -> undef
// undef + X -> undef
if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
  return UndefValue::get(ReturnType);

I was initially planning to include these simplifications, but ultimately was not certain regarding their legality. In particular, if we have uadd.sat(MaxValue, Y), then the result is fully determined to be MaxValue, regardless of the value of Y. If we have something like sadd.sat(SignedMinValue, Y) then the result is known to be negative. In either case the intrinsic cannot have the full range of results of the result type, regardless of the value of Y. As such, I think folding operations on undef to undef would not be legal in this case.

It should be possible to fold uadd.sat(X, undef) to MaxValue. Not sure how useful that is though.

In D54237#1291405, @nikic wrote:
Regarding
// X + undef -> undef
// undef + X -> undef
if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
  return UndefValue::get(ReturnType);
I was initially planning to include these simplifications, but ultimately was not certain regarding their legality. In particular, if we have uadd.sat(MaxValue, Y), then the result is fully determined to be MaxValue, regardless of the value of Y. If we have something like sadd.sat(SignedMinValue, Y) then the result is known to be negative. In either case the intrinsic cannot have the full range of results of the result type, regardless of the value of Y. As such, I think folding operations on undef to undef would not be legal in this case.

It should be possible to fold uadd.sat(X, undef) to MaxValue. Not sure how useful that is though.

Ok, I did not really think about it that way, but you definitely got a point there.
But for example, sadd.sat(undef, undef) can still be folded to undef, right?
Anyway, I'm not sure how important folds involving undef are here at all. I just noticed that undef was handled for sadd_with_overflow etc, and figured we needed it for "completeness".

Address style comments, move trivial transforms to InstructionSimplify, add APInt unit test, move code for overflow information based checks outside the condition for a constant operand (duh).

nikic marked 5 inline comments as done.Nov 8 2018, 2:30 AM

nikic added inline comments.

lib/Support/APInt.cpp
1953 ↗	(On Diff #173074)	I've left the braces on the overflow branch, otherwise there's a dangling else compiler warning.

bjope added inline comments.Nov 8 2018, 2:32 AM

lib/Support/APInt.cpp
1953 ↗	(On Diff #173074)	Yes, this is ok (not sure why I wrote "skip all", that was confusing).

rkruppe added a subscriber: rkruppe.Nov 8 2018, 3:28 AM

RKSimon added inline comments.Nov 8 2018, 5:42 AM

lib/Support/APInt.cpp
1953 ↗	(On Diff #173074)	It might be cleaner as: if (!Overflow) return Res; return isNegative() ? APInt::getSignedMinValue(Res.getBitWidth()) : APInt::getSignedMaxValue(Res.getBitWidth());

craig.topper added inline comments.Nov 8 2018, 10:27 AM

lib/Transforms/InstCombine/InstCombineCalls.cpp
2065	saturing->saturating

Improve control-flow in APInt implementation, fix typo in comment.

dexonsmith removed a subscriber: dexonsmith.Nov 8 2018, 12:33 PM

nikic mentioned this in D54274: Constant folding and instcombine for saturating subs.Nov 8 2018, 2:01 PM

nikic added a child revision: D54274: Constant folding and instcombine for saturating subs.

I've added an implementation for saturating subs in D54274. Please tell me if I should combine the changes into this one instead.

In D54237#1291405, @nikic wrote:
Regarding
// X + undef -> undef
// undef + X -> undef
if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
  return UndefValue::get(ReturnType);
I was initially planning to include these simplifications, but ultimately was not certain regarding their legality. In particular, if we have uadd.sat(MaxValue, Y), then the result is fully determined to be MaxValue, regardless of the value of Y. If we have something like sadd.sat(SignedMinValue, Y) then the result is known to be negative. In either case the intrinsic cannot have the full range of results of the result type, regardless of the value of Y. As such, I think folding operations on undef to undef would not be legal in this case.

It should be possible to fold uadd.sat(X, undef) to MaxValue. Not sure how useful that is though.

You can also assume that undef is 0 and fold X + undef -> X.

Add simplify for X + undef -> X.

Please can you pull out the APInt changes + unit tests from this and D54274 into their own patch - I'll give that a quick check but I think that part is ready. I'll leave the IR changes to everyone else.

nikic mentioned this in D54332: [APInt] Add methods for saturated add and sub.Nov 9 2018, 9:59 AM

In D54237#1292144, @nlopes wrote:
In D54237#1291405, @nikic wrote:
Regarding
// X + undef -> undef
// undef + X -> undef
if (match(Op1, m_Undef()) || match(Op0, m_Undef()))
  return UndefValue::get(ReturnType);
I was initially planning to include these simplifications, but ultimately was not certain regarding their legality. In particular, if we have uadd.sat(MaxValue, Y), then the result is fully determined to be MaxValue, regardless of the value of Y. If we have something like sadd.sat(SignedMinValue, Y) then the result is known to be negative. In either case the intrinsic cannot have the full range of results of the result type, regardless of the value of Y. As such, I think folding operations on undef to undef would not be legal in this case.

It should be possible to fold uadd.sat(X, undef) to MaxValue. Not sure how useful that is though.
You can also assume that undef is 0 and fold X + undef -> X.

Is it perhaps "better" to fold sadd_sat(X, undef) -> 0 And uadd_sat(X, undef) -> MaxValue` if we want to get rid of undef here? That way we get rid of the X operand as well.

Is it perhaps "better" to fold sadd_sat(X, undef) -> 0 And uadd_sat(X, undef) -> MaxValue if we want to get rid of undef here? That way we get rid of the X operand as well.

Folding sadd_sat(X, undef) -> 0 would not be valid, because if X = SignedMinValue, there is no choice of undef for which the result would be zero. The largest value that can be reached is -1.

I think this is a list of the possibilities that we have:

For signed add I think there is only one choice:

sadd_sat(X, undef) -> X   // for undef = 0
sadd_sat(undef, X) -> X   // for undef = 0

For signed sub we have two variants:

ssub_sat(X, undef) -> X   // for undef = 0
ssub_sat(undef, X)        // don't simplify
// or
ssub_sat(X, undef) -> 0   // for undef = X
ssub_sat(undef, X) -> 0   // for undef = X

For unsigned add we also have two:

uadd_sat(X, undef) -> X   // for undef = 0
uadd_sat(undef, X) -> X   // for undef = 0
// or
uadd_sat(X, undef) -> MAX // for undef = MAX
uadd_sat(undef, X) -> MAX // for undef = MAX

For unsigned sub also two:

usub_sat(X, undef) -> X   // for undef = 0
usub_sat(undef, X)        // don't simplify
// or
usub_sat(X, undef) -> 0   // for undef = X
usub_sat(undef, X) -> 0   // for undef = X

Given these possibilities, I would suggest to use the following folds. They don't result in the maximal number of constant results, but they have consistent results/assumptions for the signed&unsigned cases:

sadd_sat(X, undef) -> X   // for undef = 0
sadd_sat(undef, X) -> X   // for undef = 0
uadd_sat(X, undef) -> X   // for undef = 0
uadd_sat(undef, X) -> X   // for undef = 0
ssub_sat(X, undef) -> 0   // for undef = X
ssub_sat(undef, X) -> 0   // for undef = X
usub_sat(X, undef) -> 0   // for undef = X
usub_sat(undef, X) -> 0   // for undef = X

Rebase on separate APInt patch.

In D54237#1294571, @nikic wrote:

Is it perhaps "better" to fold sadd_sat(X, undef) -> 0 And uadd_sat(X, undef) -> MaxValue if we want to get rid of undef here? That way we get rid of the X operand as well.

Folding sadd_sat(X, undef) -> 0 would not be valid, because if X = SignedMinValue, there is no choice of undef for which the result would be zero. The largest value that can be reached is -1.

Ok. Got it! Sorry for the confusion.

Btw, thanks for adding these new folds of saturating add/sub.

Please can you add vector constant folding tests + support?

In D54237#1298354, @RKSimon wrote:

Please can you add vector constant folding tests + support?

Isn't it better to land this one and https://reviews.llvm.org/D54274 first since they basically are ready?
I think they are ok and reviewed (I think the only thing blocking was the @RKSimon request to extract APInt stuff and that is done).

If we need to support vectors also at once, then I think you should merge the add/sub patches into one big pile (otherwise it just complicates review and patch handling for @nikic since the "subtraction" patch basically rewrites lots of the things added in the "addition" patch.

LGTM - happy for vector support to be added as a follow up patch

This revision is now accepted and ready to land.Nov 14 2018, 6:50 AM

In D54237#1291365, @bjope wrote:

I'm no expert on how things are divided between InstructionSimplify and InstCombine, but shouldn't the simple folds be in InstructionSimplify?

The rule is that if you don't create any new variable values (ie, you are returning an existing value or constant), then it should go in InstSimplify. Note that InstSimplify is used as an analysis by other passes like GVN, so putting transforms in there improves things beyond InstCombine.

I see that this patch already has 2 approvals, so I won't block it, but we really should have the regression tests in the corresponding test directory (and using the appropriate pass in the RUN line). So the constant folding hunk of this patch should really be its own commit with tests in test/Analysis/ConstantFolding/, and the InstSimplify hunk of this patch should be its own commit with tests in test/Transforms/InstSimplify. Please make those changes as a follow-up if this is committed initially as a single patch.

I've now split this up in separate patches, separated the tests for constant folding, instruction simplify and instcombine and added tests for vector operations. The patches are now (with the add/sub cases combined):

https://reviews.llvm.org/D54332 - APInt support
https://reviews.llvm.org/D54531 - Constant folding support
https://reviews.llvm.org/D54532 - Instruction simplify support
https://reviews.llvm.org/D54534 - InstCombine support

I did not change any of the code, only the tests. Vectors to the most part work out of the box. The only thing that does not work with vectors are sat(sat(X + C1) + C2) -> sat(X + (C1 + C2)) simplifications where C1 and C2 are not constant splats. It would be possible to support this, but I'm not sure it's worth the extra complexity. From looking at the surrounding codes it seems like nothing else tries to handle the non-splat case either.

spatel mentioned this in rL347324: [APInt] Add methods for saturated add and sub.Nov 20 2018, 8:50 AM

Abandoning in favor of the more granular patches linked in the last comment.

Diff 173549

lib/Analysis/ConstantFolding.cpp

Show First 20 Lines • Show All 1,393 Lines • ▼ Show 20 Lines	bool llvm::canConstantFoldCallTo(ImmutableCallSite CS, const Function *F) {
case Intrinsic::round:		case Intrinsic::round:
case Intrinsic::masked_load:		case Intrinsic::masked_load:
case Intrinsic::sadd_with_overflow:		case Intrinsic::sadd_with_overflow:
case Intrinsic::uadd_with_overflow:		case Intrinsic::uadd_with_overflow:
case Intrinsic::ssub_with_overflow:		case Intrinsic::ssub_with_overflow:
case Intrinsic::usub_with_overflow:		case Intrinsic::usub_with_overflow:
case Intrinsic::smul_with_overflow:		case Intrinsic::smul_with_overflow:
case Intrinsic::umul_with_overflow:		case Intrinsic::umul_with_overflow:
		case Intrinsic::sadd_sat:
		case Intrinsic::uadd_sat:
case Intrinsic::convert_from_fp16:		case Intrinsic::convert_from_fp16:
case Intrinsic::convert_to_fp16:		case Intrinsic::convert_to_fp16:
case Intrinsic::bitreverse:		case Intrinsic::bitreverse:
case Intrinsic::x86_sse_cvtss2si:		case Intrinsic::x86_sse_cvtss2si:
case Intrinsic::x86_sse_cvtss2si64:		case Intrinsic::x86_sse_cvtss2si64:
case Intrinsic::x86_sse_cvttss2si:		case Intrinsic::x86_sse_cvttss2si:
case Intrinsic::x86_sse_cvttss2si64:		case Intrinsic::x86_sse_cvttss2si64:
case Intrinsic::x86_sse2_cvtsd2si:		case Intrinsic::x86_sse2_cvtsd2si:
▲ Show 20 Lines • Show All 604 Lines • ▼ Show 20 Lines	if (auto *Op1 = dyn_cast<ConstantInt>(Operands[0])) {
break;		break;
}		}
Constant *Ops[] = {		Constant *Ops[] = {
ConstantInt::get(Ty->getContext(), Res),		ConstantInt::get(Ty->getContext(), Res),
ConstantInt::get(Type::getInt1Ty(Ty->getContext()), Overflow)		ConstantInt::get(Type::getInt1Ty(Ty->getContext()), Overflow)
};		};
return ConstantStruct::get(cast<StructType>(Ty), Ops);		return ConstantStruct::get(cast<StructType>(Ty), Ops);
}		}
		case Intrinsic::uadd_sat:
		return ConstantInt::get(Ty, Op1->getValue().uadd_sat(Op2->getValue()));
		case Intrinsic::sadd_sat:
		return ConstantInt::get(Ty, Op1->getValue().sadd_sat(Op2->getValue()));
case Intrinsic::cttz:		case Intrinsic::cttz:
if (Op2->isOne() && Op1->isZero()) // cttz(0, 1) is undef.		if (Op2->isOne() && Op1->isZero()) // cttz(0, 1) is undef.
return UndefValue::get(Ty);		return UndefValue::get(Ty);
return ConstantInt::get(Ty, Op1->getValue().countTrailingZeros());		return ConstantInt::get(Ty, Op1->getValue().countTrailingZeros());
case Intrinsic::ctlz:		case Intrinsic::ctlz:
if (Op2->isOne() && Op1->isZero()) // ctlz(0, 1) is undef.		if (Op2->isOne() && Op1->isZero()) // ctlz(0, 1) is undef.
return UndefValue::get(Ty);		return UndefValue::get(Ty);
return ConstantInt::get(Ty, Op1->getValue().countLeadingZeros());		return ConstantInt::get(Ty, Op1->getValue().countLeadingZeros());
▲ Show 20 Lines • Show All 358 Lines • Show Last 20 Lines

lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 4,884 Lines • ▼ Show 20 Lines	case Intrinsic::smul_with_overflow:
// X * 0 -> { 0, false }		// X * 0 -> { 0, false }
if (match(Op0, m_Zero()) \|\| match(Op1, m_Zero()))		if (match(Op0, m_Zero()) \|\| match(Op1, m_Zero()))
return Constant::getNullValue(ReturnType);		return Constant::getNullValue(ReturnType);
// undef * X -> { 0, false }		// undef * X -> { 0, false }
// X * undef -> { 0, false }		// X * undef -> { 0, false }
if (match(Op0, m_Undef()) \|\| match(Op1, m_Undef()))		if (match(Op0, m_Undef()) \|\| match(Op1, m_Undef()))
return Constant::getNullValue(ReturnType);		return Constant::getNullValue(ReturnType);
break;		break;
		case Intrinsic::uadd_sat:
		// sat(X + MAX) -> MAX
		if (match(Op1, m_AllOnes()))
		return Constant::getAllOnesValue(ReturnType);
		// sat(MAX + X) -> MAX
		if (match(Op0, m_AllOnes()))
		return Constant::getAllOnesValue(ReturnType);
		LLVM_FALLTHROUGH;
		case Intrinsic::sadd_sat:
		// X + 0 -> X, X + undef -> X
		if (match(Op1, m_Zero()) \|\| match(Op1, m_Undef()))
		return Op0;
		// 0 + X -> X, undef + X -> X
		if (match(Op0, m_Zero()) \|\| match(Op0, m_Undef()))
		return Op1;
		break;
case Intrinsic::load_relative:		case Intrinsic::load_relative:
if (auto *C0 = dyn_cast<Constant>(Op0))		if (auto *C0 = dyn_cast<Constant>(Op0))
if (auto *C1 = dyn_cast<Constant>(Op1))		if (auto *C1 = dyn_cast<Constant>(Op1))
return SimplifyRelativeLoad(C0, C1, Q.DL);		return SimplifyRelativeLoad(C0, C1, Q.DL);
break;		break;
case Intrinsic::powi:		case Intrinsic::powi:
if (auto *Power = dyn_cast<ConstantInt>(Op1)) {		if (auto *Power = dyn_cast<ConstantInt>(Op1)) {
// powi(x, 0) -> 1.0		// powi(x, 0) -> 1.0
▲ Show 20 Lines • Show All 434 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 2,013 Lines • ▼ Show 20 Lines	case Intrinsic::ssub_with_overflow: {
Constant *OverflowResult = nullptr;		Constant *OverflowResult = nullptr;
if (OptimizeOverflowCheck(OCF, II->getArgOperand(0), II->getArgOperand(1),		if (OptimizeOverflowCheck(OCF, II->getArgOperand(0), II->getArgOperand(1),
*II, OperationResult, OverflowResult))		*II, OperationResult, OverflowResult))
return CreateOverflowTuple(II, OperationResult, OverflowResult);		return CreateOverflowTuple(II, OperationResult, OverflowResult);

break;		break;
}		}

		case Intrinsic::uadd_sat:
		case Intrinsic::sadd_sat: {
		Value *Arg0 = II->getArgOperand(0);
		Value *Arg1 = II->getArgOperand(1);
		if (isa<Constant>(Arg0) && !isa<Constant>(Arg1)) {
		// Canonicalize constants into the RHS.
		bjopeUnsubmitted Done Reply Inline Actions full stop/period missing bjope: full stop/period missing
		II->setArgOperand(0, Arg1);
		II->setArgOperand(1, Arg0);
		return II;
		}

		// Make use of known overflow information.
		bool IsUnsigned = II->getIntrinsicID() == Intrinsic::uadd_sat;
		if (IsUnsigned) {
		OverflowResult OR = computeOverflowForUnsignedAdd(Arg0, Arg1, II);
		if (OR == OverflowResult::NeverOverflows)
		return replaceInstUsesWith(*II, Builder.CreateNUWAdd(Arg0, Arg1));

		if (OR == OverflowResult::AlwaysOverflows)
		return replaceInstUsesWith(*II,
		ConstantInt::getAllOnesValue(II->getType()));
		} else {
		if (willNotOverflowSignedAdd(Arg0, Arg1, *II))
		return replaceInstUsesWith(*II, Builder.CreateNSWAdd(Arg0, Arg1));
		bjopeUnsubmitted Done Reply Inline Actions full stop/period missing bjope: full stop/period missing
		}

		// sat(sat(X + Val2) + Val) -> sat(X + (Val+Val2))
		// if Val and Val2 have the same sign
		if (auto *Other = dyn_cast<IntrinsicInst>(Arg0)) {
		Value *X;
		const APInt Val, Val2;
		APInt NewVal;
		if (Other->getIntrinsicID() == II->getIntrinsicID() &&
		match(Arg1, m_APInt(Val)) &&
		match(Other->getArgOperand(0), m_Value(X)) &&
		match(Other->getArgOperand(1), m_APInt(Val2))) {
		if (IsUnsigned)
		NewVal = Val->uadd_sat(*Val2);
		else if (Val->isNonNegative() == Val2->isNonNegative()) {
		bool Overflow;
		NewVal = Val->sadd_ov(*Val2, Overflow);
		if (Overflow) {
		// Both adds together may add more than SignedMaxValue
		// without saturating the final result.
		craig.topperUnsubmitted Done Reply Inline Actions saturing->saturating craig.topper: saturing->saturating
		break;
		}
		} else {
		// Cannot fold saturated addition with different signs.
		break;
		}

		return replaceInstUsesWith(*II, Builder.CreateBinaryIntrinsic(
		II->getIntrinsicID(), X,
		ConstantInt::get(II->getType(), NewVal)));
		}
		}
		bjopeUnsubmitted Done Reply Inline Actions full stop/period missing bjope: full stop/period missing
		break;
		}

case Intrinsic::minnum:		case Intrinsic::minnum:
		bjopeUnsubmitted Done Reply Inline Actions full stop/period missing bjope: full stop/period missing
case Intrinsic::maxnum:		case Intrinsic::maxnum:
case Intrinsic::minimum:		case Intrinsic::minimum:
case Intrinsic::maximum: {		case Intrinsic::maximum: {
Value *Arg0 = II->getArgOperand(0);		Value *Arg0 = II->getArgOperand(0);
Value *Arg1 = II->getArgOperand(1);		Value *Arg1 = II->getArgOperand(1);
// Canonicalize constants to the RHS.		// Canonicalize constants to the RHS.
if (isa<ConstantFP>(Arg0) && !isa<ConstantFP>(Arg1)) {		if (isa<ConstantFP>(Arg0) && !isa<ConstantFP>(Arg1)) {
II->setArgOperand(0, Arg1);		II->setArgOperand(0, Arg1);
▲ Show 20 Lines • Show All 2,619 Lines • Show Last 20 Lines

test/Transforms/InstCombine/saturating-add.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare void @dummy(i8)
				declare i8 @llvm.uadd.sat.i8(i8, i8)
				declare i8 @llvm.sadd.sat.i8(i8, i8)

				; Simple constant folding
				define void @test1() {
				; CHECK-LABEL: @test1(
				; CHECK-NEXT: call void @dummy(i8 30)
				; CHECK-NEXT: call void @dummy(i8 -1)
				; CHECK-NEXT: call void @dummy(i8 -10)
				; CHECK-NEXT: call void @dummy(i8 127)
				; CHECK-NEXT: call void @dummy(i8 -128)
				; CHECK-NEXT: ret void
				;
				%x1 = call i8 @llvm.uadd.sat.i8(i8 10, i8 20)
				call void @dummy(i8 %x1)
				%x2 = call i8 @llvm.uadd.sat.i8(i8 250, i8 100)
				call void @dummy(i8 %x2)

				%y1 = call i8 @llvm.sadd.sat.i8(i8 10, i8 -20)
				call void @dummy(i8 %y1)
				%y2 = call i8 @llvm.sadd.sat.i8(i8 120, i8 10)
				call void @dummy(i8 %y2)
				%y3 = call i8 @llvm.sadd.sat.i8(i8 -120, i8 -10)
				call void @dummy(i8 %y3)

				ret void
				}

				; Zero, max and undef folding
				define void @test2(i8 %a) {
				; CHECK-LABEL: @test2(
				; CHECK-NEXT: call void @dummy(i8 [[A:%.*]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: call void @dummy(i8 -1)
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: [[Y3:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A]], i8 -1)
				; CHECK-NEXT: call void @dummy(i8 [[Y3]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: call void @dummy(i8 [[A]])
				; CHECK-NEXT: ret void
				;
				%x1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 0)
				call void @dummy(i8 %x1)
				%x2 = call i8 @llvm.uadd.sat.i8(i8 0, i8 %a)
				call void @dummy(i8 %x2)
				%x3 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 255)
				call void @dummy(i8 %x3)
				%x4 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 undef)
				call void @dummy(i8 %x4)
				%x5 = call i8 @llvm.uadd.sat.i8(i8 undef, i8 %a)
				call void @dummy(i8 %x5)

				%y1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 0)
				call void @dummy(i8 %y1)
				%y2 = call i8 @llvm.sadd.sat.i8(i8 0, i8 %a)
				call void @dummy(i8 %y2)
				%y3 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 255)
				call void @dummy(i8 %y3)
				%y4 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 undef)
				call void @dummy(i8 %y4)
				%y5 = call i8 @llvm.sadd.sat.i8(i8 undef, i8 %a)
				call void @dummy(i8 %y5)

				ret void
				}

				; Folding of two saturating adds
				define void @test3(i8 %a) {
				; CHECK-LABEL: @test3(
				; CHECK-NEXT: [[TMP1:%.]] = call i8 @llvm.uadd.sat.i8(i8 [[A:%.]], i8 30)
				; CHECK-NEXT: call void @dummy(i8 [[TMP1]])
				; CHECK-NEXT: call void @dummy(i8 -1)
				; CHECK-NEXT: [[TMP2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A]], i8 30)
				; CHECK-NEXT: call void @dummy(i8 [[TMP2]])
				; CHECK-NEXT: [[TMP3:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A]], i8 -30)
				; CHECK-NEXT: call void @dummy(i8 [[TMP3]])
				; CHECK-NEXT: [[V1:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A]], i8 10)
				; CHECK-NEXT: [[V2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[V1]], i8 -20)
				; CHECK-NEXT: call void @dummy(i8 [[V2]])
				; CHECK-NEXT: [[W1:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A]], i8 100)
				; CHECK-NEXT: [[W2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[W1]], i8 100)
				; CHECK-NEXT: call void @dummy(i8 [[W2]])
				; CHECK-NEXT: ret void
				;
				%x1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 10)
				%x2 = call i8 @llvm.uadd.sat.i8(i8 %x1, i8 20)
				call void @dummy(i8 %x2)

				%y1 = call i8 @llvm.uadd.sat.i8(i8 %a, i8 100)
				%y2 = call i8 @llvm.uadd.sat.i8(i8 %y1, i8 200)
				call void @dummy(i8 %y2)

				%z1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 10)
				%z2 = call i8 @llvm.sadd.sat.i8(i8 %z1, i8 20)
				call void @dummy(i8 %z2)

				%u1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 -10)
				%u2 = call i8 @llvm.sadd.sat.i8(i8 %u1, i8 -20)
				call void @dummy(i8 %u2)

				%v1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 10)
				%v2 = call i8 @llvm.sadd.sat.i8(i8 %v1, i8 -20)
				call void @dummy(i8 %v2)

				%w1 = call i8 @llvm.sadd.sat.i8(i8 %a, i8 100)
				%w2 = call i8 @llvm.sadd.sat.i8(i8 %w1, i8 100)
				call void @dummy(i8 %w2)

				ret void
				}

				; Use of known overflow/no-overflow information
				define void @test4(i8 %a) {
				; CHECK-LABEL: @test4(
				; CHECK-NEXT: [[A_NEG:%.]] = or i8 [[A:%.]], -128
				; CHECK-NEXT: [[A_NNEG:%.*]] = and i8 [[A]], 127
				; CHECK-NEXT: call void @dummy(i8 -1)
				; CHECK-NEXT: [[TMP1:%.*]] = add nuw i8 [[A_NNEG]], 10
				; CHECK-NEXT: call void @dummy(i8 [[TMP1]])
				; CHECK-NEXT: [[Y1:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A_NEG]], i8 -10)
				; CHECK-NEXT: call void @dummy(i8 [[Y1]])
				; CHECK-NEXT: [[Y2:%.*]] = call i8 @llvm.sadd.sat.i8(i8 [[A_NNEG]], i8 10)
				; CHECK-NEXT: call void @dummy(i8 [[Y2]])
				; CHECK-NEXT: [[TMP2:%.*]] = add nsw i8 [[A_NEG]], 10
				; CHECK-NEXT: call void @dummy(i8 [[TMP2]])
				; CHECK-NEXT: [[TMP3:%.*]] = add nsw i8 [[A_NNEG]], -10
				; CHECK-NEXT: call void @dummy(i8 [[TMP3]])
				; CHECK-NEXT: ret void
				;
				%a_neg = or i8 %a, -128
				%a_nneg = and i8 %a, 127

				%x1 = call i8 @llvm.uadd.sat.i8(i8 %a_neg, i8 -10)
				call void @dummy(i8 %x1)

				%x2 = call i8 @llvm.uadd.sat.i8(i8 %a_nneg, i8 10)
				call void @dummy(i8 %x2)

				%y1 = call i8 @llvm.sadd.sat.i8(i8 %a_neg, i8 -10)
				call void @dummy(i8 %y1)

				%y2 = call i8 @llvm.sadd.sat.i8(i8 %a_nneg, i8 10)
				call void @dummy(i8 %y2)

				%y3 = call i8 @llvm.sadd.sat.i8(i8 %a_neg, i8 10)
				call void @dummy(i8 %y3)

				%y4 = call i8 @llvm.sadd.sat.i8(i8 %a_nneg, i8 -10)
				call void @dummy(i8 %y4)

				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Constant folding and instcombine for saturating adds
AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 173549

lib/Analysis/ConstantFolding.cpp

lib/Analysis/InstructionSimplify.cpp

lib/Transforms/InstCombine/InstCombineCalls.cpp

test/Transforms/InstCombine/saturating-add.ll

This is an archive of the discontinued LLVM Phabricator instance.

Constant folding and instcombine for saturating addsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 173549

lib/Analysis/ConstantFolding.cpp

lib/Analysis/InstructionSimplify.cpp

lib/Transforms/InstCombine/InstCombineCalls.cpp

test/Transforms/InstCombine/saturating-add.ll

Constant folding and instcombine for saturating adds
AbandonedPublic