Download Raw Diff

Details

Reviewers

nikic
spatel

Commits

rG6b9317f52a66: [InstCombine] Fold zero check followed by decrement to usub.sat

Summary

Fold (a == 0) : 0 ? a - 1 into usub.sat(a, 1)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

clubby789 created this revision.Dec 30 2022, 10:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 30 2022, 10:44 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

clubby789 requested review of this revision.Dec 30 2022, 10:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 30 2022, 10:44 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B205262: Diff 485752.Dec 30 2022, 11:31 PM

Proof: https://alive2.llvm.org/ce/z/a2jKgY

Is it possible/sensible to integrate this into canonicalizeSaturatedSubtract()? It seems like this is essentially the same, just for the ult 1 -> eq 0 special case. But maybe that is more awkward than a separate fold.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
881	Not necessary, icmp constants are canonicalized to the right.
887	This is incorrect: If we swap select operands, we also need to invert the predicate. What you want to do is check whether the predicate is `ne` and then invert and swap. I believe this can only happen in multi-use scenarios, e.g.: https://llvm.godbolt.org/z/5Pxo3f8WE
893	It probably makes more sense to use m_SpecificInt here. While this CreateNeg call will not actually create an instruction, it looks like it could.
894	Limited to just the constant case, I don't think we need to handle sub at all, it will always be canonicalized to add.
llvm/test/Transforms/InstCombine/saturating-add-sub.ll
533	Especially if this is implemented as a separate fold, we'd want some negative test cases here, e.g. incorrect constants (vary zero/one), incorrect operands (not %a both time), incorrect icmp predicate. As you are using `m_Zero()` matchers, which allow undef/poison in vectors, we'd want to test that as well.

Inline decrement check into canonicalizeSaturatedSubtract and add some negative test cases

clubby789 marked 4 inline comments as done.Dec 31 2022, 7:18 AM

clubby789 added inline comments.

llvm/test/Transforms/InstCombine/saturating-add-sub.ll
533	Sorry, I'm not quite sure how to test the vector case you mentioned.

Harbormaster completed remote builds in B205278: Diff 485771.Dec 31 2022, 7:57 AM

Fix for vectors and update test

Harbormaster completed remote builds in B205320: Diff 485821.Jan 1 2023, 2:51 PM

clubby789 updated this revision to Diff 485825.Jan 1 2023, 3:48 PM

Harbormaster completed remote builds in B205324: Diff 485825.Jan 1 2023, 4:29 PM

clubby789 updated this revision to Diff 485837.Jan 1 2023, 6:54 PM

Harbormaster completed remote builds in B205335: Diff 485837.Jan 1 2023, 7:35 PM

nikic added inline comments.Jan 2 2023, 7:48 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
815	This is not going to trigger, because everything here is inside `Pred == ICmpInst::ICMP_EQ`. You'd want to check for isEquality() first (which accepts eq and ne), then have a check for ne where you invert the predicate and swap true/false, and then do the check for zero on the true value. I'd suggest adding this test case to make sure the ne pattern is matched: define i8 @test(i8 %a) { %i = icmp ne i8 %a, 0 call void @use.i1(i1 %i) %i1 = add i8 %a, -1 %i2 = select i1 %i, i8 %i1, i8 0 ret i8 %i2 } declare void @use.i1(i1)
825	You can match `-1` using `m_AllOnes()`.

Fix icmp ne case

Harbormaster completed remote builds in B205370: Diff 485878.Jan 2 2023, 12:35 PM

nikic added inline comments.Jan 4 2023, 1:58 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
809–830	You could take this piece of code (starting at `if (match(TrueVal`) and move it to the top of the function, because this bit of logic is the same for both cases.
816	Should be getInversePredicate(), though it doesn't actually matter here (as you are not checking the Pred afterwards anymore).
825	`m_APInt(Added)` can be replaced with `m_AllOnes()` here. Below you'd create the constant with value `1` then.
llvm/test/Transforms/InstCombine/saturating-add-sub.ll
615	2 would be better, to prevent the select from folding away.

clubby789 updated this revision to Diff 486285.Jan 4 2023, 7:29 AM

Harbormaster completed remote builds in B205699: Diff 486285.Jan 4 2023, 9:01 AM

nikic added inline comments.Jan 6 2023, 2:43 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
817–837	I believe this should be sufficient now. The rest is handled by the shared bit of code.

clubby789 updated this revision to Diff 486920.Jan 6 2023, 10:06 AM

Harbormaster completed remote builds in B206149: Diff 486920.Jan 6 2023, 11:32 AM

LGTM

If you don't have commit access, can you please share the "Name <email>" to use for the commit?

This revision is now accepted and ready to land.Jan 7 2023, 10:02 AM

In D140798#4033587, @nikic wrote:

If you don't have commit access, can you please share the "Name <email>" to use for the commit?

Thanks for all of the help improving this patch 🙂 I'd like to use "Jamie Hill-Daniel <jamie@hill-daniel.co.uk>"

nikic mentioned this in rG8f4795ef1373: [InstCombine] Add tests for saturating subtract by one (NFC).Jan 9 2023, 5:10 AM

Closed by commit rG6b9317f52a66: [InstCombine] Fold zero check followed by decrement to usub.sat (authored by clubby789, committed by nikic). · Explain WhyJan 9 2023, 5:22 AM

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG6b9317f52a66: [InstCombine] Fold zero check followed by decrement to usub.sat.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 9 2023, 5:22 AM

Just as a heads-up, this regresses one of Rust's codegen tests: fn check_foo2 in [issue-45222.rs](https://github.com/rust-lang/rust/blob/master/src/test/codegen/issue-45222.rs) is no longer being constant-folded, AFAICT because IndVarSimplifyPass no longer works on the %a = tail call i64 @llvm.usub.sat.i64(i64 %b, i64 1) generated by this patch (instead of %a = add i64 %b, -1 without this patch).

The full module IR is at https://gist.github.com/TimNN/e9a762026083ff265bdccbbcb34be956 (sorry, not minimized). Running it through opt -O3 without this patch constant-folds the loop, with this patch the loop remains.

I don't know if this is significant enough to warrant reverting the patch (I'm guessing not). Let me know if you have suggestions on how to proceed. (Is this a problem with the transform or does IndVarSimplifyPass need to become smarter?)

edit: updated the gist with a reduced test case.

Before this patch, CVP converts

define i32 @test() {
entry: 
  br label %loop 
   
loop:
  %iv = phi i32 [ 1000, %entry ], [ %iv.next, %loop ]
  %count = phi i32 [ 0, %entry ], [ %count.next, %loop ] 
  %cmp = icmp eq i32 %iv, 0
  %iv.dec = add i32 %iv, -1 
  %iv.next = select i1 %cmp, i32 0, i32 %iv.dec 
  %count.next = add i32 %count, 1 
  br i1 %cmp, label %exit, label %loop 
   
exit:
  ret i32 %count 
}

into

define i32 @test() {
entry:
  br label %loop

loop:                                             ; preds = %loop, %entry
  %iv = phi i32 [ 1000, %entry ], [ %iv.dec, %loop ]
  %count = phi i32 [ 0, %entry ], [ %count.next, %loop ]
  %cmp = icmp eq i32 %iv, 0
  %iv.dec = add i32 %iv, -1
  %iv.next = select i1 %cmp, i32 0, i32 %iv.dec
  %count.next = add i32 %count, 1
  br i1 %cmp, label %exit, label %loop

exit:                                             ; preds = %loop
  ret i32 %count
}

The difference is that %iv.next in the phi node is replaced with %iv.dec, because the select and branch have the same condition.

After this patch, we instead have:

define i32 @test() {
entry:
  br label %loop

loop:                                             ; preds = %loop, %entry
  %iv = phi i32 [ 1000, %entry ], [ %iv.next, %loop ]
  %count = phi i32 [ 0, %entry ], [ %count.next, %loop ]
  %cmp = icmp eq i32 %iv, 0
  %iv.next = call i32 @llvm.usub.sat.i32(i32 %iv, i32 1)
  %count.next = add i32 %count, 1
  br i1 %cmp, label %exit, label %loop

exit:                                             ; preds = %loop
  ret i32 %count
}

The same optimization is still possible in principle, in that we could convert the usub.sat into a sub in this case.

@TimNN I've put up https://reviews.llvm.org/D141482 to address this.

nikic mentioned this in rG4f772b095525: [LVI][CVP] Make use of condition known at use.Jan 12 2023, 7:42 AM

alex-t mentioned this in D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.Mar 29 2023, 6:23 AM

This change caused a regression in AMDGPU backend.
In case optimization is done before inline, it covers opportunities for other inst-combines that may have higher precedence.
As a result, we get suboptimal code.

The example is in the https://reviews.llvm.org/D147146 which I have created to propose a possible fix.

Diff 487393

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 797 Lines • ▼ Show 20 Lines

/// Transform patterns such as (a > b) ? a - b : 0 into usub.sat(a, b). /// Transform patterns such as (a > b) ? a - b : 0 into usub.sat(a, b).

/// There are 8 commuted/swapped variants of this pattern. /// There are 8 commuted/swapped variants of this pattern.

/// TODO: Also support a - UMIN(a,b) patterns. /// TODO: Also support a - UMIN(a,b) patterns.

static Value *canonicalizeSaturatedSubtract(const ICmpInst *ICI, static Value *canonicalizeSaturatedSubtract(const ICmpInst *ICI,

const Value *TrueVal, const Value *TrueVal,

const Value *FalseVal, const Value *FalseVal,

InstCombiner::BuilderTy &Builder) { InstCombiner::BuilderTy &Builder) {

ICmpInst::Predicate Pred = ICI->getPredicate(); ICmpInst::Predicate Pred = ICI->getPredicate();

if (!ICmpInst::isUnsigned(Pred)) Value *A = ICI->getOperand(0);

return nullptr; Value *B = ICI->getOperand(1);

// (b > a) ? 0 : a - b -> (b <= a) ? a - b : 0 // (b > a) ? 0 : a - b -> (b <= a) ? a - b : 0

// (a == 0) ? 0 : a - 1 -> (a != 0) ? a - 1 : 0

if (match(TrueVal, m_Zero())) { if (match(TrueVal, m_Zero())) {

Pred = ICmpInst::getInversePredicate(Pred); Pred = ICmpInst::getInversePredicate(Pred);

std::swap(TrueVal, FalseVal); std::swap(TrueVal, FalseVal);

} }

nikicUnsubmitted

Not Done

This is not going to trigger, because everything here is inside Pred == ICmpInst::ICMP_EQ. You'd want to check for isEquality() first (which accepts eq and ne), then have a check for ne where you invert the predicate and swap true/false, and then do the check for zero on the true value.

I'd suggest adding this test case to make sure the ne pattern is matched:

define i8 @test(i8 %a) {
  %i = icmp ne i8 %a, 0
  call void @use.i1(i1 %i)
  %i1 = add i8 %a, -1
  %i2 = select i1 %i, i8 %i1, i8 0
  ret i8 %i2
}

declare void @use.i1(i1)

nikic: This is not going to trigger, because everything here is inside `Pred == ICmpInst::ICMP_EQ`.

if (!match(FalseVal, m_Zero())) if (!match(FalseVal, m_Zero()))

nikicUnsubmitted

Not Done

Should be getInversePredicate(), though it doesn't actually matter here (as you are not checking the Pred afterwards anymore).

nikic: Should be getInversePredicate(), though it doesn't actually matter here (as you are not…

return nullptr; return nullptr;

Value *A = ICI->getOperand(0); // ugt 0 is canonicalized to ne 0 and requires special handling

Value *B = ICI->getOperand(1); // (a != 0) ? a + -1 : 0 -> usub.sat(a, 1)

if (Pred == ICmpInst::ICMP_NE) {

if (match(B, m_Zero()) && match(TrueVal, m_Add(m_Specific(A), m_AllOnes())))

return Builder.CreateBinaryIntrinsic(Intrinsic::usub_sat, A,

ConstantInt::get(A->getType(), 1));

return nullptr;

nikicUnsubmitted

Not Done

You can match -1 using m_AllOnes().

nikic: You can match `-1` using `m_AllOnes()`.

nikicUnsubmitted

Not Done

m_APInt(Added) can be replaced with m_AllOnes() here. Below you'd create the constant with value 1 then.

nikic: `m_APInt(Added)` can be replaced with `m_AllOnes()` here. Below you'd create the constant with…

}

if (!ICmpInst::isUnsigned(Pred))

return nullptr;

nikicUnsubmitted

Not Done

You could take this piece of code (starting at if (match(TrueVal) and move it to the top of the function, because this bit of logic is the same for both cases.

nikic: You could take this piece of code (starting at `if (match(TrueVal`) and move it to the top of…

if (Pred == ICmpInst::ICMP_ULE || Pred == ICmpInst::ICMP_ULT) { if (Pred == ICmpInst::ICMP_ULE || Pred == ICmpInst::ICMP_ULT) {

// (b < a) ? a - b : 0 -> (a > b) ? a - b : 0 // (b < a) ? a - b : 0 -> (a > b) ? a - b : 0

std::swap(A, B); std::swap(A, B);

Pred = ICmpInst::getSwappedPredicate(Pred); Pred = ICmpInst::getSwappedPredicate(Pred);

} }

assert((Pred == ICmpInst::ICMP_UGE || Pred == ICmpInst::ICMP_UGT) && assert((Pred == ICmpInst::ICMP_UGE || Pred == ICmpInst::ICMP_UGT) &&

nikicUnsubmitted

Not Done

std::swap(TrueVal, FalseVal);

}

- // (a == 0) : 0 : a - 1 -> usub.sat(a, 1)

- if (ICmpInst::isEquality(Pred)) {

- if (!match(B, m_Zero()))

- return nullptr;

- if (match(FalseVal, m_Zero())) {

- if (Pred == ICmpInst::ICMP_NE) {

- std::swap(TrueVal, FalseVal);

- }

- if (!match(TrueVal, m_Zero()))

- return nullptr;

- if (match(FalseVal, m_Add(m_Specific(A), m_AllOnes()))) {

- // add a, -1

+ // ugt 0 is canonicalized to ne 0 and requires special handling.

+ // (a != 0) ? a + -1 : 0 -> usub.sat(a, 1)

+ if (Pred == ICmpInst::ICMP_NE) {

+ if (match(TrueVal, m_Add(m_Specific(A), m_AllOnes())))

return Builder.CreateBinaryIntrinsic(Intrinsic::usub_sat, A,

ConstantInt::get(A->getType(), 1));

- }

return nullptr;

}

if (!ICmpInst::isUnsigned(Pred))

I believe this should be sufficient now. The rest is handled by the shared bit of code.

nikic: I believe this should be sufficient now. The rest is handled by the shared bit of code.

"Unexpected isUnsigned predicate!"); "Unexpected isUnsigned predicate!");

// Ensure the sub is of the form: // Ensure the sub is of the form:

// (a > b) ? a - b : 0 -> usub.sat(a, b) // (a > b) ? a - b : 0 -> usub.sat(a, b)

// (a > b) ? b - a : 0 -> -usub.sat(a, b) // (a > b) ? b - a : 0 -> -usub.sat(a, b)

// Checking for both a-b and a+(-b) as a constant. // Checking for both a-b and a+(-b) as a constant.

bool IsNegative = false; bool IsNegative = false;

const APInt *C; const APInt *C;

Show All 27 Lines static Value *canonicalizeSaturatedAdd(ICmpInst *Cmp, Value *TVal, Value *FVal,

// Match unsigned saturated add with constant. // Match unsigned saturated add with constant.

Value *Cmp0 = Cmp->getOperand(0); Value *Cmp0 = Cmp->getOperand(0);

Value *Cmp1 = Cmp->getOperand(1); Value *Cmp1 = Cmp->getOperand(1);

ICmpInst::Predicate Pred = Cmp->getPredicate(); ICmpInst::Predicate Pred = Cmp->getPredicate();

Value *X; Value *X;

const APInt *C, *CmpC; const APInt *C, *CmpC;

if (Pred == ICmpInst::ICMP_ULT && if (Pred == ICmpInst::ICMP_ULT &&

match(TVal, m_Add(m_Value(X), m_APInt(C))) && X == Cmp0 && match(TVal, m_Add(m_Value(X), m_APInt(C))) && X == Cmp0 &&

match(FVal, m_AllOnes()) && match(Cmp1, m_APInt(CmpC)) && *CmpC == ~*C) { match(FVal, m_AllOnes()) && match(Cmp1, m_APInt(CmpC)) && *CmpC == ~*C) {

nikicUnsubmitted

Done

Not necessary, icmp constants are canonicalized to the right.

nikic: Not necessary, icmp constants are canonicalized to the right.

// (X u< ~C) ? (X + C) : -1 --> uadd.sat(X, C) // (X u< ~C) ? (X + C) : -1 --> uadd.sat(X, C)

return Builder.CreateBinaryIntrinsic( return Builder.CreateBinaryIntrinsic(

Intrinsic::uadd_sat, X, ConstantInt::get(X->getType(), *C)); Intrinsic::uadd_sat, X, ConstantInt::get(X->getType(), *C));

} }

// Match unsigned saturated add of 2 variables with an unnecessary 'not'. // Match unsigned saturated add of 2 variables with an unnecessary 'not'.

nikicUnsubmitted

Done

This is incorrect: If we swap select operands, we also need to invert the predicate. What you want to do is check whether the predicate is ne and then invert and swap. I believe this can only happen in multi-use scenarios, e.g.: https://llvm.godbolt.org/z/5Pxo3f8WE

nikic: This is incorrect: If we swap select operands, we also need to invert the predicate. What you…

// There are 8 commuted variants. // There are 8 commuted variants.

// Canonicalize -1 (saturated result) to true value of the select. // Canonicalize -1 (saturated result) to true value of the select.

if (match(FVal, m_AllOnes())) { if (match(FVal, m_AllOnes())) {

std::swap(TVal, FVal); std::swap(TVal, FVal);

Pred = CmpInst::getInversePredicate(Pred); Pred = CmpInst::getInversePredicate(Pred);

} }

nikicUnsubmitted

Done

It probably makes more sense to use m_SpecificInt here. While this CreateNeg call will not actually create an instruction, it looks like it could.

nikic: It probably makes more sense to use m_SpecificInt here. While this CreateNeg call will not…

if (!match(TVal, m_AllOnes())) if (!match(TVal, m_AllOnes()))

nikicUnsubmitted

Done

Limited to just the constant case, I don't think we need to handle sub at all, it will always be canonicalized to add.

nikic: Limited to just the constant case, I don't think we need to handle sub at all, it will always…

return nullptr; return nullptr;

// Canonicalize predicate to less-than or less-or-equal-than. // Canonicalize predicate to less-than or less-or-equal-than.

if (Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_UGE) { if (Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_UGE) {

std::swap(Cmp0, Cmp1); std::swap(Cmp0, Cmp1);

Pred = CmpInst::getSwappedPredicate(Pred); Pred = CmpInst::getSwappedPredicate(Pred);

} }

if (Pred != ICmpInst::ICMP_ULT && Pred != ICmpInst::ICMP_ULE) if (Pred != ICmpInst::ICMP_ULT && Pred != ICmpInst::ICMP_ULE)

▲ Show 20 Lines • Show All 2,536 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/saturating-add-sub.ll

Show First 20 Lines • Show All 517 Lines • ▼ Show 20 Lines	;
%x1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 10)		%x1 = call i8 @llvm.usub.sat.i8(i8 %a, i8 10)
%x2 = call i8 @llvm.usub.sat.i8(i8 %x1, i8 20)		%x2 = call i8 @llvm.usub.sat.i8(i8 %x1, i8 20)
ret i8 %x2		ret i8 %x2
}		}

; Can simplify zero check followed by decrement		; Can simplify zero check followed by decrement
define i8 @test_simplify_decrement(i8 %a) {		define i8 @test_simplify_decrement(i8 %a) {
; CHECK-LABEL: @test_simplify_decrement(		; CHECK-LABEL: @test_simplify_decrement(
; CHECK-NEXT: [[I:%.]] = icmp eq i8 [[A:%.]], 0		; CHECK-NEXT: [[I2:%.]] = call i8 @llvm.usub.sat.i8(i8 [[A:%.]], i8 1)
; CHECK-NEXT: [[I1:%.*]] = add i8 [[A]], -1
; CHECK-NEXT: [[I2:%.*]] = select i1 [[I]], i8 0, i8 [[I1]]
; CHECK-NEXT: ret i8 [[I2]]		; CHECK-NEXT: ret i8 [[I2]]
;		;
%i = icmp eq i8 %a, 0		%i = icmp eq i8 %a, 0
%i1 = sub i8 %a, 1		%i1 = sub i8 %a, 1
%i2 = select i1 %i, i8 0, i8 %i1		%i2 = select i1 %i, i8 0, i8 %i1
ret i8 %i2		ret i8 %i2
}		}
		nikicUnsubmitted Not Done Reply Inline Actions Especially if this is implemented as a separate fold, we'd want some negative test cases here, e.g. incorrect constants (vary zero/one), incorrect operands (not %a both time), incorrect icmp predicate. As you are using `m_Zero()` matchers, which allow undef/poison in vectors, we'd want to test that as well. nikic: Especially if this is implemented as a separate fold, we'd want some negative test cases here…
		clubby789AuthorUnsubmitted Not Done Reply Inline Actions Sorry, I'm not quite sure how to test the vector case you mentioned. clubby789: Sorry, I'm not quite sure how to test the vector case you mentioned.

declare void @use.i1(i1)		declare void @use.i1(i1)

define i8 @test_simplify_decrement_ne(i8 %a) {		define i8 @test_simplify_decrement_ne(i8 %a) {
; CHECK-LABEL: @test_simplify_decrement_ne(		; CHECK-LABEL: @test_simplify_decrement_ne(
; CHECK-NEXT: [[I:%.]] = icmp ne i8 [[A:%.]], 0		; CHECK-NEXT: [[I:%.]] = icmp ne i8 [[A:%.]], 0
; CHECK-NEXT: call void @use.i1(i1 [[I]])		; CHECK-NEXT: call void @use.i1(i1 [[I]])
; CHECK-NEXT: [[I1:%.*]] = add i8 [[A]], -1		; CHECK-NEXT: [[I2:%.*]] = call i8 @llvm.usub.sat.i8(i8 [[A]], i8 1)
; CHECK-NEXT: [[I2:%.*]] = select i1 [[I]], i8 [[I1]], i8 0
; CHECK-NEXT: ret i8 [[I2]]		; CHECK-NEXT: ret i8 [[I2]]
;		;
%i = icmp ne i8 %a, 0		%i = icmp ne i8 %a, 0
call void @use.i1(i1 %i)		call void @use.i1(i1 %i)
%i1 = add i8 %a, -1		%i1 = add i8 %a, -1
%i2 = select i1 %i, i8 %i1, i8 0		%i2 = select i1 %i, i8 %i1, i8 0
ret i8 %i2		ret i8 %i2
}		}

define <2 x i8> @test_simplify_decrement_vec(<2 x i8> %a) {		define <2 x i8> @test_simplify_decrement_vec(<2 x i8> %a) {
; CHECK-LABEL: @test_simplify_decrement_vec(		; CHECK-LABEL: @test_simplify_decrement_vec(
; CHECK-NEXT: [[I:%.]] = icmp eq <2 x i8> [[A:%.]], zeroinitializer		; CHECK-NEXT: [[I2:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 1, i8 1>)
; CHECK-NEXT: [[I1:%.*]] = add <2 x i8> [[A]], <i8 -1, i8 -1>
; CHECK-NEXT: [[I2:%.*]] = select <2 x i1> [[I]], <2 x i8> zeroinitializer, <2 x i8> [[I1]]
; CHECK-NEXT: ret <2 x i8> [[I2]]		; CHECK-NEXT: ret <2 x i8> [[I2]]
;		;
%i = icmp eq <2 x i8> %a, <i8 0, i8 0>		%i = icmp eq <2 x i8> %a, <i8 0, i8 0>
%i1 = sub <2 x i8> %a, <i8 1, i8 1>		%i1 = sub <2 x i8> %a, <i8 1, i8 1>
%i2 = select <2 x i1> %i, <2 x i8> <i8 0, i8 0>, <2 x i8> %i1		%i2 = select <2 x i1> %i, <2 x i8> <i8 0, i8 0>, <2 x i8> %i1
ret <2 x i8> %i2		ret <2 x i8> %i2
}		}

define <2 x i8> @test_simplify_decrement_vec_undef(<2 x i8> %a) {		define <2 x i8> @test_simplify_decrement_vec_undef(<2 x i8> %a) {
; CHECK-LABEL: @test_simplify_decrement_vec_undef(		; CHECK-LABEL: @test_simplify_decrement_vec_undef(
; CHECK-NEXT: [[I:%.]] = icmp eq <2 x i8> [[A:%.]], zeroinitializer		; CHECK-NEXT: [[I2:%.]] = call <2 x i8> @llvm.usub.sat.v2i8(<2 x i8> [[A:%.]], <2 x i8> <i8 1, i8 1>)
; CHECK-NEXT: [[I1:%.*]] = add <2 x i8> [[A]], <i8 -1, i8 -1>
; CHECK-NEXT: [[I2:%.*]] = select <2 x i1> [[I]], <2 x i8> <i8 0, i8 undef>, <2 x i8> [[I1]]
; CHECK-NEXT: ret <2 x i8> [[I2]]		; CHECK-NEXT: ret <2 x i8> [[I2]]
;		;
%i = icmp eq <2 x i8> %a, <i8 0, i8 0>		%i = icmp eq <2 x i8> %a, <i8 0, i8 0>
%i1 = sub <2 x i8> %a, <i8 1, i8 1>		%i1 = sub <2 x i8> %a, <i8 1, i8 1>
%i2 = select <2 x i1> %i, <2 x i8> <i8 0, i8 undef>, <2 x i8> %i1		%i2 = select <2 x i1> %i, <2 x i8> <i8 0, i8 undef>, <2 x i8> %i1
ret <2 x i8> %i2		ret <2 x i8> %i2
}		}

Show All 34 Lines	;
%i2 = select i1 %i, i8 0, i8 %i1		%i2 = select i1 %i, i8 0, i8 %i1
ret i8 %i2		ret i8 %i2
}		}

define i8 @test_invalid_simplify_select_1(i8 %a) {		define i8 @test_invalid_simplify_select_1(i8 %a) {
; CHECK-LABEL: @test_invalid_simplify_select_1(		; CHECK-LABEL: @test_invalid_simplify_select_1(
; CHECK-NEXT: [[I:%.]] = icmp eq i8 [[A:%.]], 0		; CHECK-NEXT: [[I:%.]] = icmp eq i8 [[A:%.]], 0
; CHECK-NEXT: [[I1:%.*]] = add i8 [[A]], -1		; CHECK-NEXT: [[I1:%.*]] = add i8 [[A]], -1
; CHECK-NEXT: [[I2:%.*]] = select i1 [[I]], i8 1, i8 [[I1]]		; CHECK-NEXT: [[I2:%.*]] = select i1 [[I]], i8 1, i8 [[I1]]
		nikicUnsubmitted Not Done Reply Inline Actions 2 would be better, to prevent the select from folding away. nikic: 2 would be better, to prevent the select from folding away.
; CHECK-NEXT: ret i8 [[I2]]		; CHECK-NEXT: ret i8 [[I2]]
;		;
%i = icmp eq i8 %a, 0		%i = icmp eq i8 %a, 0
%i1 = sub i8 %a, 1		%i1 = sub i8 %a, 1
%i2 = select i1 %i, i8 1, i8 %i1		%i2 = select i1 %i, i8 1, i8 %i1
ret i8 %i2		ret i8 %i2
}		}

▲ Show 20 Lines • Show All 1,428 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll

Show First 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	;
%cmp = icmp ugt i32 %a, 1		%cmp = icmp ugt i32 %a, 1
%sub = add i32 %a, -1		%sub = add i32 %a, -1
%sel = select i1 %cmp, i32 %sub ,i32 0		%sel = select i1 %cmp, i32 %sub ,i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @max_sub_ugt_c01(i32 %a) {		define i32 @max_sub_ugt_c01(i32 %a) {
; CHECK-LABEL: @max_sub_ugt_c01(		; CHECK-LABEL: @max_sub_ugt_c01(
; CHECK-NEXT: [[CMP:%.]] = icmp eq i32 [[A:%.]], 0		; CHECK-NEXT: [[SEL:%.]] = call i32 @llvm.usub.sat.i32(i32 [[A:%.]], i32 1)
; CHECK-NEXT: [[SUB:%.*]] = add i32 [[A]], -1
; CHECK-NEXT: [[SEL:%.*]] = select i1 [[CMP]], i32 0, i32 [[SUB]]
; CHECK-NEXT: ret i32 [[SEL]]		; CHECK-NEXT: ret i32 [[SEL]]
;		;
%cmp = icmp ugt i32 %a, 0		%cmp = icmp ugt i32 %a, 0
%sub = add i32 %a, -1		%sub = add i32 %a, -1
%sel = select i1 %cmp, i32 %sub ,i32 0		%sel = select i1 %cmp, i32 %sub ,i32 0
ret i32 %sel		ret i32 %sel
}		}

▲ Show 20 Lines • Show All 175 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/pr44461-br-to-switch-rotate.ll

	Show All 13 Lines
	; CHECK-NEXT: [[COUNT_1:%.]] = phi i64 [ 0, [[START]] ], [ [[TMP4:%.]], [[BB3_I_I]] ]			; CHECK-NEXT: [[COUNT_1:%.]] = phi i64 [ 0, [[START]] ], [ [[TMP4:%.]], [[BB3_I_I]] ]
	; CHECK-NEXT: switch i8 [[ITER1_SROA_9_0]], label [[BB12:%.*]] [			; CHECK-NEXT: switch i8 [[ITER1_SROA_9_0]], label [[BB12:%.*]] [
	; CHECK-NEXT: i8 2, label [[BB3_I_I]]			; CHECK-NEXT: i8 2, label [[BB3_I_I]]
	; CHECK-NEXT: i8 0, label [[BB3_I_I]]			; CHECK-NEXT: i8 0, label [[BB3_I_I]]
	; CHECK-NEXT: ]			; CHECK-NEXT: ]
	; CHECK: bb3.i.i:			; CHECK: bb3.i.i:
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[ITER1_SROA_5_0]], 0			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[ITER1_SROA_5_0]], 0
	; CHECK-NEXT: [[TMP3]] = zext i1 [[TMP2]] to i8			; CHECK-NEXT: [[TMP3]] = zext i1 [[TMP2]] to i8
	; CHECK-NEXT: [[_5_0_I_I_I_I:%.*]] = add i64 [[ITER1_SROA_5_0]], -1			; CHECK-NEXT: [[SPEC_SELECT]] = tail call i64 @llvm.usub.sat.i64(i64 [[ITER1_SROA_5_0]], i64 1)
	; CHECK-NEXT: [[SPEC_SELECT]] = select i1 [[TMP2]], i64 0, i64 [[_5_0_I_I_I_I]]
	; CHECK-NEXT: [[TMP4]] = add i64 [[COUNT_1]], [[ITER1_SROA_5_0]]			; CHECK-NEXT: [[TMP4]] = add i64 [[COUNT_1]], [[ITER1_SROA_5_0]]
	; CHECK-NEXT: br label [[BB10]]			; CHECK-NEXT: br label [[BB10]]
	; CHECK: bb12:			; CHECK: bb12:
	; CHECK-NEXT: ret i64 [[COUNT_1]]			; CHECK-NEXT: ret i64 [[COUNT_1]]
	;			;
	start:			start:
	br label %bb10			br label %bb10

	Show All 21 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold zero check followed by decrement to usub.sat
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 487393

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

llvm/test/Transforms/InstCombine/saturating-add-sub.ll

llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll

llvm/test/Transforms/PhaseOrdering/pr44461-br-to-switch-rotate.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold zero check followed by decrement to usub.satClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 487393

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

llvm/test/Transforms/InstCombine/saturating-add-sub.ll

llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll

llvm/test/Transforms/PhaseOrdering/pr44461-br-to-switch-rotate.ll

[InstCombine] Fold zero check followed by decrement to usub.sat
ClosedPublic