This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
-
DAGCombiner.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
sat-add.ll
-
X86/
-
sat-add.ll

Differential D51929

[DAGCombiner] use UADDO to optimize saturated unsigned add
ClosedPublic

Authored by spatel on Sep 11 2018, 7:02 AM.

Download Raw Diff

Details

Reviewers

craig.topper
lebedev.ri
efriedma
t.p.northover
javed.absar

Commits

rG2c901742cac6: [DAGCombiner] use UADDO to optimize saturated unsigned add
rL342886: [DAGCombiner] use UADDO to optimize saturated unsigned add

Summary

This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. That's why AArch64 only shows diffs for i32/i64. The existing codegen for the CGP-altered i8/i16 tests suggests that AArch could do better by marking the other types as 'custom', but that's outside the scope of this patch.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Sep 11 2018, 7:02 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptSep 11 2018, 7:02 AM

Herald added subscribers: kristof.beyls, mcrosier. · View Herald Transcript

That's why AArch64 only shows diffs for i32/i64.

Reading that again, that's likely to confuse you. :) More detailed explanation:

AArch shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place.

lebedev.ri added inline comments.Sep 12 2018, 9:13 AM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7312–7323 ↗	(On Diff #164859)	Seems like some of this can be a preparatory NFC commit?
7332–7333 ↗	(On Diff #164859)	I feel like this needs a comment with IR example..

Patch updated:

Removed NFC formatting (rL342095).
Added IR example to code comment. This transform has a lot of predicates, and there's no direct twin in IR, so I'm not sure this actually makes it better...but let me know. :)

Ping.

Do we somehow enforce that in %r = select i1 %c, i32 -1, i32 %a, -1 is in the middle?
If not, we miss at least one case i think:

$ ~/src/alive/alive.py /tmp/test.txt 
----------------------------------------
Optimization: 1
Precondition: true
  %a = add i32 %x, 42
  %c = icmp ugt i32 %x, -43
  %r = select i1 %c, i32 -1, i32 %a
=>
  %c2 = icmp ult i32 %x, -43
  %r = select i1 %c2, i32 %a, i32 -1

Done: 1
Optimization is correct!

I thought there was some more case with inverted/negated 42 / -43 but i can't find it right now.

In D51929#1241507, @lebedev.ri wrote:

Do we somehow enforce that in %r = select i1 %c, i32 -1, i32 %a, -1 is in the middle?
If not, we miss at least one case i think:

That's correct. I think there are also 4 swapped variants for the pattern with a variable and a 'not' op (online version of Alive looks dead? cc'ing @nlopes):

Optimization: swap1
Precondition: true
  %noty = xor i32 %y, -1
  %a = add %x, %y
  %c = icmp ugt i32 %x, %noty
  %r = select i1 %c, i32 -1, i32 %a
=>
  %c2 = icmp ugt i32 %noty, %x
  %r = select i1 %c2, i32 %a, i32 -1

Done: 1
Optimization is correct!

----------------------------------------
Optimization: swap2
Precondition: true
  %noty = xor i32 %y, -1
  %a = add %x, %y
  %c = icmp ugt i32 %x, %noty
  %r = select i1 %c, i32 -1, i32 %a
=>
  %c2 = icmp ult i32 %x, %noty
  %r = select i1 %c2, i32 %a, i32 -1

Done: 1
Optimization is correct!

----------------------------------------
Optimization: swap3
Precondition: true
  %noty = xor i32 %y, -1
  %a = add %x, %y
  %c = icmp ugt i32 %x, %noty
  %r = select i1 %c, i32 -1, i32 %a
=>
  %c2 = icmp ult i32 %noty, %x
  %r = select i1 %c2, i32 -1, i32 %a

Done: 1
Optimization is correct!

My plan is to canonicalize all of the patterns in IR. If I'm seeing it correctly, there shouldn't be anything blocking those canonicalizations because we only try to form the uaddo here when the cmp (setcc) has one use (the select). That should let us get by with just the basic matching here in the backend. For example, the pattern with a variable should never reach the backend with a 'not' op because we can always shrink it in IR:

%noty = xor i8 %y, -1
%a = add i8 %x, %y
%c = icmp ugt i8 %x, %noty
%r = select i1 %c, i8 -1, i8 %a

>

%a = add i8 %x, %y
%c = icmp ugt i8 %x, %a
%r = select i1 %c, i8 -1, i8 %a

In D51929#1242158, @spatel wrote:

In D51929#1241507, @lebedev.ri wrote:

Do we somehow enforce that in %r = select i1 %c, i32 -1, i32 %a, -1 is in the middle?
If not, we miss at least one case i think:

That's correct. I think there are also 4 swapped variants for the pattern with a variable and a 'not' op (online version of Alive looks dead? cc'ing @nlopes):
...

My plan is to canonicalize all of the patterns in IR. If I'm seeing it correctly, there shouldn't be anything blocking those canonicalizations because we only try to form the uaddo here when the cmp (setcc) has one use (the select). That should let us get by with just the basic matching here in the backend.

Ok, sounds good. At worst, this can be extended.
I'm not sure why there is no x86 test coverage.
Won't [[ https://www.felixcloutier.com/x86/ADC.html | ADC ]] work for this purpose?
[[ https://www.felixcloutier.com/x86/ADD.html#description | ADD evaluates the result for both signed and unsigned integer operands and sets the CF and OF flags to indicate a carry (overflow) in the signed or unsigned result, respectively ]], too.
Some plumbing missing?

I think this looks good, but maybe wait a bit just in case someone else wants to comment.

This revision is now accepted and ready to land.Sep 21 2018, 10:41 AM

In D51929#1242231, @lebedev.ri wrote:

In D51929#1242158, @spatel wrote:

In D51929#1241507, @lebedev.ri wrote:

Do we somehow enforce that in %r = select i1 %c, i32 -1, i32 %a, -1 is in the middle?
If not, we miss at least one case i think:

That's correct. I think there are also 4 swapped variants for the pattern with a variable and a 'not' op (online version of Alive looks dead? cc'ing @nlopes):
...

My plan is to canonicalize all of the patterns in IR. If I'm seeing it correctly, there shouldn't be anything blocking those canonicalizations because we only try to form the uaddo here when the cmp (setcc) has one use (the select). That should let us get by with just the basic matching here in the backend.

Ok, sounds good. At worst, this can be extended.
I'm not sure why there is no x86 test coverage.
Won't [[ https://www.felixcloutier.com/x86/ADC.html | ADC ]] work for this purpose?
[[ https://www.felixcloutier.com/x86/ADD.html#description | ADD evaluates the result for both signed and unsigned integer operands and sets the CF and OF flags to indicate a carry (overflow) in the signed or unsigned result, respectively ]], too.
Some plumbing missing?

I think this looks good, but maybe wait a bit just in case someone else wants to comment.

Thanks! I'm not following the 'adc' suggestion though. That uses the carry from a previous op in the addition. What would that code sequence look like?

In D51929#1242305, @spatel wrote:

I'm not following the 'adc' suggestion though. That uses the carry from a previous op in the addition. What would that code sequence look like?

Uhm, after thinking a bit more, i think we'd only care about ADD? I haven't really thought about it though.
But in ADC case, i guess the same as with ADD, but prefixed with [[ https://www.felixcloutier.com/x86/CLC.html | CLC ]] -> no point in using ADC in the first place.

In D51929#1242320, @lebedev.ri wrote:

In D51929#1242305, @spatel wrote:

I'm not following the 'adc' suggestion though. That uses the carry from a previous op in the addition. What would that code sequence look like?

Uhm, after thinking a bit more, i think we'd only care about ADD? I haven't really thought about it though.
But in ADC case, i guess the same as with ADD, but prefixed with [[ https://www.felixcloutier.com/x86/CLC.html | CLC ]] -> no point in using ADC in the first place.

For completeness sake, i was *thinking* of something like: https://godbolt.org/z/uoShW6
But it is totally possible that it makes no sense :)

In D51929#1242413, @lebedev.ri wrote:

For completeness sake, i was *thinking* of something like: https://godbolt.org/z/uoShW6

Yes - that's the kind of codegen we're trying for - see the x86 test for unsigned_sat_constant_i32_using_cmp_notval().
But I think you have overflow and carry mixed up:
https://stackoverflow.com/questions/19301498/carry-flag-auxiliary-flag-and-overflow-flag-in-assembly

Ignoring all of the missing canonicalizations for the moment, this patch will give us the same output as gcc:
https://godbolt.org/z/dYnXhf

(cmovnc is the same as cmovae)

Closed by commit rL342886: [DAGCombiner] use UADDO to optimize saturated unsigned add (authored by spatel). · Explain WhySep 24 2018, 7:52 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D57352: [InstCombine] canonicalize cmp/select form of uadd saturate with constant.Jan 28 2019, 12:46 PM

spatel mentioned this in rL352536: [InstCombine] canonicalize cmp/select form of uadd saturate with constant.Jan 29 2019, 12:04 PM

spatel mentioned this in rL354219: [InstCombine] add tests for unsigned saturated add; NFC.Feb 17 2019, 7:09 AM

spatel mentioned this in rG3e1193743c43: [InstCombine] add tests for unsigned saturated add; NFC.

spatel mentioned this in rL354221: [InstCombine] reduce unsigned saturated add with 'not' op.Feb 17 2019, 7:58 AM

spatel mentioned this in rGbee207354271: [InstCombine] reduce unsigned saturated add with 'not' op.

spatel mentioned this in rL354224: [InstCombine] reduce more unsigned saturated add with 'not' op.Feb 17 2019, 8:48 AM

spatel mentioned this in rGb341ee7071c1: [InstCombine] reduce more unsigned saturated add with 'not' op.

spatel mentioned this in rL354276: [InstCombine] reduce even more unsigned saturated add with 'not' op.Feb 18 2019, 7:21 AM

spatel mentioned this in rG079b610c29b4: [InstCombine] reduce even more unsigned saturated add with 'not' op.Feb 18 2019, 7:25 AM

spatel mentioned this in rL354393: [InstCombine] reduce even more unsigned saturated add with 'not' op.Feb 19 2019, 2:14 PM

spatel mentioned this in rGc1e018431795: [InstCombine] reduce even more unsigned saturated add with 'not' op.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

29 lines

test/

CodeGen/

AArch64/

sat-add.ll

10 lines

X86/

sat-add.ll

25 lines

Diff 166688

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,340 Lines • ▼ Show 20 Lines	if (N0.getOpcode() == ISD::SETCC) {
// select (fcmp gt x, y), x, y -> fmaxnum x, y		// select (fcmp gt x, y), x, y -> fmaxnum x, y
//		//
// This is OK if we don't care what happens if either operand is a NaN.		// This is OK if we don't care what happens if either operand is a NaN.
if (N0.hasOneUse() && isLegalToCombineMinNumMaxNum(DAG, N1, N2))		if (N0.hasOneUse() && isLegalToCombineMinNumMaxNum(DAG, N1, N2))
if (SDValue FMinMax = combineMinNumMaxNum(DL, VT, Cond0, Cond1, N1, N2,		if (SDValue FMinMax = combineMinNumMaxNum(DL, VT, Cond0, Cond1, N1, N2,
CC, TLI, DAG))		CC, TLI, DAG))
return FMinMax;		return FMinMax;

		// Use 'unsigned add with overflow' to optimize an unsigned saturating add.
		// This is conservatively limited to pre-legal-operations to give targets
		// a chance to reverse the transform if they want to do that. Also, it is
		// unlikely that the pattern would be formed late, so it's probably not
		// worth going through the other checks.
		if (!LegalOperations && TLI.isOperationLegalOrCustom(ISD::UADDO, VT) &&
		CC == ISD::SETUGT && N0.hasOneUse() && isAllOnesConstant(N1) &&
		N2.getOpcode() == ISD::ADD && Cond0 == N2.getOperand(0)) {
		auto *C = dyn_cast<ConstantSDNode>(N2.getOperand(1));
		auto *NotC = dyn_cast<ConstantSDNode>(Cond1);
		if (C && NotC && C->getAPIntValue() == ~NotC->getAPIntValue()) {
		// select (setcc Cond0, ~C, ugt), -1, (add Cond0, C) -->
		// uaddo Cond0, C; select uaddo.1, -1, uaddo.0
		//
		// The IR equivalent of this transform would have this form:
		// %a = add %x, C
		// %c = icmp ugt %x, ~C
		// %r = select %c, -1, %a
		// =>
		// %u = call {iN,i1} llvm.uadd.with.overflow(%x, C)
		// %u0 = extractvalue %u, 0
		// %u1 = extractvalue %u, 1
		// %r = select %u1, -1, %u0
		SDVTList VTs = DAG.getVTList(VT, VT0);
		SDValue UAO = DAG.getNode(ISD::UADDO, DL, VTs, Cond0, N2.getOperand(1));
		return DAG.getSelect(DL, VT, UAO.getValue(1), N1, UAO.getValue(0));
		}
		}

if (TLI.isOperationLegal(ISD::SELECT_CC, VT) \|\|		if (TLI.isOperationLegal(ISD::SELECT_CC, VT) \|\|
(!LegalOperations && TLI.isOperationLegalOrCustom(ISD::SELECT_CC, VT)))		(!LegalOperations && TLI.isOperationLegalOrCustom(ISD::SELECT_CC, VT)))
return DAG.getNode(ISD::SELECT_CC, DL, VT, Cond0, Cond1, N1, N2,		return DAG.getNode(ISD::SELECT_CC, DL, VT, Cond0, Cond1, N1, N2,
N0.getOperand(2));		N0.getOperand(2));

return SimplifySelect(DL, N0, N1, N2);		return SimplifySelect(DL, N0, N1, N2);
}		}

▲ Show 20 Lines • Show All 11,483 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/sat-add.ll

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%c = icmp ugt i32 %x, %a		%c = icmp ugt i32 %x, %a
%r = select i1 %c, i32 -1, i32 %a		%r = select i1 %c, i32 -1, i32 %a
ret i32 %r		ret i32 %r
}		}

define i32 @unsigned_sat_constant_i32_using_cmp_notval(i32 %x) {		define i32 @unsigned_sat_constant_i32_using_cmp_notval(i32 %x) {
; CHECK-LABEL: unsigned_sat_constant_i32_using_cmp_notval:		; CHECK-LABEL: unsigned_sat_constant_i32_using_cmp_notval:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: add w8, w0, #42 // =42		; CHECK-NEXT: adds w8, w0, #42 // =42
; CHECK-NEXT: cmn w0, #43 // =43		; CHECK-NEXT: csinv w0, w8, wzr, lo
; CHECK-NEXT: csinv w0, w8, wzr, ls
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%a = add i32 %x, 42		%a = add i32 %x, 42
%c = icmp ugt i32 %x, -43		%c = icmp ugt i32 %x, -43
%r = select i1 %c, i32 -1, i32 %a		%r = select i1 %c, i32 -1, i32 %a
ret i32 %r		ret i32 %r
}		}

define i64 @unsigned_sat_constant_i64_using_min(i64 %x) {		define i64 @unsigned_sat_constant_i64_using_min(i64 %x) {
Show All 20 Lines	; CHECK-NEXT: ret
%c = icmp ugt i64 %x, %a		%c = icmp ugt i64 %x, %a
%r = select i1 %c, i64 -1, i64 %a		%r = select i1 %c, i64 -1, i64 %a
ret i64 %r		ret i64 %r
}		}

define i64 @unsigned_sat_constant_i64_using_cmp_notval(i64 %x) {		define i64 @unsigned_sat_constant_i64_using_cmp_notval(i64 %x) {
; CHECK-LABEL: unsigned_sat_constant_i64_using_cmp_notval:		; CHECK-LABEL: unsigned_sat_constant_i64_using_cmp_notval:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: add x8, x0, #42 // =42		; CHECK-NEXT: adds x8, x0, #42 // =42
; CHECK-NEXT: cmn x0, #43 // =43		; CHECK-NEXT: csinv x0, x8, xzr, lo
; CHECK-NEXT: csinv x0, x8, xzr, ls
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%a = add i64 %x, 42		%a = add i64 %x, 42
%c = icmp ugt i64 %x, -43		%c = icmp ugt i64 %x, -43
%r = select i1 %c, i64 -1, i64 %a		%r = select i1 %c, i64 -1, i64 %a
ret i64 %r		ret i64 %r
}		}

define i8 @unsigned_sat_variable_i8_using_min(i8 %x, i8 %y) {		define i8 @unsigned_sat_variable_i8_using_min(i8 %x, i8 %y) {
▲ Show 20 Lines • Show All 554 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sat-add.ll

Show All 38 Lines	; ANY-NEXT: retq
%c = icmp ugt i8 %x, %a		%c = icmp ugt i8 %x, %a
%r = select i1 %c, i8 -1, i8 %a		%r = select i1 %c, i8 -1, i8 %a
ret i8 %r		ret i8 %r
}		}

define i8 @unsigned_sat_constant_i8_using_cmp_notval(i8 %x) {		define i8 @unsigned_sat_constant_i8_using_cmp_notval(i8 %x) {
; ANY-LABEL: unsigned_sat_constant_i8_using_cmp_notval:		; ANY-LABEL: unsigned_sat_constant_i8_using_cmp_notval:
; ANY: # %bb.0:		; ANY: # %bb.0:
; ANY-NEXT: cmpb $-43, %dil		; ANY-NEXT: addb $42, %dil
; ANY-NEXT: movb $-1, %al		; ANY-NEXT: movb $-1, %al
; ANY-NEXT: ja .LBB2_2		; ANY-NEXT: jb .LBB2_2
; ANY-NEXT: # %bb.1:		; ANY-NEXT: # %bb.1:
; ANY-NEXT: addb $42, %dil
; ANY-NEXT: movl %edi, %eax		; ANY-NEXT: movl %edi, %eax
; ANY-NEXT: .LBB2_2:		; ANY-NEXT: .LBB2_2:
; ANY-NEXT: retq		; ANY-NEXT: retq
%a = add i8 %x, 42		%a = add i8 %x, 42
%c = icmp ugt i8 %x, -43		%c = icmp ugt i8 %x, -43
%r = select i1 %c, i8 -1, i8 %a		%r = select i1 %c, i8 -1, i8 %a
ret i8 %r		ret i8 %r
}		}
Show All 26 Lines	; ANY-NEXT: retq
%c = icmp ugt i16 %x, %a		%c = icmp ugt i16 %x, %a
%r = select i1 %c, i16 -1, i16 %a		%r = select i1 %c, i16 -1, i16 %a
ret i16 %r		ret i16 %r
}		}

define i16 @unsigned_sat_constant_i16_using_cmp_notval(i16 %x) {		define i16 @unsigned_sat_constant_i16_using_cmp_notval(i16 %x) {
; ANY-LABEL: unsigned_sat_constant_i16_using_cmp_notval:		; ANY-LABEL: unsigned_sat_constant_i16_using_cmp_notval:
; ANY: # %bb.0:		; ANY: # %bb.0:
; ANY-NEXT: # kill: def $edi killed $edi def $rdi		; ANY-NEXT: addw $42, %di
; ANY-NEXT: leal 42(%rdi), %ecx
; ANY-NEXT: movzwl %di, %eax
; ANY-NEXT: cmpl $65493, %eax # imm = 0xFFD5
; ANY-NEXT: movl $65535, %eax # imm = 0xFFFF		; ANY-NEXT: movl $65535, %eax # imm = 0xFFFF
; ANY-NEXT: cmovbel %ecx, %eax		; ANY-NEXT: cmovael %edi, %eax
; ANY-NEXT: # kill: def $ax killed $ax killed $eax		; ANY-NEXT: # kill: def $ax killed $ax killed $eax
; ANY-NEXT: retq		; ANY-NEXT: retq
%a = add i16 %x, 42		%a = add i16 %x, 42
%c = icmp ugt i16 %x, -43		%c = icmp ugt i16 %x, -43
%r = select i1 %c, i16 -1, i16 %a		%r = select i1 %c, i16 -1, i16 %a
ret i16 %r		ret i16 %r
}		}

Show All 22 Lines	; ANY-NEXT: retq
%c = icmp ugt i32 %x, %a		%c = icmp ugt i32 %x, %a
%r = select i1 %c, i32 -1, i32 %a		%r = select i1 %c, i32 -1, i32 %a
ret i32 %r		ret i32 %r
}		}

define i32 @unsigned_sat_constant_i32_using_cmp_notval(i32 %x) {		define i32 @unsigned_sat_constant_i32_using_cmp_notval(i32 %x) {
; ANY-LABEL: unsigned_sat_constant_i32_using_cmp_notval:		; ANY-LABEL: unsigned_sat_constant_i32_using_cmp_notval:
; ANY: # %bb.0:		; ANY: # %bb.0:
; ANY-NEXT: # kill: def $edi killed $edi def $rdi		; ANY-NEXT: addl $42, %edi
; ANY-NEXT: leal 42(%rdi), %ecx
; ANY-NEXT: cmpl $-43, %edi
; ANY-NEXT: movl $-1, %eax		; ANY-NEXT: movl $-1, %eax
; ANY-NEXT: cmovbel %ecx, %eax		; ANY-NEXT: cmovael %edi, %eax
; ANY-NEXT: retq		; ANY-NEXT: retq
%a = add i32 %x, 42		%a = add i32 %x, 42
%c = icmp ugt i32 %x, -43		%c = icmp ugt i32 %x, -43
%r = select i1 %c, i32 -1, i32 %a		%r = select i1 %c, i32 -1, i32 %a
ret i32 %r		ret i32 %r
}		}

define i64 @unsigned_sat_constant_i64_using_min(i64 %x) {		define i64 @unsigned_sat_constant_i64_using_min(i64 %x) {
Show All 21 Lines	; ANY-NEXT: retq
%c = icmp ugt i64 %x, %a		%c = icmp ugt i64 %x, %a
%r = select i1 %c, i64 -1, i64 %a		%r = select i1 %c, i64 -1, i64 %a
ret i64 %r		ret i64 %r
}		}

define i64 @unsigned_sat_constant_i64_using_cmp_notval(i64 %x) {		define i64 @unsigned_sat_constant_i64_using_cmp_notval(i64 %x) {
; ANY-LABEL: unsigned_sat_constant_i64_using_cmp_notval:		; ANY-LABEL: unsigned_sat_constant_i64_using_cmp_notval:
; ANY: # %bb.0:		; ANY: # %bb.0:
; ANY-NEXT: cmpq $-43, %rdi		; ANY-NEXT: addq $42, %rdi
; ANY-NEXT: leaq 42(%rdi), %rax		; ANY-NEXT: movq $-1, %rax
; ANY-NEXT: movq $-1, %rcx		; ANY-NEXT: cmovaeq %rdi, %rax
; ANY-NEXT: cmovaq %rcx, %rax
; ANY-NEXT: retq		; ANY-NEXT: retq
%a = add i64 %x, 42		%a = add i64 %x, 42
%c = icmp ugt i64 %x, -43		%c = icmp ugt i64 %x, -43
%r = select i1 %c, i64 -1, i64 %a		%r = select i1 %c, i64 -1, i64 %a
ret i64 %r		ret i64 %r
}		}

define i8 @unsigned_sat_variable_i8_using_min(i8 %x, i8 %y) {		define i8 @unsigned_sat_variable_i8_using_min(i8 %x, i8 %y) {
▲ Show 20 Lines • Show All 760 Lines • Show Last 20 Lines