This is an archive of the discontinued LLVM Phabricator instance.

[EarlyCSE] recognize commuted and swapped variants of min/max as equivalent (PR35642)
ClosedPublic

Authored by spatel on Dec 12 2017, 2:37 PM.

Download Raw Diff

Details

Reviewers

hfinkel
efriedma
gberry

Commits

rG558a465473bd: [EarlyCSE] recognize swapped variants of abs/nabs as equivalent
rG3c7a35de7fbc: [EarlyCSE] recognize commuted and swapped variants of min/max as equivalent…
rL320653: [EarlyCSE] recognize swapped variants of abs/nabs as equivalent
rL320640: [EarlyCSE] recognize commuted and swapped variants of min/max as equivalent…

Summary

As shown in:
https://bugs.llvm.org/show_bug.cgi?id=35642
...we can have different forms of min/max, so we should recognize those here in EarlyCSE similar to how we already handle binops and compares that can commute.

Diff Detail

Event Timeline

spatel created this revision.Dec 12 2017, 2:37 PM

Herald added a subscriber: mcrosier. · View Herald TranscriptDec 12 2017, 2:37 PM

Could you use matchSelectPattern (from ValueTracking) here instead of matching the individual min/max patterns? That will also give us absolute value and the floating-point min/max as well. I think that you can hash_combine the fields of SelectPatternResult, and the logic might be simpler overall too (hopefully).

fhahn added a subscriber: fhahn.Dec 12 2017, 8:25 PM

In D41136#953249, @hfinkel wrote:

Could you use matchSelectPattern (from ValueTracking) here instead of matching the individual min/max patterns? That will also give us absolute value and the floating-point min/max as well. I think that you can hash_combine the fields of SelectPatternResult, and the logic might be simpler overall too (hopefully).

Yes, that's better. I was purposely postponing FP in this patch because it's not clear to me what we need to handle in those cases (intrinsics and/or IR patterns). But either way, using value tracking should simplify the code here.

Patch updated:
Use ValueTracking's matchSelectPattern() rather than pattern matchers for min/max. This:

Makes the patch smaller
Catches more cases (extra test added)
Allows follow-up patches to enable abs() and FP min/max more easily

hfinkel accepted this revision.Dec 13 2017, 12:52 PM

hfinkel added inline comments.

lib/Transforms/Scalar/EarlyCSE.cpp
150	Is there anything to do to address this TODO except for removing this check, the corresponding check below, and, to be good, adding a few test cases? I don't want this artificially restricted, but I'm fine enabling the others in a follow-up commit (so that we can revert separately if necessary). This LGTM, but please do take care of the other cases in a follow-up commit, or add a comment explaining why that's non-trivial.

This revision is now accepted and ready to land.Dec 13 2017, 12:52 PM

spatel added inline comments.Dec 13 2017, 1:37 PM

lib/Transforms/Scalar/EarlyCSE.cpp
150	For abs, it should just be fixing the check and adding tests. I need to look closer at which FP variants are commutable. But yes, I'll do both of those after this patch.

Closed by commit rL320640: [EarlyCSE] recognize commuted and swapped variants of min/max as equivalent… (authored by spatel). · Explain WhyDec 13 2017, 1:59 PM

This revision was automatically updated to reflect the committed changes.

hfinkel added inline comments.Dec 13 2017, 3:59 PM

lib/Transforms/Scalar/EarlyCSE.cpp
150	I need to look closer at which FP variants are commutable. But yes, I'll do both of those after this patch. Ah, I should have mentioned. I looked through the code before I made the suggestion. I think that you only need exclude SPF_UNKNOWN. It looks like all of the others commute because matchSelectPattern specifically excludes matching floating-point forms that don't commute (it specifically excludes cases where signed-zeros could cause problems, NaNs might cause problems, etc.).

spatel added inline comments.Dec 13 2017, 4:03 PM

lib/Transforms/Scalar/EarlyCSE.cpp
150	Great - I was coming to that conclusion too now that I'm reading the code. Just need to come up with N test cases (and probably a pile of negative tests too). :) I think we will need to handle the 'num' intrinsics separately (but similarly): define double @maxnum_intrinsic(double %x, double %y) { %m1 = call double @llvm.maxnum.f64(double %x, double %y) %m2 = call double @llvm.maxnum.f64(double %y, double %x) %r = fadd double %m1, %m2 ret double %r }

hfinkel added inline comments.Dec 13 2017, 4:09 PM

lib/Transforms/Scalar/EarlyCSE.cpp
150	Cool. Regarding [max\|min]num, I agree.

On closer inspection, I think matchSelectPattern is either not sufficient or broken because it doesn't handle -0.0 as we would require here:

declare void @self_destruct_if_neg_zero(double)  
define double @fmin_any_ordered_commute(double %x, double %y) {
  %cmp1 = fcmp nnan olt double %x, %y                            ; if x=+0.0 and y=-0.0, returns false
  %cmp2 = fcmp nnan olt double %y, %x                            ; if x=+0.0 and y=-0.0, returns false
  %neg_zero_if_false = select i1 %cmp1, double %x, double %y     ; returns -0.0
  %pos_zero_if_false = select i1 %cmp2, double %y, double %x     ; returns +0.0
  call void @self_destruct_if_neg_zero(double %pos_zero_if_false)
  ret double %neg_zero_if_false
}

If we use what is returned by matchSelectPattern ( {SPF_FMINNUM, SPNB_RETURNS_ANY, false} ), we get:

$ ./opt -early-cse fminmax.ll -S

define double @fmin_any_ordered_commute(double %x, double %y) {
  %cmp1 = fcmp nnan olt double %x, %y
  %cmp2 = fcmp nnan olt double %y, %x
  %neg_zero_if_false = select i1 %cmp1, double %x, double %y
  call void @self_destruct_if_neg_zero(double %neg_zero_if_false)  ; boom!
  ret double %neg_zero_if_false
}

In D41136#955455, @spatel wrote:

On closer inspection, I think matchSelectPattern is either not sufficient or broken because it doesn't handle -0.0 as we would require here:

declare void @self_destruct_if_neg_zero(double)  
define double @fmin_any_ordered_commute(double %x, double %y) {
  %cmp1 = fcmp nnan olt double %x, %y                            ; if x=+0.0 and y=-0.0, returns false
  %cmp2 = fcmp nnan olt double %y, %x                            ; if x=+0.0 and y=-0.0, returns false
  %neg_zero_if_false = select i1 %cmp1, double %x, double %y     ; returns -0.0
  %pos_zero_if_false = select i1 %cmp2, double %y, double %x     ; returns +0.0
  call void @self_destruct_if_neg_zero(double %pos_zero_if_false)
  ret double %neg_zero_if_false
}

If we use what is returned by matchSelectPattern ( {SPF_FMINNUM, SPNB_RETURNS_ANY, false} ), we get:

$ ./opt -early-cse fminmax.ll -S

define double @fmin_any_ordered_commute(double %x, double %y) {
  %cmp1 = fcmp nnan olt double %x, %y
  %cmp2 = fcmp nnan olt double %y, %x
  %neg_zero_if_false = select i1 %cmp1, double %x, double %y
  call void @self_destruct_if_neg_zero(double %neg_zero_if_false)  ; boom!
  ret double %neg_zero_if_false
}

This seems inconsistent with the intent of matchSelectPattern, at least based on the comment:

// If the predicate is an "or-equal"  (FP) predicate, then signed zeroes may
// return inconsistent results between implementations.
//   (0.0 <= -0.0) ? 0.0 : -0.0 // Returns 0.0
//   minNum(0.0, -0.0)          // May return -0.0 or 0.0 (IEEE 754-2008 5.3.1)
// Therefore we behave conservatively and only proceed if at least one of the
// operands is known to not be zero, or if we don't care about signed zeroes.
switch (Pred) {
default: break;
case CmpInst::FCMP_OGE: case CmpInst::FCMP_OLE:
case CmpInst::FCMP_UGE: case CmpInst::FCMP_ULE:
  if (!FMF.noSignedZeros() && !isKnownNonZero(CmpLHS) &&
      !isKnownNonZero(CmpRHS))
    return {SPF_UNKNOWN, SPNB_NA, false};
}

It seems like there are cases missing here (e.g., this does not apply only to the GE/LE predicates)? Based on your example, it seems like we have the same problem with the GT/LT predicates.

In D41136#955627, @hfinkel wrote:

// If the predicate is an "or-equal"  (FP) predicate, then signed zeroes may
// return inconsistent results between implementations.
//   (0.0 <= -0.0) ? 0.0 : -0.0 // Returns 0.0
//   minNum(0.0, -0.0)          // May return -0.0 or 0.0 (IEEE 754-2008 5.3.1)
// Therefore we behave conservatively and only proceed if at least one of the
// operands is known to not be zero, or if we don't care about signed zeroes.
switch (Pred) {
default: break;
case CmpInst::FCMP_OGE: case CmpInst::FCMP_OLE:
case CmpInst::FCMP_UGE: case CmpInst::FCMP_ULE:
  if (!FMF.noSignedZeros() && !isKnownNonZero(CmpLHS) &&
      !isKnownNonZero(CmpRHS))
    return {SPF_UNKNOWN, SPNB_NA, false};
}

It seems like there are cases missing here (e.g., this does not apply only to the GE/LE predicates)? Based on your example, it seems like we have the same problem with the GT/LT predicates.

Yep - that's the quick fix; add the other preds. A better solution might be to distinguish signed zero behavior in the same way that we're doing for NAN. But let me try the easy patch first and see if anything breaks. :)

In D41136#955784, @spatel wrote:

It seems like there are cases missing here (e.g., this does not apply only to the GE/LE predicates)? Based on your example, it seems like we have the same problem with the GT/LT predicates.

Yep - that's the quick fix; add the other preds. A better solution might be to distinguish signed zero behavior in the same way that we're doing for NAN. But let me try the easy patch first and see if anything breaks. :)

Unfortunately, we would regress this test:

define i8 @t9(float %a) {
  %t1 = fcmp ult float %a, 0.0
  %t2 = fptosi float %a to i8
  %t3 = select i1 %t1, i8 %t2, i8 0
  ret i8 %t3
}

Currently, we recognize that as fmin with a cast and canonicalize to:

define i8 @t9(float %a) {
  %.inv = fcmp oge float %a, 0.0
  %t31 = select i1 %.inv, float 0.0, float %a
  %1 = fptosi float %t31 to i8
  ret i8 %1
}

We could set the 'nsz' bit that we're sending to the matcher to avoid the regression in the case of a min/max with cast.

spatel mentioned this in D41333: [ValueTracking] ignore FP signed-zero when detecting a casted-to-integer fmin/fmax pattern.Dec 17 2017, 11:09 AM

spatel mentioned this in rL321456: [ValueTracking] ignore FP signed-zero when detecting a casted-to-integer….Dec 26 2017, 7:10 AM

spatel mentioned this in D41603: [InstCombine] fold min/max tree with common operand (PR35717).Jan 8 2018, 7:00 AM

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

EarlyCSE.cpp

41 lines

test/

Transforms/

EarlyCSE/

commute.ll

30 lines

Diff 126629

lib/Transforms/Scalar/EarlyCSE.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	if (CmpInst *CI = dyn_cast<CmpInst>(Inst)) {
CmpInst::Predicate Pred = CI->getPredicate();		CmpInst::Predicate Pred = CI->getPredicate();
if (Inst->getOperand(0) > Inst->getOperand(1)) {		if (Inst->getOperand(0) > Inst->getOperand(1)) {
std::swap(LHS, RHS);		std::swap(LHS, RHS);
Pred = CI->getSwappedPredicate();		Pred = CI->getSwappedPredicate();
}		}
return hash_combine(Inst->getOpcode(), Pred, LHS, RHS);		return hash_combine(Inst->getOpcode(), Pred, LHS, RHS);
}		}

		// Hash min/max (cmp + select) to allow for both commuted operands and
		// non-canonical compare predicate (eg, the compare for smin may use 'sgt'
		// rather than 'slt').
		Value A, B;
		if (match(Inst, m_SMin(m_Value(A), m_Value(B)))) {
		if (A > B)
		hfinkelUnsubmitted Not Done Reply Inline Actions Is there anything to do to address this TODO except for removing this check, the corresponding check below, and, to be good, adding a few test cases? I don't want this artificially restricted, but I'm fine enabling the others in a follow-up commit (so that we can revert separately if necessary). This LGTM, but please do take care of the other cases in a follow-up commit, or add a comment explaining why that's non-trivial. hfinkel: Is there anything to do to address this TODO except for removing this check, the corresponding…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions For abs, it should just be fixing the check and adding tests. I need to look closer at which FP variants are commutable. But yes, I'll do both of those after this patch. spatel: For abs, it should just be fixing the check and adding tests. I need to look closer at which FP…
		hfinkelUnsubmitted Not Done Reply Inline Actions I need to look closer at which FP variants are commutable. But yes, I'll do both of those after this patch. Ah, I should have mentioned. I looked through the code before I made the suggestion. I think that you only need exclude SPF_UNKNOWN. It looks like all of the others commute because matchSelectPattern specifically excludes matching floating-point forms that don't commute (it specifically excludes cases where signed-zeros could cause problems, NaNs might cause problems, etc.). hfinkel: > I need to look closer at which FP variants are commutable. But yes, I'll do both of those…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions Great - I was coming to that conclusion too now that I'm reading the code. Just need to come up with N test cases (and probably a pile of negative tests too). :) I think we will need to handle the 'num' intrinsics separately (but similarly): define double @maxnum_intrinsic(double %x, double %y) { %m1 = call double @llvm.maxnum.f64(double %x, double %y) %m2 = call double @llvm.maxnum.f64(double %y, double %x) %r = fadd double %m1, %m2 ret double %r } spatel: Great - I was coming to that conclusion too now that I'm reading the code. Just need to come up…
		hfinkelUnsubmitted Not Done Reply Inline Actions Cool. Regarding [max\|min]num, I agree. hfinkel: Cool. Regarding [max\|min]num, I agree.
		std::swap(A, B);
		return hash_combine(Inst->getOpcode(), ICmpInst::ICMP_SLT, A, B);
		}
		if (match(Inst, m_SMax(m_Value(A), m_Value(B)))) {
		if (A > B)
		std::swap(A, B);
		return hash_combine(Inst->getOpcode(), ICmpInst::ICMP_SGT, A, B);
		}
		if (match(Inst, m_UMin(m_Value(A), m_Value(B)))) {
		if (A > B)
		std::swap(A, B);
		return hash_combine(Inst->getOpcode(), ICmpInst::ICMP_ULT, A, B);
		}
		if (match(Inst, m_UMax(m_Value(A), m_Value(B)))) {
		if (A > B)
		std::swap(A, B);
		return hash_combine(Inst->getOpcode(), ICmpInst::ICMP_UGT, A, B);
		}

if (CastInst *CI = dyn_cast<CastInst>(Inst))		if (CastInst *CI = dyn_cast<CastInst>(Inst))
return hash_combine(CI->getOpcode(), CI->getType(), CI->getOperand(0));		return hash_combine(CI->getOpcode(), CI->getType(), CI->getOperand(0));

if (const ExtractValueInst *EVI = dyn_cast<ExtractValueInst>(Inst))		if (const ExtractValueInst *EVI = dyn_cast<ExtractValueInst>(Inst))
return hash_combine(EVI->getOpcode(), EVI->getOperand(0),		return hash_combine(EVI->getOpcode(), EVI->getOperand(0),
hash_combine_range(EVI->idx_begin(), EVI->idx_end()));		hash_combine_range(EVI->idx_begin(), EVI->idx_end()));

if (const InsertValueInst *IVI = dyn_cast<InsertValueInst>(Inst))		if (const InsertValueInst *IVI = dyn_cast<InsertValueInst>(Inst))
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	assert(isa<CmpInst>(RHSI) &&
"same opcode, but different instruction type?");		"same opcode, but different instruction type?");
CmpInst *RHSCmp = cast<CmpInst>(RHSI);		CmpInst *RHSCmp = cast<CmpInst>(RHSI);
// Commuted equality		// Commuted equality
return LHSCmp->getOperand(0) == RHSCmp->getOperand(1) &&		return LHSCmp->getOperand(0) == RHSCmp->getOperand(1) &&
LHSCmp->getOperand(1) == RHSCmp->getOperand(0) &&		LHSCmp->getOperand(1) == RHSCmp->getOperand(0) &&
LHSCmp->getSwappedPredicate() == RHSCmp->getPredicate();		LHSCmp->getSwappedPredicate() == RHSCmp->getPredicate();
}		}

		// The commutative min/max matchers allow for both commuted operands and
		// different compare predicates.
		Value A, B;
		if (match(LHSI, m_SMin(m_Value(A), m_Value(B))) &&
		match(RHSI, m_c_SMin(m_Specific(A), m_Specific(B))))
		return true;
		if (match(LHSI, m_SMax(m_Value(A), m_Value(B))) &&
		match(RHSI, m_c_SMax(m_Specific(A), m_Specific(B))))
		return true;
		if (match(LHSI, m_UMin(m_Value(A), m_Value(B))) &&
		match(RHSI, m_c_UMin(m_Specific(A), m_Specific(B))))
		return true;
		if (match(LHSI, m_UMax(m_Value(A), m_Value(B))) &&
		match(RHSI, m_c_UMax(m_Specific(A), m_Specific(B))))
		return true;

return false;		return false;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// CallValue		// CallValue
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
▲ Show 20 Lines • Show All 945 Lines • Show Last 20 Lines

test/Transforms/EarlyCSE/commute.ll

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines

	; Min/max operands may be commuted in the compare and select.			; Min/max operands may be commuted in the compare and select.

	define i8 @smin_commute(i8 %a, i8 %b) {			define i8 @smin_commute(i8 %a, i8 %b) {
	; CHECK-LABEL: @smin_commute(			; CHECK-LABEL: @smin_commute(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i8 %b, %a			; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i8 %b, %a
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %b, i8 %a			; CHECK-NEXT: [[R:%.*]] = mul i8 [[M1]], [[M1]]
	; CHECK-NEXT: [[R:%.*]] = mul i8 [[M1]], [[M2]]
	; CHECK-NEXT: ret i8 [[R]]			; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp slt i8 %a, %b			%cmp1 = icmp slt i8 %a, %b
	%cmp2 = icmp slt i8 %b, %a			%cmp2 = icmp slt i8 %b, %a
	%m1 = select i1 %cmp1, i8 %a, i8 %b			%m1 = select i1 %cmp1, i8 %a, i8 %b
	%m2 = select i1 %cmp2, i8 %b, i8 %a			%m2 = select i1 %cmp2, i8 %b, i8 %a
	%r = mul i8 %m1, %m2			%r = mul i8 %m1, %m2
	ret i8 %r			ret i8 %r
	}			}

	; Min/max can also have a swapped predicate and select operands.			; Min/max can also have a swapped predicate and select operands.

	define i1 @smin_swapped(i8 %a, i8 %b) {			define i1 @smin_swapped(i8 %a, i8 %b) {
	; CHECK-LABEL: @smin_swapped(			; CHECK-LABEL: @smin_swapped(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i8 %a, %b			; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i8 %a, %b
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %a, i8 %b			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: [[R:%.*]] = icmp eq i8 [[M2]], [[M1]]
	; CHECK-NEXT: ret i1 [[R]]
	;			;
	%cmp1 = icmp sgt i8 %a, %b			%cmp1 = icmp sgt i8 %a, %b
	%cmp2 = icmp slt i8 %a, %b			%cmp2 = icmp slt i8 %a, %b
	%m1 = select i1 %cmp1, i8 %b, i8 %a			%m1 = select i1 %cmp1, i8 %b, i8 %a
	%m2 = select i1 %cmp2, i8 %a, i8 %b			%m2 = select i1 %cmp2, i8 %a, i8 %b
	%r = icmp eq i8 %m2, %m1			%r = icmp eq i8 %m2, %m1
	ret i1 %r			ret i1 %r
	}			}

	define i8 @smax_commute(i8 %a, i8 %b) {			define i8 @smax_commute(i8 %a, i8 %b) {
	; CHECK-LABEL: @smax_commute(			; CHECK-LABEL: @smax_commute(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i8 %b, %a			; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i8 %b, %a
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %b, i8 %a			; CHECK-NEXT: ret i8 0
	; CHECK-NEXT: [[R:%.*]] = urem i8 [[M2]], [[M1]]
	; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp sgt i8 %a, %b			%cmp1 = icmp sgt i8 %a, %b
	%cmp2 = icmp sgt i8 %b, %a			%cmp2 = icmp sgt i8 %b, %a
	%m1 = select i1 %cmp1, i8 %a, i8 %b			%m1 = select i1 %cmp1, i8 %a, i8 %b
	%m2 = select i1 %cmp2, i8 %b, i8 %a			%m2 = select i1 %cmp2, i8 %b, i8 %a
	%r = urem i8 %m2, %m1			%r = urem i8 %m2, %m1
	ret i8 %r			ret i8 %r
	}			}

	define i8 @smax_swapped(i8 %a, i8 %b) {			define i8 @smax_swapped(i8 %a, i8 %b) {
	; CHECK-LABEL: @smax_swapped(			; CHECK-LABEL: @smax_swapped(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i8 %a, %b			; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i8 %a, %b
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %a, i8 %b			; CHECK-NEXT: ret i8 1
	; CHECK-NEXT: [[R:%.*]] = sdiv i8 [[M1]], [[M2]]
	; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp slt i8 %a, %b			%cmp1 = icmp slt i8 %a, %b
	%cmp2 = icmp sgt i8 %a, %b			%cmp2 = icmp sgt i8 %a, %b
	%m1 = select i1 %cmp1, i8 %b, i8 %a			%m1 = select i1 %cmp1, i8 %b, i8 %a
	%m2 = select i1 %cmp2, i8 %a, i8 %b			%m2 = select i1 %cmp2, i8 %a, i8 %b
	%r = sdiv i8 %m1, %m2			%r = sdiv i8 %m1, %m2
	ret i8 %r			ret i8 %r
	}			}

	define i8 @umin_commute(i8 %a, i8 %b) {			define i8 @umin_commute(i8 %a, i8 %b) {
	; CHECK-LABEL: @umin_commute(			; CHECK-LABEL: @umin_commute(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp ult i8 %b, %a			; CHECK-NEXT: [[CMP2:%.*]] = icmp ult i8 %b, %a
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %b, i8 %a			; CHECK-NEXT: ret i8 0
	; CHECK-NEXT: [[R:%.*]] = sub i8 [[M2]], [[M1]]
	; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp ult i8 %a, %b			%cmp1 = icmp ult i8 %a, %b
	%cmp2 = icmp ult i8 %b, %a			%cmp2 = icmp ult i8 %b, %a
	%m1 = select i1 %cmp1, i8 %a, i8 %b			%m1 = select i1 %cmp1, i8 %a, i8 %b
	%m2 = select i1 %cmp2, i8 %b, i8 %a			%m2 = select i1 %cmp2, i8 %b, i8 %a
	%r = sub i8 %m2, %m1			%r = sub i8 %m2, %m1
	ret i8 %r			ret i8 %r
	}			}

	; Choose a vector type just to show that works.			; Choose a vector type just to show that works.

	define <2 x i8> @umin_swapped(<2 x i8> %a, <2 x i8> %b) {			define <2 x i8> @umin_swapped(<2 x i8> %a, <2 x i8> %b) {
	; CHECK-LABEL: @umin_swapped(			; CHECK-LABEL: @umin_swapped(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt <2 x i8> %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt <2 x i8> %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp ult <2 x i8> %a, %b			; CHECK-NEXT: [[CMP2:%.*]] = icmp ult <2 x i8> %a, %b
	; CHECK-NEXT: [[M1:%.*]] = select <2 x i1> [[CMP1]], <2 x i8> %b, <2 x i8> %a			; CHECK-NEXT: [[M1:%.*]] = select <2 x i1> [[CMP1]], <2 x i8> %b, <2 x i8> %a
	; CHECK-NEXT: [[M2:%.*]] = select <2 x i1> [[CMP2]], <2 x i8> %a, <2 x i8> %b			; CHECK-NEXT: ret <2 x i8> zeroinitializer
	; CHECK-NEXT: [[R:%.*]] = sub <2 x i8> [[M2]], [[M1]]
	; CHECK-NEXT: ret <2 x i8> [[R]]
	;			;
	%cmp1 = icmp ugt <2 x i8> %a, %b			%cmp1 = icmp ugt <2 x i8> %a, %b
	%cmp2 = icmp ult <2 x i8> %a, %b			%cmp2 = icmp ult <2 x i8> %a, %b
	%m1 = select <2 x i1> %cmp1, <2 x i8> %b, <2 x i8> %a			%m1 = select <2 x i1> %cmp1, <2 x i8> %b, <2 x i8> %a
	%m2 = select <2 x i1> %cmp2, <2 x i8> %a, <2 x i8> %b			%m2 = select <2 x i1> %cmp2, <2 x i8> %a, <2 x i8> %b
	%r = sub <2 x i8> %m2, %m1			%r = sub <2 x i8> %m2, %m1
	ret <2 x i8> %r			ret <2 x i8> %r
	}			}

	define i8 @umax_commute(i8 %a, i8 %b) {			define i8 @umax_commute(i8 %a, i8 %b) {
	; CHECK-LABEL: @umax_commute(			; CHECK-LABEL: @umax_commute(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp ugt i8 %b, %a			; CHECK-NEXT: [[CMP2:%.*]] = icmp ugt i8 %b, %a
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %a, i8 %b
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %b, i8 %a			; CHECK-NEXT: ret i8 1
	; CHECK-NEXT: [[R:%.*]] = udiv i8 [[M1]], [[M2]]
	; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp ugt i8 %a, %b			%cmp1 = icmp ugt i8 %a, %b
	%cmp2 = icmp ugt i8 %b, %a			%cmp2 = icmp ugt i8 %b, %a
	%m1 = select i1 %cmp1, i8 %a, i8 %b			%m1 = select i1 %cmp1, i8 %a, i8 %b
	%m2 = select i1 %cmp2, i8 %b, i8 %a			%m2 = select i1 %cmp2, i8 %b, i8 %a
	%r = udiv i8 %m1, %m2			%r = udiv i8 %m1, %m2
	ret i8 %r			ret i8 %r
	}			}

	define i8 @umax_swapped(i8 %a, i8 %b) {			define i8 @umax_swapped(i8 %a, i8 %b) {
	; CHECK-LABEL: @umax_swapped(			; CHECK-LABEL: @umax_swapped(
	; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i8 %a, %b			; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i8 %a, %b
	; CHECK-NEXT: [[CMP2:%.*]] = icmp ugt i8 %a, %b			; CHECK-NEXT: [[CMP2:%.*]] = icmp ugt i8 %a, %b
	; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a			; CHECK-NEXT: [[M1:%.*]] = select i1 [[CMP1]], i8 %b, i8 %a
	; CHECK-NEXT: [[M2:%.*]] = select i1 [[CMP2]], i8 %a, i8 %b			; CHECK-NEXT: [[R:%.*]] = add i8 [[M1]], [[M1]]
	; CHECK-NEXT: [[R:%.*]] = add i8 [[M2]], [[M1]]
	; CHECK-NEXT: ret i8 [[R]]			; CHECK-NEXT: ret i8 [[R]]
	;			;
	%cmp1 = icmp ult i8 %a, %b			%cmp1 = icmp ult i8 %a, %b
	%cmp2 = icmp ugt i8 %a, %b			%cmp2 = icmp ugt i8 %a, %b
	%m1 = select i1 %cmp1, i8 %b, i8 %a			%m1 = select i1 %cmp1, i8 %b, i8 %a
	%m2 = select i1 %cmp2, i8 %a, i8 %b			%m2 = select i1 %cmp2, i8 %a, i8 %b
	%r = add i8 %m2, %m1			%r = add i8 %m2, %m1
	ret i8 %r			ret i8 %r
	}			}