This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
4/8
InstCombineSelect.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
icmp-dom.ll
1/1
minmax-fold.ll

Differential D107148

[InstCombine] Fold two-value clamp patterns
AbandonedPublic

Authored by qiucf on Jul 30 2021, 3:27 AM.

Download Raw Diff

Details

Reviewers

spatel
xbolva00
RKSimon
dmgreen
nikic
mkazantsev

Summary

This patch is for PR43053, the following pattern:

%cmp1 = icmp slt i32 %num, C1
%s1 = select i1 %cmp1, i32 %num, i32 C1
%cmp2 = icmp sgt i32 %s1, C2
%r = select i1 %cmp2, i32 %s1, i32 C2
; C2 = C1 - 1

can be folded into

%cmp3 = icmp sgt %num, C2
%r = select i1 %cmp3, C1, C2

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	1,030 ms	x64 debian > libarcher.races::task-dependency.c

Event Timeline

qiucf created this revision.Jul 30 2021, 3:27 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 30 2021, 3:27 AM

qiucf requested review of this revision.Jul 30 2021, 3:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2021, 3:27 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

RKSimon added inline comments.Jul 30 2021, 3:45 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3110	(style) auto *
llvm/test/Transforms/InstCombine/minmax-fold.ll
1511	pre-commit these tests and rebase to show the diffs in the patch

Harbormaster completed remote builds in B117140: Diff 363020.Jul 30 2021, 4:30 AM

The initial code

// s1 = (n < C1) ? n : C1
// s2 = (s1 > C1 - 1) ? s1 : C1 - 1

had a clear semantics of

s1 = min(n, C1)
s2 = max(s1, C1 - 1)

These pattern could be analyzed by other passes that understand semantics of min and max, such as SCEV using passes. If it participates in other min or max expression, there is a lot of ways how we can simplify it.

This new form, despite it has fewer instructions, is completely arcane. Just looking at it, it's not so easy to understand what it actually means. And no pass will.

I'd rather suggest to do changes like this as close to codegen as possible, for example in CodeGenPrepare.

In D107148#2916097, @mkazantsev wrote:
The initial code
// s1 = (n < C1) ? n : C1
// s2 = (s1 > C1 - 1) ? s1 : C1 - 1
had a clear semantics of
s1 = min(n, C1)
s2 = max(s1, C1 - 1)
These pattern could be analyzed by other passes that understand semantics of min and max, such as SCEV using passes. If it participates in other min or max expression, there is a lot of ways how we can simplify it.

This new form, despite it has fewer instructions, is completely arcane. Just looking at it, it's not so easy to understand what it actually means. And no pass will.

I'd rather suggest to do changes like this as close to codegen as possible, for example in CodeGenPrepare.

Sounds like SCEV should be aware of this way to parse such a select-of-constants as min-max.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3102	I think you should just produce select of constants here.

In D107148#2916113, @lebedev.ri wrote:
In D107148#2916097, @mkazantsev wrote:
The initial code
// s1 = (n < C1) ? n : C1
// s2 = (s1 > C1 - 1) ? s1 : C1 - 1
had a clear semantics of
s1 = min(n, C1)
s2 = max(s1, C1 - 1)
These pattern could be analyzed by other passes that understand semantics of min and max, such as SCEV using passes. If it participates in other min or max expression, there is a lot of ways how we can simplify it.

This new form, despite it has fewer instructions, is completely arcane. Just looking at it, it's not so easy to understand what it actually means. And no pass will.

I'd rather suggest to do changes like this as close to codegen as possible, for example in CodeGenPrepare.
Sounds like SCEV should be aware of this way to parse such a select-of-constants as min-max.

I know it's a different case, but I would consider the saturation case to be canonical as:

%m = call i32 @llvm.smin.i32(i32 %num, i32 127)
%r = call i32 @llvm.smax.i32(i32 %m, i32 -128)

And I believe we have code that relies on that.

But if this does indeed simplify to

%1 = icmp sgt i32 %0, 126
%r = select i1 %1, i32 127, i32 126

that doesn't sound worse than the min and max in this case.

The code might be better if it made use of matchSelectPattern. There are "SPF" matchers for matching min/max patterns, like in factorizeMinMaxTree and moveNotAfterMinMax.

qiucf updated this revision to Diff 366228.Aug 13 2021, 3:21 AM

qiucf marked 3 inline comments as done.

qiucf edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B119418: Diff 366228.Aug 13 2021, 3:59 AM

Ping.

RKSimon added a reviewer: nikic.Aug 25 2021, 5:33 AM

RKSimon added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3100	It'd be better if we could use m_APInt here to support vectors (including handling undef element cases).
3110	Can we use m_SMin etc. ? Ideally we'd have something that matches min/max intrinsics as well as select patterns.

Sorry for missing this review earlier.
I implemented the corresponding folds for intrinsics with:
https://reviews.llvm.org/rG025bb5290379

Can you confirm that the pattern matching and tests here correspond to those (replace the intrinsics with cmp+sel or vice-versa)?

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3110	`m_MaxOrMin()` will match either pattern (intrinsic or cmp+sel). We're starting from a select here, so I'm not sure if it's worth including tests with mismatched patterns (one of each). Also see getInverseMinMaxFlavor() or getInverseMinMaxPred() in ValueTracking.h.

In D107148#2964910, @spatel wrote:

Sorry for missing this review earlier.
I implemented the corresponding folds for intrinsics with:
https://reviews.llvm.org/rG025bb5290379

Can you confirm that the pattern matching and tests here correspond to those (replace the intrinsics with cmp+sel or vice-versa)?

Yes, it does similar thing to this. So will it be better to fold such cmp-select pattern to the intrinsic?

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3100	Hmm.. Does `match(xxx, m_APInt(&x))` support vector type?

In D107148#2975017, @qiucf wrote:

In D107148#2964910, @spatel wrote:

Sorry for missing this review earlier.
I implemented the corresponding folds for intrinsics with:
https://reviews.llvm.org/rG025bb5290379

Can you confirm that the pattern matching and tests here correspond to those (replace the intrinsics with cmp+sel or vice-versa)?

Yes, it does similar thing to this. So will it be better to fold such cmp-select pattern to the intrinsic?

We intend to canonicalize cmp-select to min/max intrinsics, but we need to address some more regressions that are visible in:
D98152

So it's a question of timing/duplication - when we get D98152 done, then this patch probably becomes obsolete because we won't see the cmp-select pattern any more.

Ah - I just realized the tests here are pre-committed, so we can see what effect converting to intrinsics will have on them in D98152.
It looks like we get some of the examples, but miss some others. So we will need to improve the analysis of the intrinsics for 2-way clamps either way. If you want to investigate that, that would be great.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3100	Partly - m_APInt will automatically match a splat constant <42, 42, 42,...>, but not an arbitrary vector constant. For this fold, matching a splat is probably sufficient. m_APInt is basically free in terms of coding compared to m_ConstantInt, so we prefer to use m_APInt over m_ConstantInt for more general matching.

Matt added a subscriber: Matt.Oct 2 2021, 6:42 AM

Sounds like SCEV should be aware of this way to parse such a select-of-constants as min-max.

We can, theoretically, teach SCEV to recognize min and max in all kinds of arcane expressions including arithmetics, zexts, xors, shifts and whatsoever. It's not going to happen in reality, off couese.

In D107148#2916113, @lebedev.ri wrote:
In D107148#2916097, @mkazantsev wrote:
The initial code
// s1 = (n < C1) ? n : C1
// s2 = (s1 > C1 - 1) ? s1 : C1 - 1
had a clear semantics of
s1 = min(n, C1)
s2 = max(s1, C1 - 1)
These pattern could be analyzed by other passes that understand semantics of min and max, such as SCEV using passes. If it participates in other min or max expression, there is a lot of ways how we can simplify it.

This new form, despite it has fewer instructions, is completely arcane. Just looking at it, it's not so easy to understand what it actually means. And no pass will.

I'd rather suggest to do changes like this as close to codegen as possible, for example in CodeGenPrepare.
Sounds like SCEV should be aware of this way to parse such a select-of-constants as min-max.

@qiucf could you please add a test with -analyze -scalar-evolution to see if SCEV is able to recognize this as max/min? If not, please add a FIXME in this test, this is a good improvement to do.

My only concern here is the recognition of this pattern by SCEV. I don't see much other problems but don't feel comfortable to approve this in InstCombine so leaving this to other reviewers' judgement.

@spatel Any thoughts?

In D107148#3080892, @RKSimon wrote:

@spatel Any thoughts?

Not with the direction - as noted earlier, we're already trying this with the intrinsics, so doing it with cmp+select just makes things consistent. There are a few implementation/test questions:

What logic diffs are there between this and 025bb5290379 ?
Add/adjust tests based on those diffs.
Use m_APInt so we get splat vectors.

@qiucf - will you continue this patch soon?

@qiucf - will you continue this patch soon?

Sure. Sorry for leaving this patch for some time. I'll take these issue these days.

Use m_APInt to match vector splats.

In D107148#3081328, @spatel wrote:

Not with the direction - as noted earlier, we're already trying this with the intrinsics, so doing it with cmp+select just makes things consistent. There are a few implementation/test questions:

What logic diffs are there between this and 025bb5290379 ?

Add/adjust tests based on those diffs.

Use m_APInt so we get splat vectors.

@qiucf - will you continue this patch soon?

I applied D98152 and try. The two currently affected cases (minmax-fold.ll, icmp-dom.ll) here are still passed with it but without my patch. But for below vector case:

define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %cmp1 = icmp sgt <4 x i32> %num, <i32 13767, i32 13767, i32 13767, i32 13767>
  %s1 = select <4 x i1> %cmp1, <4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>
  %cmp2 = icmp slt <4 x i32> %s1, <i32 13768, i32 13768, i32 13768, i32 13768>
  %r = select <4 x i1> %cmp2, <4 x i32> %s1, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
  ret <4 x i32> %r
}

; Got
define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %0 = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>)
  %1 = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %0, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>)
  ret <4 x i32> %1
}

; Not
define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %0 = icmp slt <4 x i32> %num, <i32 13768, i32 13768, i32 13768, i32 13768>
  %r = select <4 x i1> %0, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
  ret <4 x i32> %r
}

Seems not covered by that?

Harbormaster completed remote builds in B131710: Diff 383754.Nov 1 2021, 3:31 AM

In D107148#3099856, @qiucf wrote:
In D107148#3081328, @spatel wrote:

Not with the direction - as noted earlier, we're already trying this with the intrinsics, so doing it with cmp+select just makes things consistent. There are a few implementation/test questions:

What logic diffs are there between this and 025bb5290379 ?

Add/adjust tests based on those diffs.

Use m_APInt so we get splat vectors.

@qiucf - will you continue this patch soon?

I applied D98152 and try. The two currently affected cases (minmax-fold.ll, icmp-dom.ll) here are still passed with it but without my patch. But for below vector case:
define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %cmp1 = icmp sgt <4 x i32> %num, <i32 13767, i32 13767, i32 13767, i32 13767>
  %s1 = select <4 x i1> %cmp1, <4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>
  %cmp2 = icmp slt <4 x i32> %s1, <i32 13768, i32 13768, i32 13768, i32 13768>
  %r = select <4 x i1> %cmp2, <4 x i32> %s1, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
  ret <4 x i32> %r
}

; Got
define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %0 = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>)
  %1 = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %0, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>)
  ret <4 x i32> %1
}

; Not
define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
entry:
  %0 = icmp slt <4 x i32> %num, <i32 13768, i32 13768, i32 13768, i32 13768>
  %r = select <4 x i1> %0, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
  ret <4 x i32> %r
}
Seems not covered by that?

We may need some generalization for mismatched signedness of min/max like this:
https://alive2.llvm.org/ce/z/NV_bKr
(That should translate for all 4 combinations for min/max?)
After that, we will recognize the special case for clamp of 2 values.

But that doesn't need to hold this patch up unless I'm misunderstanding the comment.

RKSimon added inline comments.Nov 1 2021, 10:34 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
3184	You've reused C1 and not referred to C2?

qiucf updated this revision to Diff 384358.Nov 3 2021, 1:29 AM

qiucf marked an inline comment as done.

Harbormaster completed remote builds in B132160: Diff 384358.Nov 3 2021, 2:21 AM

rGa266af721153fab6452094207b09ed265ab0be7b does similar work.

Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2023, 12:00 AM

Herald added a subscriber: StephenFan. · View Herald Transcript

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

27 lines

test/

Transforms/

InstCombine/

icmp-dom.ll

14 lines

minmax-fold.ll

58 lines

Diff 363020

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

	Show First 20 Lines • Show All 991 Lines • ▼ Show 20 Lines
	// example).			// example).
	if (TrueSI->getFalseValue() == FalseVal && TrueSI->hasOneUse()) {			if (TrueSI->getFalseValue() == FalseVal && TrueSI->hasOneUse()) {
	Value *And = Builder.CreateLogicalAnd(CondVal, TrueSI->getCondition());			Value *And = Builder.CreateLogicalAnd(CondVal, TrueSI->getCondition());
	replaceOperand(SI, 0, And);			replaceOperand(SI, 0, And);
	replaceOperand(SI, 1, TrueSI->getTrueValue());			replaceOperand(SI, 1, TrueSI->getTrueValue());
	return &SI;			return &SI;
	}			}
	}			}

				// Fold two-way clamps:
				// s1 = (n < C1) ? n : C1
				// s2 = (s1 > C1 - 1) ? s1 : C1 - 1
				RKSimonUnsubmitted Not Done Reply Inline Actions It'd be better if we could use m_APInt here to support vectors (including handling undef element cases). RKSimon: It'd be better if we could use m_APInt here to support vectors (including handling undef…
				qiucfAuthorUnsubmitted Done Reply Inline Actions Hmm.. Does `match(xxx, m_APInt(&x))` support vector type? qiucf: Hmm.. Does `match(xxx, m_APInt(&x))` support vector type?
				spatelUnsubmitted Not Done Reply Inline Actions Partly - m_APInt will automatically match a splat constant <42, 42, 42,...>, but not an arbitrary vector constant. For this fold, matching a splat is probably sufficient. m_APInt is basically free in terms of coding compared to m_ConstantInt, so we prefer to use m_APInt over m_ConstantInt for more general matching. spatel: Partly - m_APInt will automatically match a splat constant <42, 42, 42,...>, but not an…
				// into:
				// s2 = C1 - 1 + (n > C1 - 1)
				lebedev.riUnsubmitted Done Reply Inline Actions I think you should just produce select of constants here. lebedev.ri: I think you should just produce select of constants here.
				CmpInst::Predicate InnerPred;
				if (match(CondVal,
				m_ICmp(Pred, m_Specific(TrueVal), m_Specific(FalseVal))) &&
				match(TrueSI->getCondition(),
				m_ICmp(InnerPred, m_Specific(TrueSI->getTrueValue()),
				m_Specific(TrueSI->getFalseValue())))) {
				ConstantInt *C1 = dyn_cast<ConstantInt>(TrueSI->getFalseValue());
				ConstantInt *C2 = dyn_cast<ConstantInt>(FalseVal);
				RKSimonUnsubmitted Done Reply Inline Actions (style) auto * RKSimon: (style) auto *
				RKSimonUnsubmitted Not Done Reply Inline Actions Can we use m_SMin etc. ? Ideally we'd have something that matches min/max intrinsics as well as select patterns. RKSimon: Can we use m_SMin etc. ? Ideally we'd have something that matches min/max intrinsics as well as…
				spatelUnsubmitted Not Done Reply Inline Actions `m_MaxOrMin()` will match either pattern (intrinsic or cmp+sel). We're starting from a select here, so I'm not sure if it's worth including tests with mismatched patterns (one of each). Also see getInverseMinMaxFlavor() or getInverseMinMaxPred() in ValueTracking.h. spatel: `m_MaxOrMin()` will match either pattern (intrinsic or cmp+sel). We're starting from a select…
				if (C1 && C2 &&
				((Pred == CmpInst::ICMP_SGT && InnerPred == CmpInst::ICMP_SLT &&
				C2->getValue() == C1->getValue() - 1) \|\|
				(Pred == CmpInst::ICMP_SLT && InnerPred == CmpInst::ICMP_SGT &&
				C2->getValue() == C1->getValue() + 1))) {
				Value *NewCmp = Builder.CreateICmp(Pred, TrueSI->getTrueValue(), C2);
				Value *Ext = Builder.CreateZExt(NewCmp, TrueVal->getType());
				Instruction *Add = BinaryOperator::Create(
				Pred == CmpInst::ICMP_SGT ? Instruction::Add : Instruction::Sub, C2,
				Ext);
				return Add;
				}
				}
	}			}
	if (SelectInst *FalseSI = dyn_cast<SelectInst>(FalseVal)) {			if (SelectInst *FalseSI = dyn_cast<SelectInst>(FalseVal)) {
	if (FalseSI->getCondition()->getType() == CondVal->getType()) {			if (FalseSI->getCondition()->getType() == CondVal->getType()) {
	// select(C, a, select(C, b, c)) -> select(C, a, c)			// select(C, a, select(C, b, c)) -> select(C, a, c)
	if (FalseSI->getCondition() == CondVal) {			if (FalseSI->getCondition() == CondVal) {
	if (SI.getFalseValue() == FalseSI->getFalseValue())			if (SI.getFalseValue() == FalseSI->getFalseValue())
	return nullptr;			return nullptr;
	return replaceOperand(SI, 2, FalseSI->getFalseValue());			return replaceOperand(SI, 2, FalseSI->getFalseValue());
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	}			}

	// select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W))			// select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W))
	BinaryOperator *FalseBO;			BinaryOperator *FalseBO;
	if (match(FalseVal, m_OneUse(m_BinOp(FalseBO))) &&			if (match(FalseVal, m_OneUse(m_BinOp(FalseBO))) &&
	canMergeSelectThroughBinop(FalseBO)) {			canMergeSelectThroughBinop(FalseBO)) {
	if (auto *FalseBOSI = dyn_cast<SelectInst>(FalseBO->getOperand(0))) {			if (auto *FalseBOSI = dyn_cast<SelectInst>(FalseBO->getOperand(0))) {
	if (FalseBOSI->getCondition() == CondVal) {			if (FalseBOSI->getCondition() == CondVal) {
	replaceOperand(*FalseBO, 0, FalseBOSI->getFalseValue());			replaceOperand(*FalseBO, 0, FalseBOSI->getFalseValue());
				RKSimonUnsubmitted Done Reply Inline Actions You've reused C1 and not referred to C2? RKSimon: You've reused C1 and not referred to C2?
	Worklist.push(FalseBO);			Worklist.push(FalseBO);
	return &SI;			return &SI;
	}			}
	}			}
	if (auto *FalseBOSI = dyn_cast<SelectInst>(FalseBO->getOperand(1))) {			if (auto *FalseBOSI = dyn_cast<SelectInst>(FalseBO->getOperand(1))) {
	if (FalseBOSI->getCondition() == CondVal) {			if (FalseBOSI->getCondition() == CondVal) {
	replaceOperand(*FalseBO, 1, FalseBOSI->getFalseValue());			replaceOperand(*FalseBO, 1, FalseBOSI->getFalseValue());
	Worklist.push(FalseBO);			Worklist.push(FalseBO);
	▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/icmp-dom.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

define void @idom_sign_bit_check_edge_dominates(i64 %a) {		define void @idom_sign_bit_check_edge_dominates(i64 %a) {
; CHECK-LABEL: @idom_sign_bit_check_edge_dominates(		; CHECK-LABEL: @idom_sign_bit_check_edge_dominates(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[CMP:%.]] = icmp slt i64 [[A:%.]], 0		; CHECK-NEXT: [[CMP:%.]] = icmp slt i64 [[A:%.]], 0
; CHECK-NEXT: br i1 [[CMP]], label [[LAND_LHS_TRUE:%.]], label [[LOR_RHS:%.]]		; CHECK-NEXT: br i1 [[CMP]], label [[LAND_LHS_TRUE:%.]], label [[LOR_RHS:%.]]
; CHECK: land.lhs.true:		; CHECK: land.lhs.true:
; CHECK-NEXT: br label [[LOR_END:%.*]]		; CHECK-NEXT: br label [[LOR_END:%.*]]
; CHECK: lor.rhs:		; CHECK: lor.rhs:
; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i64 [[A]], 0		; CHECK-NEXT: [[CMP2_NOT:%.*]] = icmp eq i64 [[A]], 0
; CHECK-NEXT: br i1 [[CMP2]], label [[LOR_END]], label [[LAND_RHS:%.*]]		; CHECK-NEXT: br i1 [[CMP2_NOT]], label [[LOR_END]], label [[LAND_RHS:%.*]]
; CHECK: land.rhs:		; CHECK: land.rhs:
; CHECK-NEXT: br label [[LOR_END]]		; CHECK-NEXT: br label [[LOR_END]]
; CHECK: lor.end:		; CHECK: lor.end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%cmp = icmp slt i64 %a, 0		%cmp = icmp slt i64 %a, 0
br i1 %cmp, label %land.lhs.true, label %lor.rhs		br i1 %cmp, label %land.lhs.true, label %lor.rhs
▲ Show 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	falselabel:
ret i32 0		ret i32 0
}		}

; This used to infinite loop because of a conflict		; This used to infinite loop because of a conflict
; with min/max canonicalization.		; with min/max canonicalization.

define i8 @PR48900_alt(i8 %i, i1* %p) {		define i8 @PR48900_alt(i8 %i, i1* %p) {
; CHECK-LABEL: @PR48900_alt(		; CHECK-LABEL: @PR48900_alt(
; CHECK-NEXT: [[MAXCMP:%.]] = icmp sgt i8 [[I:%.]], -127		; CHECK-NEXT: [[MAXCMP:%.]] = icmp slt i8 [[I:%.]], -126
; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[MAXCMP]], i8 [[I]], i8 -127		; CHECK-NEXT: [[I41:%.*]] = icmp ugt i8 [[I]], -128
; CHECK-NEXT: [[I4:%.*]] = icmp ugt i8 [[SMAX]], -128		; CHECK-NEXT: [[I4:%.*]] = or i1 [[MAXCMP]], [[I41]]
; CHECK-NEXT: br i1 [[I4]], label [[TRUELABEL:%.]], label [[FALSELABEL:%.]]		; CHECK-NEXT: br i1 [[I4]], label [[TRUELABEL:%.]], label [[FALSELABEL:%.]]
; CHECK: truelabel:		; CHECK: truelabel:
; CHECK-NEXT: [[MINCMP:%.*]] = icmp slt i8 [[SMAX]], -126		; CHECK-NEXT: [[TMP1:%.*]] = icmp slt i8 [[I]], -126
; CHECK-NEXT: [[UMIN:%.*]] = select i1 [[MINCMP]], i8 [[SMAX]], i8 -126		; CHECK-NEXT: [[UMIN:%.*]] = select i1 [[TMP1]], i8 -127, i8 -126
; CHECK-NEXT: ret i8 [[UMIN]]		; CHECK-NEXT: ret i8 [[UMIN]]
; CHECK: falselabel:		; CHECK: falselabel:
; CHECK-NEXT: ret i8 0		; CHECK-NEXT: ret i8 0
;		;
%maxcmp = icmp sgt i8 %i, -127		%maxcmp = icmp sgt i8 %i, -127
%smax = select i1 %maxcmp, i8 %i, i8 -127		%smax = select i1 %maxcmp, i8 %i, i8 -127
%i4 = icmp ugt i8 %smax, 128		%i4 = icmp ugt i8 %smax, 128
br i1 %i4, label %truelabel, label %falselabel		br i1 %i4, label %truelabel, label %falselabel
Show All 9 Lines

llvm/test/Transforms/InstCombine/minmax-fold.ll

Show First 20 Lines • Show All 531 Lines • ▼ Show 20 Lines	;
%res = select i1 %cmp2, i32 %sel1, i32 0		%res = select i1 %cmp2, i32 %sel1, i32 0
ret i32 %res		ret i32 %res
}		}

; Check that there is no infinite loop because of reverse cmp transformation:		; Check that there is no infinite loop because of reverse cmp transformation:
; (icmp slt smax(PositiveA, B) 2) -> (icmp eq B 1)		; (icmp slt smax(PositiveA, B) 2) -> (icmp eq B 1)
define i32 @clamp_check_for_no_infinite_loop3(i32 %i) {		define i32 @clamp_check_for_no_infinite_loop3(i32 %i) {
; CHECK-LABEL: @clamp_check_for_no_infinite_loop3(		; CHECK-LABEL: @clamp_check_for_no_infinite_loop3(
; CHECK-NEXT: [[I2:%.]] = icmp sgt i32 [[I:%.]], 1
; CHECK-NEXT: [[I3:%.*]] = select i1 [[I2]], i32 [[I]], i32 1
; CHECK-NEXT: br i1 true, label [[TRUELABEL:%.]], label [[FALSELABEL:%.]]		; CHECK-NEXT: br i1 true, label [[TRUELABEL:%.]], label [[FALSELABEL:%.]]
; CHECK: truelabel:		; CHECK: truelabel:
; CHECK-NEXT: [[I5:%.*]] = icmp slt i32 [[I3]], 2		; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[I:%.]], 2
; CHECK-NEXT: [[I6:%.*]] = select i1 [[I5]], i32 [[I3]], i32 2		; CHECK-NEXT: [[I6:%.*]] = select i1 [[TMP1]], i32 4, i32 8
; CHECK-NEXT: [[I7:%.*]] = shl nuw nsw i32 [[I6]], 2		; CHECK-NEXT: ret i32 [[I6]]
; CHECK-NEXT: ret i32 [[I7]]
; CHECK: falselabel:		; CHECK: falselabel:
; CHECK-NEXT: ret i32 0		; CHECK-NEXT: ret i32 0
;		;

%i2 = icmp sgt i32 %i, 1		%i2 = icmp sgt i32 %i, 1
%i3 = select i1 %i2, i32 %i, i32 1		%i3 = select i1 %i2, i32 %i, i32 1
%i4 = icmp sgt i32 %i3, 0		%i4 = icmp sgt i32 %i3, 0
br i1 %i4, label %truelabel, label %falselabel		br i1 %i4, label %truelabel, label %falselabel
▲ Show 20 Lines • Show All 902 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret i8 [[R]]		; CHECK-NEXT: ret i8 [[R]]
;		;
%a = icmp sgt <2 x i8> %x, <i8 -1, i8 -1>		%a = icmp sgt <2 x i8> %x, <i8 -1, i8 -1>
%b = select <2 x i1> %a, <2 x i8> %x, <2 x i8> <i8 undef, i8 -1>		%b = select <2 x i1> %a, <2 x i8> %x, <2 x i8> <i8 undef, i8 -1>
%not = xor <2 x i8> %b, <i8 undef, i8 -1>		%not = xor <2 x i8> %b, <i8 undef, i8 -1>
%r = extractelement <2 x i8> %not, i32 1		%r = extractelement <2 x i8> %not, i32 1
ret i8 %r		ret i8 %r
}		}

		define i32 @twoway_clamp_lt(i32 %num) {
		; CHECK-LABEL: @twoway_clamp_lt(
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = icmp sgt i32 [[NUM:%.]], 13767
		; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP0]], i32 13768, i32 13767
		; CHECK-NEXT: ret i32 [[R]]
		;
		entry:
		%cmp1 = icmp slt i32 %num, 13768
		%s1 = select i1 %cmp1, i32 %num, i32 13768
		%cmp2 = icmp sgt i32 %s1, 13767
		%r = select i1 %cmp2, i32 %s1, i32 13767
		ret i32 %r
		}

		define i32 @twoway_clamp_gt(i32 %num) {
		; CHECK-LABEL: @twoway_clamp_gt(
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[TMP0:%.]] = icmp slt i32 [[NUM:%.]], 13768
		; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP0]], i32 13767, i32 13768
		; CHECK-NEXT: ret i32 [[R]]
		;
		entry:
		%cmp1 = icmp sgt i32 %num, 13767
		%s1 = select i1 %cmp1, i32 %num, i32 13767
		%cmp2 = icmp slt i32 %s1, 13768
		%r = select i1 %cmp2, i32 %s1, i32 13768
		ret i32 %r
		}

		define i32 @twoway_clamp_gt_nonconst(i32 %num, i32 %k) {
		; CHECK-LABEL: @twoway_clamp_gt_nonconst(
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[K1:%.]] = add i32 [[K:%.]], 1
		; CHECK-NEXT: [[CMP1:%.]] = icmp sgt i32 [[NUM:%.]], [[K]]
		; CHECK-NEXT: [[S1:%.*]] = select i1 [[CMP1]], i32 [[NUM]], i32 [[K]]
		; CHECK-NEXT: [[CMP2:%.*]] = icmp slt i32 [[S1]], [[K1]]
		; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP2]], i32 [[S1]], i32 [[K1]]
		; CHECK-NEXT: ret i32 [[R]]
		;
		entry:
		%k1 = add i32 %k, 1
		%cmp1 = icmp sgt i32 %num, %k
		%s1 = select i1 %cmp1, i32 %num, i32 %k
		%cmp2 = icmp slt i32 %s1, %k1
		%r = select i1 %cmp2, i32 %s1, i32 %k1
		ret i32 %r
		}
		RKSimonUnsubmitted Done Reply Inline Actions pre-commit these tests and rebase to show the diffs in the patch RKSimon: pre-commit these tests and rebase to show the diffs in the patch