This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
2/12
InstCombineSelect.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1
truncating-saturate.ll

Differential D108049

[InstCombine] Extend canonicalizeClampLike to handle truncated inputs
ClosedPublic

Authored by dmgreen on Aug 13 2021, 11:59 AM.

Download Raw Diff

Details

Reviewers

spatel
efriedma
lebedev.ri
RKSimon
nikic

Commits

rG9358384fd646: [InstCombine] Extend canonicalizeClampLike to handle truncated inputs

Summary

This extends the canonicalizeClampLike function to allow cases where the input is truncated, but still matching on the types of the ICmps. For example

%t = trunc i32 %X to i8
%a = add i32 %X, 128
%cmp = icmp ult i32 %a, 256
%c = icmp sgt i32 %X, -1
%f = select i1 %c, i8 High, i8 Low
%r = select i1 %cmp, i8 %t, i8 %f

becomes

%c1 = icmp slt i32 %X, -128
%c2 = icmp sge i32 %X, 128
%s1 = select i1 %c1, i32 sext(Low), i32 %X
%s2 = select i1 %c2, i32 sext(High), i32 %s1
%t = trunc i32 %s2 to i8

https://alive2.llvm.org/ce/z/vPzfxH

We limit the transform to constant High and Low values, where we know the sext are free.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dmgreen created this revision.Aug 13 2021, 11:59 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptAug 13 2021, 11:59 AM

dmgreen requested review of this revision.Aug 13 2021, 11:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2021, 11:59 AM

Harbormaster completed remote builds in B119482: Diff 366325.Aug 13 2021, 12:00 PM

vector test cases?

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
922	If you're going to keep this in code, please can you simplify it (no entry: + better var names).
934	How is BW different to Ty->getScalarSizeInBits() ?

Thanks for taking a look. This does some cleanup as suggested, and renames the method to be consistent with others nearby.

There is a vector test in testv4i16i8. Is that what you had in mind, or would something more be preferable?

Harbormaster completed remote builds in B120085: Diff 367156.Aug 18 2021, 2:48 AM

More cleanup

Harbormaster completed remote builds in B120739: Diff 368040.Aug 23 2021, 12:09 AM

Rebase and ping. Any opinions on here vs aggressive instrcombine?

Harbormaster completed remote builds in B121780: Diff 369500.Aug 30 2021, 11:26 AM

That's a big fold!
The larger the pattern match, the more fragile the optimization tends to be because we might eventually find sub-patterns that can be reduced.
Does it make things harder or easier if we fold that icmp? We're checking if the top N/2 + 1 bits are all set or clear, so I visualized it like this:

define i1 @src(i16 %x) {
  %t0 = lshr i16 %x, 8
  %conv.i = trunc i16 %t0 to i8
  %conv1.i = trunc i16 %x to i8
  %shr2.i = ashr i8 %conv1.i, 7
  %r = icmp eq i8 %shr2.i, %conv.i
  ret i1 %r
}

define i1 @tgt(i16 %x) {
  %mask = ashr i16 %x, 7
  %ones = icmp eq i16 %mask, -1
  %zero = icmp eq i16 %mask, 0
  %r = or i1 %ones, %zero 
  ret i1 %r
}

https://alive2.llvm.org/ce/z/reQjDv

But existing combines get that down to just 2 instructions:

define i1 @tgt(i16 %x) {
  %x.off = add i16 %x, 65408
  %r = icmp ugt i16 %x.off, 65279
  ret i1 %r
}

https://alive2.llvm.org/ce/z/8Fh23s

I don't know exactly what the generalization for this will be, but it seems like we should try that first?

In D108049#2974587, @spatel wrote:

That's a big fold!
The larger the pattern match, the more fragile the optimization tends to be because we might eventually find sub-patterns that can be reduced.

Yeah OK, Sounds good. I'll try it as a number of smaller folds, like you say it should be more reliable and that way the final pattern should at least be smaller. I half remember from a very long time ago trying the same thing (including extends/add to make a sadd.sat pattern), but couldn't do it without some combines that individually increased instruction count (or large combines).

I can give it another go though. From looking at it again, this gets us there, but the last fold involves moving truncates in the wrong direction:
https://godbolt.org/z/dKjdfhe1s

I'll try and figure out if there's anything more sensible to do as that last fold. Suggestions welcome if you have any :)

The problem with smaller intermediate folds is that the larger final fold
may still be wanted iff the intermediate folds may be blocked due to the use counts..

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
925–926	I'm not sure bitwidth cutoff should be here? It should be easy to legalize in backend, in fact it already has to.

In D108049#2975003, @lebedev.ri wrote:

The problem with smaller intermediate folds is that the larger final fold
may still be wanted iff the intermediate folds may be blocked due to the use counts..

Yes - that reminds me of the min/max pixel patterns from cmyk benchmarks. We ended up needing to match a larger-than-normal pattern to get those because of uses. (I'm trying to deal with the intrinsic versions of those patterns now to help D98152.)
So it depends on the motivating case here - if there are enough extra uses that we can't get the big pattern, then we might as well add this. If not, we can go for smaller matches and confirm that we get the sequence to work on the larger example(s).

dmgreen mentioned this in D109151: [InstCombine] Convert xor (ashr X, BW-1), C -> select(X >=s 0, C, ~C).Sep 2 2021, 5:59 AM

dmgreen added a parent revision: D109151: [InstCombine] Convert xor (ashr X, BW-1), C -> select(X >=s 0, C, ~C).

dmgreen added a parent revision: D109155: [InstCombine] Fold BW/2+1 tops bits are same pattern.Sep 2 2021, 6:58 AM

Change to now extend canonicalizeClampLike to handle truncated inputs, which further gets folded to a saturating min/max pattern. With the other combines added in D109151 and D109155 the original case becomes a sadd.sat.

Harbormaster completed remote builds in B122507: Diff 370557.Sep 3 2021, 6:08 AM

Rebase onto main, making sure this has some testing independent of other patches.

Harbormaster completed remote builds in B127351: Diff 377621.Oct 6 2021, 11:37 AM

Ping

lebedev.ri added inline comments.Oct 15 2021, 7:19 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1441	We didn't need `sext` originally. Why do we always need one now? Should this use `CreateSExtOrBitCast()`?
1445–1452	I would recommend something along the lines of using `CreateSExtOrBitCast()`, only calling `CreateSelect` in a single place, and returning through `CreateTruncOrBitCast()`.

Rejig.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1441	It is relying on the CreateSExt doing `if (V->getType() == DestTy) return V;`. I can change that to CreateSExtOrBitCast but It shouldn't ever bitcast anything, the types will always be integers.
1445–1452	Oh yeah I was awkwardly working around it returning an intruction. I've changed it, let be know if it looks wrong or they should be changed to `OrBitcast` versions.

Harbormaster completed remote builds in B129073: Diff 380014.Oct 15 2021, 8:33 AM

I notice that all the changed tests perform signed clamping,
and likewise, you sext. Is sext always the right choice?
Please add at least one test with unsigned clamp,
and an alive proof.

In D108049#3067063, @lebedev.ri wrote:

I notice that all the changed tests perform signed clamping,
and likewise, you sext. Is sext always the right choice?
Please add at least one test with unsigned clamp,
and an alive proof.

This method wont performs unsigned clamping (depending on what unsigned clamping means). It transforms code of the form:
select (icmp ult X, C0), add(X, C1), select(icmp slt(X, C2), L, H))
And always produces signed clamp outputs. So there is an unsigned comparison in there (and a signed one), but they are not interchangable.

I can change either the first icmp to signed, or the second to unsigned, but then the method wont match them and this patch doesn't alter the codegen. I'll add them to the list of tests, but the codegen here won't change

define i16 @testi32i16i8_s(i32 %add) {
; CHECK-LABEL: @testi32i16i8_s(
; CHECK-NEXT:    [[A:%.*]] = add i32 [[ADD:%.*]], 128
; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[A]], 256
; CHECK-NEXT:    [[T:%.*]] = trunc i32 [[ADD]] to i16
; CHECK-NEXT:    [[C:%.*]] = icmp sgt i32 [[ADD]], -1
; CHECK-NEXT:    [[F:%.*]] = select i1 [[C]], i16 127, i16 -128
; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP]], i16 [[T]], i16 [[F]]
; CHECK-NEXT:    ret i16 [[R]]
;
  %a = add i32 %add, 128
  %cmp = icmp slt i32 %a, 256
  %t = trunc i32 %add to i16
  %c = icmp sgt i32 %add, -1
  %f = select i1 %c, i16 127, i16 -128
  %r = select i1 %cmp, i16 %t, i16 %f
  ret i16 %r
}

define i16 @testi32i16i8_u(i32 %add) {
; CHECK-LABEL: @testi32i16i8_u(
; CHECK-NEXT:    [[A:%.*]] = add i32 [[ADD:%.*]], 128
; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i32 [[A]], 256
; CHECK-NEXT:    [[T:%.*]] = trunc i32 [[ADD]] to i16
; CHECK-NEXT:    [[R:%.*]] = select i1 [[CMP]], i16 [[T]], i16 -128
; CHECK-NEXT:    ret i16 [[R]]
;
  %a = add i32 %add, 128
  %cmp = icmp ult i32 %a, 256
  %t = trunc i32 %add to i16
  %c = icmp ugt i32 %add, -1
  %f = select i1 %c, i16 127, i16 -128
  %r = select i1 %cmp, i16 %t, i16 %f
  ret i16 %r
}

This is a proof that includes the sexts: https://alive2.llvm.org/ce/z/Y_Q3yQ

Uhm, okay. Then, could you please post the generalized (with variables) alive2 proof for ULT+SLT predicate pair?

In D108049#3067499, @lebedev.ri wrote:

Uhm, okay. Then, could you please post the generalized (with variables) alive2 proof for ULT+SLT predicate pair?

Hmm, What do you mean? The proof above is ult+slt. Do you mean in some other way?

In D108049#3068061, @dmgreen wrote:

In D108049#3067499, @lebedev.ri wrote:

Uhm, okay. Then, could you please post the generalized (with variables) alive2 proof for ULT+SLT predicate pair?

Hmm, What do you mean? The proof above is ult+slt. Do you mean in some other way?

I mean it is for a hardcoded set of constants, while i'd like to see it's general form.

Here's what i wanted to see: https://alive2.llvm.org/ce/z/2XpHkB
Did i mess that proof up, or are some new preconditions needed?

In D108049#3068075, @lebedev.ri wrote:

Here's what i wanted to see: https://alive2.llvm.org/ce/z/2XpHkB
Did i mess that proof up, or are some new preconditions needed?

Oh I see. It appears to need a check that the C0 in icmp ult %a, C0 isn't zero, to prevent undef being an issue, unless I got this wrong:
https://alive2.llvm.org/ce/z/Fxjsv6
Same thing for the original case:
https://alive2.llvm.org/ce/z/BBdqhZ

It sounds fine to me to assume that the icmp ult %a, 0 will have already been simplified away, but let me know if you think an extra check should be added in case.

Aha, so the initial transformation is already a miscompile:
https://alive2.llvm.org/ce/z/AsQQBJ

Could you please fix it first?
You need to be freezing the %x: https://alive2.llvm.org/ce/z/ekNz8E

I don't think we need freeze for the general case, just that C0 != 0 (which will pretty much always be true as icmp ult i32 %t2, 0 isn't a very useful thing to check :) )
https://alive2.llvm.org/ce/z/VNnGSy

I'll add an explicit check for it though.

Added a check for C0 being zero. This is the proof for the UGT part:
https://alive2.llvm.org/ce/z/tr8drB

I'm not sure if this case is possible to test. The icmp will usually have been simplified away by the time we visit this pattern.

Harbormaster completed remote builds in B129172: Diff 380162.Oct 16 2021, 3:20 AM

lebedev.ri added inline comments.Oct 21 2021, 4:55 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1310–1324	What happens if one element of a constant vector is all-ones/zero?

The constants aren't splats! Switch back to using the m_SpecificInt_ICMP method for checking constant elements.

Harbormaster completed remote builds in B129931: Diff 381243.Oct 21 2021, 6:40 AM

Any further comments?

Sorry, this should not be taking so long.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1350–1351	`match(&X, m_TruncOrSelf(m_Value(X)));` ?
1426	This accepts constant exprs too. You probably want `m_ImmConstant()` matcher.
1448	Hm, so this now creates one more instruction than before, but i'm not seeing one more one-use check being added. Perhaps the original `trunc` should be one-use?

Use m_ImmConstant, match(X, m_TruncOrSelf(m_Value(X))) (I hope that's OK) and add a one use check for the truncate.

Harbormaster completed remote builds in B131171: Diff 382985.Oct 28 2021, 4:33 AM

LG to me, unless @spatel has further thoughts.

This revision is now accepted and ready to land.Oct 28 2021, 4:35 AM

spatel added inline comments.Oct 28 2021, 6:20 AM

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1425	It would be clearer to follow the optional casting logic if we consolidate it one block with something like this: if (X->getType() != Sel0.getType()) { Constant LowC, HighC; if (!match(ReplacementLow, m_ImmConstant(LowC)) \|\| !match(ReplacementHigh, m_ImmConstant(HighC))) return nullptr; ReplacementLow = ConstantExpr::getSExt(LowC, X->getType()); ReplacementHigh = ConstantExpr::getSExt(HighC, X->getType()); } ...and then get rid of the CreateSext diffs below here?
llvm/test/Transforms/InstCombine/truncating-saturate.ll
665	Double-checking my understanding: this is a miscompile and is fixed by the new ULT check (m_SpecificInt_ICMP)? If so, let's commit that fix first?

dmgreen mentioned this in rG79011c705b58: [InstCombine] Fix rare condition violation in canonicalizeClampLike.Oct 28 2021, 7:03 AM

Thanks - updated as per comments along with rG79011c705b58 for the initial part.

Harbormaster completed remote builds in B131198: Diff 383025.Oct 28 2021, 7:08 AM

lebedev.ri accepted this revision.Oct 28 2021, 7:10 AM

LGTM

Cheers

This revision was landed with ongoing or failed builds.Oct 28 2021, 7:47 AM

Closed by commit rG9358384fd646: [InstCombine] Extend canonicalizeClampLike to handle truncated inputs (authored by dmgreen). · Explain Why

This revision was automatically updated to reflect the committed changes.

dmgreen added a commit: rG9358384fd646: [InstCombine] Extend canonicalizeClampLike to handle truncated inputs.

this is causing crashes, e.g.

$ cat reduced.ll 
define i8 @f(i32 %value, i8 %call.i) {
entry:
  %cmp.i = icmp slt i32 %value, 0
  %cond.i = select i1 %cmp.i, i8 %call.i, i8 0
  %cmp.i.i = icmp ult i32 %value, 256
  %conv4 = trunc i32 %value to i8
  %cond = select i1 %cmp.i.i, i8 %conv4, i8 %cond.i
  ret i8 %cond
}
$ ./build/rel/bin/opt -passes=instcombine -disable-output reduced.ll
opt: ../../llvm/lib/IR/Constants.cpp:2304: static llvm::Constant *llvm::ConstantExpr::get(unsigned int, llvm::Constant *, llvm::Constant *, unsigned int, llvm::Type *): Assertion `C1->getType() == C2->getType() && "Operand types in binary constant expression should match"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ./build/rel/bin/opt -passes=instcombine -disable-output reduced.ll
 #0 0x0000000001f1a023 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:565:13
 #1 0x0000000001f17e9e llvm::sys::RunSignalHandlers() /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Signals.cpp:98:18
 #2 0x0000000001f1a38f SignalHandler(int) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Support/Unix/Signals.inc:407:1
 #3 0x00007f45c95198e0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x138e0)
 #4 0x00007f45c8ff3e71 raise ./signal/../sysdeps/unix/sysv/linux/raise.c:50:1
 #5 0x00007f45c8fdd536 abort ./stdlib/abort.c:81:7
 #6 0x00007f45c8fdd41f get_sysdep_segment_value ./intl/loadmsgcat.c:509:8
 #7 0x00007f45c8fdd41f _nl_load_domain ./intl/loadmsgcat.c:970:34
 #8 0x00007f45c8fec7f2 (/lib/x86_64-linux-gnu/libc.so.6+0x357f2)
 #9 0x0000000001ba8eed llvm::ConstantExpr::get(unsigned int, llvm::Constant*, llvm::Constant*, unsigned int, llvm::Type*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/IR/Constants.cpp:0:0
#10 0x0000000002929f7e canonicalizeClampLike /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:1409:29
#11 0x0000000002929f7e llvm::InstCombinerImpl::foldSelectInstWithICmp(llvm::SelectInst&, llvm::ICmpInst*) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:1535:18
#12 0x000000000292f3b2 llvm::InstCombinerImpl::visitSelectInst(llvm::SelectInst&) /usr/local/google/home/aeubanks/repos/llvm-project/build/rel/../../llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:2994:31

A revert is causing llvm/test/Transforms/InstCombine/truncating-saturate.ll to fail, could you fix forward or perhaps revert whatever stack of patches needs to be reverted?

In D108049#3098052, @aeubanks wrote:

A revert is causing llvm/test/Transforms/InstCombine/truncating-saturate.ll to fail, could you fix forward or perhaps revert whatever stack of patches needs to be reverted?

Thanks for the report. I'll fix it now.

dmgreen mentioned this in rG66281baea1df: [InstCombine] Fix type of constant in canonicalizeClampLike.Oct 30 2021, 1:06 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

33 lines

test/

Transforms/

InstCombine/

truncating-saturate.ll

114 lines

Diff 383035

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 913 Lines • ▼ Show 20 Lines
static Instruction foldSelectCtlzToCttz(ICmpInst ICI, Value *TrueVal,		static Instruction foldSelectCtlzToCttz(ICmpInst ICI, Value *TrueVal,
Value *FalseVal,		Value *FalseVal,
InstCombiner::BuilderTy &Builder) {		InstCombiner::BuilderTy &Builder) {
unsigned BitWidth = TrueVal->getType()->getScalarSizeInBits();		unsigned BitWidth = TrueVal->getType()->getScalarSizeInBits();
if (!ICI->isEquality() \|\| !match(ICI->getOperand(1), m_Zero()))		if (!ICI->isEquality() \|\| !match(ICI->getOperand(1), m_Zero()))
return nullptr;		return nullptr;

if (ICI->getPredicate() == ICmpInst::ICMP_NE)		if (ICI->getPredicate() == ICmpInst::ICMP_NE)
std::swap(TrueVal, FalseVal);		std::swap(TrueVal, FalseVal);
		RKSimonUnsubmitted Not Done Reply Inline Actions If you're going to keep this in code, please can you simplify it (no entry: + better var names). RKSimon: If you're going to keep this in code, please can you simplify it (no entry: + better var names).

if (!match(FalseVal,		if (!match(FalseVal,
m_Xor(m_Deferred(TrueVal), m_SpecificInt(BitWidth - 1))))		m_Xor(m_Deferred(TrueVal), m_SpecificInt(BitWidth - 1))))
return nullptr;		return nullptr;
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure bitwidth cutoff should be here? It should be easy to legalize in backend, in fact it already has to. lebedev.ri: I'm not sure bitwidth cutoff should be here? It should be easy to legalize in backend, in fact…

if (!match(TrueVal, m_Intrinsic<Intrinsic::ctlz>()))		if (!match(TrueVal, m_Intrinsic<Intrinsic::ctlz>()))
return nullptr;		return nullptr;

Value *X = ICI->getOperand(0);		Value *X = ICI->getOperand(0);
auto *II = cast<IntrinsicInst>(TrueVal);		auto *II = cast<IntrinsicInst>(TrueVal);
if (!match(II->getOperand(0), m_c_And(m_Specific(X), m_Neg(m_Specific(X)))))		if (!match(II->getOperand(0), m_c_And(m_Specific(X), m_Neg(m_Specific(X)))))
return nullptr;		return nullptr;
		RKSimonUnsubmitted Not Done Reply Inline Actions How is BW different to Ty->getScalarSizeInBits() ? RKSimon: How is BW different to Ty->getScalarSizeInBits() ?

Function *F = Intrinsic::getDeclaration(II->getModule(), Intrinsic::cttz,		Function *F = Intrinsic::getDeclaration(II->getModule(), Intrinsic::cttz,
II->getType());		II->getType());
return CallInst::Create(F, {X, II->getArgOperand(1)});		return CallInst::Create(F, {X, II->getArgOperand(1)});
}		}

/// Attempt to fold a cttz/ctlz followed by a icmp plus select into a single		/// Attempt to fold a cttz/ctlz followed by a icmp plus select into a single
/// call to cttz/ctlz with flag 'is_zero_undef' cleared.		/// call to cttz/ctlz with flag 'is_zero_undef' cleared.
▲ Show 20 Lines • Show All 341 Lines • ▼ Show 20 Lines
// This can be rewritten as more canonical pattern:		// This can be rewritten as more canonical pattern:
// %new_cmp1 = icmp slt i32 %x, -C1		// %new_cmp1 = icmp slt i32 %x, -C1
// %new_cmp2 = icmp sge i32 %x, C0-C1		// %new_cmp2 = icmp sge i32 %x, C0-C1
// %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x		// %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x
// %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low		// %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low
// Iff -C1 s<= C2 s<= C0-C1		// Iff -C1 s<= C2 s<= C0-C1
// Also ULT predicate can also be UGT iff C0 != -1 (+invert result)		// Also ULT predicate can also be UGT iff C0 != -1 (+invert result)
// SLT predicate can also be SGT iff C2 != INT_MAX (+invert res.)		// SLT predicate can also be SGT iff C2 != INT_MAX (+invert res.)
static Instruction *canonicalizeClampLike(SelectInst &Sel0, ICmpInst &Cmp0,		static Value *canonicalizeClampLike(SelectInst &Sel0, ICmpInst &Cmp0,
InstCombiner::BuilderTy &Builder) {		InstCombiner::BuilderTy &Builder) {
Value *X = Sel0.getTrueValue();		Value *X = Sel0.getTrueValue();
Value *Sel1 = Sel0.getFalseValue();		Value *Sel1 = Sel0.getFalseValue();

// First match the condition of the outermost select.		// First match the condition of the outermost select.
// Said condition must be one-use.		// Said condition must be one-use.
if (!Cmp0.hasOneUse())		if (!Cmp0.hasOneUse())
return nullptr;		return nullptr;
Value *Cmp00 = Cmp0.getOperand(0);		Value *Cmp00 = Cmp0.getOperand(0);
Constant *C0;		Constant *C0;
if (!match(Cmp0.getOperand(1),		if (!match(Cmp0.getOperand(1),
m_CombineAnd(m_AnyIntegralConstant(), m_Constant(C0))))		m_CombineAnd(m_AnyIntegralConstant(), m_Constant(C0))))
return nullptr;		return nullptr;
// Canonicalize Cmp0 into the form we expect.		// Canonicalize Cmp0 into the form we expect.
// FIXME: we shouldn't care about lanes that are 'undef' in the end?		// FIXME: we shouldn't care about lanes that are 'undef' in the end?
switch (Cmp0.getPredicate()) {		switch (Cmp0.getPredicate()) {
case ICmpInst::Predicate::ICMP_ULT:		case ICmpInst::Predicate::ICMP_ULT:
// Although icmp ult %x, 0 is an unusual thing to try and should generally		// Although icmp ult %x, 0 is an unusual thing to try and should generally
// have been simplified, it does not verify with undef inputs so ensure we		// have been simplified, it does not verify with undef inputs so ensure we
// are not in a strange state.		// are not in a strange state.
if (!match(C0, m_SpecificInt_ICMP(		if (!match(C0, m_SpecificInt_ICMP(
ICmpInst::Predicate::ICMP_NE,		ICmpInst::Predicate::ICMP_NE,
APInt::getZero(C0->getType()->getScalarSizeInBits()))))		APInt::getZero(C0->getType()->getScalarSizeInBits()))))
return nullptr;		return nullptr;
break; // Great!		break; // Great!
case ICmpInst::Predicate::ICMP_ULE:		case ICmpInst::Predicate::ICMP_ULE:
// We'd have to increment C0 by one, and for that it must not have all-ones		// We'd have to increment C0 by one, and for that it must not have all-ones
// element, but then it would have been canonicalized to 'ult' before		// element, but then it would have been canonicalized to 'ult' before
// we get here. So we can't do anything useful with 'ule'.		// we get here. So we can't do anything useful with 'ule'.
return nullptr;		return nullptr;
case ICmpInst::Predicate::ICMP_UGT:		case ICmpInst::Predicate::ICMP_UGT:
// We want to canonicalize it to 'ult', so we'll need to increment C0,		// We want to canonicalize it to 'ult', so we'll need to increment C0,
		lebedev.riUnsubmitted Not Done Reply Inline Actions What happens if one element of a constant vector is all-ones/zero? lebedev.ri: What happens if one element of a constant vector is all-ones/zero?
// which again means it must not have any all-ones elements.		// which again means it must not have any all-ones elements.
if (!match(C0,		if (!match(C0,
m_SpecificInt_ICMP(		m_SpecificInt_ICMP(
ICmpInst::Predicate::ICMP_NE,		ICmpInst::Predicate::ICMP_NE,
APInt::getAllOnes(C0->getType()->getScalarSizeInBits()))))		APInt::getAllOnes(C0->getType()->getScalarSizeInBits()))))
return nullptr; // Can't do, have all-ones element[s].		return nullptr; // Can't do, have all-ones element[s].
C0 = InstCombiner::AddOne(C0);		C0 = InstCombiner::AddOne(C0);
std::swap(X, Sel1);		std::swap(X, Sel1);
break;		break;
case ICmpInst::Predicate::ICMP_UGE:		case ICmpInst::Predicate::ICMP_UGE:
// The only way we'd get this predicate if this `icmp` has extra uses,		// The only way we'd get this predicate if this `icmp` has extra uses,
// but then we won't be able to do this fold.		// but then we won't be able to do this fold.
return nullptr;		return nullptr;
default:		default:
return nullptr; // Unknown predicate.		return nullptr; // Unknown predicate.
}		}

// Now that we've canonicalized the ICmp, we know the X we expect;		// Now that we've canonicalized the ICmp, we know the X we expect;
// the select in other hand should be one-use.		// the select in other hand should be one-use.
if (!Sel1->hasOneUse())		if (!Sel1->hasOneUse())
return nullptr;		return nullptr;

		// If the types do not match, look through any truncs to the underlying
		// instruction.
		if (Cmp00->getType() != X->getType() && X->hasOneUse())
		match(X, m_TruncOrSelf(m_Value(X)));

		lebedev.riUnsubmitted Not Done Reply Inline Actions `match(&X, m_TruncOrSelf(m_Value(X)));` ? lebedev.ri: `match(&X, m_TruncOrSelf(m_Value(X)));` ?
// We now can finish matching the condition of the outermost select:		// We now can finish matching the condition of the outermost select:
// it should either be the X itself, or an addition of some constant to X.		// it should either be the X itself, or an addition of some constant to X.
Constant *C1;		Constant *C1;
if (Cmp00 == X)		if (Cmp00 == X)
C1 = ConstantInt::getNullValue(Sel0.getType());		C1 = ConstantInt::getNullValue(Sel0.getType());
else if (!match(Cmp00,		else if (!match(Cmp00,
m_Add(m_Specific(X),		m_Add(m_Specific(X),
m_CombineAnd(m_AnyIntegralConstant(), m_Constant(C1)))))		m_CombineAnd(m_AnyIntegralConstant(), m_Constant(C1)))))
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	static Value *canonicalizeClampLike(SelectInst &Sel0, ICmpInst &Cmp0,
if (!match(Precond1, m_One()))		if (!match(Precond1, m_One()))
return nullptr;		return nullptr;
// The fold has a precondition 2: C2 s<= ThresholdHigh		// The fold has a precondition 2: C2 s<= ThresholdHigh
auto *Precond2 = ConstantExpr::getICmp(ICmpInst::Predicate::ICMP_SLE, C2,		auto *Precond2 = ConstantExpr::getICmp(ICmpInst::Predicate::ICMP_SLE, C2,
ThresholdHighExcl);		ThresholdHighExcl);
if (!match(Precond2, m_One()))		if (!match(Precond2, m_One()))
return nullptr;		return nullptr;

		// If we are matching from a truncated input, we need to sext the
		// ReplacementLow and ReplacementHigh values. Only do the transform if they
		// are free to extend due to being constants.
		if (X->getType() != Sel0.getType()) {
		spatelUnsubmitted Not Done Reply Inline Actions It would be clearer to follow the optional casting logic if we consolidate it one block with something like this: if (X->getType() != Sel0.getType()) { Constant LowC, HighC; if (!match(ReplacementLow, m_ImmConstant(LowC)) \|\| !match(ReplacementHigh, m_ImmConstant(HighC))) return nullptr; ReplacementLow = ConstantExpr::getSExt(LowC, X->getType()); ReplacementHigh = ConstantExpr::getSExt(HighC, X->getType()); } ...and then get rid of the CreateSext diffs below here? spatel: It would be clearer to follow the optional casting logic if we consolidate it one block with…
		Constant LowC, HighC;
		lebedev.riUnsubmitted Not Done Reply Inline Actions This accepts constant exprs too. You probably want `m_ImmConstant()` matcher. lebedev.ri: This accepts constant exprs too. You probably want `m_ImmConstant()` matcher.
		if (!match(ReplacementLow, m_ImmConstant(LowC)) \|\|
		!match(ReplacementHigh, m_ImmConstant(HighC)))
		return nullptr;
		ReplacementLow = ConstantExpr::getSExt(LowC, X->getType());
		ReplacementHigh = ConstantExpr::getSExt(HighC, X->getType());
		}

// All good, finally emit the new pattern.		// All good, finally emit the new pattern.
Value *ShouldReplaceLow = Builder.CreateICmpSLT(X, ThresholdLowIncl);		Value *ShouldReplaceLow = Builder.CreateICmpSLT(X, ThresholdLowIncl);
Value *ShouldReplaceHigh = Builder.CreateICmpSGE(X, ThresholdHighExcl);		Value *ShouldReplaceHigh = Builder.CreateICmpSGE(X, ThresholdHighExcl);
Value *MaybeReplacedLow =		Value *MaybeReplacedLow =
Builder.CreateSelect(ShouldReplaceLow, ReplacementLow, X);		Builder.CreateSelect(ShouldReplaceLow, ReplacementLow, X);
Instruction *MaybeReplacedHigh =
SelectInst::Create(ShouldReplaceHigh, ReplacementHigh, MaybeReplacedLow);

return MaybeReplacedHigh;		// Create the final select. If we looked through a truncate above, we will
		// need to retruncate the result.
		lebedev.riUnsubmitted Not Done Reply Inline Actions We didn't need `sext` originally. Why do we always need one now? Should this use `CreateSExtOrBitCast()`? lebedev.ri: We didn't need `sext` originally. Why do we always need one now? Should this use…
		dmgreenAuthorUnsubmitted Done Reply Inline Actions It is relying on the CreateSExt doing `if (V->getType() == DestTy) return V;`. I can change that to CreateSExtOrBitCast but It shouldn't ever bitcast anything, the types will always be integers. dmgreen: It is relying on the CreateSExt doing `if (V->getType() == DestTy) return V;`. I can change…
		Value *MaybeReplacedHigh = Builder.CreateSelect(
		ShouldReplaceHigh, ReplacementHigh, MaybeReplacedLow);
		return Builder.CreateTrunc(MaybeReplacedHigh, Sel0.getType());
}		}

// If we have		// If we have
// %cmp = icmp [canonical predicate] i32 %x, C0		// %cmp = icmp [canonical predicate] i32 %x, C0
		lebedev.riUnsubmitted Not Done Reply Inline Actions Hm, so this now creates one more instruction than before, but i'm not seeing one more one-use check being added. Perhaps the original `trunc` should be one-use? lebedev.ri: Hm, so this now creates one more instruction than before, but i'm not seeing one more one-use…
// %r = select i1 %cmp, i32 %y, i32 C1		// %r = select i1 %cmp, i32 %y, i32 C1
// Where C0 != C1 and %x may be different from %y, see if the constant that we		// Where C0 != C1 and %x may be different from %y, see if the constant that we
// will have if we flip the strictness of the predicate (i.e. without changing		// will have if we flip the strictness of the predicate (i.e. without changing
// the result) is identical to the C1 in select. If it matches we can change		// the result) is identical to the C1 in select. If it matches we can change
		lebedev.riUnsubmitted Not Done Reply Inline Actions I would recommend something along the lines of using `CreateSExtOrBitCast()`, only calling `CreateSelect` in a single place, and returning through `CreateTruncOrBitCast()`. lebedev.ri: I would recommend something along the lines of using `CreateSExtOrBitCast()`, only calling…
		dmgreenAuthorUnsubmitted Done Reply Inline Actions Oh yeah I was awkwardly working around it returning an intruction. I've changed it, let be know if it looks wrong or they should be changed to `OrBitcast` versions. dmgreen: Oh yeah I was awkwardly working around it returning an intruction. I've changed it, let be know…
// original comparison to one with swapped predicate, reuse the constant,		// original comparison to one with swapped predicate, reuse the constant,
// and swap the hands of select.		// and swap the hands of select.
static Instruction *		static Instruction *
tryToReuseConstantFromSelectInComparison(SelectInst &Sel, ICmpInst &Cmp,		tryToReuseConstantFromSelectInComparison(SelectInst &Sel, ICmpInst &Cmp,
InstCombinerImpl &IC) {		InstCombinerImpl &IC) {
ICmpInst::Predicate Pred;		ICmpInst::Predicate Pred;
Value *X;		Value *X;
Constant *C0;		Constant *C0;
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	if (Instruction NewSel = foldSelectValueEquivalence(SI, ICI))
return NewSel;		return NewSel;

if (Instruction NewSel = canonicalizeMinMaxWithConstant(SI, ICI, *this))		if (Instruction NewSel = canonicalizeMinMaxWithConstant(SI, ICI, *this))
return NewSel;		return NewSel;

if (Instruction NewAbs = canonicalizeAbsNabs(SI, ICI, *this))		if (Instruction NewAbs = canonicalizeAbsNabs(SI, ICI, *this))
return NewAbs;		return NewAbs;

if (Instruction NewAbs = canonicalizeClampLike(SI, ICI, Builder))		if (Value V = canonicalizeClampLike(SI, ICI, Builder))
return NewAbs;		return replaceInstUsesWith(SI, V);

if (Instruction *NewSel =		if (Instruction *NewSel =
tryToReuseConstantFromSelectInComparison(SI, ICI, this))		tryToReuseConstantFromSelectInComparison(SI, ICI, this))
return NewSel;		return NewSel;

bool Changed = adjustMinMax(SI, *ICI);		bool Changed = adjustMinMax(SI, *ICI);

if (Value *V = foldSelectICmpAnd(SI, ICI, Builder))		if (Value *V = foldSelectICmpAnd(SI, ICI, Builder))
▲ Show 20 Lines • Show All 1,792 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/truncating-saturate.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"		target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"

declare void @use(i32)		declare void @use(i32)
		declare void @use16(i16)
declare void @use1(i1)		declare void @use1(i1)

define i8 @testi16i8(i16 %add) {		define i8 @testi16i8(i16 %add) {
; CHECK-LABEL: @testi16i8(		; CHECK-LABEL: @testi16i8(
; CHECK-NEXT: [[SH:%.]] = lshr i16 [[ADD:%.]], 8		; CHECK-NEXT: [[SH:%.]] = lshr i16 [[ADD:%.]], 8
; CHECK-NEXT: [[CONV_I:%.*]] = trunc i16 [[SH]] to i8		; CHECK-NEXT: [[CONV_I:%.*]] = trunc i16 [[SH]] to i8
; CHECK-NEXT: [[CONV1_I:%.*]] = trunc i16 [[ADD]] to i8		; CHECK-NEXT: [[CONV1_I:%.*]] = trunc i16 [[ADD]] to i8
; CHECK-NEXT: [[SHR2_I:%.*]] = ashr i8 [[CONV1_I]], 7		; CHECK-NEXT: [[SHR2_I:%.*]] = ashr i8 [[CONV1_I]], 7
Show All 38 Lines	;
%conv5.i = trunc i64 %shr4.i to i32		%conv5.i = trunc i64 %shr4.i to i32
%xor.i = xor i32 %conv5.i, 2147483647		%xor.i = xor i32 %conv5.i, 2147483647
%cond.i = select i1 %cmp.not.i, i32 %conv1.i, i32 %xor.i		%cond.i = select i1 %cmp.not.i, i32 %conv1.i, i32 %xor.i
ret i32 %cond.i		ret i32 %cond.i
}		}

define i16 @testi32i16i8(i32 %add) {		define i16 @testi32i16i8(i32 %add) {
; CHECK-LABEL: @testi32i16i8(		; CHECK-LABEL: @testi32i16i8(
; CHECK-NEXT: [[A:%.]] = add i32 [[ADD:%.]], 128		; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[ADD:%.]], -128
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[A]], 256		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[ADD]], i32 -128
; CHECK-NEXT: [[T:%.*]] = trunc i32 [[ADD]] to i16		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[TMP2]], 127
; CHECK-NEXT: [[C:%.*]] = icmp sgt i32 [[ADD]], -1		; CHECK-NEXT: [[TMP4:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 127
; CHECK-NEXT: [[F:%.*]] = select i1 [[C]], i16 127, i16 -128		; CHECK-NEXT: [[TMP5:%.*]] = trunc i32 [[TMP4]] to i16
; CHECK-NEXT: [[R:%.*]] = select i1 [[CMP]], i16 [[T]], i16 [[F]]		; CHECK-NEXT: ret i16 [[TMP5]]
; CHECK-NEXT: ret i16 [[R]]
;		;
%a = add i32 %add, 128		%a = add i32 %add, 128
%cmp = icmp ult i32 %a, 256		%cmp = icmp ult i32 %a, 256
%t = trunc i32 %add to i16		%t = trunc i32 %add to i16
%c = icmp sgt i32 %add, -1		%c = icmp sgt i32 %add, -1
%f = select i1 %c, i16 127, i16 -128		%f = select i1 %c, i16 127, i16 -128
%r = select i1 %cmp, i16 %t, i16 %f		%r = select i1 %cmp, i16 %t, i16 %f
ret i16 %r		ret i16 %r
}		}

define <4 x i16> @testv4i32i16i8(<4 x i32> %add) {		define <4 x i16> @testv4i32i16i8(<4 x i32> %add) {
; CHECK-LABEL: @testv4i32i16i8(		; CHECK-LABEL: @testv4i32i16i8(
; CHECK-NEXT: [[A:%.]] = add <4 x i32> [[ADD:%.]], <i32 128, i32 128, i32 128, i32 128>		; CHECK-NEXT: [[TMP1:%.]] = icmp sgt <4 x i32> [[ADD:%.]], <i32 -128, i32 -128, i32 -128, i32 -128>
; CHECK-NEXT: [[CMP:%.*]] = icmp ult <4 x i32> [[A]], <i32 256, i32 256, i32 256, i32 256>		; CHECK-NEXT: [[TMP2:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[ADD]], <4 x i32> <i32 -128, i32 -128, i32 -128, i32 -128>
; CHECK-NEXT: [[T:%.*]] = trunc <4 x i32> [[ADD]] to <4 x i16>		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt <4 x i32> [[TMP2]], <i32 127, i32 127, i32 127, i32 127>
; CHECK-NEXT: [[C:%.*]] = icmp sgt <4 x i32> [[ADD]], <i32 -1, i32 -1, i32 -1, i32 -1>		; CHECK-NEXT: [[TMP4:%.*]] = select <4 x i1> [[TMP3]], <4 x i32> [[TMP2]], <4 x i32> <i32 127, i32 127, i32 127, i32 127>
; CHECK-NEXT: [[F:%.*]] = select <4 x i1> [[C]], <4 x i16> <i16 127, i16 127, i16 127, i16 127>, <4 x i16> <i16 -128, i16 -128, i16 -128, i16 -128>		; CHECK-NEXT: [[TMP5:%.*]] = trunc <4 x i32> [[TMP4]] to <4 x i16>
; CHECK-NEXT: [[R:%.*]] = select <4 x i1> [[CMP]], <4 x i16> [[T]], <4 x i16> [[F]]		; CHECK-NEXT: ret <4 x i16> [[TMP5]]
; CHECK-NEXT: ret <4 x i16> [[R]]
;		;
%a = add <4 x i32> %add, <i32 128, i32 128, i32 128, i32 128>		%a = add <4 x i32> %add, <i32 128, i32 128, i32 128, i32 128>
%cmp = icmp ult <4 x i32> %a, <i32 256, i32 256, i32 256, i32 256>		%cmp = icmp ult <4 x i32> %a, <i32 256, i32 256, i32 256, i32 256>
%t = trunc <4 x i32> %add to <4 x i16>		%t = trunc <4 x i32> %add to <4 x i16>
%c = icmp sgt <4 x i32> %add, <i32 -1, i32 -1, i32 -1, i32 -1>		%c = icmp sgt <4 x i32> %add, <i32 -1, i32 -1, i32 -1, i32 -1>
%f = select <4 x i1> %c, <4 x i16> <i16 127, i16 127, i16 127, i16 127>, <4 x i16> <i16 -128, i16 -128, i16 -128, i16 -128>		%f = select <4 x i1> %c, <4 x i16> <i16 127, i16 127, i16 127, i16 127>, <4 x i16> <i16 -128, i16 -128, i16 -128, i16 -128>
%r = select <4 x i1> %cmp, <4 x i16> %t, <4 x i16> %f		%r = select <4 x i1> %cmp, <4 x i16> %t, <4 x i16> %f
ret <4 x i16> %r		ret <4 x i16> %r
}		}

define i32 @testi32i32i8(i32 %add) {		define i32 @testi32i32i8(i32 %add) {
; CHECK-LABEL: @testi32i32i8(		; CHECK-LABEL: @testi32i32i8(
; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[ADD:%.]], -128		; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[ADD:%.]], -128
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[ADD]], i32 -128		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[ADD]], i32 -128
; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[TMP2]], 127		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[TMP2]], 127
; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 127		; CHECK-NEXT: [[TMP4:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 127
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[TMP4]]
;		;
%a = add i32 %add, 128		%a = add i32 %add, 128
%cmp = icmp ult i32 %a, 256		%cmp = icmp ult i32 %a, 256
%c = icmp sgt i32 %add, -1		%c = icmp sgt i32 %add, -1
%f = select i1 %c, i32 127, i32 -128		%f = select i1 %c, i32 127, i32 -128
%r = select i1 %cmp, i32 %add, i32 %f		%r = select i1 %cmp, i32 %add, i32 %f
ret i32 %r		ret i32 %r
}		}

define i16 @test_truncfirst(i32 %add) {		define i16 @test_truncfirst(i32 %add) {
; CHECK-LABEL: @test_truncfirst(		; CHECK-LABEL: @test_truncfirst(
; CHECK-NEXT: [[T:%.]] = trunc i32 [[ADD:%.]] to i16		; CHECK-NEXT: [[T:%.]] = trunc i32 [[ADD:%.]] to i16
; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i16 [[T]], -128		; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i16 [[T]], -128
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i16 [[T]], i16 -128		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i16 [[T]], i16 -128
; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i16 [[TMP2]], 127		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i16 [[TMP2]], 127
; CHECK-NEXT: [[R:%.*]] = select i1 [[TMP3]], i16 [[TMP2]], i16 127		; CHECK-NEXT: [[TMP4:%.*]] = select i1 [[TMP3]], i16 [[TMP2]], i16 127
; CHECK-NEXT: ret i16 [[R]]		; CHECK-NEXT: ret i16 [[TMP4]]
;		;
%t = trunc i32 %add to i16		%t = trunc i32 %add to i16
%a = add i16 %t, 128		%a = add i16 %t, 128
%cmp = icmp ult i16 %a, 256		%cmp = icmp ult i16 %a, 256
%c = icmp sgt i16 %t, -1		%c = icmp sgt i16 %t, -1
%f = select i1 %c, i16 127, i16 -128		%f = select i1 %c, i16 127, i16 -128
%r = select i1 %cmp, i16 %t, i16 %f		%r = select i1 %cmp, i16 %t, i16 %f
ret i16 %r		ret i16 %r
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	;
%conv5.i = trunc i32 %shr4.i to i8		%conv5.i = trunc i32 %shr4.i to i8
%xor.i = xor i8 %conv5.i, 127		%xor.i = xor i8 %conv5.i, 127
%cond.i = select i1 %cmp.not.i, i8 %conv1.i, i8 %xor.i		%cond.i = select i1 %cmp.not.i, i8 %conv1.i, i8 %xor.i
ret i8 %cond.i		ret i8 %cond.i
}		}

define i16 @differentconsts(i32 %x, i16 %replacement_low, i16 %replacement_high) {		define i16 @differentconsts(i32 %x, i16 %replacement_low, i16 %replacement_high) {
; CHECK-LABEL: @differentconsts(		; CHECK-LABEL: @differentconsts(
; CHECK-NEXT: [[T0:%.]] = icmp slt i32 [[X:%.]], 128		; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[X:%.]], -16
; CHECK-NEXT: [[T1:%.*]] = select i1 [[T0]], i16 256, i16 -1		; CHECK-NEXT: [[TMP2:%.*]] = icmp sgt i32 [[X]], 127
; CHECK-NEXT: [[T2:%.*]] = add i32 [[X]], 16		; CHECK-NEXT: [[TMP3:%.*]] = trunc i32 [[X]] to i16
; CHECK-NEXT: [[T3:%.*]] = icmp ult i32 [[T2]], 144		; CHECK-NEXT: [[TMP4:%.*]] = select i1 [[TMP1]], i16 256, i16 [[TMP3]]
; CHECK-NEXT: [[T4:%.*]] = trunc i32 [[X]] to i16		; CHECK-NEXT: [[TMP5:%.*]] = select i1 [[TMP2]], i16 -1, i16 [[TMP4]]
; CHECK-NEXT: [[R:%.*]] = select i1 [[T3]], i16 [[T4]], i16 [[T1]]		; CHECK-NEXT: ret i16 [[TMP5]]
; CHECK-NEXT: ret i16 [[R]]
;		;
%t0 = icmp slt i32 %x, 128		%t0 = icmp slt i32 %x, 128
%t1 = select i1 %t0, i16 256, i16 65535		%t1 = select i1 %t0, i16 256, i16 65535
%t2 = add i32 %x, 16		%t2 = add i32 %x, 16
%t3 = icmp ult i32 %t2, 144		%t3 = icmp ult i32 %t2, 144
%t4 = trunc i32 %x to i16		%t4 = trunc i32 %x to i16
%r = select i1 %t3, i16 %t4, i16 %t1		%r = select i1 %t3, i16 %t4, i16 %t1
ret i16 %r		ret i16 %r
▲ Show 20 Lines • Show All 237 Lines • ▼ Show 20 Lines	;
%xor.i = xor i32 %conv5.i, 2147483647		%xor.i = xor i32 %conv5.i, 2147483647
%cond.i = select i1 %cmp.not.i, i32 %conv1.i, i32 %xor.i		%cond.i = select i1 %cmp.not.i, i32 %conv1.i, i32 %xor.i
call void @use(i32 %xor.i)		call void @use(i32 %xor.i)
call void @use(i32 %conv1.i)		call void @use(i32 %conv1.i)
call void @use1(i1 %cmp.not.i)		call void @use1(i1 %cmp.not.i)
ret i32 %cond.i		ret i32 %cond.i
}		}

		define i16 @differentconsts_usetrunc(i32 %x, i16 %replacement_low, i16 %replacement_high) {
		; CHECK-LABEL: @differentconsts_usetrunc(
		; CHECK-NEXT: [[T0:%.]] = icmp slt i32 [[X:%.]], 128
		; CHECK-NEXT: [[T1:%.*]] = select i1 [[T0]], i16 256, i16 -1
		; CHECK-NEXT: [[T2:%.*]] = add i32 [[X]], 16
		; CHECK-NEXT: [[T3:%.*]] = icmp ult i32 [[T2]], 144
		; CHECK-NEXT: [[T4:%.*]] = trunc i32 [[X]] to i16
		; CHECK-NEXT: [[R:%.*]] = select i1 [[T3]], i16 [[T4]], i16 [[T1]]
		; CHECK-NEXT: call void @use16(i16 [[T4]])
		; CHECK-NEXT: ret i16 [[R]]
		;
		%t0 = icmp slt i32 %x, 128
		%t1 = select i1 %t0, i16 256, i16 65535
		%t2 = add i32 %x, 16
		%t3 = icmp ult i32 %t2, 144
		%t4 = trunc i32 %x to i16
		%r = select i1 %t3, i16 %t4, i16 %t1
		call void @use16(i16 %t4)
		ret i16 %r
		}

		define i16 @differentconsts_useadd(i32 %x, i16 %replacement_low, i16 %replacement_high) {
		; CHECK-LABEL: @differentconsts_useadd(
		; CHECK-NEXT: [[T2:%.]] = add i32 [[X:%.]], 16
		; CHECK-NEXT: [[TMP1:%.*]] = icmp slt i32 [[X]], -16
		; CHECK-NEXT: [[TMP2:%.*]] = icmp sgt i32 [[X]], 127
		; CHECK-NEXT: [[TMP3:%.*]] = trunc i32 [[X]] to i16
		; CHECK-NEXT: [[TMP4:%.*]] = select i1 [[TMP1]], i16 256, i16 [[TMP3]]
		; CHECK-NEXT: [[TMP5:%.*]] = select i1 [[TMP2]], i16 -1, i16 [[TMP4]]
		; CHECK-NEXT: call void @use(i32 [[T2]])
		; CHECK-NEXT: ret i16 [[TMP5]]
		;
		%t0 = icmp slt i32 %x, 128
		%t1 = select i1 %t0, i16 256, i16 65535
		%t2 = add i32 %x, 16
		%t3 = icmp ult i32 %t2, 144
		%t4 = trunc i32 %x to i16
		%r = select i1 %t3, i16 %t4, i16 %t1
		call void @use(i32 %t2)
		ret i16 %r
		}

		define i16 @differentconsts_useaddtrunc(i32 %x, i16 %replacement_low, i16 %replacement_high) {
		; CHECK-LABEL: @differentconsts_useaddtrunc(
		; CHECK-NEXT: [[T0:%.]] = icmp slt i32 [[X:%.]], 128
		; CHECK-NEXT: [[T1:%.*]] = select i1 [[T0]], i16 256, i16 -1
		; CHECK-NEXT: [[T2:%.*]] = add i32 [[X]], 16
		; CHECK-NEXT: [[T3:%.*]] = icmp ult i32 [[T2]], 144
		; CHECK-NEXT: [[T4:%.*]] = trunc i32 [[X]] to i16
		; CHECK-NEXT: [[R:%.*]] = select i1 [[T3]], i16 [[T4]], i16 [[T1]]
		; CHECK-NEXT: call void @use16(i16 [[T4]])
		; CHECK-NEXT: call void @use(i32 [[T2]])
		; CHECK-NEXT: ret i16 [[R]]
		;
		%t0 = icmp slt i32 %x, 128
		%t1 = select i1 %t0, i16 256, i16 65535
		%t2 = add i32 %x, 16
		%t3 = icmp ult i32 %t2, 144
		%t4 = trunc i32 %x to i16
		%r = select i1 %t3, i16 %t4, i16 %t1
		call void @use16(i16 %t4)
		call void @use(i32 %t2)
		ret i16 %r
		}


define i8 @C0zero(i8 %X, i8 %y, i8 %z) {		define i8 @C0zero(i8 %X, i8 %y, i8 %z) {
; CHECK-LABEL: @C0zero(		; CHECK-LABEL: @C0zero(
; CHECK-NEXT: [[C:%.]] = icmp slt i8 [[X:%.]], -10		; CHECK-NEXT: [[C:%.]] = icmp slt i8 [[X:%.]], -10
; CHECK-NEXT: [[F:%.]] = select i1 [[C]], i8 [[Y:%.]], i8 [[Z:%.*]]		; CHECK-NEXT: [[F:%.]] = select i1 [[C]], i8 [[Y:%.]], i8 [[Z:%.*]]
; CHECK-NEXT: ret i8 [[F]]		; CHECK-NEXT: ret i8 [[F]]
;		;
%a = add i8 %X, 10		%a = add i8 %X, 10
%cmp = icmp ult i8 %a, 0		%cmp = icmp ult i8 %a, 0
Show All 12 Lines	;
%a = add <2 x i8> %X, <i8 10, i8 10>		%a = add <2 x i8> %X, <i8 10, i8 10>
%cmp = icmp ult <2 x i8> %a, zeroinitializer		%cmp = icmp ult <2 x i8> %a, zeroinitializer
%c = icmp slt <2 x i8> %X, <i8 -10, i8 -10>		%c = icmp slt <2 x i8> %X, <i8 -10, i8 -10>
%f = select <2 x i1> %c, <2 x i8> %y, <2 x i8> %z		%f = select <2 x i1> %c, <2 x i8> %y, <2 x i8> %z
%r = select <2 x i1> %cmp, <2 x i8> %X, <2 x i8> %f		%r = select <2 x i1> %cmp, <2 x i8> %X, <2 x i8> %f
ret <2 x i8> %r		ret <2 x i8> %r
}		}

define <2 x i8> @C0zeroVu(<2 x i8> %X, <2 x i8> %y, <2 x i8> %z) {		define <2 x i8> @C0zeroVu(<2 x i8> %X, <2 x i8> %y, <2 x i8> %z) {
		spatelUnsubmitted Not Done Reply Inline Actions Double-checking my understanding: this is a miscompile and is fixed by the new ULT check (m_SpecificInt_ICMP)? If so, let's commit that fix first? spatel: Double-checking my understanding: this is a miscompile and is fixed by the new ULT check…
; CHECK-LABEL: @C0zeroVu(		; CHECK-LABEL: @C0zeroVu(
; CHECK-NEXT: [[A:%.]] = add <2 x i8> [[X:%.]], <i8 10, i8 10>		; CHECK-NEXT: [[A:%.]] = add <2 x i8> [[X:%.]], <i8 10, i8 10>
; CHECK-NEXT: [[CMP:%.*]] = icmp ult <2 x i8> [[A]], <i8 0, i8 10>		; CHECK-NEXT: [[CMP:%.*]] = icmp ult <2 x i8> [[A]], <i8 0, i8 10>
; CHECK-NEXT: [[C:%.*]] = icmp slt <2 x i8> [[X]], <i8 -10, i8 -10>		; CHECK-NEXT: [[C:%.*]] = icmp slt <2 x i8> [[X]], <i8 -10, i8 -10>
; CHECK-NEXT: [[F:%.]] = select <2 x i1> [[C]], <2 x i8> [[Y:%.]], <2 x i8> [[Z:%.*]]		; CHECK-NEXT: [[F:%.]] = select <2 x i1> [[C]], <2 x i8> [[Y:%.]], <2 x i8> [[Z:%.*]]
; CHECK-NEXT: [[R:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[X]], <2 x i8> [[F]]		; CHECK-NEXT: [[R:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[X]], <2 x i8> [[F]]
; CHECK-NEXT: ret <2 x i8> [[R]]		; CHECK-NEXT: ret <2 x i8> [[R]]
;		;
%a = add <2 x i8> %X, <i8 10, i8 10>		%a = add <2 x i8> %X, <i8 10, i8 10>
%cmp = icmp ult <2 x i8> %a, <i8 0, i8 10>		%cmp = icmp ult <2 x i8> %a, <i8 0, i8 10>
%c = icmp slt <2 x i8> %X, <i8 -10, i8 -10>		%c = icmp slt <2 x i8> %X, <i8 -10, i8 -10>
%f = select <2 x i1> %c, <2 x i8> %y, <2 x i8> %z		%f = select <2 x i1> %c, <2 x i8> %y, <2 x i8> %z
%r = select <2 x i1> %cmp, <2 x i8> %X, <2 x i8> %f		%r = select <2 x i1> %cmp, <2 x i8> %X, <2 x i8> %f
ret <2 x i8> %r		ret <2 x i8> %r
}		}