Download Raw Diff

Details

Reviewers

spatel
RKSimon
zvi
craig.topper
igorb
hfinkel

Commits

rG16b20d2fc5df: [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization
rL300422: [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization

Summary

This patch is a part of two reviews and base on the https://reviews.llvm.org/D31396.

This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b))
to instCombineCall pass and was written specific for X86 CMP intrinsics.

Diff Detail

Event Timeline

m_zuckerman created this revision.Mar 27 2017, 8:42 AM

m_zuckerman added a parent revision: D31396: [X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics.

Please make sure you always include llvm-commits as a subscriber in future patches.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2401 ↗	(On Diff #93138)	Pull out these repeated calls to getFastMathFlags: FastMathFlags FMFs = I->getFastMathFlags();
test/Transforms/InstCombine/X86FsubCmpCombine.ll
1 ↗	(On Diff #93138)	Regenerate with utils/update_test_checks.py
5 ↗	(On Diff #93138)	maybe split these into one test per intrinsic? also do tests for safe algebra (or maybe one of the other fmfs each per test).
12 ↗	(On Diff #93138)	clean this up if possible

m_zuckerman updated this revision to Diff 93545.Mar 30 2017, 2:41 PM

m_zuckerman marked 4 inline comments as done.

If this transform is valid (cc'ing @scanon), then should we also do this for general IR?

define i1 @fcmpsub(double %x, double %y) {
  %sub = fsub nnan ninf nsz double %x, %y
  %cmp = fcmp nnan ninf nsz ugt double %sub, 0.0
  ret i1 %cmp
}

define i1 @fcmpsub(double %x, double %y) {
  %cmp = fcmp nnan ninf nsz ugt double %x, %y
  ret i1 %cmp
}

lib/Transforms/InstCombine/InstCombineCalls.cpp
2401 ↗	(On Diff #93545)	Don't we need to check the FMF of the intrinsic too?
2402–2403 ↗	(On Diff #93545)	Currently, unsafeAlgebra implies all of the other FMF bits, so checking that bit is redundant here. If we change the definition of unsafeAlgebra in the future (there was a proposal for this recently), then this check will be wrong. Either way, remove the unsafeAlgebra predicate (unless I'm misunderstanding the constraints of this transform).

(x-y) == 0 --> x == y does not require nsz (zeros of any sign compare equal), nor does it require nnan (if x or y is NaN, both comparisons are false). It *does* require ninf (because inf-inf = NaN), and it also requires that subnormals are not flushed by the CPU. There's been some churn around flushing recently, Hal may have thoughts (+Hal).

In D31398#715296, @spatel wrote:
If this transform is valid (cc'ing @scanon), then should we also do this for general IR?
define i1 @fcmpsub(double %x, double %y) {
  %sub = fsub nnan ninf nsz double %x, %y
  %cmp = fcmp nnan ninf nsz ugt double %sub, 0.0
  ret i1 %cmp
}

define i1 @fcmpsub(double %x, double %y) {
  %cmp = fcmp nnan ninf nsz ugt double %x, %y
  ret i1 %cmp
}

You are absolutely right, your transform is valid and we will do it after this patch.
Since the intrinsics are lowered with generic IR, mine patch is still valid and we will need them both for a complete solution.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2401 ↗	(On Diff #93545)	No, we don't need to check, this is implied from the flags of the sub. According to http://llvm.org/docs/LangRef.html#fast-math-flags optimizations can assume that the arguments and the result behave as expected from them. Since the compare uses the result and splits them to two arguments (the same arguments as in the sub) we are still working with the early assumption. We can continue with the assumptions as long we will work with the same arguments or the same result.

m_zuckerman updated this revision to Diff 93747.Apr 1 2017, 7:07 AM

In D31398#716002, @m_zuckerman wrote:

You are absolutely right, your transform is valid and we will do it after this patch.
Since the intrinsics are lowered with generic IR, mine patch is still valid and we will need them both for a complete solution.

We need to be conservative for the general case...speaking from experience :). As @scanon mentioned, we need some way to tell whether denorms are flushed to zero or not. I think this patch is safe currently because we assume the default FP ENV, and on x86 that would not have DAZ/FTZ.
We don't need nsz or nnan for this fold (see @scanon comment).
I'd prefer to put all of the tests in one file since they are just variants of the same fold.

In D31398#716031, @spatel wrote:

In D31398#716002, @m_zuckerman wrote:

You are absolutely right, your transform is valid and we will do it after this patch.
Since the intrinsics are lowered with generic IR, mine patch is still valid and we will need them both for a complete solution.

We need to be conservative for the general case...speaking from experience :). As @scanon mentioned, we need some way to tell whether denorms are flushed to zero or not. I think this patch is safe currently because we assume the default FP ENV, and on x86 that would not have DAZ/FTZ.

We don't need nsz or nnan for this fold (see @scanon comment).

I'd prefer to put all of the tests in one file since they are just variants of the same fold.

1.I agree with you, in the general case we need to be caution, but this transformation is feasible.
Regard the X86 this transformation is safe (As you wrote) from all perspectives. We know that status of the flags and we know the target.

3.I am with you on that.

m_zuckerman added a reviewer: llvm-commits.Apr 2 2017, 4:45 AM

m_zuckerman removed a reviewer: llvm-commits.

m_zuckerman updated this revision to Diff 93795.Apr 2 2017, 9:15 AM

spatel added inline comments.Apr 3 2017, 7:39 AM

lib/Transforms/InstCombine/InstCombineCalls.cpp
2396 ↗	(On Diff #93795)	You can use 'auto *' with dyn_cast because the type is obvious.
2398–2400 ↗	(On Diff #93795)	This comment should specify the non-obvious constraints that we've discussed here: // This fold requires NINF because inf minus inf is nan. // NSZ is not needed because zeros of any sign are equal for both compares. // NNAN is not needed because nans compare the same for both compares. // FMF are not needed on the compare intrinsic because...
test/Transforms/InstCombine/X86FsubCmpCombine.ll
6 ↗	(On Diff #93795)	Misspelling in function name. Also, as suggested earlier, I'd really prefer to have one test per intrinsic rather than everything in one function. It makes it a lot easier to see the simple pattern that's getting folded.

m_zuckerman updated this revision to Diff 94199.Apr 5 2017, 4:38 AM

Ping

LGTM - see inline comments for a couple of cleanups.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2407–2410 ↗	(On Diff #94199)	Giving local names to the operands doesn't add value here IMO, but if you want to do that, I prefer to use "A" and "B" to match the formula in the comment.
2396 ↗	(On Diff #93795)	You missed this nit.

This revision is now accepted and ready to land.Apr 10 2017, 4:36 PM

In D31398#715307, @scanon wrote:

(x-y) == 0 --> x == y does not require nsz (zeros of any sign compare equal), nor does it require nnan (if x or y is NaN, both comparisons are false). It *does* require ninf (because inf-inf = NaN), and it also requires that subnormals are not flushed by the CPU. There's been some churn around flushing recently, Hal may have thoughts (+Hal).

We still assume standard denormal handling. There's ongoing work to add intrinsics for operations that are sensitive to this behavior (and rounding modes, etc.), but that shouldn't affect this.

Hi,
I did a small modification on the code since I am going to retire from the canonical compare representation review.
I added to the code the ability to fold cmp(0, fsub(a,b)) additnal to the reviewrd cmp(fsub(a,b),0)

Thanks,
Michael Zuckerman

spatel added inline comments.Apr 14 2017, 7:26 AM

Transforms/InstCombine/InstCombineCalls.cpp

2348–2366

There are many ways to deal with commuted patterns, and a 2-loop is my least favorite. Would you consider using std::swap and matchers instead? Something like:

Value *Arg0 = II->getArgOperand(0);
Value *Arg1 = II->getArgOperand(1);
bool Arg0IsZero = match(Arg0, m_Zero());
if (Arg0IsZero)
  std::swap(Arg0, Arg1);
Value *A, *B;
if ((match(Arg0, m_OneUse(m_FSub(m_Value(A), m_Value(B)))) &&
     match(Arg1, m_Zero()) &&
     cast<Instruction>(Arg0)->getFastMathFlags().noInfs())) {
  if (Arg0IsZero)
    std::swap(A, B);
  II->setArgOperand(0, A);
  II->setArgOperand(1, B);
  return II;
}

I agree with you that your code is more elegant.

LGTM.

Thanks

Closed by commit rL300422: [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization (authored by mzuckerm). · Explain WhyApr 16 2017, 6:39 AM

This revision was automatically updated to reflect the committed changes.

Diff 95377

Transforms/InstCombine/InstCombineCalls.cpp

	Show First 20 Lines • Show All 1,993 Lines • ▼ Show 20 Lines
	if (Value *V = SimplifyDemandedVectorEltsLow(Arg1, VWidth, 1)) {			if (Value *V = SimplifyDemandedVectorEltsLow(Arg1, VWidth, 1)) {
	II->setArgOperand(1, V);			II->setArgOperand(1, V);
	MadeChange = true;			MadeChange = true;
	}			}
	if (MadeChange)			if (MadeChange)
	return II;			return II;
	break;			break;
	}			}
				case Intrinsic::x86_avx512_mask_cmp_pd_128:
				case Intrinsic::x86_avx512_mask_cmp_pd_256:
				case Intrinsic::x86_avx512_mask_cmp_pd_512:
				case Intrinsic::x86_avx512_mask_cmp_ps_128:
				case Intrinsic::x86_avx512_mask_cmp_ps_256:
				case Intrinsic::x86_avx512_mask_cmp_ps_512: {
				// Folding cmp(sub(a,b),0) -> cmp(a,b) and cmp(0,sub(a,b)) -> cmp(b,a)
				Value *Arg0 = II->getArgOperand(0);
				Value *Arg1 = II->getArgOperand(1);
				bool Arg0IsZero = match(Arg0, m_Zero());
				if (Arg0IsZero)
				std::swap(Arg0, Arg1);
				Value A, B;
				// This fold requires only the NINF(not +/- inf) since inf minus
				// inf is nan.
				// NSZ(No Signed Zeros) is not needed because zeros of any sign are
				// equal for both compares.
				// NNAN is not needed because nans compare the same for both compares.
				// The compare intrinsic uses the above assumptions and therefore
				// doesn't require additional flags.
				if ((match(Arg0, m_OneUse(m_FSub(m_Value(A), m_Value(B)))) &&
				match(Arg1, m_Zero()) &&
				cast<Instruction>(Arg0)->getFastMathFlags().noInfs())) {
				if (Arg0IsZero)
				std::swap(A, B);
				II->setArgOperand(0, A);
				II->setArgOperand(1, B);
				return II;
				spatelUnsubmitted Not Done Reply Inline Actions There are many ways to deal with commuted patterns, and a 2-loop is my least favorite. Would you consider using std::swap and matchers instead? Something like: Value Arg0 = II->getArgOperand(0); Value Arg1 = II->getArgOperand(1); bool Arg0IsZero = match(Arg0, m_Zero()); if (Arg0IsZero) std::swap(Arg0, Arg1); Value A, B; if ((match(Arg0, m_OneUse(m_FSub(m_Value(A), m_Value(B)))) && match(Arg1, m_Zero()) && cast<Instruction>(Arg0)->getFastMathFlags().noInfs())) { if (Arg0IsZero) std::swap(A, B); II->setArgOperand(0, A); II->setArgOperand(1, B); return II; } spatel: There are many ways to deal with commuted patterns, and a 2-loop is my least favorite. Would…
				}
				break;
				}

	case Intrinsic::x86_avx512_mask_add_ps_512:			case Intrinsic::x86_avx512_mask_add_ps_512:
	case Intrinsic::x86_avx512_mask_div_ps_512:			case Intrinsic::x86_avx512_mask_div_ps_512:
	case Intrinsic::x86_avx512_mask_mul_ps_512:			case Intrinsic::x86_avx512_mask_mul_ps_512:
	case Intrinsic::x86_avx512_mask_sub_ps_512:			case Intrinsic::x86_avx512_mask_sub_ps_512:
	case Intrinsic::x86_avx512_mask_add_pd_512:			case Intrinsic::x86_avx512_mask_add_pd_512:
	case Intrinsic::x86_avx512_mask_div_pd_512:			case Intrinsic::x86_avx512_mask_div_pd_512:
	case Intrinsic::x86_avx512_mask_mul_pd_512:			case Intrinsic::x86_avx512_mask_mul_pd_512:
	▲ Show 20 Lines • Show All 992 Lines • Show Last 20 Lines

Transforms/InstCombine/X86FsubCmpCombine.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -instcombine -S \| FileCheck %s

				; The test checks the folding of cmp(sub(a,b),0) into cmp(a,b).

				define i8 @sub_compare_foldingPD128_safe(<2 x double> %a, <2 x double> %b){
				; CHECK-LABEL: @sub_compare_foldingPD128_safe(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[SUB_SAFE:%.]] = fsub <2 x double> [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: [[TMP0:%.*]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> [[SUB_SAFE]], <2 x double> zeroinitializer, i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.safe = fsub <2 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> %sub.safe , <2 x double> zeroinitializer, i32 5, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_foldingPD128(<2 x double> %a, <2 x double> %b){
				; CHECK-LABEL: @sub_compare_foldingPD128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> [[A:%.]], <2 x double> [[B:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <2 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> %sub.i , <2 x double> zeroinitializer, i32 5, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_foldingPD256(<4 x double> %a, <4 x double> %b){
				; CHECK-LABEL: @sub_compare_foldingPD256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> [[A:%.]], <4 x double> [[B:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i1 = fsub ninf <4 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> %sub.i1, <4 x double> zeroinitializer, i32 5, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_foldingPD512(<8 x double> %a, <8 x double> %b){
				; CHECK-LABEL: @sub_compare_foldingPD512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> [[A:%.]], <8 x double> [[B:%.*]], i32 11, i8 -1, i32 4)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i2 = fsub ninf <8 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> %sub.i2, <8 x double> zeroinitializer, i32 11, i8 -1, i32 4)
				ret i8 %0
				}


				define i8 @sub_compare_foldingPS128(<4 x float> %a, <4 x float> %b){
				; CHECK-LABEL: @sub_compare_foldingPS128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> [[A:%.]], <4 x float> [[B:%.*]], i32 12, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i3 = fsub ninf <4 x float> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> %sub.i3, <4 x float> zeroinitializer, i32 12, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_foldingPS256(<8 x float> %a, <8 x float> %b){
				; CHECK-LABEL: @sub_compare_foldingPS256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> [[A:%.]], <8 x float> [[B:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i4 = fsub ninf <8 x float> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> %sub.i4, <8 x float> zeroinitializer, i32 5, i8 -1)
				ret i8 %0
				}


				define i16 @sub_compare_foldingPS512(<16 x float> %a, <16 x float> %b){
				; CHECK-LABEL: @sub_compare_foldingPS512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> [[A:%.]], <16 x float> [[B:%.*]], i32 11, i16 -1, i32 4)
				; CHECK-NEXT: ret i16 [[TMP0]]
				;
				entry:
				%sub.i5 = fsub ninf <16 x float> %a, %b
				%0 = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> %sub.i5, <16 x float> zeroinitializer, i32 11, i16 -1, i32 4)
				ret i16 %0
				}



				define i8 @sub_compare_folding_swapPD128(<2 x double> %a, <2 x double> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPD128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> [[B:%.]], <2 x double> [[A:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <2 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> zeroinitializer, <2 x double> %sub.i, i32 5, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_folding_swapPD256(<4 x double> %a, <4 x double> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPD256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> [[B:%.]], <4 x double> [[A:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <4 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> zeroinitializer, <4 x double> %sub.i, i32 5, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_folding_swapPD512(<8 x double> %a, <8 x double> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPD512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> [[B:%.]], <8 x double> [[A:%.*]], i32 11, i8 -1, i32 4)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <8 x double> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> zeroinitializer, <8 x double> %sub.i, i32 11, i8 -1, i32 4)
				ret i8 %0
				}


				define i8 @sub_compare_folding_swapPS128(<4 x float> %a, <4 x float> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPS128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> [[B:%.]], <4 x float> [[A:%.*]], i32 12, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <4 x float> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> zeroinitializer, <4 x float> %sub.i, i32 12, i8 -1)
				ret i8 %0
				}


				define i8 @sub_compare_folding_swapPS256(<8 x float> %a, <8 x float> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPS256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> [[B:%.]], <8 x float> [[A:%.*]], i32 5, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <8 x float> %a, %b
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> zeroinitializer, <8 x float> %sub.i, i32 5, i8 -1)
				ret i8 %0
				}


				define i16 @sub_compare_folding_swapPS512(<16 x float> %a, <16 x float> %b){
				; CHECK-LABEL: @sub_compare_folding_swapPS512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> [[B:%.]], <16 x float> [[A:%.*]], i32 11, i16 -1, i32 4)
				; CHECK-NEXT: ret i16 [[TMP0]]
				;
				entry:
				%sub.i = fsub ninf <16 x float> %a, %b
				%0 = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> zeroinitializer, <16 x float> %sub.i, i32 11, i16 -1, i32 4)
				ret i16 %0
				}

				declare i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double>, <2 x double>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double>, <4 x double>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double>, <8 x double>, i32, i8, i32)
				declare i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float>, <4 x float>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float>, <8 x float>, i32, i8)
				declare i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float>, <16 x float>, i32, i16, i32)

This is an archive of the discontinued LLVM Phabricator instance.

[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 95377

Transforms/InstCombine/InstCombineCalls.cpp

Transforms/InstCombine/X86FsubCmpCombine.ll

This is an archive of the discontinued LLVM Phabricator instance.

[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimizationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 95377

Transforms/InstCombine/InstCombineCalls.cpp

Transforms/InstCombine/X86FsubCmpCombine.ll

[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization
ClosedPublic