Download Raw Diff

Details

Reviewers

RKSimon
zvi
guyblank
craig.topper
igorb

Summary

This patch adds new canonical representation to the instcombineCall pass.
This new functionality will switch the Compare intrinsic's operands and
will create one canonical representation to the compare intrinsics:
CMP (A, CONST)

Diff Detail

Event Timeline

m_zuckerman created this revision.Mar 27 2017, 6:55 AM

m_zuckerman edited the summary of this revision. (Show Details)

m_zuckerman added reviewers: RKSimon, craig.topper, igorb.Mar 27 2017, 6:58 AM

m_zuckerman retitled this revision from [X86][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics to [X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics.

m_zuckerman added a child revision: D31398: [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization.Mar 27 2017, 8:42 AM

craig.topper added a subscriber: llvm-commits.Mar 27 2017, 9:09 AM

craig.topper added inline comments.Mar 27 2017, 10:57 AM

lib/Transforms/InstCombine/InstCombineCalls.cpp
1563	How are you ensuring the VEX encoding is valid?
1569	Unfortunately, there's nothing in the backend IR parsing that guarantees that only a constant for the last intrinsic argument can get here. If you write a bad IR file you can fail this cast. Use a dyn_cast and check it defensively. A bad intrinsic will fail isel later and throw a graceful error, but a bad value here would just cause a crash.
1592	Why are you using ConstantExpr::getIntegerValue? You should be able to use ConstantInt::get right?
2392	What about sse_cmp_ps, sse2_cmp_pd and their avx2 equivalents?
2505	These don't use an i32 for the comparison type immediate, but the X86CreateCanonicalCMP assumes they do when it creates the new Constant.
test/Transforms/InstCombine/X86CanonicCmp.ll
2	Add test cases for sse_cmp_ss and sse2_cmp_sd.

Why are we doing this in InstCombine instead of DAG combine? This won't enable any additional InstCombine optimizations.

igorb added reviewers: zvi, guyblank.Mar 29 2017, 7:28 AM

m_zuckerman updated this revision to Diff 93515.Mar 30 2017, 11:53 AM

m_zuckerman updated this revision to Diff 93516.Mar 30 2017, 11:55 AM

m_zuckerman updated this revision to Diff 93520.Mar 30 2017, 12:21 PM

m_zuckerman marked 2 inline comments as done.

In D31396#711463, @craig.topper wrote:

Why are we doing this in InstCombine instead of DAG combine? This won't enable any additional InstCombine optimizations.

First, We are doing that because we are working on intrinsics and I don't do any special changing. I only reorder the operands.

Secondly, we can do it in the lowering but we prefer that this canonical representation will be in earlier step.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2392	As I wrote in the comments. We are only working on AVX intrinsics.
2505	In the main function, we only accept AVX and above intrinsics. This ensures us that we only work with i32 operands.

m_zuckerman marked an inline comment as done.Mar 30 2017, 12:54 PM

m_zuckerman added inline comments.

lib/Transforms/InstCombine/InstCombineCalls.cpp
1563	I added assert check that intrinsics contains avx.

filcab added a subscriber: filcab.Apr 3 2017, 8:23 AM

filcab added inline comments.

lib/Transforms/InstCombine/InstCombineCalls.cpp
1590	Spaces after commas, not before. Please be consistent in capitalization and hyphenation.
test/Transforms/InstCombine/X86CanonicCmp.ll
6	Typo in the function name (it could also be a bit more specific). Please explain what a canonical compare is in the comment. Or simply something like: Transform compare(constant, variable, comparison, ...) into compare(variable, constant, flip(comparison), ...)
49	What's happening here? Can you get more meaningful names and maybe do a log2 `and` chain?

m_zuckerman updated this revision to Diff 94182.Apr 5 2017, 3:10 AM

m_zuckerman marked 2 inline comments as done.

m_zuckerman marked an inline comment as done.

m_zuckerman updated this revision to Diff 94196.Apr 5 2017, 4:35 AM

m_zuckerman updated this revision to Diff 94197.

Ping

What is the plan for supporting the SSE intrinsics?

What does this canonicalization enable if we can't properly do it for the SSE intrinsics? Are we getting worse codegen for the scalar and 128-bit intrinsics on AVX targets just because we can't know we're an AVX target in InstCombine?

lib/Transforms/InstCombine/InstCombineCalls.cpp
1570	IntrinsicName is unused in release builds and will throw a warning. Probably need to wrap it in #ifndef NDEBUG
1578	You need to check for nullptr have a dyn_cast.
2402	What about x86_avx_cmp_ps_256 and x86_avx_cmp_pd_256?

In D31396#724945, @craig.topper wrote:

What is the plan for supporting the SSE intrinsics?

What does this canonicalization enable if we can't properly do it for the SSE intrinsics? Are we getting worse codegen for the scalar and 128-bit intrinsics on AVX targets just because we can't know we're an AVX target in InstCombine?

IIUC, the issue here is that @llvm.x86.sse.cmp.* instrinsics function calls can be lowered to either SSE encoded instructions or to AVX instructions. The difference in not only in the encoding, but also in the possible predicates the instructions support.
The SSE variants support 8 predicates and the AVX variants support a richer set of 32 variants.
The concern is that at InstCombine-time we don't have knowledge about what subtarget features will be enabled so we can't replace an SSE immediate with an AVX-only immediate because if we end-up lowering to an SSE target, we don't expect (or at least now it doesn't) the backend to reverse back to an SSE-legal form.
But what if the function contains "target-cpu" or "target-features" attributes which will allow us to assume these will be used by the backend? would it be ok then to perform the canonicalization?

m_zuckerman abandoned this revision.Oct 7 2017, 11:41 PM

Diff 94197

lib/Transforms/InstCombine/InstCombineCalls.cpp

	Show First 20 Lines • Show All 992 Lines • ▼ Show 20 Lines
	}			}
	}			}
	break;			break;
	}			}

	return false;			return false;
	}			}

				// Convert Cmp(Const,a) into canonical representation Cmp(a,Const)
				// This function creates a canonical compare by reordering its operands if needed.
				// as a result, the compare direction may also be changed in order to preserve the original compare meaning.
				// Example(a < b) -> (b > a)
				// Otherwise, the comparison value is still valid for the new operands order.
				// Example(a == b) -> (b == a)
				//
				craig.topperUnsubmitted Done Reply Inline Actions How are you ensuring the VEX encoding is valid? craig.topper: How are you ensuring the VEX encoding is valid?
				m_zuckermanAuthorUnsubmitted Not Done Reply Inline Actions I added assert check that intrinsics contains avx. m_zuckerman: I added assert check that intrinsics contains avx.
				// TODO: Add canonical representation for SSE's compare intrinsics.
				// Target must have VEX encoding!!!

				static bool X86CreateCanonicalCMP(IntrinsicInst *II) {
				Value *LHS = II->getOperand(0);
				Value *RHS = II->getOperand(1);
				craig.topperUnsubmitted Done Reply Inline Actions Unfortunately, there's nothing in the backend IR parsing that guarantees that only a constant for the last intrinsic argument can get here. If you write a bad IR file you can fail this cast. Use a dyn_cast and check it defensively. A bad intrinsic will fail isel later and throw a graceful error, but a bad value here would just cause a crash. craig.topper: Unfortunately, there's nothing in the backend IR parsing that guarantees that only a constant…
				StringRef IntrinsicName = II->getCalledFunction()->getName();
				craig.topperUnsubmitted Not Done Reply Inline Actions IntrinsicName is unused in release builds and will throw a warning. Probably need to wrap it in #ifndef NDEBUG craig.topper: IntrinsicName is unused in release builds and will throw a warning. Probably need to wrap it in…
				// This Assertion ensures that only avx and above intrinsics are passing
				// the compare canonical representation.
				assert(IntrinsicName.contains("avx") &&
				"Canonical representation support only intrinsics with VEX encoding");
				if (isa<Constant>(LHS) && !isa<Constant>(RHS)) {
				assert((II->getOperand(2)->getType() == Type::getInt32Ty(II->getContext()))
				&& "Operand must defined by int32 type" );
				ConstantInt *ComparisonValue = dyn_cast<ConstantInt>(II->getOperand(2));
				craig.topperUnsubmitted Not Done Reply Inline Actions You need to check for nullptr have a dyn_cast. craig.topper: You need to check for nullptr have a dyn_cast.
				uint64_t ConstantValue = ComparisonValue->getZExtValue();
				// When the lower bits of the compare are "01" or "10" (e.g. "1" or "2"),
				// they represent a "direction" ('<','<='...) types of comparisons
				// for which we need to change the direction of the compare when operators
				// are exchanged.
				//
				// In the 128-bit Legacy SSE version: The comparison predicate operand is
				// an 8 - bit immediate, bits 2:0 of the immediate define the type of
				// comparison to be performed. Bits 7 : 3 of the immediate is reserved.
				//
				// The three first bits represent:
				// Equal, Less-than, Less-than-or-equal, Unordered, Not-equal,
				filcabUnsubmitted Done Reply Inline Actions Spaces after commas, not before. Please be consistent in capitalization and hyphenation. filcab: Spaces after commas, not before. Please be consistent in capitalization and hyphenation.
				// Not less-than, Not less-than-or-equal and Ordered.
				//
				craig.topperUnsubmitted Done Reply Inline Actions Why are you using ConstantExpr::getIntegerValue? You should be able to use ConstantInt::get right? craig.topper: Why are you using ConstantExpr::getIntegerValue? You should be able to use ConstantInt::get…
				// In the VEX version: Two more bits were added to the immediate. For the
				// "relation" types (<,<=,!<,!<=) the fourth bit represent the
				// "greater relation". By using "Bit flipping" operation on the 4 low bits
				// of the "relation" types. We are creating a one to one match between
				// less to the equivalent upside greater.
				// ( xor(immediate('<'),0x0F) = immediate('>'))
				// This behaviour is true also for immediate with 0X10h.
				if ((ConstantValue & 0x3) == 1 \|\| (ConstantValue & 0x3) == 2) {
				const APInt NewComparison(32, (ConstantValue ^ 0xf));
				II->setOperand(2, ConstantInt::get(Type::getInt32Ty(II->getContext()),
				NewComparison));
				}
				II->setArgOperand(0, RHS);
				II->setArgOperand(1, LHS);
				return true;
				}
				return false;
				}

	// Convert NVVM intrinsics to target-generic LLVM code where possible.			// Convert NVVM intrinsics to target-generic LLVM code where possible.
	static Instruction SimplifyNVVMIntrinsic(IntrinsicInst II, InstCombiner &IC) {			static Instruction SimplifyNVVMIntrinsic(IntrinsicInst II, InstCombiner &IC) {
	// Each NVVM intrinsic we can simplify can be replaced with one of:			// Each NVVM intrinsic we can simplify can be replaced with one of:
	//			//
	// * an LLVM intrinsic,			// * an LLVM intrinsic,
	// * an LLVM cast operation,			// * an LLVM cast operation,
	// * an LLVM binary operation, or			// * an LLVM binary operation, or
	// * ad-hoc LLVM IR for the particular operation.			// * ad-hoc LLVM IR for the particular operation.
	▲ Show 20 Lines • Show All 720 Lines • ▼ Show 20 Lines
	case Intrinsic::x86_avx_movmsk_pd_256:			case Intrinsic::x86_avx_movmsk_pd_256:
	case Intrinsic::x86_avx_movmsk_ps_256:			case Intrinsic::x86_avx_movmsk_ps_256:
	case Intrinsic::x86_avx2_pmovmskb: {			case Intrinsic::x86_avx2_pmovmskb: {
	if (Value V = simplifyX86movmsk(II, *Builder))			if (Value V = simplifyX86movmsk(II, *Builder))
	return replaceInstUsesWith(*II, V);			return replaceInstUsesWith(*II, V);
	break;			break;
	}			}

				case Intrinsic::x86_avx512_mask_cmp_ss:
				case Intrinsic::x86_avx512_mask_cmp_sd:
				if (X86CreateCanonicalCMP(II))
				return II;
				LLVM_FALLTHROUGH;
	case Intrinsic::x86_sse_comieq_ss:			case Intrinsic::x86_sse_comieq_ss:
	case Intrinsic::x86_sse_comige_ss:			case Intrinsic::x86_sse_comige_ss:
	case Intrinsic::x86_sse_comigt_ss:			case Intrinsic::x86_sse_comigt_ss:
	case Intrinsic::x86_sse_comile_ss:			case Intrinsic::x86_sse_comile_ss:
	case Intrinsic::x86_sse_comilt_ss:			case Intrinsic::x86_sse_comilt_ss:
	case Intrinsic::x86_sse_comineq_ss:			case Intrinsic::x86_sse_comineq_ss:
	case Intrinsic::x86_sse_ucomieq_ss:			case Intrinsic::x86_sse_ucomieq_ss:
	case Intrinsic::x86_sse_ucomige_ss:			case Intrinsic::x86_sse_ucomige_ss:
	Show All 9 Lines
	case Intrinsic::x86_sse2_comineq_sd:			case Intrinsic::x86_sse2_comineq_sd:
	case Intrinsic::x86_sse2_ucomieq_sd:			case Intrinsic::x86_sse2_ucomieq_sd:
	case Intrinsic::x86_sse2_ucomige_sd:			case Intrinsic::x86_sse2_ucomige_sd:
	case Intrinsic::x86_sse2_ucomigt_sd:			case Intrinsic::x86_sse2_ucomigt_sd:
	case Intrinsic::x86_sse2_ucomile_sd:			case Intrinsic::x86_sse2_ucomile_sd:
	case Intrinsic::x86_sse2_ucomilt_sd:			case Intrinsic::x86_sse2_ucomilt_sd:
	case Intrinsic::x86_sse2_ucomineq_sd:			case Intrinsic::x86_sse2_ucomineq_sd:
	case Intrinsic::x86_avx512_vcomi_ss:			case Intrinsic::x86_avx512_vcomi_ss:
	case Intrinsic::x86_avx512_vcomi_sd:			case Intrinsic::x86_avx512_vcomi_sd: {
	case Intrinsic::x86_avx512_mask_cmp_ss:
	case Intrinsic::x86_avx512_mask_cmp_sd: {
	// These intrinsics only demand the 0th element of their input vectors. If			// These intrinsics only demand the 0th element of their input vectors. If
	// we can simplify the input based on that, do so now.			// we can simplify the input based on that, do so now.
	bool MadeChange = false;			bool MadeChange = false;
	Value *Arg0 = II->getArgOperand(0);			Value *Arg0 = II->getArgOperand(0);
	Value *Arg1 = II->getArgOperand(1);			Value *Arg1 = II->getArgOperand(1);
	unsigned VWidth = Arg0->getType()->getVectorNumElements();			unsigned VWidth = Arg0->getType()->getVectorNumElements();
	if (Value *V = SimplifyDemandedVectorEltsLow(Arg0, VWidth, 1)) {			if (Value *V = SimplifyDemandedVectorEltsLow(Arg0, VWidth, 1)) {
	II->setArgOperand(0, V);			II->setArgOperand(0, V);
	MadeChange = true;			MadeChange = true;
	}			}
	if (Value *V = SimplifyDemandedVectorEltsLow(Arg1, VWidth, 1)) {			if (Value *V = SimplifyDemandedVectorEltsLow(Arg1, VWidth, 1)) {
	II->setArgOperand(1, V);			II->setArgOperand(1, V);
	MadeChange = true;			MadeChange = true;
	}			}
				craig.topperUnsubmitted Not Done Reply Inline Actions What about sse_cmp_ps, sse2_cmp_pd and their avx2 equivalents? craig.topper: What about sse_cmp_ps, sse2_cmp_pd and their avx2 equivalents?
				m_zuckermanAuthorUnsubmitted Not Done Reply Inline Actions As I wrote in the comments. We are only working on AVX intrinsics. m_zuckerman: As I wrote in the comments. We are only working on AVX intrinsics.
	if (MadeChange)			if (MadeChange)
	return II;			return II;
	break;			break;
	}			}
				case Intrinsic::x86_avx512_mask_cmp_pd_128:
				case Intrinsic::x86_avx512_mask_cmp_pd_256:
				case Intrinsic::x86_avx512_mask_cmp_pd_512:
				case Intrinsic::x86_avx512_mask_cmp_ps_128:
				case Intrinsic::x86_avx512_mask_cmp_ps_256:
				case Intrinsic::x86_avx512_mask_cmp_ps_512:
				craig.topperUnsubmitted Not Done Reply Inline Actions What about x86_avx_cmp_ps_256 and x86_avx_cmp_pd_256? craig.topper: What about x86_avx_cmp_ps_256 and x86_avx_cmp_pd_256?
				if(X86CreateCanonicalCMP(II))
				return II;
				break;

	case Intrinsic::x86_avx512_mask_add_ps_512:			case Intrinsic::x86_avx512_mask_add_ps_512:
	case Intrinsic::x86_avx512_mask_div_ps_512:			case Intrinsic::x86_avx512_mask_div_ps_512:
	case Intrinsic::x86_avx512_mask_mul_ps_512:			case Intrinsic::x86_avx512_mask_mul_ps_512:
	case Intrinsic::x86_avx512_mask_sub_ps_512:			case Intrinsic::x86_avx512_mask_sub_ps_512:
	case Intrinsic::x86_avx512_mask_add_pd_512:			case Intrinsic::x86_avx512_mask_add_pd_512:
	case Intrinsic::x86_avx512_mask_div_pd_512:			case Intrinsic::x86_avx512_mask_div_pd_512:
	case Intrinsic::x86_avx512_mask_mul_pd_512:			case Intrinsic::x86_avx512_mask_mul_pd_512:
	▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
	cast<IntegerType>(Mask->getType())->getBitWidth());			cast<IntegerType>(Mask->getType())->getBitWidth());
	Mask = Builder->CreateBitCast(Mask, MaskTy);			Mask = Builder->CreateBitCast(Mask, MaskTy);
	Mask = Builder->CreateExtractElement(Mask, (uint64_t)0);			Mask = Builder->CreateExtractElement(Mask, (uint64_t)0);
	// Extract the lowest element from the passthru operand.			// Extract the lowest element from the passthru operand.
	Value *Passthru = Builder->CreateExtractElement(II->getArgOperand(2),			Value *Passthru = Builder->CreateExtractElement(II->getArgOperand(2),
	(uint64_t)0);			(uint64_t)0);
	V = Builder->CreateSelect(Mask, V, Passthru);			V = Builder->CreateSelect(Mask, V, Passthru);
	}			}

				craig.topperUnsubmitted Not Done Reply Inline Actions These don't use an i32 for the comparison type immediate, but the X86CreateCanonicalCMP assumes they do when it creates the new Constant. craig.topper: These don't use an i32 for the comparison type immediate, but the X86CreateCanonicalCMP assumes…
				m_zuckermanAuthorUnsubmitted Not Done Reply Inline Actions In the main function, we only accept AVX and above intrinsics. This ensures us that we only work with i32 operands. m_zuckerman: In the main function, we only accept AVX and above intrinsics. This ensures us that we only…
	// Insert the result back into the original argument 0.			// Insert the result back into the original argument 0.
	V = Builder->CreateInsertElement(Arg0, V, (uint64_t)0);			V = Builder->CreateInsertElement(Arg0, V, (uint64_t)0);

	return replaceInstUsesWith(*II, V);			return replaceInstUsesWith(*II, V);
	}			}
	}			}
	LLVM_FALLTHROUGH;			LLVM_FALLTHROUGH;

	▲ Show 20 Lines • Show All 892 Lines • Show Last 20 Lines

test/Transforms/InstCombine/X86CanonicCmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -instcombine -S \| FileCheck %s
				craig.topperUnsubmitted Done Reply Inline Actions Add test cases for sse_cmp_ss and sse2_cmp_sd. craig.topper: Add test cases for sse_cmp_ss and sse2_cmp_sd.

				; This test checks that the canonical representation of compare intrinsics is working
				; and converts the Cmp(const, a, comparison) into Cmp(a, const, flip(comparison)) with correct comparison immediate.
				; This representation is valid only for AVX and above intrinsics.
				filcabUnsubmitted Done Reply Inline Actions Typo in the function name (it could also be a bit more specific). Please explain what a canonical compare is in the comment. Or simply something like: Transform compare(constant, variable, comparison, ...) into compare(variable, constant, flip(comparison), ...) filcab: Typo in the function name (it could also be a bit more specific). Please explain what a…

				define i8 @canonical_compare_representationPD128(<2 x double> %a){
				; CHECK-LABEL: @canonical_compare_representationPD128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> [[A:%.]], <2 x double> zeroinitializer, i32 10, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double> zeroinitializer, <2 x double> %a, i32 5, i8 -1)
				ret i8 %0
				}

				define i8 @canonical_compare_representationPD256(<4 x double> %a){
				; CHECK-LABEL: @canonical_compare_representationPD256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> [[A:%.]], <4 x double> zeroinitializer, i32 10, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double> zeroinitializer, <4 x double> %a, i32 5, i8 -1)
				ret i8 %0
				}

				define i8 @canonical_compare_representationPD512(<8 x double> %a){
				; CHECK-LABEL: @canonical_compare_representationPD512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> [[A:%.]], <8 x double> zeroinitializer, i32 11, i8 -1, i32 4)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double> zeroinitializer, <8 x double> %a, i32 11, i8 -1, i32 4)
				ret i8 %0
				}

				define i8 @canonical_compare_representationPS128(<4 x float> %a){
				; CHECK-LABEL: @canonical_compare_representationPS128(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> [[A:%.]], <4 x float> zeroinitializer, i32 12, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float> zeroinitializer, <4 x float> %a, i32 12, i8 -1)
				ret i8 %0
				filcabUnsubmitted Not Done Reply Inline Actions What's happening here? Can you get more meaningful names and maybe do a log2 `and` chain? filcab: What's happening here? Can you get more meaningful names and maybe do a log2 `and` chain?
				}

				define i8 @canonical_compare_representationPS256(<8 x float> %a){
				; CHECK-LABEL: @canonical_compare_representationPS256(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> [[A:%.]], <8 x float> zeroinitializer, i32 10, i8 -1)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float> zeroinitializer, <8 x float> %a, i32 5, i8 -1)
				ret i8 %0
				}

				define i16 @canonical_compare_representationPS512(<16 x float> %a){
				; CHECK-LABEL: @canonical_compare_representationPS512(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> [[A:%.]], <16 x float> zeroinitializer, i32 11, i16 -1, i32 4)
				; CHECK-NEXT: ret i16 [[TMP0]]
				;
				entry:
				%0 = tail call i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float> zeroinitializer, <16 x float> %a, i32 11, i16 -1, i32 4)
				ret i16 %0
				}

				define i8 @canonical_compare_representationSS(<4 x float> %a){
				; CHECK-LABEL: @canonical_compare_representationSS(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.ss(<4 x float> [[A:%.]], <4 x float> <float 0.000000e+00, float undef, float undef, float undef>, i32 10, i8 -1, i32 4)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.ss(<4 x float> zeroinitializer, <4 x float> %a, i32 5, i8 -1, i32 4)
				ret i8 %0
				}

				define i8 @canonical_compare_representationSD(<2 x double> %a){
				; CHECK-LABEL: @canonical_compare_representationSD(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = tail call i8 @llvm.x86.avx512.mask.cmp.sd(<2 x double> [[A:%.]], <2 x double> <double 0.000000e+00, double undef>, i32 10, i8 -1, i32 4)
				; CHECK-NEXT: ret i8 [[TMP0]]
				;
				entry:
				%0 = tail call i8 @llvm.x86.avx512.mask.cmp.sd(<2 x double> zeroinitializer, <2 x double> %a, i32 5, i8 -1, i32 4)
				ret i8 %0
				}


				declare i8 @llvm.x86.avx512.mask.cmp.pd.128(<2 x double>, <2 x double>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.pd.256(<4 x double>, <4 x double>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.pd.512(<8 x double>, <8 x double>, i32, i8, i32)
				declare i8 @llvm.x86.avx512.mask.cmp.ps.128(<4 x float>, <4 x float>, i32, i8)
				declare i8 @llvm.x86.avx512.mask.cmp.ps.256(<8 x float>, <8 x float>, i32, i8)
				declare i16 @llvm.x86.avx512.mask.cmp.ps.512(<16 x float>, <16 x float>, i32, i16, i32)
				declare i8 @llvm.x86.avx512.mask.cmp.sd(<2 x double>, <2 x double>, i32, i8, i32)
				declare i8 @llvm.x86.avx512.mask.cmp.ss(<4 x float>, <4 x float>, i32, i8, i32)

This is an archive of the discontinued LLVM Phabricator instance.

[X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics
AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 94197

lib/Transforms/InstCombine/InstCombineCalls.cpp

test/Transforms/InstCombine/X86CanonicCmp.ll

This is an archive of the discontinued LLVM Phabricator instance.

[X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsicsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 94197

lib/Transforms/InstCombine/InstCombineCalls.cpp

test/Transforms/InstCombine/X86CanonicCmp.ll

[X86][LLVM][Canonical Compare Intrinsics] Creating a canonical representation for X86 CMP intrinsics
AbandonedPublic