This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Teach DAGCombine to fold the aext + select pattern
ClosedPublic

Authored by steven.zhang on Jun 13 2019, 10:56 PM.

Download Raw Diff

Details

Reviewers

craig.topper
jsji
nemanjai
hfinkel
kbarton
RKSimon
spatel

Commits

rGe0e7d4c3662e: Teach the DAGCombine to fold this pattern(c1 and c2 is constant).
rL364382: Teach the DAGCombine to fold this pattern(c1 and c2 is constant).

Summary

Teach the DAGCombine to fold this pattern(c1 and c2 is constant).

// fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
// fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2)
// fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)

Sign extend the operands if it is any_extend, to keep the signess of the operands that, the other combine rule would apply. The any_extend is handled as zero extend for constants. i.e.

t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0>
t2: i64 = any_extend t1
 -->
t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0>
 -->
 t4: i64 = sign_extend_inreg t3

Diff Detail

Event Timeline

steven.zhang created this revision.Jun 13 2019, 10:56 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2019, 10:56 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

craig.topper added inline comments.Jun 14 2019, 12:13 AM

llvm/test/CodeGen/X86/cmov-promotion.ll
131	This is slightly worse. Maybe don't do this when zext is free?

steven.zhang marked an inline comment as done.Jun 14 2019, 3:50 AM

steven.zhang added inline comments.

llvm/test/CodeGen/X86/cmov-promotion.ll
131	I check the scheduling information for cmovneq and cmovnel, both latency are 1. I didn't catch the "slightly worse" you mean ... Could you explain it more, as I am new to X86 instr. Thank you!

craig.topper added inline comments.Jun 14 2019, 8:16 AM

llvm/test/CodeGen/X86/cmov-promotion.ll
131	Cmovneq’s encoding is 1 byte longer than cmovnel. The 64-bit size requires a REX prefix.

steven.zhang marked an inline comment as done.Jun 18 2019, 12:12 AM

steven.zhang added inline comments.

llvm/test/CodeGen/X86/cmov-promotion.ll
131	I wonder if we can do this inside X86 target, as it seems a valid improvement for x86. For cmovneq, if the high 32bit is zero, use cmovnel ?

craig.topper added a reviewer: spatel.Jun 18 2019, 12:35 AM

spatel added inline comments.Jun 18 2019, 7:25 AM

llvm/test/CodeGen/X86/cmov-promotion.ll
131	The problem is larger than just this transform or x86. We do the same transform in instcombine, so we need to check constant values to reverse it. But there's no reason to make the problem worse by not using the existing TLI hook suggested by Craig. If we just add one more clause to the 'if' check, we avoid this test diff without changing any others: if (isa<ConstantSDNode>(Op1) && isa<ConstantSDNode>(Op2) && (Opcode != ISD::ZERO_EXTEND \|\| !TLI.isZExtFree(N0.getValueType(), VT))) {

steven.zhang marked an inline comment as done.Jun 18 2019, 9:52 PM

steven.zhang added inline comments.

llvm/test/CodeGen/X86/cmov-promotion.ll
131	Ah, sorry, I didn't get the point. Sure, it makes sense.

Add the isZextFree check.

steven.zhang added a comment.Jun 18 2019, 11:22 PM

This comment was removed by steven.zhang.

X86 looks ok to me other than i8->i64 zext problem.

llvm/test/CodeGen/X86/cmov-promotion.ll
59	The zextisfree check isn't enough to fix this :( It's an i8->i64 zext which isn't free. I guess we'll have to handle this in the x86 backend.

steven.zhang marked an inline comment as done.Jun 19 2019, 12:35 AM

steven.zhang added inline comments.

llvm/test/CodeGen/X86/cmov-promotion.ll
59	Yes. The hook only check the type i32->i64. Do we need to pass the value instead of the type for the isZextFree to fix this issue ?

spatel added inline comments.Jun 19 2019, 6:58 AM

llvm/test/CodeGen/X86/cmov-promotion.ll
59	I don't think changing the TLI call will be enough to solve the general problem (the IR is likely already in the form that we're trying to avoid).

As the X86 review is done (Another issue exposed by this patch, and x86 backend will work on that ?), let's continue the review for the PowerPC backend and DAGCombine logic.

llvm/test/CodeGen/X86/cmov-promotion.ll
59	ok. That sounds to be another issue.

spatel added inline comments.Jun 20 2019, 6:00 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8846–8847	This sentence is not clear to me. Is this better? "For any_extend, choose sign extension of the constants to allow a possible further transform to sign_extend_inreg." I'm not sure if using sign extension is the best choice in all cases, but if there are no visible test regressions, I guess that's ok for now.

Update the comments.

LGTM

This revision is now accepted and ready to land.Jun 25 2019, 8:40 AM

Closed by commit rL364382: Teach the DAGCombine to fold this pattern(c1 and c2 is constant). (authored by • qshanz). · Explain WhyJun 25 2019, 10:13 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

30 lines

test/

CodeGen/

PowerPC/

bool-math.ll

4 lines

select_const.ll

269 lines

X86/

avx512-insert-extract.ll

36 lines

cmov-promotion.ll

71 lines

select.ll

12 lines

Diff 205512

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 8,816 Lines • ▼ Show 20 Lines
	/// dag nodes (see for example method DAGCombiner::visitSIGN_EXTEND).			/// dag nodes (see for example method DAGCombiner::visitSIGN_EXTEND).
	/// Vector extends are not folded if operations are legal; this is to			/// Vector extends are not folded if operations are legal; this is to
	/// avoid introducing illegal build_vector dag nodes.			/// avoid introducing illegal build_vector dag nodes.
	static SDValue tryToFoldExtendOfConstant(SDNode *N, const TargetLowering &TLI,			static SDValue tryToFoldExtendOfConstant(SDNode *N, const TargetLowering &TLI,
	SelectionDAG &DAG, bool LegalTypes) {			SelectionDAG &DAG, bool LegalTypes) {
	unsigned Opcode = N->getOpcode();			unsigned Opcode = N->getOpcode();
	SDValue N0 = N->getOperand(0);			SDValue N0 = N->getOperand(0);
	EVT VT = N->getValueType(0);			EVT VT = N->getValueType(0);
				SDLoc DL(N);

	assert((Opcode == ISD::SIGN_EXTEND \|\| Opcode == ISD::ZERO_EXTEND \|\|			assert((Opcode == ISD::SIGN_EXTEND \|\| Opcode == ISD::ZERO_EXTEND \|\|
	Opcode == ISD::ANY_EXTEND \|\| Opcode == ISD::SIGN_EXTEND_VECTOR_INREG \|\|			Opcode == ISD::ANY_EXTEND \|\| Opcode == ISD::SIGN_EXTEND_VECTOR_INREG \|\|
	Opcode == ISD::ZERO_EXTEND_VECTOR_INREG)			Opcode == ISD::ZERO_EXTEND_VECTOR_INREG)
	&& "Expected EXTEND dag node in input!");			&& "Expected EXTEND dag node in input!");

	// fold (sext c1) -> c1			// fold (sext c1) -> c1
	// fold (zext c1) -> c1			// fold (zext c1) -> c1
	// fold (aext c1) -> c1			// fold (aext c1) -> c1
	if (isa<ConstantSDNode>(N0))			if (isa<ConstantSDNode>(N0))
	return DAG.getNode(Opcode, SDLoc(N), VT, N0);			return DAG.getNode(Opcode, DL, VT, N0);

				// fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
				// fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2)
				// fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
				if (N0->getOpcode() == ISD::SELECT) {
				SDValue Op1 = N0->getOperand(1);
				SDValue Op2 = N0->getOperand(2);
				if (isa<ConstantSDNode>(Op1) && isa<ConstantSDNode>(Op2) &&
				(Opcode != ISD::ZERO_EXTEND \|\| !TLI.isZExtFree(N0.getValueType(), VT))) {
				// Sign extend the operands if it is any_extend, to keep the signess
				// of the operands that, the other combine rule would apply. i.e.
				spatelUnsubmitted Not Done Reply Inline Actions This sentence is not clear to me. Is this better? "For any_extend, choose sign extension of the constants to allow a possible further transform to sign_extend_inreg." I'm not sure if using sign extension is the best choice in all cases, but if there are no visible test regressions, I guess that's ok for now. spatel: This sentence is not clear to me. Is this better? "For any_extend, choose sign extension of the…
				//
				// t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0>
				// t2: i64 = any_extend t1
				// -->
				// t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0>
				// -->
				// t4: i64 = sign_extend_inreg t3
				unsigned FoldOpc = Opcode;
				if (FoldOpc == ISD::ANY_EXTEND)
				FoldOpc = ISD::SIGN_EXTEND;
				return DAG.getSelect(DL, VT, N0->getOperand(0),
				DAG.getNode(FoldOpc, DL, VT, Op1),
				DAG.getNode(FoldOpc, DL, VT, Op2));
				}
				}

	// fold (sext (build_vector AllConstants) -> (build_vector AllConstants)			// fold (sext (build_vector AllConstants) -> (build_vector AllConstants)
	// fold (zext (build_vector AllConstants) -> (build_vector AllConstants)			// fold (zext (build_vector AllConstants) -> (build_vector AllConstants)
	// fold (aext (build_vector AllConstants) -> (build_vector AllConstants)			// fold (aext (build_vector AllConstants) -> (build_vector AllConstants)
	EVT SVT = VT.getScalarType();			EVT SVT = VT.getScalarType();
	if (!(VT.isVector() && (!LegalTypes \|\| TLI.isTypeLegal(SVT)) &&			if (!(VT.isVector() && (!LegalTypes \|\| TLI.isTypeLegal(SVT)) &&
	ISD::isBuildVectorOfConstantSDNodes(N0.getNode())))			ISD::isBuildVectorOfConstantSDNodes(N0.getNode())))
	return SDValue();			return SDValue();

	// We can fold this node into a build_vector.			// We can fold this node into a build_vector.
	unsigned VTBits = SVT.getSizeInBits();			unsigned VTBits = SVT.getSizeInBits();
	unsigned EVTBits = N0->getValueType(0).getScalarSizeInBits();			unsigned EVTBits = N0->getValueType(0).getScalarSizeInBits();
	SmallVector<SDValue, 8> Elts;			SmallVector<SDValue, 8> Elts;
	unsigned NumElts = VT.getVectorNumElements();			unsigned NumElts = VT.getVectorNumElements();
	SDLoc DL(N);

	// For zero-extensions, UNDEF elements still guarantee to have the upper			// For zero-extensions, UNDEF elements still guarantee to have the upper
	// bits set to zero.			// bits set to zero.
	bool IsZext =			bool IsZext =
	Opcode == ISD::ZERO_EXTEND \|\| Opcode == ISD::ZERO_EXTEND_VECTOR_INREG;			Opcode == ISD::ZERO_EXTEND \|\| Opcode == ISD::ZERO_EXTEND_VECTOR_INREG;

	for (unsigned i = 0; i != NumElts; ++i) {			for (unsigned i = 0; i != NumElts; ++i) {
	SDValue Op = N0.getOperand(i);			SDValue Op = N0.getOperand(i);
	▲ Show 20 Lines • Show All 11,756 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/bool-math.ll

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	; CHECK-NEXT: blr
%c = icmp eq i32 %a, 0		%c = icmp eq i32 %a, 0
%r = select i1 %c, i16 36, i16 37		%r = select i1 %c, i16 36, i16 37
ret i16 %r		ret i16 %r
}		}

define i8 @low_bit_select_constants_bigger_true_same_size_result(i8 %x) {		define i8 @low_bit_select_constants_bigger_true_same_size_result(i8 %x) {
; CHECK-LABEL: low_bit_select_constants_bigger_true_same_size_result:		; CHECK-LABEL: low_bit_select_constants_bigger_true_same_size_result:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: rlwinm 3, 3, 0, 31, 31		; CHECK-NEXT: clrldi 3, 3, 63
; CHECK-NEXT: subfic 3, 3, -29		; CHECK-NEXT: subfic 3, 3, -29
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%a = and i8 %x, 1		%a = and i8 %x, 1
%c = icmp eq i8 %a, 0		%c = icmp eq i8 %a, 0
%r = select i1 %c, i8 227, i8 226		%r = select i1 %c, i8 227, i8 226
ret i8 %r		ret i8 %r
}		}

define i32 @low_bit_select_constants_bigger_true_wider_result(i8 %x) {		define i32 @low_bit_select_constants_bigger_true_wider_result(i8 %x) {
; CHECK-LABEL: low_bit_select_constants_bigger_true_wider_result:		; CHECK-LABEL: low_bit_select_constants_bigger_true_wider_result:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: clrldi 3, 3, 63		; CHECK-NEXT: clrldi 3, 3, 63
; CHECK-NEXT: subfic 3, 3, 227		; CHECK-NEXT: subfic 3, 3, 227
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%a = and i8 %x, 1		%a = and i8 %x, 1
%c = icmp eq i8 %a, 0		%c = icmp eq i8 %a, 0
%r = select i1 %c, i32 227, i32 226		%r = select i1 %c, i32 227, i32 226
ret i32 %r		ret i32 %r
}		}

define i8 @low_bit_select_constants_bigger_true_narrower_result(i16 %x) {		define i8 @low_bit_select_constants_bigger_true_narrower_result(i16 %x) {
; CHECK-LABEL: low_bit_select_constants_bigger_true_narrower_result:		; CHECK-LABEL: low_bit_select_constants_bigger_true_narrower_result:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: rlwinm 3, 3, 0, 31, 31		; CHECK-NEXT: clrldi 3, 3, 63
; CHECK-NEXT: subfic 3, 3, 41		; CHECK-NEXT: subfic 3, 3, 41
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%a = and i16 %x, 1		%a = and i16 %x, 1
%c = icmp eq i16 %a, 0		%c = icmp eq i16 %a, 0
%r = select i1 %c, i8 41, i8 40		%r = select i1 %c, i8 41, i8 40
ret i8 %r		ret i8 %r
}		}

llvm/test/CodeGen/PowerPC/select_const.ll

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
; ALL-NEXT: blr		; ALL-NEXT: blr
%sel = select i1 %cond, i32 1, i32 0		%sel = select i1 %cond, i32 1, i32 0
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 0, -1 --> sext (!Cond)		; select Cond, 0, -1 --> sext (!Cond)

define i32 @select_0_or_neg1(i1 %cond) {		define i32 @select_0_or_neg1(i1 %cond) {
; ISEL-LABEL: select_0_or_neg1:		; ALL-LABEL: select_0_or_neg1:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: not 3, 3
; ISEL-NEXT: andi. 3, 3, 1		; ALL-NEXT: clrldi 3, 3, 63
; ISEL-NEXT: oris 3, 4, 65535		; ALL-NEXT: neg 3, 3
; ISEL-NEXT: ori 3, 3, 65535		; ALL-NEXT: blr
; ISEL-NEXT: isel 3, 0, 3, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_0_or_neg1:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bc 12, 1, .LBB6_1
; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: .LBB6_1:
; NO_ISEL-NEXT: addi 3, 0, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {		define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
; ISEL-LABEL: select_0_or_neg1_zeroext:		; ALL-LABEL: select_0_or_neg1_zeroext:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: xori 3, 3, 1
; ISEL-NEXT: andi. 3, 3, 1		; ALL-NEXT: neg 3, 3
; ISEL-NEXT: oris 3, 4, 65535		; ALL-NEXT: blr
; ISEL-NEXT: ori 3, 3, 65535
; ISEL-NEXT: isel 3, 0, 3, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_0_or_neg1_zeroext:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bc 12, 1, .LBB7_1
; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: .LBB7_1:
; NO_ISEL-NEXT: addi 3, 0, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_signext(i1 signext %cond) {		define i32 @select_0_or_neg1_signext(i1 signext %cond) {
; ISEL-LABEL: select_0_or_neg1_signext:		; ALL-LABEL: select_0_or_neg1_signext:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: not 3, 3
; ISEL-NEXT: andi. 3, 3, 1		; ALL-NEXT: blr
; ISEL-NEXT: oris 3, 4, 65535
; ISEL-NEXT: ori 3, 3, 65535
; ISEL-NEXT: isel 3, 0, 3, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_0_or_neg1_signext:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bc 12, 1, .LBB8_1
; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: .LBB8_1:
; NO_ISEL-NEXT: addi 3, 0, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

; select Cond, -1, 0 --> sext (Cond)		; select Cond, -1, 0 --> sext (Cond)

define i32 @select_neg1_or_0(i1 %cond) {		define i32 @select_neg1_or_0(i1 %cond) {
; ISEL-LABEL: select_neg1_or_0:		; ALL-LABEL: select_neg1_or_0:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: clrldi 3, 3, 63
; ISEL-NEXT: andi. 3, 3, 1		; ALL-NEXT: neg 3, 3
; ISEL-NEXT: oris 3, 4, 65535		; ALL-NEXT: blr
; ISEL-NEXT: ori 3, 3, 65535
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_neg1_or_0:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bclr 12, 1, 0
; NO_ISEL-NEXT: # %bb.1:
; NO_ISEL-NEXT: ori 3, 4, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {		define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {
; ISEL-LABEL: select_neg1_or_0_zeroext:		; ALL-LABEL: select_neg1_or_0_zeroext:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: neg 3, 3
; ISEL-NEXT: andi. 3, 3, 1		; ALL-NEXT: blr
; ISEL-NEXT: oris 3, 4, 65535
; ISEL-NEXT: ori 3, 3, 65535
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_neg1_or_0_zeroext:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bclr 12, 1, 0
; NO_ISEL-NEXT: # %bb.1:
; NO_ISEL-NEXT: ori 3, 4, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_signext(i1 signext %cond) {		define i32 @select_neg1_or_0_signext(i1 signext %cond) {
; ISEL-LABEL: select_neg1_or_0_signext:		; ALL-LABEL: select_neg1_or_0_signext:
; ISEL: # %bb.0:		; ALL: # %bb.0:
; ISEL-NEXT: li 4, 0		; ALL-NEXT: blr
; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: oris 3, 4, 65535
; ISEL-NEXT: ori 3, 3, 65535
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr
;
; NO_ISEL-LABEL: select_neg1_or_0_signext:
; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535
; NO_ISEL-NEXT: ori 3, 3, 65535
; NO_ISEL-NEXT: bclr 12, 1, 0
; NO_ISEL-NEXT: # %bb.1:
; NO_ISEL-NEXT: ori 3, 4, 0
; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

; select Cond, C+1, C --> add (zext Cond), C		; select Cond, C+1, C --> add (zext Cond), C

define i32 @select_Cplus1_C(i1 %cond) {		define i32 @select_Cplus1_C(i1 %cond) {
; ALL-LABEL: select_Cplus1_C:		; ALL-LABEL: select_Cplus1_C:
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = add i8 %sel, 5		%bo = add i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_sub_constant(i1 %cond) {		define i8 @sel_constants_sub_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_sub_constant:		; ISEL-LABEL: sel_constants_sub_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: li 4, 0
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: oris 3, 4, 65535		; ISEL-NEXT: li 4, -9
; ISEL-NEXT: li 4, 18		; ISEL-NEXT: li 3, 18
; ISEL-NEXT: ori 3, 3, 65527		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_sub_constant:		; NO_ISEL-LABEL: sel_constants_sub_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535		; NO_ISEL-NEXT: li 4, -9
; NO_ISEL-NEXT: li 4, 18		; NO_ISEL-NEXT: li 3, 18
; NO_ISEL-NEXT: ori 3, 3, 65527		; NO_ISEL-NEXT: bc 12, 1, .LBB22_1
; NO_ISEL-NEXT: bclr 12, 1, 0		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: # %bb.1:		; NO_ISEL-NEXT: .LBB22_1:
; NO_ISEL-NEXT: ori 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = sub i8 %sel, 5		%bo = sub i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_sub_constant_sel_constants(i1 %cond) {		define i8 @sel_constants_sub_constant_sel_constants(i1 %cond) {
; ISEL-LABEL: sel_constants_sub_constant_sel_constants:		; ISEL-LABEL: sel_constants_sub_constant_sel_constants:
Show All 17 Lines	; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 3		%sel = select i1 %cond, i8 -4, i8 3
%bo = sub i8 5, %sel		%bo = sub i8 5, %sel
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_mul_constant(i1 %cond) {		define i8 @sel_constants_mul_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_mul_constant:		; ISEL-LABEL: sel_constants_mul_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: lis 4, 16383
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: ori 3, 4, 65531		; ISEL-NEXT: li 4, -20
; ISEL-NEXT: li 4, 115		; ISEL-NEXT: li 3, 115
; ISEL-NEXT: sldi 3, 3, 2		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_mul_constant:		; NO_ISEL-LABEL: sel_constants_mul_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: lis 4, 16383
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: ori 3, 4, 65531		; NO_ISEL-NEXT: li 4, -20
; NO_ISEL-NEXT: li 4, 115		; NO_ISEL-NEXT: li 3, 115
; NO_ISEL-NEXT: sldi 3, 3, 2		; NO_ISEL-NEXT: bc 12, 1, .LBB24_1
; NO_ISEL-NEXT: bclr 12, 1, 0		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: # %bb.1:		; NO_ISEL-NEXT: .LBB24_1:
; NO_ISEL-NEXT: ori 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = mul i8 %sel, 5		%bo = mul i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_sdiv_constant(i1 %cond) {		define i8 @sel_constants_sdiv_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_sdiv_constant:		; ISEL-LABEL: sel_constants_sdiv_constant:
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = udiv i8 120, %sel		%bo = udiv i8 120, %sel
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_srem_constant(i1 %cond) {		define i8 @sel_constants_srem_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_srem_constant:		; ISEL-LABEL: sel_constants_srem_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: lis 4, 16383
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: ori 3, 4, 65535		; ISEL-NEXT: li 4, -4
; ISEL-NEXT: li 4, 3		; ISEL-NEXT: li 3, 3
; ISEL-NEXT: sldi 3, 3, 2		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_srem_constant:		; NO_ISEL-LABEL: sel_constants_srem_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: lis 4, 16383
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: ori 3, 4, 65535		; NO_ISEL-NEXT: li 4, -4
; NO_ISEL-NEXT: li 4, 3		; NO_ISEL-NEXT: li 3, 3
; NO_ISEL-NEXT: sldi 3, 3, 2		; NO_ISEL-NEXT: bc 12, 1, .LBB29_1
; NO_ISEL-NEXT: bclr 12, 1, 0		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: # %bb.1:		; NO_ISEL-NEXT: .LBB29_1:
; NO_ISEL-NEXT: ori 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = srem i8 %sel, 5		%bo = srem i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @srem_constant_sel_constants(i1 %cond) {		define i8 @srem_constant_sel_constants(i1 %cond) {
; ISEL-LABEL: srem_constant_sel_constants:		; ISEL-LABEL: srem_constant_sel_constants:
Show All 17 Lines	; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 121, i8 23		%sel = select i1 %cond, i8 121, i8 23
%bo = srem i8 120, %sel		%bo = srem i8 120, %sel
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_urem_constant(i1 %cond) {		define i8 @sel_constants_urem_constant(i1 %cond) {
; ALL-LABEL: sel_constants_urem_constant:		; ALL-LABEL: sel_constants_urem_constant:
; ALL: # %bb.0:		; ALL: # %bb.0:
; ALL-NEXT: rlwinm 3, 3, 0, 31, 31		; ALL-NEXT: clrldi 3, 3, 63
; ALL-NEXT: subfic 3, 3, 3		; ALL-NEXT: subfic 3, 3, 3
; ALL-NEXT: blr		; ALL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = urem i8 %sel, 5		%bo = urem i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @urem_constant_sel_constants(i1 %cond) {		define i8 @urem_constant_sel_constants(i1 %cond) {
Show All 18 Lines	; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = urem i8 120, %sel		%bo = urem i8 120, %sel
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_and_constant(i1 %cond) {		define i8 @sel_constants_and_constant(i1 %cond) {
; ALL-LABEL: sel_constants_and_constant:		; ALL-LABEL: sel_constants_and_constant:
; ALL: # %bb.0:		; ALL: # %bb.0:
; ALL-NEXT: rlwinm 3, 3, 0, 31, 31		; ALL-NEXT: clrldi 3, 3, 63
; ALL-NEXT: subfic 3, 3, 5		; ALL-NEXT: subfic 3, 3, 5
; ALL-NEXT: blr		; ALL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = and i8 %sel, 5		%bo = and i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_or_constant(i1 %cond) {		define i8 @sel_constants_or_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_or_constant:		; ISEL-LABEL: sel_constants_or_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: li 4, 0
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: oris 3, 4, 65535		; ISEL-NEXT: li 4, -3
; ISEL-NEXT: li 4, 23		; ISEL-NEXT: li 3, 23
; ISEL-NEXT: ori 3, 3, 65533		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_or_constant:		; NO_ISEL-LABEL: sel_constants_or_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535		; NO_ISEL-NEXT: li 4, -3
; NO_ISEL-NEXT: li 4, 23		; NO_ISEL-NEXT: li 3, 23
; NO_ISEL-NEXT: ori 3, 3, 65533		; NO_ISEL-NEXT: bc 12, 1, .LBB34_1
; NO_ISEL-NEXT: bclr 12, 1, 0		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: # %bb.1:		; NO_ISEL-NEXT: .LBB34_1:
; NO_ISEL-NEXT: ori 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = or i8 %sel, 5		%bo = or i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_xor_constant(i1 %cond) {		define i8 @sel_constants_xor_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_xor_constant:		; ISEL-LABEL: sel_constants_xor_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: li 4, 0
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: oris 3, 4, 65535		; ISEL-NEXT: li 4, -7
; ISEL-NEXT: li 4, 18		; ISEL-NEXT: li 3, 18
; ISEL-NEXT: ori 3, 3, 65529		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: isel 3, 3, 4, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_xor_constant:		; NO_ISEL-LABEL: sel_constants_xor_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: li 4, 0
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: oris 3, 4, 65535		; NO_ISEL-NEXT: li 4, -7
; NO_ISEL-NEXT: li 4, 18		; NO_ISEL-NEXT: li 3, 18
; NO_ISEL-NEXT: ori 3, 3, 65529		; NO_ISEL-NEXT: bc 12, 1, .LBB35_1
; NO_ISEL-NEXT: bclr 12, 1, 0		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: # %bb.1:		; NO_ISEL-NEXT: .LBB35_1:
; NO_ISEL-NEXT: ori 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = xor i8 %sel, 5		%bo = xor i8 %sel, 5
ret i8 %bo		ret i8 %bo
}		}

define i8 @sel_constants_shl_constant(i1 %cond) {		define i8 @sel_constants_shl_constant(i1 %cond) {
; ISEL-LABEL: sel_constants_shl_constant:		; ISEL-LABEL: sel_constants_shl_constant:
; ISEL: # %bb.0:		; ISEL: # %bb.0:
; ISEL-NEXT: lis 4, 2047
; ISEL-NEXT: lis 5, 511
; ISEL-NEXT: andi. 3, 3, 1		; ISEL-NEXT: andi. 3, 3, 1
; ISEL-NEXT: ori 3, 4, 65535		; ISEL-NEXT: li 4, -128
; ISEL-NEXT: ori 4, 5, 65535		; ISEL-NEXT: li 3, -32
; ISEL-NEXT: sldi 3, 3, 5
; ISEL-NEXT: sldi 4, 4, 7
; ISEL-NEXT: isel 3, 4, 3, 1		; ISEL-NEXT: isel 3, 4, 3, 1
; ISEL-NEXT: blr		; ISEL-NEXT: blr
;		;
; NO_ISEL-LABEL: sel_constants_shl_constant:		; NO_ISEL-LABEL: sel_constants_shl_constant:
; NO_ISEL: # %bb.0:		; NO_ISEL: # %bb.0:
; NO_ISEL-NEXT: lis 4, 2047
; NO_ISEL-NEXT: lis 5, 511
; NO_ISEL-NEXT: andi. 3, 3, 1		; NO_ISEL-NEXT: andi. 3, 3, 1
; NO_ISEL-NEXT: ori 3, 4, 65535		; NO_ISEL-NEXT: li 4, -128
; NO_ISEL-NEXT: ori 4, 5, 65535		; NO_ISEL-NEXT: li 3, -32
; NO_ISEL-NEXT: sldi 3, 3, 5
; NO_ISEL-NEXT: sldi 4, 4, 7
; NO_ISEL-NEXT: bc 12, 1, .LBB36_1		; NO_ISEL-NEXT: bc 12, 1, .LBB36_1
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
; NO_ISEL-NEXT: .LBB36_1:		; NO_ISEL-NEXT: .LBB36_1:
; NO_ISEL-NEXT: addi 3, 4, 0		; NO_ISEL-NEXT: addi 3, 4, 0
; NO_ISEL-NEXT: blr		; NO_ISEL-NEXT: blr
%sel = select i1 %cond, i8 -4, i8 23		%sel = select i1 %cond, i8 -4, i8 23
%bo = shl i8 %sel, 5		%bo = shl i8 %sel, 5
ret i8 %bo		ret i8 %bo
▲ Show 20 Lines • Show All 311 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/avx512-insert-extract.ll

	Show First 20 Lines • Show All 902 Lines • ▼ Show 20 Lines
	}			}

	define zeroext i8 @test_extractelement_v2i1(<2 x i64> %a, <2 x i64> %b) {			define zeroext i8 @test_extractelement_v2i1(<2 x i64> %a, <2 x i64> %b) {
	; KNL-LABEL: test_extractelement_v2i1:			; KNL-LABEL: test_extractelement_v2i1:
	; KNL: ## %bb.0:			; KNL: ## %bb.0:
	; KNL-NEXT: ## kill: def $xmm1 killed $xmm1 def $zmm1			; KNL-NEXT: ## kill: def $xmm1 killed $xmm1 def $zmm1
	; KNL-NEXT: ## kill: def $xmm0 killed $xmm0 def $zmm0			; KNL-NEXT: ## kill: def $xmm0 killed $xmm0 def $zmm0
	; KNL-NEXT: vpcmpnleuq %zmm1, %zmm0, %k0			; KNL-NEXT: vpcmpnleuq %zmm1, %zmm0, %k0
	; KNL-NEXT: kmovw %k0, %eax			; KNL-NEXT: kmovw %k0, %ecx
	; KNL-NEXT: andb $1, %al			; KNL-NEXT: andl $1, %ecx
	; KNL-NEXT: movb $4, %cl			; KNL-NEXT: movl $4, %eax
	; KNL-NEXT: subb %al, %cl			; KNL-NEXT: subl %ecx, %eax
	; KNL-NEXT: movzbl %cl, %eax
	; KNL-NEXT: vzeroupper			; KNL-NEXT: vzeroupper
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: test_extractelement_v2i1:			; SKX-LABEL: test_extractelement_v2i1:
	; SKX: ## %bb.0:			; SKX: ## %bb.0:
	; SKX-NEXT: vpcmpnleuq %xmm1, %xmm0, %k0			; SKX-NEXT: vpcmpnleuq %xmm1, %xmm0, %k0
	; SKX-NEXT: kmovd %k0, %eax			; SKX-NEXT: kmovd %k0, %ecx
	; SKX-NEXT: andb $1, %al			; SKX-NEXT: andl $1, %ecx
	; SKX-NEXT: movb $4, %cl			; SKX-NEXT: movl $4, %eax
	; SKX-NEXT: subb %al, %cl			; SKX-NEXT: subl %ecx, %eax
	; SKX-NEXT: movzbl %cl, %eax
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%t1 = icmp ugt <2 x i64> %a, %b			%t1 = icmp ugt <2 x i64> %a, %b
	%t2 = extractelement <2 x i1> %t1, i32 0			%t2 = extractelement <2 x i1> %t1, i32 0
	%res = select i1 %t2, i8 3, i8 4			%res = select i1 %t2, i8 3, i8 4
	ret i8 %res			ret i8 %res
	}			}

	define zeroext i8 @extractelement_v2i1_alt(<2 x i64> %a, <2 x i64> %b) {			define zeroext i8 @extractelement_v2i1_alt(<2 x i64> %a, <2 x i64> %b) {
	▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	; KNL: ## %bb.0:			; KNL: ## %bb.0:
	; KNL-NEXT: vpminub %ymm3, %ymm1, %ymm0			; KNL-NEXT: vpminub %ymm3, %ymm1, %ymm0
	; KNL-NEXT: vpcmpeqb %ymm0, %ymm1, %ymm0			; KNL-NEXT: vpcmpeqb %ymm0, %ymm1, %ymm0
	; KNL-NEXT: vextracti128 $1, %ymm0, %xmm0			; KNL-NEXT: vextracti128 $1, %ymm0, %xmm0
	; KNL-NEXT: vpternlogq $15, %zmm0, %zmm0, %zmm0			; KNL-NEXT: vpternlogq $15, %zmm0, %zmm0, %zmm0
	; KNL-NEXT: vpmovsxbd %xmm0, %zmm0			; KNL-NEXT: vpmovsxbd %xmm0, %zmm0
	; KNL-NEXT: vptestmd %zmm0, %zmm0, %k0			; KNL-NEXT: vptestmd %zmm0, %zmm0, %k0
	; KNL-NEXT: kshiftrw $15, %k0, %k0			; KNL-NEXT: kshiftrw $15, %k0, %k0
	; KNL-NEXT: kmovw %k0, %eax			; KNL-NEXT: kmovw %k0, %ecx
	; KNL-NEXT: andb $1, %al			; KNL-NEXT: andl $1, %ecx
	; KNL-NEXT: movb $4, %cl			; KNL-NEXT: movl $4, %eax
	; KNL-NEXT: subb %al, %cl			; KNL-NEXT: subl %ecx, %eax
	; KNL-NEXT: movzbl %cl, %eax
	; KNL-NEXT: vzeroupper			; KNL-NEXT: vzeroupper
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: test_extractelement_v64i1:			; SKX-LABEL: test_extractelement_v64i1:
	; SKX: ## %bb.0:			; SKX: ## %bb.0:
	; SKX-NEXT: vpcmpnleub %zmm1, %zmm0, %k0			; SKX-NEXT: vpcmpnleub %zmm1, %zmm0, %k0
	; SKX-NEXT: kshiftrq $63, %k0, %k0			; SKX-NEXT: kshiftrq $63, %k0, %k0
	; SKX-NEXT: kmovd %k0, %eax			; SKX-NEXT: kmovd %k0, %ecx
	; SKX-NEXT: andb $1, %al			; SKX-NEXT: andl $1, %ecx
	; SKX-NEXT: movb $4, %cl			; SKX-NEXT: movl $4, %eax
	; SKX-NEXT: subb %al, %cl			; SKX-NEXT: subl %ecx, %eax
	; SKX-NEXT: movzbl %cl, %eax
	; SKX-NEXT: vzeroupper			; SKX-NEXT: vzeroupper
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%t1 = icmp ugt <64 x i8> %a, %b			%t1 = icmp ugt <64 x i8> %a, %b
	%t2 = extractelement <64 x i1> %t1, i32 63			%t2 = extractelement <64 x i1> %t1, i32 63
	%res = select i1 %t2, i8 3, i8 4			%res = select i1 %t2, i8 3, i8 4
	ret i8 %res			ret i8 %res
	}			}

	▲ Show 20 Lines • Show All 1,307 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/cmov-promotion.ll

	Show All 9 Lines
	; CMOV-NEXT: movl $237, %eax			; CMOV-NEXT: movl $237, %eax
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovnel %ecx, %eax
	; CMOV-NEXT: # kill: def $ax killed $ax killed $eax			; CMOV-NEXT: # kill: def $ax killed $ax killed $eax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_zpromotion_8_to_16:			; NO_CMOV-LABEL: cmov_zpromotion_8_to_16:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $117, %al			; NO_CMOV-NEXT: movl $117, %eax
	; NO_CMOV-NEXT: jne .LBB0_2			; NO_CMOV-NEXT: jne .LBB0_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movb $-19, %al			; NO_CMOV-NEXT: movl $237, %eax
	; NO_CMOV-NEXT: .LBB0_2:			; NO_CMOV-NEXT: .LBB0_2:
	; NO_CMOV-NEXT: movzbl %al, %eax
	; NO_CMOV-NEXT: # kill: def $ax killed $ax killed $eax			; NO_CMOV-NEXT: # kill: def $ax killed $ax killed $eax
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 117, i8 -19			%t0 = select i1 %c, i8 117, i8 -19
	%ret = zext i8 %t0 to i16			%ret = zext i8 %t0 to i16
	ret i16 %ret			ret i16 %ret
	}			}

	define i32 @cmov_zpromotion_8_to_32(i1 %c) {			define i32 @cmov_zpromotion_8_to_32(i1 %c) {
	; CMOV-LABEL: cmov_zpromotion_8_to_32:			; CMOV-LABEL: cmov_zpromotion_8_to_32:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $126, %ecx			; CMOV-NEXT: movl $126, %ecx
	; CMOV-NEXT: movl $255, %eax			; CMOV-NEXT: movl $255, %eax
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovnel %ecx, %eax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_zpromotion_8_to_32:			; NO_CMOV-LABEL: cmov_zpromotion_8_to_32:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $126, %al			; NO_CMOV-NEXT: movl $126, %eax
	; NO_CMOV-NEXT: jne .LBB1_2			; NO_CMOV-NEXT: jne .LBB1_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movb $-1, %al			; NO_CMOV-NEXT: movl $255, %eax
	; NO_CMOV-NEXT: .LBB1_2:			; NO_CMOV-NEXT: .LBB1_2:
	; NO_CMOV-NEXT: movzbl %al, %eax
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 12414, i8 -1			%t0 = select i1 %c, i8 12414, i8 -1
	%ret = zext i8 %t0 to i32			%ret = zext i8 %t0 to i32
	ret i32 %ret			ret i32 %ret
	}			}

	define i64 @cmov_zpromotion_8_to_64(i1 %c) {			define i64 @cmov_zpromotion_8_to_64(i1 %c) {
	; CMOV-LABEL: cmov_zpromotion_8_to_64:			; CMOV-LABEL: cmov_zpromotion_8_to_64:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $126, %ecx			; CMOV-NEXT: movl $126, %ecx
	; CMOV-NEXT: movl $255, %eax			; CMOV-NEXT: movl $255, %eax
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovneq %rcx, %rax
				craig.topperUnsubmitted Not Done Reply Inline Actions The zextisfree check isn't enough to fix this :( It's an i8->i64 zext which isn't free. I guess we'll have to handle this in the x86 backend. craig.topper: The zextisfree check isn't enough to fix this :( It's an i8->i64 zext which isn't free. I guess…
				steven.zhangAuthorUnsubmitted Done Reply Inline Actions Yes. The hook only check the type i32->i64. Do we need to pass the value instead of the type for the isZextFree to fix this issue ? steven.zhang: Yes. The hook only check the type i32->i64. Do we need to pass the value instead of the type…
				spatelUnsubmitted Not Done Reply Inline Actions I don't think changing the TLI call will be enough to solve the general problem (the IR is likely already in the form that we're trying to avoid). spatel: I don't think changing the TLI call will be enough to solve the general problem (the IR is…
				steven.zhangAuthorUnsubmitted Done Reply Inline Actions ok. That sounds to be another issue. steven.zhang: ok. That sounds to be another issue.
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_zpromotion_8_to_64:			; NO_CMOV-LABEL: cmov_zpromotion_8_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $126, %al			; NO_CMOV-NEXT: movl $126, %eax
	; NO_CMOV-NEXT: jne .LBB2_2			; NO_CMOV-NEXT: jne .LBB2_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movb $-1, %al			; NO_CMOV-NEXT: movl $255, %eax
	; NO_CMOV-NEXT: .LBB2_2:			; NO_CMOV-NEXT: .LBB2_2:
	; NO_CMOV-NEXT: movzbl %al, %eax
	; NO_CMOV-NEXT: xorl %edx, %edx			; NO_CMOV-NEXT: xorl %edx, %edx
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 12414, i8 -1			%t0 = select i1 %c, i8 12414, i8 -1
	%ret = zext i8 %t0 to i64			%ret = zext i8 %t0 to i64
	ret i64 %ret			ret i64 %ret
	}			}

	define i32 @cmov_zpromotion_16_to_32(i1 %c) {			define i32 @cmov_zpromotion_16_to_32(i1 %c) {
	Show All 20 Lines
	}			}

	define i64 @cmov_zpromotion_16_to_64(i1 %c) {			define i64 @cmov_zpromotion_16_to_64(i1 %c) {
	; CMOV-LABEL: cmov_zpromotion_16_to_64:			; CMOV-LABEL: cmov_zpromotion_16_to_64:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E			; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E
	; CMOV-NEXT: movl $65535, %eax # imm = 0xFFFF			; CMOV-NEXT: movl $65535, %eax # imm = 0xFFFF
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovneq %rcx, %rax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_zpromotion_16_to_64:			; NO_CMOV-LABEL: cmov_zpromotion_16_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E			; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E
	; NO_CMOV-NEXT: jne .LBB4_2			; NO_CMOV-NEXT: jne .LBB4_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movl $65535, %eax # imm = 0xFFFF			; NO_CMOV-NEXT: movl $65535, %eax # imm = 0xFFFF
	; NO_CMOV-NEXT: .LBB4_2:			; NO_CMOV-NEXT: .LBB4_2:
	; NO_CMOV-NEXT: xorl %edx, %edx			; NO_CMOV-NEXT: xorl %edx, %edx
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i16 12414, i16 -1			%t0 = select i1 %c, i16 12414, i16 -1
	%ret = zext i16 %t0 to i64			%ret = zext i16 %t0 to i64
	ret i64 %ret			ret i64 %ret
	}			}

	define i64 @cmov_zpromotion_32_to_64(i1 %c) {			define i64 @cmov_zpromotion_32_to_64(i1 %c) {
	; CMOV-LABEL: cmov_zpromotion_32_to_64:			; CMOV-LABEL: cmov_zpromotion_32_to_64:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E			; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E
	; CMOV-NEXT: movl $-1, %eax			; CMOV-NEXT: movl $-1, %eax
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovnel %ecx, %eax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
				craig.topperUnsubmitted Not Done Reply Inline Actions This is slightly worse. Maybe don't do this when zext is free? craig.topper: This is slightly worse. Maybe don't do this when zext is free?
				steven.zhangAuthorUnsubmitted Done Reply Inline Actions I check the scheduling information for cmovneq and cmovnel, both latency are 1. I didn't catch the "slightly worse" you mean ... Could you explain it more, as I am new to X86 instr. Thank you! steven.zhang: I check the scheduling information for cmovneq and cmovnel, both latency are 1. I didn't catch…
				craig.topperUnsubmitted Not Done Reply Inline Actions Cmovneq’s encoding is 1 byte longer than cmovnel. The 64-bit size requires a REX prefix. craig.topper: Cmovneq’s encoding is 1 byte longer than cmovnel. The 64-bit size requires a REX prefix.
				steven.zhangAuthorUnsubmitted Done Reply Inline Actions I wonder if we can do this inside X86 target, as it seems a valid improvement for x86. For cmovneq, if the high 32bit is zero, use cmovnel ? steven.zhang: I wonder if we can do this inside X86 target, as it seems a valid improvement for x86. For…
				spatelUnsubmitted Not Done Reply Inline Actions The problem is larger than just this transform or x86. We do the same transform in instcombine, so we need to check constant values to reverse it. But there's no reason to make the problem worse by not using the existing TLI hook suggested by Craig. If we just add one more clause to the 'if' check, we avoid this test diff without changing any others: if (isa<ConstantSDNode>(Op1) && isa<ConstantSDNode>(Op2) && (Opcode != ISD::ZERO_EXTEND \|\| !TLI.isZExtFree(N0.getValueType(), VT))) { spatel: The problem is larger than just this transform or x86. We do the same transform in instcombine…
				steven.zhangAuthorUnsubmitted Done Reply Inline Actions Ah, sorry, I didn't get the point. Sure, it makes sense. steven.zhang: Ah, sorry, I didn't get the point. Sure, it makes sense.
	;			;
	; NO_CMOV-LABEL: cmov_zpromotion_32_to_64:			; NO_CMOV-LABEL: cmov_zpromotion_32_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E			; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E
	; NO_CMOV-NEXT: jne .LBB5_2			; NO_CMOV-NEXT: jne .LBB5_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movl $-1, %eax			; NO_CMOV-NEXT: movl $-1, %eax
	Show All 13 Lines
	; CMOV-NEXT: movl $65517, %eax # imm = 0xFFED			; CMOV-NEXT: movl $65517, %eax # imm = 0xFFED
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovnel %ecx, %eax
	; CMOV-NEXT: # kill: def $ax killed $ax killed $eax			; CMOV-NEXT: # kill: def $ax killed $ax killed $eax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_spromotion_8_to_16:			; NO_CMOV-LABEL: cmov_spromotion_8_to_16:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $117, %al			; NO_CMOV-NEXT: movl $117, %eax
	; NO_CMOV-NEXT: jne .LBB6_2			; NO_CMOV-NEXT: jne .LBB6_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movb $-19, %al			; NO_CMOV-NEXT: movl $65517, %eax # imm = 0xFFED
	; NO_CMOV-NEXT: .LBB6_2:			; NO_CMOV-NEXT: .LBB6_2:
	; NO_CMOV-NEXT: movsbl %al, %eax
	; NO_CMOV-NEXT: # kill: def $ax killed $ax killed $eax			; NO_CMOV-NEXT: # kill: def $ax killed $ax killed $eax
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 117, i8 -19			%t0 = select i1 %c, i8 117, i8 -19
	%ret = sext i8 %t0 to i16			%ret = sext i8 %t0 to i16
	ret i16 %ret			ret i16 %ret
	}			}

	define i32 @cmov_spromotion_8_to_32(i1 %c) {			define i32 @cmov_spromotion_8_to_32(i1 %c) {
	; CMOV-LABEL: cmov_spromotion_8_to_32:			; CMOV-LABEL: cmov_spromotion_8_to_32:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $126, %ecx			; CMOV-NEXT: movl $126, %ecx
	; CMOV-NEXT: movl $-1, %eax			; CMOV-NEXT: movl $-1, %eax
	; CMOV-NEXT: cmovnel %ecx, %eax			; CMOV-NEXT: cmovnel %ecx, %eax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_spromotion_8_to_32:			; NO_CMOV-LABEL: cmov_spromotion_8_to_32:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $126, %al			; NO_CMOV-NEXT: movl $126, %eax
	; NO_CMOV-NEXT: jne .LBB7_2			; NO_CMOV-NEXT: jne .LBB7_2
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movb $-1, %al			; NO_CMOV-NEXT: movl $-1, %eax
	; NO_CMOV-NEXT: .LBB7_2:			; NO_CMOV-NEXT: .LBB7_2:
	; NO_CMOV-NEXT: movsbl %al, %eax
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 12414, i8 -1			%t0 = select i1 %c, i8 12414, i8 -1
	%ret = sext i8 %t0 to i32			%ret = sext i8 %t0 to i32
	ret i32 %ret			ret i32 %ret
	}			}

	define i64 @cmov_spromotion_8_to_64(i1 %c) {			define i64 @cmov_spromotion_8_to_64(i1 %c) {
	; CMOV-LABEL: cmov_spromotion_8_to_64:			; CMOV-LABEL: cmov_spromotion_8_to_64:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $126, %ecx			; CMOV-NEXT: movl $126, %ecx
	; CMOV-NEXT: movq $-1, %rax			; CMOV-NEXT: movq $-1, %rax
	; CMOV-NEXT: cmovneq %rcx, %rax			; CMOV-NEXT: cmovneq %rcx, %rax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_spromotion_8_to_64:			; NO_CMOV-LABEL: cmov_spromotion_8_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movb $126, %al			; NO_CMOV-NEXT: jne .LBB8_1
	; NO_CMOV-NEXT: jne .LBB8_2			; NO_CMOV-NEXT: # %bb.2:
	; NO_CMOV-NEXT: # %bb.1:			; NO_CMOV-NEXT: movl $-1, %eax
	; NO_CMOV-NEXT: movb $-1, %al			; NO_CMOV-NEXT: movl $-1, %edx
	; NO_CMOV-NEXT: .LBB8_2:			; NO_CMOV-NEXT: retl
	; NO_CMOV-NEXT: movsbl %al, %eax			; NO_CMOV-NEXT: .LBB8_1:
	; NO_CMOV-NEXT: movl %eax, %edx			; NO_CMOV-NEXT: xorl %edx, %edx
	; NO_CMOV-NEXT: sarl $31, %edx			; NO_CMOV-NEXT: movl $126, %eax
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i8 12414, i8 -1			%t0 = select i1 %c, i8 12414, i8 -1
	%ret = sext i8 %t0 to i64			%ret = sext i8 %t0 to i64
	ret i64 %ret			ret i64 %ret
	}			}

	define i32 @cmov_spromotion_16_to_32(i1 %c) {			define i32 @cmov_spromotion_16_to_32(i1 %c) {
	; CMOV-LABEL: cmov_spromotion_16_to_32:			; CMOV-LABEL: cmov_spromotion_16_to_32:
	Show All 25 Lines
	; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E			; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E
	; CMOV-NEXT: movq $-1, %rax			; CMOV-NEXT: movq $-1, %rax
	; CMOV-NEXT: cmovneq %rcx, %rax			; CMOV-NEXT: cmovneq %rcx, %rax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_spromotion_16_to_64:			; NO_CMOV-LABEL: cmov_spromotion_16_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E			; NO_CMOV-NEXT: jne .LBB10_1
	; NO_CMOV-NEXT: jne .LBB10_2			; NO_CMOV-NEXT: # %bb.2:
	; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movl $-1, %eax			; NO_CMOV-NEXT: movl $-1, %eax
	; NO_CMOV-NEXT: .LBB10_2:			; NO_CMOV-NEXT: movl $-1, %edx
	; NO_CMOV-NEXT: movl %eax, %edx			; NO_CMOV-NEXT: retl
	; NO_CMOV-NEXT: sarl $31, %edx			; NO_CMOV-NEXT: .LBB10_1:
				; NO_CMOV-NEXT: xorl %edx, %edx
				; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i16 12414, i16 -1			%t0 = select i1 %c, i16 12414, i16 -1
	%ret = sext i16 %t0 to i64			%ret = sext i16 %t0 to i64
	ret i64 %ret			ret i64 %ret
	}			}

	define i64 @cmov_spromotion_32_to_64(i1 %c) {			define i64 @cmov_spromotion_32_to_64(i1 %c) {
	; CMOV-LABEL: cmov_spromotion_32_to_64:			; CMOV-LABEL: cmov_spromotion_32_to_64:
	; CMOV: # %bb.0:			; CMOV: # %bb.0:
	; CMOV-NEXT: testb $1, %dil			; CMOV-NEXT: testb $1, %dil
	; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E			; CMOV-NEXT: movl $12414, %ecx # imm = 0x307E
	; CMOV-NEXT: movq $-1, %rax			; CMOV-NEXT: movq $-1, %rax
	; CMOV-NEXT: cmovneq %rcx, %rax			; CMOV-NEXT: cmovneq %rcx, %rax
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NO_CMOV-LABEL: cmov_spromotion_32_to_64:			; NO_CMOV-LABEL: cmov_spromotion_32_to_64:
	; NO_CMOV: # %bb.0:			; NO_CMOV: # %bb.0:
	; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)			; NO_CMOV-NEXT: testb $1, {{[0-9]+}}(%esp)
	; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E			; NO_CMOV-NEXT: jne .LBB11_1
	; NO_CMOV-NEXT: jne .LBB11_2			; NO_CMOV-NEXT: # %bb.2:
	; NO_CMOV-NEXT: # %bb.1:
	; NO_CMOV-NEXT: movl $-1, %eax			; NO_CMOV-NEXT: movl $-1, %eax
	; NO_CMOV-NEXT: .LBB11_2:			; NO_CMOV-NEXT: movl $-1, %edx
	; NO_CMOV-NEXT: movl %eax, %edx			; NO_CMOV-NEXT: retl
	; NO_CMOV-NEXT: sarl $31, %edx			; NO_CMOV-NEXT: .LBB11_1:
				; NO_CMOV-NEXT: xorl %edx, %edx
				; NO_CMOV-NEXT: movl $12414, %eax # imm = 0x307E
	; NO_CMOV-NEXT: retl			; NO_CMOV-NEXT: retl
	%t0 = select i1 %c, i32 12414, i32 -1			%t0 = select i1 %c, i32 12414, i32 -1
	%ret = sext i32 %t0 to i64			%ret = sext i32 %t0 to i64
	ret i64 %ret			ret i64 %ret
	}			}

llvm/test/CodeGen/X86/select.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; PR2139			; PR2139
	define i32 @test2() nounwind {			define i32 @test2() nounwind {
	; GENERIC-LABEL: test2:			; GENERIC-LABEL: test2:
	; GENERIC: ## %bb.0: ## %entry			; GENERIC: ## %bb.0: ## %entry
	; GENERIC-NEXT: pushq %rax			; GENERIC-NEXT: pushq %rax
	; GENERIC-NEXT: callq _return_false			; GENERIC-NEXT: callq _return_false
	; GENERIC-NEXT: xorl %ecx, %ecx			; GENERIC-NEXT: xorl %ecx, %ecx
	; GENERIC-NEXT: testb $1, %al			; GENERIC-NEXT: testb $1, %al
	; GENERIC-NEXT: movl $-480, %eax ## imm = 0xFE20			; GENERIC-NEXT: movl $-3840, %eax ## imm = 0xF100
	; GENERIC-NEXT: cmovnel %ecx, %eax			; GENERIC-NEXT: cmovnel %ecx, %eax
	; GENERIC-NEXT: shll $3, %eax
	; GENERIC-NEXT: cmpl $32768, %eax ## imm = 0x8000			; GENERIC-NEXT: cmpl $32768, %eax ## imm = 0x8000
	; GENERIC-NEXT: jge LBB1_1			; GENERIC-NEXT: jge LBB1_1
	; GENERIC-NEXT: ## %bb.2: ## %bb91			; GENERIC-NEXT: ## %bb.2: ## %bb91
	; GENERIC-NEXT: xorl %eax, %eax			; GENERIC-NEXT: xorl %eax, %eax
	; GENERIC-NEXT: popq %rcx			; GENERIC-NEXT: popq %rcx
	; GENERIC-NEXT: retq			; GENERIC-NEXT: retq
	; GENERIC-NEXT: LBB1_1: ## %bb90			; GENERIC-NEXT: LBB1_1: ## %bb90
	; GENERIC-NEXT: ud2			; GENERIC-NEXT: ud2
	;			;
	; ATOM-LABEL: test2:			; ATOM-LABEL: test2:
	; ATOM: ## %bb.0: ## %entry			; ATOM: ## %bb.0: ## %entry
	; ATOM-NEXT: pushq %rax			; ATOM-NEXT: pushq %rax
	; ATOM-NEXT: callq _return_false			; ATOM-NEXT: callq _return_false
	; ATOM-NEXT: xorl %ecx, %ecx			; ATOM-NEXT: xorl %ecx, %ecx
	; ATOM-NEXT: movl $-480, %edx ## imm = 0xFE20			; ATOM-NEXT: movl $-3840, %edx ## imm = 0xF100
	; ATOM-NEXT: testb $1, %al			; ATOM-NEXT: testb $1, %al
	; ATOM-NEXT: cmovnel %ecx, %edx			; ATOM-NEXT: cmovnel %ecx, %edx
	; ATOM-NEXT: shll $3, %edx
	; ATOM-NEXT: cmpl $32768, %edx ## imm = 0x8000			; ATOM-NEXT: cmpl $32768, %edx ## imm = 0x8000
	; ATOM-NEXT: jge LBB1_1			; ATOM-NEXT: jge LBB1_1
	; ATOM-NEXT: ## %bb.2: ## %bb91			; ATOM-NEXT: ## %bb.2: ## %bb91
	; ATOM-NEXT: xorl %eax, %eax			; ATOM-NEXT: xorl %eax, %eax
	; ATOM-NEXT: popq %rcx			; ATOM-NEXT: popq %rcx
	; ATOM-NEXT: retq			; ATOM-NEXT: retq
	; ATOM-NEXT: LBB1_1: ## %bb90			; ATOM-NEXT: LBB1_1: ## %bb90
	; ATOM-NEXT: ud2			; ATOM-NEXT: ud2
	;			;
	; ATHLON-LABEL: test2:			; ATHLON-LABEL: test2:
	; ATHLON: ## %bb.0: ## %entry			; ATHLON: ## %bb.0: ## %entry
	; ATHLON-NEXT: subl $12, %esp			; ATHLON-NEXT: subl $12, %esp
	; ATHLON-NEXT: calll _return_false			; ATHLON-NEXT: calll _return_false
	; ATHLON-NEXT: xorl %ecx, %ecx			; ATHLON-NEXT: xorl %ecx, %ecx
	; ATHLON-NEXT: testb $1, %al			; ATHLON-NEXT: testb $1, %al
	; ATHLON-NEXT: movl $-480, %eax ## imm = 0xFE20			; ATHLON-NEXT: movl $-3840, %eax ## imm = 0xF100
	; ATHLON-NEXT: cmovnel %ecx, %eax			; ATHLON-NEXT: cmovnel %ecx, %eax
	; ATHLON-NEXT: shll $3, %eax
	; ATHLON-NEXT: cmpl $32768, %eax ## imm = 0x8000			; ATHLON-NEXT: cmpl $32768, %eax ## imm = 0x8000
	; ATHLON-NEXT: jge LBB1_1			; ATHLON-NEXT: jge LBB1_1
	; ATHLON-NEXT: ## %bb.2: ## %bb91			; ATHLON-NEXT: ## %bb.2: ## %bb91
	; ATHLON-NEXT: xorl %eax, %eax			; ATHLON-NEXT: xorl %eax, %eax
	; ATHLON-NEXT: addl $12, %esp			; ATHLON-NEXT: addl $12, %esp
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	; ATHLON-NEXT: LBB1_1: ## %bb90			; ATHLON-NEXT: LBB1_1: ## %bb90
	; ATHLON-NEXT: ud2			; ATHLON-NEXT: ud2
	;			;
	; MCU-LABEL: test2:			; MCU-LABEL: test2:
	; MCU: # %bb.0: # %entry			; MCU: # %bb.0: # %entry
	; MCU-NEXT: calll return_false			; MCU-NEXT: calll return_false
	; MCU-NEXT: xorl %ecx, %ecx			; MCU-NEXT: xorl %ecx, %ecx
	; MCU-NEXT: testb $1, %al			; MCU-NEXT: testb $1, %al
	; MCU-NEXT: jne .LBB1_2			; MCU-NEXT: jne .LBB1_2
	; MCU-NEXT: # %bb.1: # %entry			; MCU-NEXT: # %bb.1: # %entry
	; MCU-NEXT: movl $-480, %ecx # imm = 0xFE20			; MCU-NEXT: movl $-3840, %ecx # imm = 0xF100
	; MCU-NEXT: .LBB1_2: # %entry			; MCU-NEXT: .LBB1_2: # %entry
	; MCU-NEXT: shll $3, %ecx
	; MCU-NEXT: cmpl $32768, %ecx # imm = 0x8000			; MCU-NEXT: cmpl $32768, %ecx # imm = 0x8000
	; MCU-NEXT: jge .LBB1_3			; MCU-NEXT: jge .LBB1_3
	; MCU-NEXT: # %bb.4: # %bb91			; MCU-NEXT: # %bb.4: # %bb91
	; MCU-NEXT: xorl %eax, %eax			; MCU-NEXT: xorl %eax, %eax
	; MCU-NEXT: retl			; MCU-NEXT: retl
	; MCU-NEXT: .LBB1_3: # %bb90			; MCU-NEXT: .LBB1_3: # %bb90
	entry:			entry:
	%tmp73 = tail call i1 @return_false()			%tmp73 = tail call i1 @return_false()
	▲ Show 20 Lines • Show All 1,412 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Teach DAGCombine to fold the aext + select patternClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 205512

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/test/CodeGen/PowerPC/bool-math.ll

llvm/test/CodeGen/PowerPC/select_const.ll

llvm/test/CodeGen/X86/avx512-insert-extract.ll

llvm/test/CodeGen/X86/cmov-promotion.ll

llvm/test/CodeGen/X86/select.ll

[DAGCombine] Teach DAGCombine to fold the aext + select pattern
ClosedPublic