This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] add missing folds for scalar select of {-1,0,1}
ClosedPublic

Authored by spatel on Feb 20 2017, 2:32 PM.

Download Raw Diff

Details

Reviewers

jholewinski
t.p.northover
jlebar
craig.topper
kparzysz
arsenm
hfinkel
tstellar
efriedma

Commits

rG832b1622d8b4: [DAGCombiner] add missing folds for scalar select of {-1,0,1}
rL296137: [DAGCombiner] add missing folds for scalar select of {-1,0,1}

Summary

I think all of the test changes here are wins or neutral, but anyone who knows AMDGPU, Hexagon, and NVPTX should take a look at those diffs. If it makes it easier, I can post the full before/after asm of the affected tests for those targets.

The motivation for filling out these select-of-constants cases goes back to D24480, where we discussed removing an IR fold from add(zext) --> select. And that goes back to:
https://reviews.llvm.org/rL75531
https://reviews.llvm.org/rL159230

The idea is that we should always canonicalize patterns like this to a select-of-constants in IR because that's the smallest IR and the best for value tracking. Note that we currently do the opposite in some cases (like the cases in *this* patch). Ie, the proposed folds in this patch already exist in InstCombine today:
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151

As this patch shows, most targets generate better machine code for simple ext/add/not ops rather than a select of constants. So the follow-up steps to make this less of a patchwork of special-case folds and missing IR canonicalization:

Have DAGCombiner convert any select of constants into ext/add/not ops.
Have InstCombine canonicalize in the other direction (create more selects).

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Feb 20 2017, 2:32 PM

Herald added subscribers: tpr, nhaehnle, wdng and 2 others. · View Herald TranscriptFeb 20 2017, 2:32 PM

NVPTX test change is fine.

This revision is now accepted and ready to land.Feb 20 2017, 3:34 PM

The Hexagon changes are fine as well.

@tstellar @arsenm or anyone else with AMDGPU experience, do the test/CodeGen/AMDGPU/trunc.ll diffs look ok?

tstellar accepted this revision.Feb 23 2017, 10:14 AM

Closed by commit rL296137: [DAGCombiner] add missing folds for scalar select of {-1,0,1} (authored by spatel). · Explain WhyFeb 24 2017, 9:29 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D30502: [DAGCombiner] fold binops with constant into select-of-constants.Mar 1 2017, 8:35 AM

spatel mentioned this in D31944: [DAGCombiner] add (sext i1 X), 1 --> zext (not i1 X).Apr 11 2017, 9:03 AM

spatel mentioned this in rL301457: [DAGCombiner] add (sext i1 X), 1 --> zext (not i1 X).Apr 26 2017, 1:40 PM

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

35 lines

test/

CodeGen/

AMDGPU/

trunc.ll

12 lines

ARM/

select_const.ll

35 lines

Hexagon/

adde.ll

9 lines

sube.ll

8 lines

NVPTX/

add-128bit.ll

2 lines

PowerPC/

select_const.ll

65 lines

X86/

select_const.ll

18 lines

Diff 89684

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,593 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::foldSelectOfConstants(SDNode *N) {
SDLoc DL(N);		SDLoc DL(N);

if (!VT.isInteger())		if (!VT.isInteger())
return SDValue();		return SDValue();

if (!isa<ConstantSDNode>(N1) \|\| !isa<ConstantSDNode>(N2))		if (!isa<ConstantSDNode>(N1) \|\| !isa<ConstantSDNode>(N2))
return SDValue();		return SDValue();

// TODO: We should handle other cases of selecting between {-1,0,1} here.		// Only do this before legalization to avoid conflicting with target-specific
if (CondVT == MVT::i1) {		// transforms in the other direction (create a select from a zext/sext). There
		// is also a target-independent combine here in DAGCombiner in the other
		// direction for (select Cond, -1, 0) when the condition is not i1.
		// TODO: This could be generalized for any 2 constants that differ by 1:
		// add ({s/z}ext Cond), C
		if (CondVT == MVT::i1 && !LegalOperations) {
if (isNullConstant(N1) && isOneConstant(N2)) {		if (isNullConstant(N1) && isOneConstant(N2)) {
// select Cond, 0, 1 --> zext (!Cond)		// select Cond, 0, 1 --> zext (!Cond)
SDValue NotCond = DAG.getNOT(DL, Cond, MVT::i1);		SDValue NotCond = DAG.getNOT(DL, Cond, MVT::i1);
if (VT != MVT::i1)		if (VT != MVT::i1)
NotCond = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, NotCond);		NotCond = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, NotCond);
return NotCond;		return NotCond;
}		}
		if (isNullConstant(N1) && isAllOnesConstant(N2)) {
		// select Cond, 0, -1 --> sext (!Cond)
		SDValue NotCond = DAG.getNOT(DL, Cond, MVT::i1);
		if (VT != MVT::i1)
		NotCond = DAG.getNode(ISD::SIGN_EXTEND, DL, VT, NotCond);
		return NotCond;
		}
		if (isOneConstant(N1) && isNullConstant(N2)) {
		// select Cond, 1, 0 --> zext (Cond)
		if (VT != MVT::i1)
		Cond = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, Cond);
		return Cond;
		}
		if (isAllOnesConstant(N1) && isNullConstant(N2)) {
		// select Cond, -1, 0 --> sext (Cond)
		if (VT != MVT::i1)
		Cond = DAG.getNode(ISD::SIGN_EXTEND, DL, VT, Cond);
		return Cond;
		}
return SDValue();		return SDValue();
}		}

// fold (select Cond, 0, 1) -> (xor Cond, 1)		// fold (select Cond, 0, 1) -> (xor Cond, 1)
// We can't do this reliably if integer based booleans have different contents		// We can't do this reliably if integer based booleans have different contents
// to floating point based booleans. This is because we can't tell whether we		// to floating point based booleans. This is because we can't tell whether we
// have an integer-based boolean or a floating-point-based boolean unless we		// have an integer-based boolean or a floating-point-based boolean unless we
// can find the SETCC that produced it and inspect its operands. This is		// can find the SETCC that produced it and inspect its operands. This is
▲ Show 20 Lines • Show All 1,142 Lines • ▼ Show 20 Lines	SDValue ExtTrueVal = (SetCCWidth == 1) ? DAG.getAllOnesConstant(DL, VT)
: TLI.getConstTrueVal(DAG, VT, DL);		: TLI.getConstTrueVal(DAG, VT, DL);
SDValue Zero = DAG.getConstant(0, DL, VT);		SDValue Zero = DAG.getConstant(0, DL, VT);
if (SDValue SCC =		if (SDValue SCC =
SimplifySelectCC(DL, N00, N01, ExtTrueVal, Zero, CC, true))		SimplifySelectCC(DL, N00, N01, ExtTrueVal, Zero, CC, true))
return SCC;		return SCC;

if (!VT.isVector()) {		if (!VT.isVector()) {
EVT SetCCVT = getSetCCResultType(N00VT);		EVT SetCCVT = getSetCCResultType(N00VT);
if (!LegalOperations \|\| TLI.isOperationLegal(ISD::SETCC, N00VT)) {		// Don't do this transform for i1 because there's a select transform
		// that would reverse it.
		// TODO: We should not do this transform at all without a target hook
		// because a sext is likely cheaper than a select?
		if (SetCCVT.getScalarSizeInBits() != 1 &&
		(!LegalOperations \|\| TLI.isOperationLegal(ISD::SETCC, N00VT))) {
SDValue SetCC = DAG.getSetCC(DL, SetCCVT, N00, N01, CC);		SDValue SetCC = DAG.getSetCC(DL, SetCCVT, N00, N01, CC);
return DAG.getSelect(DL, VT, SetCC, ExtTrueVal, Zero);		return DAG.getSelect(DL, VT, SetCC, ExtTrueVal, Zero);
}		}
}		}
}		}

// fold (sext x) -> (zext x) if the sign bit is known zero.		// fold (sext x) -> (zext x) if the sign bit is known zero.
if ((!LegalOperations \|\| TLI.isOperationLegal(ISD::ZERO_EXTEND, VT)) &&		if ((!LegalOperations \|\| TLI.isOperationLegal(ISD::ZERO_EXTEND, VT)) &&
▲ Show 20 Lines • Show All 9,233 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AMDGPU/trunc.ll

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	define void @trunc_shl_i64(i64 addrspace(1)* %out2, i32 addrspace(1)* %out, i64 %a) {
%b = shl i64 %aa, 2		%b = shl i64 %aa, 2
%result = trunc i64 %b to i32		%result = trunc i64 %b to i32
store i32 %result, i32 addrspace(1)* %out, align 4		store i32 %result, i32 addrspace(1)* %out, align 4
store i64 %b, i64 addrspace(1)* %out2, align 8 ; Prevent reducing ops to 32-bits		store i64 %b, i64 addrspace(1)* %out2, align 8 ; Prevent reducing ops to 32-bits
ret void		ret void
}		}

; GCN-LABEL: {{^}}trunc_i32_to_i1:		; GCN-LABEL: {{^}}trunc_i32_to_i1:
; GCN: v_and_b32_e32 v{{[0-9]+}}, 1, v{{[0-9]+}}		; GCN: v_and_b32_e32 [[VREG:v[0-9]+]], 1, v{{[0-9]+}}
; GCN: v_cmp_eq_u32
define void @trunc_i32_to_i1(i32 addrspace(1)* %out, i32 addrspace(1)* %ptr) {		define void @trunc_i32_to_i1(i32 addrspace(1)* %out, i32 addrspace(1)* %ptr) {
%a = load i32, i32 addrspace(1)* %ptr, align 4		%a = load i32, i32 addrspace(1)* %ptr, align 4
%trunc = trunc i32 %a to i1		%trunc = trunc i32 %a to i1
%result = select i1 %trunc, i32 1, i32 0		%result = select i1 %trunc, i32 1, i32 0
store i32 %result, i32 addrspace(1)* %out, align 4		store i32 %result, i32 addrspace(1)* %out, align 4
ret void		ret void
}		}

; GCN-LABEL: {{^}}trunc_i8_to_i1:		; GCN-LABEL: {{^}}trunc_i8_to_i1:
; GCN: v_and_b32_e32 v{{[0-9]+}}, 1, v{{[0-9]+}}		; GCN: v_and_b32_e32 [[VREG:v[0-9]+]], 1, v{{[0-9]+}}
; GCN: v_cmp_eq_u32
define void @trunc_i8_to_i1(i8 addrspace(1)* %out, i8 addrspace(1)* %ptr) {		define void @trunc_i8_to_i1(i8 addrspace(1)* %out, i8 addrspace(1)* %ptr) {
%a = load i8, i8 addrspace(1)* %ptr, align 4		%a = load i8, i8 addrspace(1)* %ptr, align 4
%trunc = trunc i8 %a to i1		%trunc = trunc i8 %a to i1
%result = select i1 %trunc, i8 1, i8 0		%result = select i1 %trunc, i8 1, i8 0
store i8 %result, i8 addrspace(1)* %out, align 4		store i8 %result, i8 addrspace(1)* %out, align 4
ret void		ret void
}		}

; GCN-LABEL: {{^}}sgpr_trunc_i16_to_i1:		; GCN-LABEL: {{^}}sgpr_trunc_i16_to_i1:
; GCN: s_and_b32 s{{[0-9]+}}, 1, s{{[0-9]+}}		; GCN: s_and_b32 s{{[0-9]+}}, s{{[0-9]+}}, 1
; GCN: v_cmp_eq_u32
define void @sgpr_trunc_i16_to_i1(i16 addrspace(1)* %out, i16 %a) {		define void @sgpr_trunc_i16_to_i1(i16 addrspace(1)* %out, i16 %a) {
%trunc = trunc i16 %a to i1		%trunc = trunc i16 %a to i1
%result = select i1 %trunc, i16 1, i16 0		%result = select i1 %trunc, i16 1, i16 0
store i16 %result, i16 addrspace(1)* %out, align 4		store i16 %result, i16 addrspace(1)* %out, align 4
ret void		ret void
}		}

; GCN-LABEL: {{^}}sgpr_trunc_i32_to_i1:		; GCN-LABEL: {{^}}sgpr_trunc_i32_to_i1:
; GCN: s_and_b32 s{{[0-9]+}}, 1, s{{[0-9]+}}		; GCN: s_and_b32 s{{[0-9]+}}, s{{[0-9]+}}, 1
; GCN: v_cmp_eq_u32
define void @sgpr_trunc_i32_to_i1(i32 addrspace(1)* %out, i32 %a) {		define void @sgpr_trunc_i32_to_i1(i32 addrspace(1)* %out, i32 %a) {
%trunc = trunc i32 %a to i1		%trunc = trunc i32 %a to i1
%result = select i1 %trunc, i32 1, i32 0		%result = select i1 %trunc, i32 1, i32 0
store i32 %result, i32 addrspace(1)* %out, align 4		store i32 %result, i32 addrspace(1)* %out, align 4
ret void		ret void
}		}

; GCN-LABEL: {{^}}s_trunc_i64_to_i1:		; GCN-LABEL: {{^}}s_trunc_i64_to_i1:
Show All 29 Lines

llvm/trunk/test/CodeGen/ARM/select_const.ll

Show All 34 Lines	; CHECK-NEXT: mov pc, lr
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 1, 0 --> zext (Cond)		; select Cond, 1, 0 --> zext (Cond)

define i32 @select_1_or_0(i1 %cond) {		define i32 @select_1_or_0(i1 %cond) {
; CHECK-LABEL: select_1_or_0:		; CHECK-LABEL: select_1_or_0:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: ands r0, r0, #1		; CHECK-NEXT: and r0, r0, #1
; CHECK-NEXT: movne r0, #1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 1, i32 0		%sel = select i1 %cond, i32 1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {		define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {
; CHECK-LABEL: select_1_or_0_zeroext:		; CHECK-LABEL: select_1_or_0_zeroext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: cmp r0, #0
; CHECK-NEXT: movne r0, #1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 1, i32 0		%sel = select i1 %cond, i32 1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_1_or_0_signext(i1 signext %cond) {		define i32 @select_1_or_0_signext(i1 signext %cond) {
; CHECK-LABEL: select_1_or_0_signext:		; CHECK-LABEL: select_1_or_0_signext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: ands r0, r0, #1		; CHECK-NEXT: and r0, r0, #1
; CHECK-NEXT: movne r0, #1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 1, i32 0		%sel = select i1 %cond, i32 1, i32 0
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 0, -1 --> sext (!Cond)		; select Cond, 0, -1 --> sext (!Cond)

define i32 @select_0_or_neg1(i1 %cond) {		define i32 @select_0_or_neg1(i1 %cond) {
; CHECK-LABEL: select_0_or_neg1:		; CHECK-LABEL: select_0_or_neg1:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: mvn r1, #0		; CHECK-NEXT: mov r1, #1
; CHECK-NEXT: tst r0, #1		; CHECK-NEXT: bic r0, r1, r0
; CHECK-NEXT: movne r1, #0		; CHECK-NEXT: rsb r0, r0, #0
; CHECK-NEXT: mov r0, r1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {		define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
; CHECK-LABEL: select_0_or_neg1_zeroext:		; CHECK-LABEL: select_0_or_neg1_zeroext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: mvn r1, #0		; CHECK-NEXT: eor r0, r0, #1
; CHECK-NEXT: cmp r0, #0		; CHECK-NEXT: rsb r0, r0, #0
; CHECK-NEXT: movne r1, #0
; CHECK-NEXT: mov r0, r1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_signext(i1 signext %cond) {		define i32 @select_0_or_neg1_signext(i1 signext %cond) {
; CHECK-LABEL: select_0_or_neg1_signext:		; CHECK-LABEL: select_0_or_neg1_signext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: mvn r1, #0		; CHECK-NEXT: mvn r0, r0
; CHECK-NEXT: tst r0, #1
; CHECK-NEXT: movne r1, #0
; CHECK-NEXT: mov r0, r1
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

; select Cond, -1, 0 --> sext (Cond)		; select Cond, -1, 0 --> sext (Cond)

define i32 @select_neg1_or_0(i1 %cond) {		define i32 @select_neg1_or_0(i1 %cond) {
; CHECK-LABEL: select_neg1_or_0:		; CHECK-LABEL: select_neg1_or_0:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: ands r0, r0, #1		; CHECK-NEXT: and r0, r0, #1
; CHECK-NEXT: mvnne r0, #0		; CHECK-NEXT: rsb r0, r0, #0
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {		define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {
; CHECK-LABEL: select_neg1_or_0_zeroext:		; CHECK-LABEL: select_neg1_or_0_zeroext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: cmp r0, #0		; CHECK-NEXT: rsb r0, r0, #0
; CHECK-NEXT: mvnne r0, #0
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_signext(i1 signext %cond) {		define i32 @select_neg1_or_0_signext(i1 signext %cond) {
; CHECK-LABEL: select_neg1_or_0_signext:		; CHECK-LABEL: select_neg1_or_0_signext:
; CHECK: @ BB#0:		; CHECK: @ BB#0:
; CHECK-NEXT: ands r0, r0, #1
; CHECK-NEXT: mvnne r0, #0
; CHECK-NEXT: mov pc, lr		; CHECK-NEXT: mov pc, lr
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

; select Cond, C+1, C --> add (zext Cond), C		; select Cond, C+1, C --> add (zext Cond), C

define i32 @select_Cplus1_C(i1 %cond) {		define i32 @select_Cplus1_C(i1 %cond) {
▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/Hexagon/adde.ll

	; RUN: llc -march=hexagon -disable-hsdr -hexagon-expand-condsets=0 -hexagon-bit=0 -disable-post-ra < %s \| FileCheck %s			; RUN: llc -march=hexagon -disable-hsdr -hexagon-expand-condsets=0 -hexagon-bit=0 -disable-post-ra < %s \| FileCheck %s

	; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,#1)
	; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,#0)
	; CHECK: r{{[0-9]+:[0-9]+}} = add(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = add(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
				; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,#1)
	; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},#1,#0)
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,r{{[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = combine(r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = add(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = combine(r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = combine(r{{[0-9]+}},r{{[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = add(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = add(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})


	define void @check_adde_addc (i64 %AL, i64 %AH, i64 %BL, i64 %BH, i64* %RL, i64* %RH) {			define void @check_adde_addc (i64 %AL, i64 %AH, i64 %BL, i64 %BH, i64* %RL, i64* %RH) {
	entry:			entry:
	Show All 16 Lines

llvm/trunk/test/CodeGen/Hexagon/sube.ll

	; RUN: llc -march=hexagon -disable-hsdr -hexagon-expand-condsets=0 -hexagon-bit=0 -disable-post-ra < %s \| FileCheck %s			; RUN: llc -march=hexagon -disable-hsdr -hexagon-expand-condsets=0 -hexagon-bit=0 -disable-post-ra < %s \| FileCheck %s

	; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,#0)
	; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,#1)
	; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: p{{[0-9]+}} = cmp.gtu(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = sub(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = sub(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},#1,#0
	; CHECK: r{{[0-9]+}} = mux(p{{[0-9]+}},r{{[0-9]+}},r{{[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = sub(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
				; CHECK: r{{[0-9]+:[0-9]+}} = combine(#0,r{{[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = sub(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})			; CHECK: r{{[0-9]+:[0-9]+}} = sub(r{{[0-9]+:[0-9]+}},r{{[0-9]+:[0-9]+}})
	; CHECK: r{{[0-9]+:[0-9]+}} = combine(r{{[0-9]+}},r{{[0-9]+}})

	define void @check_sube_subc(i64 %AL, i64 %AH, i64 %BL, i64 %BH, i64* %RL, i64* %RH) {			define void @check_sube_subc(i64 %AL, i64 %AH, i64 %BL, i64 %BH, i64* %RL, i64* %RH) {
	entry:			entry:
	%tmp1 = zext i64 %AL to i128			%tmp1 = zext i64 %AL to i128
	%tmp23 = zext i64 %AH to i128			%tmp23 = zext i64 %AH to i128
	%tmp4 = shl i128 %tmp23, 64			%tmp4 = shl i128 %tmp23, 64
	%tmp5 = or i128 %tmp4, %tmp1			%tmp5 = or i128 %tmp4, %tmp1
	%tmp67 = zext i64 %BL to i128			%tmp67 = zext i64 %BL to i128
	Show All 11 Lines

llvm/trunk/test/CodeGen/NVPTX/add-128bit.ll

	; RUN: llc < %s -march=nvptx -mcpu=sm_20 \| FileCheck %s			; RUN: llc < %s -march=nvptx -mcpu=sm_20 \| FileCheck %s

	target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"			target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"



	define void @foo(i64 %a, i64 %add, i128* %retptr) {			define void @foo(i64 %a, i64 %add, i128* %retptr) {
	; CHECK: add.s64			; CHECK: add.s64
	; CHECK: setp.lt.u64			; CHECK: setp.lt.u64
	; CHECK: setp.lt.u64			; CHECK: setp.lt.u64
	; CHECK: selp.b64			; CHECK: selp.u64
	; CHECK: selp.b64			; CHECK: selp.b64
	; CHECK: add.s64			; CHECK: add.s64
	%t1 = sext i64 %a to i128			%t1 = sext i64 %a to i128
	%add2 = zext i64 %add to i128			%add2 = zext i64 %add to i128
	%val = add i128 %t1, %add2			%val = add i128 %t1, %add2
	store i128 %val, i128* %retptr			store i128 %val, i128* %retptr
	ret void			ret void
	}			}

llvm/trunk/test/CodeGen/PowerPC/select_const.ll

	Show All 33 Lines
	; ALL-NEXT: blr			; ALL-NEXT: blr
	%sel = select i1 %cond, i32 0, i32 1			%sel = select i1 %cond, i32 0, i32 1
	ret i32 %sel			ret i32 %sel
	}			}

	; select Cond, 1, 0 --> zext (Cond)			; select Cond, 1, 0 --> zext (Cond)

	define i32 @select_1_or_0(i1 %cond) {			define i32 @select_1_or_0(i1 %cond) {
	; ISEL-LABEL: select_1_or_0:			; ALL-LABEL: select_1_or_0:
	; ISEL: # BB#0:			; ALL: # BB#0:
	; ISEL-NEXT: andi. 3, 3, 1			; ALL-NEXT: clrldi 3, 3, 63
	; ISEL-NEXT: li 4, 1			; ALL-NEXT: blr
	; ISEL-NEXT: li 3, 0
	; ISEL-NEXT: isel 3, 4, 3, 1
	; ISEL-NEXT: blr
	;
	; NO_ISEL-LABEL: select_1_or_0:
	; NO_ISEL: # BB#0:
	; NO_ISEL-NEXT: andi. 3, 3, 1
	; NO_ISEL-NEXT: li 4, 1
	; NO_ISEL-NEXT: li 3, 0
	; NO_ISEL-NEXT: bc 12, 1, .LBB3_1
	; NO_ISEL-NEXT: blr
	; NO_ISEL-NEXT: .LBB3_1:
	; NO_ISEL-NEXT: addi 3, 4, 0
	; NO_ISEL-NEXT: blr
	%sel = select i1 %cond, i32 1, i32 0			%sel = select i1 %cond, i32 1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {			define i32 @select_1_or_0_zeroext(i1 zeroext %cond) {
	; ISEL-LABEL: select_1_or_0_zeroext:			; ALL-LABEL: select_1_or_0_zeroext:
	; ISEL: # BB#0:			; ALL: # BB#0:
	; ISEL-NEXT: andi. 3, 3, 1			; ALL-NEXT: blr
	; ISEL-NEXT: li 4, 1
	; ISEL-NEXT: li 3, 0
	; ISEL-NEXT: isel 3, 4, 3, 1
	; ISEL-NEXT: blr
	;
	; NO_ISEL-LABEL: select_1_or_0_zeroext:
	; NO_ISEL: # BB#0:
	; NO_ISEL-NEXT: andi. 3, 3, 1
	; NO_ISEL-NEXT: li 4, 1
	; NO_ISEL-NEXT: li 3, 0
	; NO_ISEL-NEXT: bc 12, 1, .LBB4_1
	; NO_ISEL-NEXT: blr
	; NO_ISEL-NEXT: .LBB4_1:
	; NO_ISEL-NEXT: addi 3, 4, 0
	; NO_ISEL-NEXT: blr
	%sel = select i1 %cond, i32 1, i32 0			%sel = select i1 %cond, i32 1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	define i32 @select_1_or_0_signext(i1 signext %cond) {			define i32 @select_1_or_0_signext(i1 signext %cond) {
	; ISEL-LABEL: select_1_or_0_signext:			; ALL-LABEL: select_1_or_0_signext:
	; ISEL: # BB#0:			; ALL: # BB#0:
	; ISEL-NEXT: andi. 3, 3, 1			; ALL-NEXT: clrldi 3, 3, 63
	; ISEL-NEXT: li 4, 1			; ALL-NEXT: blr
	; ISEL-NEXT: li 3, 0
	; ISEL-NEXT: isel 3, 4, 3, 1
	; ISEL-NEXT: blr
	;
	; NO_ISEL-LABEL: select_1_or_0_signext:
	; NO_ISEL: # BB#0:
	; NO_ISEL-NEXT: andi. 3, 3, 1
	; NO_ISEL-NEXT: li 4, 1
	; NO_ISEL-NEXT: li 3, 0
	; NO_ISEL-NEXT: bc 12, 1, .LBB5_1
	; NO_ISEL-NEXT: blr
	; NO_ISEL-NEXT: .LBB5_1:
	; NO_ISEL-NEXT: addi 3, 4, 0
	; NO_ISEL-NEXT: blr
	%sel = select i1 %cond, i32 1, i32 0			%sel = select i1 %cond, i32 1, i32 0
	ret i32 %sel			ret i32 %sel
	}			}

	; select Cond, 0, -1 --> sext (!Cond)			; select Cond, 0, -1 --> sext (!Cond)

	define i32 @select_0_or_neg1(i1 %cond) {			define i32 @select_0_or_neg1(i1 %cond) {
	; ISEL-LABEL: select_0_or_neg1:			; ISEL-LABEL: select_0_or_neg1:
	▲ Show 20 Lines • Show All 361 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/select_const.ll

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
ret i32 %sel		ret i32 %sel
}		}

; select Cond, -1, 0 --> sext (Cond)		; select Cond, -1, 0 --> sext (Cond)

define i32 @select_neg1_or_0(i1 %cond) {		define i32 @select_neg1_or_0(i1 %cond) {
; CHECK-LABEL: select_neg1_or_0:		; CHECK-LABEL: select_neg1_or_0:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: xorl %ecx, %ecx		; CHECK-NEXT: andl $1, %edi
; CHECK-NEXT: testb $1, %dil		; CHECK-NEXT: negl %edi
; CHECK-NEXT: movl $-1, %eax		; CHECK-NEXT: movl %edi, %eax
; CHECK-NEXT: cmovel %ecx, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {		define i32 @select_neg1_or_0_zeroext(i1 zeroext %cond) {
; CHECK-LABEL: select_neg1_or_0_zeroext:		; CHECK-LABEL: select_neg1_or_0_zeroext:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: xorl %ecx, %ecx		; CHECK-NEXT: movzbl %dil, %eax
; CHECK-NEXT: testb %dil, %dil		; CHECK-NEXT: negl %eax
; CHECK-NEXT: movl $-1, %eax
; CHECK-NEXT: cmovel %ecx, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_neg1_or_0_signext(i1 signext %cond) {		define i32 @select_neg1_or_0_signext(i1 signext %cond) {
; CHECK-LABEL: select_neg1_or_0_signext:		; CHECK-LABEL: select_neg1_or_0_signext:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: xorl %ecx, %ecx		; CHECK-NEXT: movsbl %dil, %eax
; CHECK-NEXT: testb $1, %dil
; CHECK-NEXT: movl $-1, %eax
; CHECK-NEXT: cmovel %ecx, %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sel = select i1 %cond, i32 -1, i32 0		%sel = select i1 %cond, i32 -1, i32 0
ret i32 %sel		ret i32 %sel
}		}

; select Cond, C+1, C --> add (zext Cond), C		; select Cond, C+1, C --> add (zext Cond), C

define i32 @select_Cplus1_C(i1 %cond) {		define i32 @select_Cplus1_C(i1 %cond) {
▲ Show 20 Lines • Show All 122 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] add missing folds for scalar select of {-1,0,1}ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 89684

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/test/CodeGen/AMDGPU/trunc.ll

llvm/trunk/test/CodeGen/ARM/select_const.ll

llvm/trunk/test/CodeGen/Hexagon/adde.ll

llvm/trunk/test/CodeGen/Hexagon/sube.ll

llvm/trunk/test/CodeGen/NVPTX/add-128bit.ll

llvm/trunk/test/CodeGen/PowerPC/select_const.ll

llvm/trunk/test/CodeGen/X86/select_const.ll

[DAGCombiner] add missing folds for scalar select of {-1,0,1}
ClosedPublic