This is an archive of the discontinued LLVM Phabricator instance.

[X86] Only reorder srl/and on last DAG combiner run
ClosedPublic

Authored by craig.topper on Feb 12 2018, 10:15 AM.

Download Raw Diff

Details

Reviewers

spatel
RKSimon
zvi
reames
davezarzycki

Commits

rGde565fc73e90: [X86] Only reorder srl/and on last DAG combiner run
rL325371: [X86] Only reorder srl/and on last DAG combiner run

Summary

This seems to interfere with a target independent brcond combine that looks for the (srl (and X, C1), C2) pattern to enable TEST instructions. Once we flip, that combine doesn't fire and we end up exposing it to the X86 specific BT combine which causes us to emit a BT instruction. BT has lower throughput than TEST.

We could try to make the brcond combine aware of the alternate pattern, but since the flip was just a code size reduction and not likely to enable other combines, it seemed easier to just delay it until after lowering.

Diff Detail

Repository: rL LLVM

Event Timeline

craig.topper created this revision.Feb 12 2018, 10:15 AM

Does this also imply that we're abandoning D37418? Ie, we're effectively blacklisting 'bt' in isel?

BT has lower throughput than TEST.

This isn't true universally based on Agner's tables. Ryzen has 1 uop and full throughput for a plain 'bt' (btc/btr/bts need 2 uops).

I don't have any actual perf evidence of 'bt' being a factor in a benchmark, but in D37418, it was suggested that a 'btc' would likely be an improvement. See also the comments in:
https://bugs.llvm.org/show_bug.cgi?id=35911

Given that and the fact that gcc/icc already choose 'bt' more often than llvm (again based on earlier comments in the previous review and the PR), I think we'd be better off choosing bt and friends by default in isel and transforming to something with a bigger constant as a specialization at a later stage.

All valid points. I don't know of any benchmarks that care either. I guess my goal with this patch is to get DAG going into isel to be consistent. The test cases in test-shrink.ll look very similar, but some use test and some use bt. This patch gets them all back to using test. We can isel them back to bt for D37418.

In D43201#1008837, @craig.topper wrote:

All valid points. I don't know of any benchmarks that care either. I guess my goal with this patch is to get DAG going into isel to be consistent. The test cases in test-shrink.ll look very similar, but some use test and some use bt. This patch gets them all back to using test. We can isel them back to bt for D37418.

Yes, I'm all for more consistency; it's a mess.

If I'm seeing it correctly, D37418 would need to be enhanced to handle these cases because it's only looking for 64-bit constants right now?

So let's try to move these forward. AFAICT, there's no opposition to D37418 with -Os. Start there?

I don't have any benchmarks. Anecdotally speaking from my experience at Apple: improving code density tends to improve overall system performance. That being said, microbenchmarks sometime suffer.

(Beyond the scope of this patch…) I wish LLVM or the x86 backend could change its instruction selection on the fly. Then it could prefer "reasonable density" (not microcoded) by default and "throughput" in loops (or otherwise obvious hot paths). This would probably yield the best overall/pragmatic performance.

It looks like the behavior of the cases in test-shrink are currently dependent on the order of the operands to the branch...

Rebase with changes to test-vs-bittest.ll

In D43201#1008877, @davezarzycki wrote:

(Beyond the scope of this patch…) I wish LLVM or the x86 backend could change its instruction selection on the fly. Then it could prefer "reasonable density" (not microcoded) by default and "throughput" in loops (or otherwise obvious hot paths). This would probably yield the best overall/pragmatic performance.

Just to second this, I think there's a lot of potential gain to be had by being able to select cold blocks for size and hot blocks for speed. As stated, definitely off topic for this review though!

Rebase now that we use BT under optsize.

In D43201#1009165, @reames wrote:

In D43201#1008877, @davezarzycki wrote:

(Beyond the scope of this patch…) I wish LLVM or the x86 backend could change its instruction selection on the fly. Then it could prefer "reasonable density" (not microcoded) by default and "throughput" in loops (or otherwise obvious hot paths). This would probably yield the best overall/pragmatic performance.

Just to second this, I think there's a lot of potential gain to be had by being able to select cold blocks for size and hot blocks for speed. As stated, definitely off topic for this review though!

@davezarzycki @reames Any chance that you could put some of this request into a bugzilla please? Its the kind of thing that keeps getting bounced around for scheduler driven MC optimizations.

Hi @RKSimon — Here you go: https://bugs.llvm.org/show_bug.cgi?id=36404

Not sure where we're at - rL325287 was supposed to include a new test file, but it didn't get committed yet? Will this patch affect those tests?

Are we converting all >8 bit constants to bt with optsize yet, or that's another patch?

Looks like I forgot to 'svn add' the test when i applied the patch before commit.

This patch should not effect those tests, but I'll double check.

We are not converting >8-bit or/and/xor to bts/btr/btc under optsize yet. Just >32 bit. Using bts/btr/btc will cripple our load folding ability so its not a clear win.

I've committed the test case, and this patch had no effect on it.

test/CodeGen/X86/test-vs-bittest.ll
55 ↗	(On Diff #134493)	Note this test is identical to test64 above with the operands of the 'br' instruction reversed. We were always using 'test'. Seems the initial selectionDAG in one of the cases(i forgot which one) has an ISD::XOR inverting the setcc result before the branch. This somehow causes a difference in DAG combine ordering or something that leads to different results.

LGTM.

test/CodeGen/X86/test-vs-bittest.ll
8 ↗	(On Diff #134493)	Nit: all of the tests in this file have .cfi noise; use 'nounwind' attribute to remove that.
55 ↗	(On Diff #134493)	Mysteries of SDAG... :) Please add this as a test comment here or the other test, so we have a record of it.

This revision is now accepted and ready to land.Feb 16 2018, 7:31 AM

Closed by commit rL325371: [X86] Only reorder srl/and on last DAG combiner run (authored by ctopper). · Explain WhyFeb 16 2018, 10:53 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

X86ISelLowering.cpp

10 lines

test/

CodeGen/

X86/

4 lines

36 lines

17 lines

4 lines

Diff 134654

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 32,935 Lines • ▼ Show 20 Lines	else if (SarConst.isNegative())
DAG.getConstant(-SarConst, DL, CVT));		DAG.getConstant(-SarConst, DL, CVT));
else		else
return DAG.getNode(ISD::SRA, DL, VT, NN,		return DAG.getNode(ISD::SRA, DL, VT, NN,
DAG.getConstant(SarConst, DL, CVT));		DAG.getConstant(SarConst, DL, CVT));
}		}
return SDValue();		return SDValue();
}		}

static SDValue combineShiftRightLogical(SDNode *N, SelectionDAG &DAG) {		static SDValue combineShiftRightLogical(SDNode *N, SelectionDAG &DAG,
		TargetLowering::DAGCombinerInfo &DCI) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
EVT VT = N0.getValueType();		EVT VT = N0.getValueType();

		// Only do this on the last DAG combine as it can interfere with other
		// combines.
		if (!DCI.isAfterLegalizeVectorOps())
		return SDValue();

// Try to improve a sequence of srl (and X, C1), C2 by inverting the order.		// Try to improve a sequence of srl (and X, C1), C2 by inverting the order.
// TODO: This is a generic DAG combine that became an x86-only combine to		// TODO: This is a generic DAG combine that became an x86-only combine to
// avoid shortcomings in other folds such as bswap, bit-test ('bt'), and		// avoid shortcomings in other folds such as bswap, bit-test ('bt'), and
// and-not ('andn').		// and-not ('andn').
if (N0.getOpcode() != ISD::AND \|\| !N0.hasOneUse())		if (N0.getOpcode() != ISD::AND \|\| !N0.hasOneUse())
return SDValue();		return SDValue();

auto *ShiftC = dyn_cast<ConstantSDNode>(N1);		auto *ShiftC = dyn_cast<ConstantSDNode>(N1);
Show All 34 Lines	if (N->getOpcode() == ISD::SHL)
if (SDValue V = combineShiftLeft(N, DAG))		if (SDValue V = combineShiftLeft(N, DAG))
return V;		return V;

if (N->getOpcode() == ISD::SRA)		if (N->getOpcode() == ISD::SRA)
if (SDValue V = combineShiftRightArithmetic(N, DAG))		if (SDValue V = combineShiftRightArithmetic(N, DAG))
return V;		return V;

if (N->getOpcode() == ISD::SRL)		if (N->getOpcode() == ISD::SRL)
if (SDValue V = combineShiftRightLogical(N, DAG))		if (SDValue V = combineShiftRightLogical(N, DAG, DCI))
return V;		return V;

return SDValue();		return SDValue();
}		}

static SDValue combineVectorPack(SDNode *N, SelectionDAG &DAG,		static SDValue combineVectorPack(SDNode *N, SelectionDAG &DAG,
TargetLowering::DAGCombinerInfo &DCI,		TargetLowering::DAGCombinerInfo &DCI,
const X86Subtarget &Subtarget) {		const X86Subtarget &Subtarget) {
▲ Show 20 Lines • Show All 6,064 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/live-out-reg-info.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s

	; Make sure dagcombine doesn't eliminate the comparison due			; Make sure dagcombine doesn't eliminate the comparison due
	; to an off-by-one bug with computeKnownBits information.			; to an off-by-one bug with computeKnownBits information.

	declare void @qux()			declare void @qux()

	define void @foo(i32 %a) {			define void @foo(i32 %a) {
	; CHECK-LABEL: foo:			; CHECK-LABEL: foo:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: pushq %rax			; CHECK-NEXT: pushq %rax
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: shrl $23, %edi			; CHECK-NEXT: shrl $23, %edi
	; CHECK-NEXT: btl $8, %edi			; CHECK-NEXT: testl $256, %edi # imm = 0x100
	; CHECK-NEXT: jb .LBB0_2			; CHECK-NEXT: jne .LBB0_2
	; CHECK-NEXT: # %bb.1: # %true			; CHECK-NEXT: # %bb.1: # %true
	; CHECK-NEXT: callq qux			; CHECK-NEXT: callq qux
	; CHECK-NEXT: .LBB0_2: # %false			; CHECK-NEXT: .LBB0_2: # %false
	; CHECK-NEXT: popq %rax			; CHECK-NEXT: popq %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 23			%t0 = lshr i32 %a, 23
	br label %next			br label %next
	next:			next:
	Show All 10 Lines

llvm/trunk/test/CodeGen/X86/test-shrink.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=CHECK-LINUX64		; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=CHECK-LINUX64
; RUN: llc < %s -mtriple=x86_64-win32 \| FileCheck %s --check-prefix=CHECK-WIN32-64		; RUN: llc < %s -mtriple=x86_64-win32 \| FileCheck %s --check-prefix=CHECK-WIN32-64
; RUN: llc < %s -mtriple=i686-- \| FileCheck %s --check-prefix=CHECK-X86		; RUN: llc < %s -mtriple=i686-- \| FileCheck %s --check-prefix=CHECK-X86

define void @g64xh(i64 inreg %x) nounwind {		define void @g64xh(i64 inreg %x) nounwind {
; CHECK-LINUX64-LABEL: g64xh:		; CHECK-LINUX64-LABEL: g64xh:
; CHECK-LINUX64: # %bb.0:		; CHECK-LINUX64: # %bb.0:
; CHECK-LINUX64-NEXT: btl $11, %edi		; CHECK-LINUX64-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-LINUX64-NEXT: jb .LBB0_2		; CHECK-LINUX64-NEXT: jne .LBB0_2
; CHECK-LINUX64-NEXT: # %bb.1: # %yes		; CHECK-LINUX64-NEXT: # %bb.1: # %yes
; CHECK-LINUX64-NEXT: pushq %rax		; CHECK-LINUX64-NEXT: pushq %rax
; CHECK-LINUX64-NEXT: callq bar		; CHECK-LINUX64-NEXT: callq bar
; CHECK-LINUX64-NEXT: popq %rax		; CHECK-LINUX64-NEXT: popq %rax
; CHECK-LINUX64-NEXT: .LBB0_2: # %no		; CHECK-LINUX64-NEXT: .LBB0_2: # %no
; CHECK-LINUX64-NEXT: retq		; CHECK-LINUX64-NEXT: retq
;		;
; CHECK-WIN32-64-LABEL: g64xh:		; CHECK-WIN32-64-LABEL: g64xh:
; CHECK-WIN32-64: # %bb.0:		; CHECK-WIN32-64: # %bb.0:
; CHECK-WIN32-64-NEXT: subq $40, %rsp		; CHECK-WIN32-64-NEXT: subq $40, %rsp
; CHECK-WIN32-64-NEXT: btl $11, %ecx		; CHECK-WIN32-64-NEXT: testl $2048, %ecx # imm = 0x800
; CHECK-WIN32-64-NEXT: jb .LBB0_2		; CHECK-WIN32-64-NEXT: jne .LBB0_2
; CHECK-WIN32-64-NEXT: # %bb.1: # %yes		; CHECK-WIN32-64-NEXT: # %bb.1: # %yes
; CHECK-WIN32-64-NEXT: callq bar		; CHECK-WIN32-64-NEXT: callq bar
; CHECK-WIN32-64-NEXT: .LBB0_2: # %no		; CHECK-WIN32-64-NEXT: .LBB0_2: # %no
; CHECK-WIN32-64-NEXT: addq $40, %rsp		; CHECK-WIN32-64-NEXT: addq $40, %rsp
; CHECK-WIN32-64-NEXT: retq		; CHECK-WIN32-64-NEXT: retq
;		;
; CHECK-X86-LABEL: g64xh:		; CHECK-X86-LABEL: g64xh:
; CHECK-X86: # %bb.0:		; CHECK-X86: # %bb.0:
; CHECK-X86-NEXT: btl $11, %eax		; CHECK-X86-NEXT: testl $2048, %eax # imm = 0x800
; CHECK-X86-NEXT: jb .LBB0_2		; CHECK-X86-NEXT: jne .LBB0_2
; CHECK-X86-NEXT: # %bb.1: # %yes		; CHECK-X86-NEXT: # %bb.1: # %yes
; CHECK-X86-NEXT: calll bar		; CHECK-X86-NEXT: calll bar
; CHECK-X86-NEXT: .LBB0_2: # %no		; CHECK-X86-NEXT: .LBB0_2: # %no
; CHECK-X86-NEXT: retl		; CHECK-X86-NEXT: retl
%t = and i64 %x, 2048		%t = and i64 %x, 2048
%s = icmp eq i64 %t, 0		%s = icmp eq i64 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	yes:
ret void		ret void
no:		no:
ret void		ret void
}		}

define void @g32xh(i32 inreg %x) nounwind {		define void @g32xh(i32 inreg %x) nounwind {
; CHECK-LINUX64-LABEL: g32xh:		; CHECK-LINUX64-LABEL: g32xh:
; CHECK-LINUX64: # %bb.0:		; CHECK-LINUX64: # %bb.0:
; CHECK-LINUX64-NEXT: btl $11, %edi		; CHECK-LINUX64-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-LINUX64-NEXT: jb .LBB2_2		; CHECK-LINUX64-NEXT: jne .LBB2_2
; CHECK-LINUX64-NEXT: # %bb.1: # %yes		; CHECK-LINUX64-NEXT: # %bb.1: # %yes
; CHECK-LINUX64-NEXT: pushq %rax		; CHECK-LINUX64-NEXT: pushq %rax
; CHECK-LINUX64-NEXT: callq bar		; CHECK-LINUX64-NEXT: callq bar
; CHECK-LINUX64-NEXT: popq %rax		; CHECK-LINUX64-NEXT: popq %rax
; CHECK-LINUX64-NEXT: .LBB2_2: # %no		; CHECK-LINUX64-NEXT: .LBB2_2: # %no
; CHECK-LINUX64-NEXT: retq		; CHECK-LINUX64-NEXT: retq
;		;
; CHECK-WIN32-64-LABEL: g32xh:		; CHECK-WIN32-64-LABEL: g32xh:
; CHECK-WIN32-64: # %bb.0:		; CHECK-WIN32-64: # %bb.0:
; CHECK-WIN32-64-NEXT: subq $40, %rsp		; CHECK-WIN32-64-NEXT: subq $40, %rsp
; CHECK-WIN32-64-NEXT: btl $11, %ecx		; CHECK-WIN32-64-NEXT: testl $2048, %ecx # imm = 0x800
; CHECK-WIN32-64-NEXT: jb .LBB2_2		; CHECK-WIN32-64-NEXT: jne .LBB2_2
; CHECK-WIN32-64-NEXT: # %bb.1: # %yes		; CHECK-WIN32-64-NEXT: # %bb.1: # %yes
; CHECK-WIN32-64-NEXT: callq bar		; CHECK-WIN32-64-NEXT: callq bar
; CHECK-WIN32-64-NEXT: .LBB2_2: # %no		; CHECK-WIN32-64-NEXT: .LBB2_2: # %no
; CHECK-WIN32-64-NEXT: addq $40, %rsp		; CHECK-WIN32-64-NEXT: addq $40, %rsp
; CHECK-WIN32-64-NEXT: retq		; CHECK-WIN32-64-NEXT: retq
;		;
; CHECK-X86-LABEL: g32xh:		; CHECK-X86-LABEL: g32xh:
; CHECK-X86: # %bb.0:		; CHECK-X86: # %bb.0:
; CHECK-X86-NEXT: btl $11, %eax		; CHECK-X86-NEXT: testl $2048, %eax # imm = 0x800
; CHECK-X86-NEXT: jb .LBB2_2		; CHECK-X86-NEXT: jne .LBB2_2
; CHECK-X86-NEXT: # %bb.1: # %yes		; CHECK-X86-NEXT: # %bb.1: # %yes
; CHECK-X86-NEXT: calll bar		; CHECK-X86-NEXT: calll bar
; CHECK-X86-NEXT: .LBB2_2: # %no		; CHECK-X86-NEXT: .LBB2_2: # %no
; CHECK-X86-NEXT: retl		; CHECK-X86-NEXT: retl
%t = and i32 %x, 2048		%t = and i32 %x, 2048
%s = icmp eq i32 %t, 0		%s = icmp eq i32 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	yes:
ret void		ret void
no:		no:
ret void		ret void
}		}

define void @g16xh(i16 inreg %x) nounwind {		define void @g16xh(i16 inreg %x) nounwind {
; CHECK-LINUX64-LABEL: g16xh:		; CHECK-LINUX64-LABEL: g16xh:
; CHECK-LINUX64: # %bb.0:		; CHECK-LINUX64: # %bb.0:
; CHECK-LINUX64-NEXT: btl $11, %edi		; CHECK-LINUX64-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-LINUX64-NEXT: jb .LBB4_2		; CHECK-LINUX64-NEXT: jne .LBB4_2
; CHECK-LINUX64-NEXT: # %bb.1: # %yes		; CHECK-LINUX64-NEXT: # %bb.1: # %yes
; CHECK-LINUX64-NEXT: pushq %rax		; CHECK-LINUX64-NEXT: pushq %rax
; CHECK-LINUX64-NEXT: callq bar		; CHECK-LINUX64-NEXT: callq bar
; CHECK-LINUX64-NEXT: popq %rax		; CHECK-LINUX64-NEXT: popq %rax
; CHECK-LINUX64-NEXT: .LBB4_2: # %no		; CHECK-LINUX64-NEXT: .LBB4_2: # %no
; CHECK-LINUX64-NEXT: retq		; CHECK-LINUX64-NEXT: retq
;		;
; CHECK-WIN32-64-LABEL: g16xh:		; CHECK-WIN32-64-LABEL: g16xh:
; CHECK-WIN32-64: # %bb.0:		; CHECK-WIN32-64: # %bb.0:
; CHECK-WIN32-64-NEXT: subq $40, %rsp		; CHECK-WIN32-64-NEXT: subq $40, %rsp
; CHECK-WIN32-64-NEXT: btl $11, %ecx		; CHECK-WIN32-64-NEXT: testl $2048, %ecx # imm = 0x800
; CHECK-WIN32-64-NEXT: jb .LBB4_2		; CHECK-WIN32-64-NEXT: jne .LBB4_2
; CHECK-WIN32-64-NEXT: # %bb.1: # %yes		; CHECK-WIN32-64-NEXT: # %bb.1: # %yes
; CHECK-WIN32-64-NEXT: callq bar		; CHECK-WIN32-64-NEXT: callq bar
; CHECK-WIN32-64-NEXT: .LBB4_2: # %no		; CHECK-WIN32-64-NEXT: .LBB4_2: # %no
; CHECK-WIN32-64-NEXT: addq $40, %rsp		; CHECK-WIN32-64-NEXT: addq $40, %rsp
; CHECK-WIN32-64-NEXT: retq		; CHECK-WIN32-64-NEXT: retq
;		;
; CHECK-X86-LABEL: g16xh:		; CHECK-X86-LABEL: g16xh:
; CHECK-X86: # %bb.0:		; CHECK-X86: # %bb.0:
; CHECK-X86-NEXT: btl $11, %eax		; CHECK-X86-NEXT: testl $2048, %eax # imm = 0x800
; CHECK-X86-NEXT: jb .LBB4_2		; CHECK-X86-NEXT: jne .LBB4_2
; CHECK-X86-NEXT: # %bb.1: # %yes		; CHECK-X86-NEXT: # %bb.1: # %yes
; CHECK-X86-NEXT: calll bar		; CHECK-X86-NEXT: calll bar
; CHECK-X86-NEXT: .LBB4_2: # %no		; CHECK-X86-NEXT: .LBB4_2: # %no
; CHECK-X86-NEXT: retl		; CHECK-X86-NEXT: retl
%t = and i16 %x, 2048		%t = and i16 %x, 2048
%s = icmp eq i16 %t, 0		%s = icmp eq i16 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no

▲ Show 20 Lines • Show All 366 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/test-vs-bittest.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s		; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s

define void @test64(i64 inreg %x) {		define void @test64(i64 inreg %x) {
; CHECK-LABEL: test64:		; CHECK-LABEL: test64:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: pushq %rax		; CHECK-NEXT: pushq %rax
; CHECK-NEXT: .cfi_def_cfa_offset 16		; CHECK-NEXT: .cfi_def_cfa_offset 16
; CHECK-NEXT: btl $11, %edi		; CHECK-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-NEXT: jb .LBB0_2		; CHECK-NEXT: jne .LBB0_2
; CHECK-NEXT: # %bb.1: # %yes		; CHECK-NEXT: # %bb.1: # %yes
; CHECK-NEXT: callq bar		; CHECK-NEXT: callq bar
; CHECK-NEXT: .LBB0_2: # %no		; CHECK-NEXT: .LBB0_2: # %no
; CHECK-NEXT: popq %rax		; CHECK-NEXT: popq %rax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%t = and i64 %x, 2048		%t = and i64 %x, 2048
%s = icmp eq i64 %t, 0		%s = icmp eq i64 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no
Show All 23 Lines

yes:		yes:
call void @bar()		call void @bar()
ret void		ret void
no:		no:
ret void		ret void
}		}

		; This test is identical to test64 above with only the destination of the br
		; reversed. This somehow causes the two functions to get slightly different
		; initial IR. One has an extra invert of the setcc. This previous caused one
		; the functions to use a BT while the other used a TEST due to another DAG
		; combine messing with an expected canonical form.
define void @test64_2(i64 inreg %x) {		define void @test64_2(i64 inreg %x) {
; CHECK-LABEL: test64_2:		; CHECK-LABEL: test64_2:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: pushq %rax		; CHECK-NEXT: pushq %rax
; CHECK-NEXT: .cfi_def_cfa_offset 16		; CHECK-NEXT: .cfi_def_cfa_offset 16
; CHECK-NEXT: testl $2048, %edi # imm = 0x800		; CHECK-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-NEXT: je .LBB2_2		; CHECK-NEXT: je .LBB2_2
; CHECK-NEXT: # %bb.1: # %yes		; CHECK-NEXT: # %bb.1: # %yes
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	no:
ret void		ret void
}		}

define void @test32(i32 inreg %x) {		define void @test32(i32 inreg %x) {
; CHECK-LABEL: test32:		; CHECK-LABEL: test32:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: pushq %rax		; CHECK-NEXT: pushq %rax
; CHECK-NEXT: .cfi_def_cfa_offset 16		; CHECK-NEXT: .cfi_def_cfa_offset 16
; CHECK-NEXT: btl $11, %edi		; CHECK-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-NEXT: jb .LBB8_2		; CHECK-NEXT: jne .LBB8_2
; CHECK-NEXT: # %bb.1: # %yes		; CHECK-NEXT: # %bb.1: # %yes
; CHECK-NEXT: callq bar		; CHECK-NEXT: callq bar
; CHECK-NEXT: .LBB8_2: # %no		; CHECK-NEXT: .LBB8_2: # %no
; CHECK-NEXT: popq %rax		; CHECK-NEXT: popq %rax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%t = and i32 %x, 2048		%t = and i32 %x, 2048
%s = icmp eq i32 %t, 0		%s = icmp eq i32 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	no:
ret void		ret void
}		}

define void @test16(i16 inreg %x) {		define void @test16(i16 inreg %x) {
; CHECK-LABEL: test16:		; CHECK-LABEL: test16:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: pushq %rax		; CHECK-NEXT: pushq %rax
; CHECK-NEXT: .cfi_def_cfa_offset 16		; CHECK-NEXT: .cfi_def_cfa_offset 16
; CHECK-NEXT: btl $11, %edi		; CHECK-NEXT: testl $2048, %edi # imm = 0x800
; CHECK-NEXT: jb .LBB12_2		; CHECK-NEXT: jne .LBB12_2
; CHECK-NEXT: # %bb.1: # %yes		; CHECK-NEXT: # %bb.1: # %yes
; CHECK-NEXT: callq bar		; CHECK-NEXT: callq bar
; CHECK-NEXT: .LBB12_2: # %no		; CHECK-NEXT: .LBB12_2: # %no
; CHECK-NEXT: popq %rax		; CHECK-NEXT: popq %rax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%t = and i16 %x, 2048		%t = and i16 %x, 2048
%s = icmp eq i16 %t, 0		%s = icmp eq i16 %t, 0
br i1 %s, label %yes, label %no		br i1 %s, label %yes, label %no
▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/xor-icmp.ll

	Show All 13 Lines
	; X32-NEXT: jmp bar # TAILCALL			; X32-NEXT: jmp bar # TAILCALL
	; X32-NEXT: .LBB0_1: # %bb			; X32-NEXT: .LBB0_1: # %bb
	; X32-NEXT: jmp foo # TAILCALL			; X32-NEXT: jmp foo # TAILCALL
	;			;
	; X64-LABEL: t:			; X64-LABEL: t:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	; X64-NEXT: xorl %esi, %edi			; X64-NEXT: xorl %esi, %edi
	; X64-NEXT: xorl %eax, %eax			; X64-NEXT: xorl %eax, %eax
	; X64-NEXT: btl $14, %edi			; X64-NEXT: testl $16384, %edi # imm = 0x4000
	; X64-NEXT: jae .LBB0_1			; X64-NEXT: je .LBB0_1
	; X64-NEXT: # %bb.2: # %bb1			; X64-NEXT: # %bb.2: # %bb1
	; X64-NEXT: jmp bar # TAILCALL			; X64-NEXT: jmp bar # TAILCALL
	; X64-NEXT: .LBB0_1: # %bb			; X64-NEXT: .LBB0_1: # %bb
	; X64-NEXT: jmp foo # TAILCALL			; X64-NEXT: jmp foo # TAILCALL
	entry:			entry:
	%0 = and i32 %a, 16384			%0 = and i32 %a, 16384
	%1 = icmp ne i32 %0, 0			%1 = icmp ne i32 %0, 0
	%2 = and i32 %b, 16384			%2 = and i32 %b, 16384
	▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines