This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Eliminate compares - add i64 sext/zext handling for SETNE
ClosedPublic

Authored by nemanjai on May 31 2017, 5:32 AM.

Download Raw Diff

Details

Reviewers

echristo
hfinkel
kbarton

Commits

rGd8623f0825d5: [PowerPC] Eliminate integer compare instructions - vol. 5
rL304907: [PowerPC] Eliminate integer compare instructions - vol. 5

Summary

This patch adds handling for SETNE patterns with i64 inputs.

Diff Detail

Repository: rL LLVM

Event Timeline

nemanjai created this revision.May 31 2017, 5:32 AM

Feel free to commit any patches that look like this to this area :)

This revision is now accepted and ready to land.Jun 5 2017, 1:01 PM

Closed by commit rL304907: [PowerPC] Eliminate integer compare instructions - vol. 5 (authored by nemanjai). · Explain WhyJun 7 2017, 6:18 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D34005: [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion.Jun 8 2017, 6:48 AM

SDValue Xor = IsRHSZero ? LHS :

SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);

SDValue AC =

  SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,
                                 Xor, getI32Imm(~0U, dl)), 0);
return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, AC,
                                      Xor, AC.getValue(1)), 0);

Suppose we have RHS literal 0, LHS is a var, its runtime value is also 0, the expected value of the expression should be 0. But the runtime values are:

Xor  = LHS = 0
AC = Xor + ~0 = ~0, no carry
result = Xor - AC - carry = 0 - (~0) = 1

So the code sequence computes out wrong result.

In D33720#788613, @Carrot wrote:
SDValue Xor = IsRHSZero ? LHS :
SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
SDValue AC =
  SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,
                                 Xor, getI32Imm(~0U, dl)), 0);
return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, AC,
                                      Xor, AC.getValue(1)), 0);
Suppose we have RHS literal 0, LHS is a var, its runtime value is also 0, the expected value of the expression should be 0. But the runtime values are:
Xor  = LHS = 0
AC = Xor + ~0 = ~0, no carry
result = Xor - AC - carry = 0 - (~0) = 1
So the code sequence computes out wrong result.

That's actually not what the SUBFE instruction does. As per the ISA:

subfe RT,RA,RB
computes
RT = ~(RA) + (RB) + CA

So in your example, it will be:

~AC + XOR + CARRY
~(~0) + 0 + 0

Which of course will just be a zero.

In D33720#788925, @nemanjai wrote:
In D33720#788613, @Carrot wrote:
SDValue Xor = IsRHSZero ? LHS :
SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
SDValue AC =
  SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,
                                 Xor, getI32Imm(~0U, dl)), 0);
return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, AC,
                                      Xor, AC.getValue(1)), 0);
Suppose we have RHS literal 0, LHS is a var, its runtime value is also 0, the expected value of the expression should be 0. But the runtime values are:
Xor  = LHS = 0
AC = Xor + ~0 = ~0, no carry
result = Xor - AC - carry = 0 - (~0) = 1
So the code sequence computes out wrong result.
That's actually not what the SUBFE instruction does. As per the ISA:
subfe RT,RA,RB
computes
RT = ~(RA) + (RB) + CA
So in your example, it will be:
~AC + XOR + CARRY
~(~0) + 0 + 0
Which of course will just be a zero.

Then the document is wrong.

Mathematically, subtraction is
RB - RA = RB + ~RA + 1
just like the description in instruction subf.

A subtraction when considering borrowing from lower bit is
RB - RA - CA = RB + ~RA + 1 - CA = RB + ~RA + !CA

In D33720#789251, @Carrot wrote:
In D33720#788925, @nemanjai wrote:
That's actually not what the SUBFE instruction does. As per the ISA:
subfe RT,RA,RB
computes
RT = ~(RA) + (RB) + CA
So in your example, it will be:
~AC + XOR + CARRY
~(~0) + 0 + 0
Which of course will just be a zero.
Then the document is wrong.

Mathematically, subtraction is
RB - RA = RB + ~RA + 1
just like the description in instruction subf.

A subtraction when considering borrowing from lower bit is
RB - RA - CA = RB + ~RA + 1 - CA = RB + ~RA + !CA

I wrote a small program to test it, you are right, the instruction subfe does
RT = ~(RA) + (RB) + CA

so
RT = (RB) + ~(RA) + CA = (RB) + ~(RA) + 1 - 1 + CA = RB - RA + CA - 1
I can't understand the mathematical meaning of the right side. And how can it be used to implement high precision integer substraction.

In D33720#789655, @Carrot wrote:
In D33720#789251, @Carrot wrote:
In D33720#788925, @nemanjai wrote:
That's actually not what the SUBFE instruction does. As per the ISA:
subfe RT,RA,RB
computes
RT = ~(RA) + (RB) + CA
So in your example, it will be:
~AC + XOR + CARRY
~(~0) + 0 + 0
Which of course will just be a zero.
Then the document is wrong.

Mathematically, subtraction is
RB - RA = RB + ~RA + 1
just like the description in instruction subf.

A subtraction when considering borrowing from lower bit is
RB - RA - CA = RB + ~RA + 1 - CA = RB + ~RA + !CA
I wrote a small program to test it, you are right, the instruction subfe does
RT = ~(RA) + (RB) + CA

so
RT = (RB) + ~(RA) + CA = (RB) + ~(RA) + 1 - 1 + CA = RB - RA + CA - 1
I can't understand the mathematical meaning of the right side. And how can it be used to implement high precision integer substraction.

I understand it now.
It is because the CA flag is different than CF flag on x86 when set by SUB instruction.

On x86, CF flag means there is borrowing from highest bit when doing subtraction
On ppc, CA flag means carry flag of following addition

B - A = B + ~A + 1

so CF = !CA

so B - A - Borrow = B + ~A + 1 - CF = B + ~A + 1 - !CA = B + ~A + 1 - (1 - CA) = B + ~A + CA

Carrot mentioned this in D35699: [PPC] Add Defs = [CARRY] to MIR SRADI_32 .Jul 20 2017, 2:07 PM

Carrot mentioned this in rL308780: [PPC] Add Defs = [CARRY] to MIR SRADI_32.Jul 21 2017, 2:06 PM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

PowerPC/

PPCISelDAGToDAG.cpp

26 lines

test/

CodeGen/

PowerPC/

logic-ops-on-compares.ll

73 lines

testComparesinesll.ll

125 lines

testComparesineull.ll

125 lines

testComparesllnesll.ll

125 lines

testComparesllneull.ll

125 lines

Diff 101723

llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

Show First 20 Lines • Show All 2,901 Lines • ▼ Show 20 Lines	case ISD::SETEQ: {
SDValue Xor = IsRHSZero ? LHS :		SDValue Xor = IsRHSZero ? LHS :
SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);		SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
SDValue Clz =		SDValue Clz =
SDValue(CurDAG->getMachineNode(PPC::CNTLZD, dl, MVT::i64, Xor), 0);		SDValue(CurDAG->getMachineNode(PPC::CNTLZD, dl, MVT::i64, Xor), 0);
return SDValue(CurDAG->getMachineNode(PPC::RLDICL, dl, MVT::i64, Clz,		return SDValue(CurDAG->getMachineNode(PPC::RLDICL, dl, MVT::i64, Clz,
getI64Imm(58, dl), getI64Imm(63, dl)),		getI64Imm(58, dl), getI64Imm(63, dl)),
0);		0);
}		}
		case ISD::SETNE: {
		// {addc.reg, addc.CA} = (addcarry (xor %a, %b), -1)
		// (zext (setcc %a, %b, setne)) -> (sube addc.reg, addc.reg, addc.CA)
		// {addcz.reg, addcz.CA} = (addcarry %a, -1)
		// (zext (setcc %a, 0, setne)) -> (sube addcz.reg, addcz.reg, addcz.CA)
		SDValue Xor = IsRHSZero ? LHS :
		SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
		SDValue AC =
		SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,
		Xor, getI32Imm(~0U, dl)), 0);
		return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, AC,
		Xor, AC.getValue(1)), 0);
		}
}		}
}		}

/// Produces a sign-extended result of comparing two 64-bit values according to		/// Produces a sign-extended result of comparing two 64-bit values according to
/// the passed condition code.		/// the passed condition code.
SDValue PPCDAGToDAGISel::get64BitSExtCompare(SDValue LHS, SDValue RHS,		SDValue PPCDAGToDAGISel::get64BitSExtCompare(SDValue LHS, SDValue RHS,
ISD::CondCode CC,		ISD::CondCode CC,
int64_t RHSValue, SDLoc dl) {		int64_t RHSValue, SDLoc dl) {
bool IsRHSZero = RHSValue == 0;		bool IsRHSZero = RHSValue == 0;
switch (CC) {		switch (CC) {
default: return SDValue();		default: return SDValue();
case ISD::SETEQ: {		case ISD::SETEQ: {
// {addc.reg, addc.CA} = (addcarry (xor %a, %b), -1)		// {addc.reg, addc.CA} = (addcarry (xor %a, %b), -1)
// (sext (setcc %a, %b, seteq)) -> (sube addc.reg, addc.reg, addc.CA)		// (sext (setcc %a, %b, seteq)) -> (sube addc.reg, addc.reg, addc.CA)
// {addcz.reg, addcz.CA} = (addcarry %a, -1)		// {addcz.reg, addcz.CA} = (addcarry %a, -1)
// (sext (setcc %a, 0, seteq)) -> (sube addcz.reg, addcz.reg, addcz.CA)		// (sext (setcc %a, 0, seteq)) -> (sube addcz.reg, addcz.reg, addcz.CA)
SDValue AddInput = IsRHSZero ? LHS :		SDValue AddInput = IsRHSZero ? LHS :
SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);		SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
SDValue Addic =		SDValue Addic =
SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,		SDValue(CurDAG->getMachineNode(PPC::ADDIC8, dl, MVT::i64, MVT::Glue,
AddInput, getI32Imm(~0U, dl)), 0);		AddInput, getI32Imm(~0U, dl)), 0);
return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, Addic,		return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, Addic,
Addic, Addic.getValue(1)), 0);		Addic, Addic.getValue(1)), 0);
}		}
		case ISD::SETNE: {
		// {subfc.reg, subfc.CA} = (subcarry 0, (xor %a, %b))
		// (sext (setcc %a, %b, setne)) -> (sube subfc.reg, subfc.reg, subfc.CA)
		// {subfcz.reg, subfcz.CA} = (subcarry 0, %a)
		// (sext (setcc %a, 0, setne)) -> (sube subfcz.reg, subfcz.reg, subfcz.CA)
		SDValue Xor = IsRHSZero ? LHS :
		SDValue(CurDAG->getMachineNode(PPC::XOR8, dl, MVT::i64, LHS, RHS), 0);
		SDValue SC =
		SDValue(CurDAG->getMachineNode(PPC::SUBFIC8, dl, MVT::i64, MVT::Glue,
		Xor, getI32Imm(0, dl)), 0);
		return SDValue(CurDAG->getMachineNode(PPC::SUBFE8, dl, MVT::i64, SC,
		SC, SC.getValue(1)), 0);
		}
}		}
}		}

/// Does this SDValue have any uses for which keeping the value in a GPR is		/// Does this SDValue have any uses for which keeping the value in a GPR is
/// appropriate. This is meant to be used on values that have type i1 since		/// appropriate. This is meant to be used on values that have type i1 since
/// it is somewhat meaningless to ask if values of other types can be kept in		/// it is somewhat meaningless to ask if values of other types can be kept in
/// GPR's.		/// GPR's.
static bool allUsesExtend(SDValue Compare, SelectionDAG *CurDAG) {		static bool allUsesExtend(SDValue Compare, SelectionDAG *CurDAG) {
▲ Show 20 Lines • Show All 2,180 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/PowerPC/logic-ops-on-compares.ll

Show All 34 Lines	if.end: ; preds = %entry
%call4 = tail call signext i32 @bar(i32 signext %b) #2		%call4 = tail call signext i32 @bar(i32 signext %b) #2
br label %return		br label %return

return: ; preds = %if.end, %if.then		return: ; preds = %if.end, %if.then
%retval.0 = phi i32 [ %call4, %if.end ], [ %call, %if.then ]		%retval.0 = phi i32 [ %call4, %if.end ], [ %call, %if.then ]
ret i32 %retval.0		ret i32 %retval.0
}		}

define void @neg_truncate_i32(i32 *%ptr) {		define void @neg_truncate_i32_eq(i32 *%ptr) {
; CHECK-LABEL: neg_truncate_i32:		; CHECK-LABEL: neg_truncate_i32_eq:
; CHECK: # BB#0: # %entry		; CHECK: # BB#0: # %entry
; CHECK-NEXT: lwz r3, 0(r3)		; CHECK-NEXT: lwz r3, 0(r3)
; CHECK-NEXT: rldicl. r3, r3, 0, 63		; CHECK-NEXT: rldicl. r3, r3, 0, 63
; CHECK-NEXT: bclr 12, 2, 0		; CHECK-NEXT: bclr 12, 2, 0
; CHECK-NEXT: # BB#1: # %if.end29.thread136		; CHECK-NEXT: # BB#1: # %if.end29.thread136
; CHECK-NEXT: .LBB1_2: # %if.end29		; CHECK-NEXT: .LBB1_2: # %if.end29
entry:		entry:
%0 = load i32, i32* %ptr, align 4		%0 = load i32, i32* %ptr, align 4
%rem17127 = and i32 %0, 1		%rem17127 = and i32 %0, 1
%cmp18 = icmp eq i32 %rem17127, 0		%cmp18 = icmp eq i32 %rem17127, 0
br label %if.else		br label %if.else

if.else: ; preds = %entry		if.else: ; preds = %entry
br i1 %cmp18, label %if.end29, label %if.end29.thread136		br i1 %cmp18, label %if.end29, label %if.end29.thread136

if.end29.thread136: ; preds = %if.else		if.end29.thread136: ; preds = %if.else
unreachable		unreachable

if.end29: ; preds = %if.else		if.end29: ; preds = %if.else
ret void		ret void

}		}

; Function Attrs: nounwind		; Function Attrs: nounwind
define i64 @logic_ne_64(i64 %a, i64 %b, i64 %c) {		define i64 @logic_eq_64(i64 %a, i64 %b, i64 %c) {
; CHECK-LABEL: logic_ne_64:		; CHECK-LABEL: logic_eq_64:
; CHECK: xor r7, r3, r4		; CHECK: xor r7, r3, r4
; CHECK-NEXT: li r6, 55		; CHECK-NEXT: li r6, 55
; CHECK-NEXT: xor r5, r5, r6		; CHECK-NEXT: xor r5, r5, r6
; CHECK-NEXT: or r7, r7, r4		; CHECK-NEXT: or r7, r7, r4
; CHECK-NEXT: cntlzd r6, r7		; CHECK-NEXT: cntlzd r6, r7
; CHECK-NEXT: cntlzd r5, r5		; CHECK-NEXT: cntlzd r5, r5
; CHECK-NEXT: rldicl r6, r6, 58, 63		; CHECK-NEXT: rldicl r6, r6, 58, 63
; CHECK-NEXT: rldicl r5, r5, 58, 63		; CHECK-NEXT: rldicl r5, r5, 58, 63
Show All 15 Lines	if.end: ; preds = %entry
%call4 = tail call i64 @bar64(i64 %b) #2		%call4 = tail call i64 @bar64(i64 %b) #2
br label %return		br label %return

return: ; preds = %if.end, %if.then		return: ; preds = %if.end, %if.then
%retval.0 = phi i64 [ %call4, %if.end ], [ %call, %if.then ]		%retval.0 = phi i64 [ %call4, %if.end ], [ %call, %if.then ]
ret i64 %retval.0		ret i64 %retval.0
}		}

define void @neg_truncate_i64(i64 *%ptr) {		define void @neg_truncate_i64_eq(i64 *%ptr) {
; CHECK-LABEL: neg_truncate_i64:		; CHECK-LABEL: neg_truncate_i64_eq:
; CHECK: # BB#0: # %entry		; CHECK: # BB#0: # %entry
; CHECK-NEXT: ld r3, 0(r3)		; CHECK-NEXT: ld r3, 0(r3)
; CHECK-NEXT: rldicl. r3, r3, 0, 63		; CHECK-NEXT: rldicl. r3, r3, 0, 63
; CHECK-NEXT: bclr 12, 2, 0		; CHECK-NEXT: bclr 12, 2, 0
; CHECK-NEXT: # BB#1: # %if.end29.thread136		; CHECK-NEXT: # BB#1: # %if.end29.thread136
; CHECK-NEXT: .LBB3_2: # %if.end29		; CHECK-NEXT: .LBB3_2: # %if.end29
entry:		entry:
%0 = load i64, i64* %ptr, align 4		%0 = load i64, i64* %ptr, align 4
%rem17127 = and i64 %0, 1		%rem17127 = and i64 %0, 1
%cmp18 = icmp eq i64 %rem17127, 0		%cmp18 = icmp eq i64 %rem17127, 0
br label %if.else		br label %if.else

if.else: ; preds = %entry		if.else: ; preds = %entry
br i1 %cmp18, label %if.end29, label %if.end29.thread136		br i1 %cmp18, label %if.end29, label %if.end29.thread136

if.end29.thread136: ; preds = %if.else		if.end29.thread136: ; preds = %if.else
unreachable		unreachable

if.end29: ; preds = %if.else		if.end29: ; preds = %if.else
ret void		ret void

}		}

		; Function Attrs: nounwind
		define i64 @logic_ne_64(i64 %a, i64 %b, i64 %c) {
		; CHECK-LABEL: logic_ne_64:
		; CHECK: xor r7, r3, r4
		; CHECK-NEXT: li r6, 55
		; CHECK-NEXT: addic r8, r7, -1
		; CHECK-NEXT: xor r5, r5, r6
		; CHECK-NEXT: subfe r7, r8, r7
		; CHECK-NEXT: cntlzd r5, r5
		; CHECK-NEXT: addic r12, r4, -1
		; CHECK-NEXT: rldicl r5, r5, 58, 63
		; CHECK-NEXT: subfe r6, r12, r4
		; CHECK-NEXT: and r6, r7, r6
		; CHECK-NEXT: or. r5, r6, r5
		; CHECK-NEXT: bc 4, 1
		entry:
		%tobool = icmp ne i64 %a, %b
		%tobool1 = icmp ne i64 %b, 0
		%or.cond = and i1 %tobool, %tobool1
		%tobool3 = icmp eq i64 %c, 55
		%or.cond5 = or i1 %or.cond, %tobool3
		br i1 %or.cond5, label %if.end, label %if.then

		if.then: ; preds = %entry
		%call = tail call i64 @foo64(i64 %a) #2
		br label %return

		if.end: ; preds = %entry
		%call4 = tail call i64 @bar64(i64 %b) #2
		br label %return

		return: ; preds = %if.end, %if.then
		%retval.0 = phi i64 [ %call4, %if.end ], [ %call, %if.then ]
		ret i64 %retval.0
		}

		define void @neg_truncate_i64_ne(i64 *%ptr) {
		; CHECK-LABEL: neg_truncate_i64_ne:
		; CHECK: # BB#0: # %entry
		; CHECK-NEXT: ld r3, 0(r3)
		; CHECK-NEXT: andi. r3, r3, 1
		; CHECK-NEXT: bclr 12, 1, 0
		; CHECK-NEXT: # BB#1: # %if.end29.thread136
		; CHECK-NEXT: .LBB5_2: # %if.end29
		entry:
		%0 = load i64, i64* %ptr, align 4
		%rem17127 = and i64 %0, 1
		%cmp18 = icmp ne i64 %rem17127, 0
		br label %if.else

		if.else: ; preds = %entry
		br i1 %cmp18, label %if.end29, label %if.end29.thread136

		if.end29.thread136: ; preds = %if.else
		unreachable

		if.end29: ; preds = %if.else
		ret void

		}

declare signext i32 @foo(i32 signext)		declare signext i32 @foo(i32 signext)
declare signext i32 @bar(i32 signext)		declare signext i32 @bar(i32 signext)
declare i64 @foo64(i64)		declare i64 @foo64(i64)
declare i64 @bar64(i64)		declare i64 @bar64(i64)

llvm/trunk/test/CodeGen/PowerPC/testComparesinesll.ll

				; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py

				@glob = common local_unnamed_addr global i64 0, align 8

				define signext i32 @test_inesll(i64 %a, i64 %b) {
				; CHECK-LABEL: test_inesll:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				define signext i32 @test_inesll_sext(i64 %a, i64 %b) {
				; CHECK-LABEL: test_inesll_sext:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%sub = sext i1 %cmp to i32
				ret i32 %sub
				}

				define signext i32 @test_inesll_z(i64 %a) {
				; CHECK-LABEL: test_inesll_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				define signext i32 @test_inesll_sext_z(i64 %a) {
				; CHECK-LABEL: test_inesll_sext_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%sub = sext i1 %cmp to i32
				ret i32 %sub
				}

				define void @test_inesll_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_inesll_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_inesll_sext_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_inesll_sext_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_inesll_z_store(i64 %a) {
				; CHECK-LABEL: test_inesll_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_inesll_sext_z_store(i64 %a) {
				; CHECK-LABEL: test_inesll_sext_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

llvm/trunk/test/CodeGen/PowerPC/testComparesineull.ll

				; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py

				@glob = common local_unnamed_addr global i64 0, align 8

				define signext i32 @test_ineull(i64 %a, i64 %b) {
				; CHECK-LABEL: test_ineull:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				define signext i32 @test_ineull_sext(i64 %a, i64 %b) {
				; CHECK-LABEL: test_ineull_sext:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%sub = sext i1 %cmp to i32
				ret i32 %sub
				}

				define signext i32 @test_ineull_z(i64 %a) {
				; CHECK-LABEL: test_ineull_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				define signext i32 @test_ineull_sext_z(i64 %a) {
				; CHECK-LABEL: test_ineull_sext_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%sub = sext i1 %cmp to i32
				ret i32 %sub
				}

				define void @test_ineull_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_ineull_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_ineull_sext_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_ineull_sext_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_ineull_z_store(i64 %a) {
				; CHECK-LABEL: test_ineull_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_ineull_sext_z_store(i64 %a) {
				; CHECK-LABEL: test_ineull_sext_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

llvm/trunk/test/CodeGen/PowerPC/testComparesllnesll.ll

				; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py

				@glob = common local_unnamed_addr global i64 0, align 8

				define i64 @test_llnesll(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llnesll:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llnesll_sext(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llnesll_sext:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llnesll_z(i64 %a) {
				; CHECK-LABEL: test_llnesll_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llnesll_sext_z(i64 %a) {
				; CHECK-LABEL: test_llnesll_sext_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				ret i64 %conv1
				}

				define void @test_llnesll_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llnesll_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llnesll_sext_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llnesll_sext_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llnesll_z_store(i64 %a) {
				; CHECK-LABEL: test_llnesll_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llnesll_sext_z_store(i64 %a) {
				; CHECK-LABEL: test_llnesll_sext_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

llvm/trunk/test/CodeGen/PowerPC/testComparesllneull.ll

				; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu -O2 \
				; RUN: -ppc-asm-full-reg-names -mcpu=pwr8 < %s \| FileCheck %s \
				; RUN: --implicit-check-not cmpw --implicit-check-not cmpd --implicit-check-not cmpl
				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py

				@glob = common local_unnamed_addr global i64 0, align 8

				define i64 @test_llneull(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llneull:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llneull_sext(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llneull_sext:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llneull_z(i64 %a) {
				; CHECK-LABEL: test_llneull_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addic r4, r3, -1
				; CHECK-NEXT: subfe r3, r4, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				ret i64 %conv1
				}

				define i64 @test_llneull_sext_z(i64 %a) {
				; CHECK-LABEL: test_llneull_sext_z:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				ret i64 %conv1
				}

				define void @test_llneull_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llneull_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llneull_sext_store(i64 %a, i64 %b) {
				; CHECK-LABEL: test_llneull_sext_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r5, r2, .LC0@toc@ha
				; CHECK-NEXT: xor r3, r3, r4
				; CHECK-NEXT: ld r12, .LC0@toc@l(r5)
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r12)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, %b
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llneull_z_store(i64 %a) {
				; CHECK-LABEL: test_llneull_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: addic r5, r3, -1
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r5, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = zext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

				define void @test_llneull_sext_z_store(i64 %a) {
				; CHECK-LABEL: test_llneull_sext_z_store:
				; CHECK: # BB#0: # %entry
				; CHECK-NEXT: addis r4, r2, .LC0@toc@ha
				; CHECK-NEXT: subfic r3, r3, 0
				; CHECK-NEXT: ld r4, .LC0@toc@l(r4)
				; CHECK-NEXT: subfe r3, r3, r3
				; CHECK-NEXT: std r3, 0(r4)
				; CHECK-NEXT: blr
				entry:
				%cmp = icmp ne i64 %a, 0
				%conv1 = sext i1 %cmp to i64
				store i64 %conv1, i64* @glob, align 8
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Eliminate compares - add i64 sext/zext handling for SETNEClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 101723

llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

llvm/trunk/test/CodeGen/PowerPC/logic-ops-on-compares.ll

llvm/trunk/test/CodeGen/PowerPC/testComparesinesll.ll

llvm/trunk/test/CodeGen/PowerPC/testComparesineull.ll

llvm/trunk/test/CodeGen/PowerPC/testComparesllnesll.ll

llvm/trunk/test/CodeGen/PowerPC/testComparesllneull.ll

[PowerPC] Eliminate compares - add i64 sext/zext handling for SETNE
ClosedPublic