This is an archive of the discontinued LLVM Phabricator instance.

lib/Target/PowerPC/PPCISelLowering.h
746	Perhaps a comment about what the use of this function is since this patch has none.
test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
5	I don't understand the changes to this test case. How is the VSX floating point comparison replacing the move of a field within the condition register and why? I don't really understand the intent of the original test either, but nonetheless checking for these instructions does not seem like the equivalent test.
test/CodeGen/PowerPC/pzero-fp-xored.ll
1	I think it might be a good idea to include `--implicit-check-not` on this line and specify which patterns (loading a zero from constant pool?) you want to not see.
2	I don't think it's a particularly good idea to use a builtin check string (`CHECK-NOT`) as a check prefix. In fact, I'm not sure what the semantics of this actually are. And in any case, using different check prefixes for the two run lines should probably include equivalent checking (including CHECK-WHATEVER-PREFIX-LABEL:).

amehsan added inline comments.Aug 17 2016, 12:50 PM

lib/Target/PowerPC/PPCISelLowering.h
746	The name is clear. Isn't it?
test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
5	I was not able to understand the intention of this testcase. (That said, I didn't check git blame. There might be some info there). The mcrf here, follows a fcmpu and copies BF 0 to BF7. the new VSX cmp insn, puts the result directly in BF 7. So that was the closest thing that I could have put here. I will check git blame, and if I found something useful to add, I will modify it. But with a testcase, whose intention is not known, I think this change should be fine.
test/CodeGen/PowerPC/pzero-fp-xored.ll
1	Yes, I can add that.
2	I had forgot that CHECK-NOT is built-in. Will change that.

nemanjai added inline comments.Aug 17 2016, 1:14 PM

lib/Target/PowerPC/PPCISelLowering.h
746	Oh, I see. This is used by the target-independent transformations.
test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
5	Yeah ok. I agree. I think this is kind of a weird test case.

It does not need to be in the same commit as adds this scalar support, but we should also add a pattern to catch the vector case. For

define <2 x double> @foo() local_unnamed_addr #0 {
  ret <2 x double> zeroinitializer
}

we currently also load from memory. Actually, for this case on powerpc64le, there's another opportunity as well:

	addis 3, 2, .LCPI0_0@toc@ha
	addi 3, 3, .LCPI0_0@toc@l
	lxvd2x 0, 0, 3
	xxswapd	 34, 0
	blr

Even in cases where we do need to load from memory, we should be able to fold the swap into the constant in cases where it can't otherwise be eliminated.

In D23614#519175, @hfinkel wrote:

It does not need to be in the same commit as adds this scalar support, but we should also add a pattern to catch the vector > case. For

Yes. I intentionally left it out because it required some extra work and I wanted to look at other issues that my reduced testcase for Eigen revealed. I will open a bugzilla item with your example.

In D23614#519175, @hfinkel wrote:

It does not need to be in the same commit as adds this scalar support, but we should also add a pattern to catch the vector > case. For

Yes. I intentionally left it out because it required some extra work and I wanted to look at other issues that my reduced testcase for Eigen revealed. I will open a bugzilla item with your example.

In D23614#519255, @amehsan wrote:

In D23614#519175, @hfinkel wrote:

It does not need to be in the same commit as adds this scalar support, but we should also add a pattern to catch the vector > case. For

Yes. I intentionally left it out because it required some extra work and I wanted to look at other issues that my reduced testcase for Eigen revealed. I will open a bugzilla item with your example.

I took a look. It seems to be a simple oversight for double and long long. For other cases we generate vxor. So changing it to generate vxor for the missing two cases is easy. I leave changing it to exploit xxlxor, for the future.

echristo added a subscriber: echristo.Aug 23 2016, 4:54 PM

echristo added inline comments.

lib/Target/PowerPC/PPCISelLowering.cpp
12117	Here's a good point for a comment on things that you expect to work :)

amehsan added inline comments.Aug 24 2016, 7:11 PM

lib/Target/PowerPC/PPCISelLowering.cpp
12117	VT!

amehsan added inline comments.Aug 24 2016, 7:21 PM

lib/Target/PowerPC/PPCISelLowering.cpp
12117	I don't really know what we do with f16, f80, etc. (need to check a test case). But I think it does not hurt to guard against them.

amehsan updated this revision to Diff 72089.Sep 21 2016, 11:16 AM

amehsan edited edge metadata.

We need to handle fp16 and fp80, or at least document why they are omitted.

lib/Target/PowerPC/PPCISelLowering.cpp
12117	I agree with Eric. I think a general comment would be useful here. Something like: Single-precision and double precision FP immediates can be loaded when VSX instructions are available and the Immediate has value 0. Half-precision and 80-bit are excluded because...
test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
5	Any insight from git blame?

This revision now requires changes to proceed.Sep 21 2016, 11:56 AM

In D23614#549008, @kbarton wrote:

We need to handle fp16 and fp80, or at least document why they are omitted.

Currently we have problems handling fp16 and fp80. Without knowing how generally we want to handle them, I prefer to exclude them here.

For this example

define half @t1(half %x) local_unnamed_addr #0 {
entry:
  %cmp = fadd half %x, 0.000000e+00
  ret half %cmp
}

I am getting this output

OUTPUT OF: llc  -mcpu=pwr8< test.ll

        .text
        .abiversion 2
        .file   "<stdin>"
LLVM ERROR: Cannot select: t20: f32 = fp16_to_fp t18
  t18: i32 = fp_to_fp16 t2
    t2: f32,ch = CopyFromReg t0, Register:f32 %vreg0
      t1: f32 = Register %vreg0
In function: t1

amehsan added inline comments.Oct 14 2016, 9:35 AM

test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll
5	Yes. It turns out that with this patch a downstream codegen bug is exposed. I have opened a PR for this Since fixing that bug will impact the code generated for this testcase, I prefer to leave this test unchanged. I have explicitly mentioned in the 2nd comment of the PR that a fix for that bug, should make sure we generate good code for this testcases.

amehsan added inline comments.Oct 14 2016, 9:43 AM

lib/Target/PowerPC/PPCISelLowering.cpp
12117	I have explained in another comment why this does not support f16 and f80. If you don't mind to approve this, I will add a summary of that comment to the code, before committing the change.

amehsan requested a review of this revision.Oct 14 2016, 9:48 AM

amehsan edited edge metadata.

LGTM

This revision is now accepted and ready to land.Oct 20 2016, 11:58 AM

amehsan added inline comments.Oct 24 2016, 10:35 AM

test/CodeGen/PowerPC/tail-dup-analyzable-fallthrough.ll
8 ↗	(On Diff #72089)	Apparently due to some other change, we now generate xxlxor here, which is better. So I'll change this line of the code before committing.

Committed rL284995

Could you move the testcase into the right directory?

In D23614#580123, @kparzysz wrote:

Could you move the testcase into the right directory?

Sorry I made a mistake when committing. Will fix it right now.

Revision Contents

Path

Size

lib/

Target/

PowerPC/

1 line

4 lines

7 lines

3 lines

11 lines

test/

CodeGen/

PowerPC/

crbits.ll

6 lines

fast-isel-fcmp-nan.ll

12 lines

pzero-fp-xored.ll

50 lines

Diff 68380

lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 737 Lines • ▼ Show 20 Lines	public:
/// exception typeid on entry to a landing pad.		/// exception typeid on entry to a landing pad.
unsigned		unsigned
getExceptionSelectorRegister(const Constant *PersonalityFn) const override;		getExceptionSelectorRegister(const Constant *PersonalityFn) const override;

/// Override to support customized stack guard loading.		/// Override to support customized stack guard loading.
bool useLoadStackGuardNode() const override;		bool useLoadStackGuardNode() const override;
void insertSSPDeclarations(Module &M) const override;		void insertSSPDeclarations(Module &M) const override;

		bool isFPImmLegal(const APFloat &Imm, EVT VT) const override;
		nemanjaiUnsubmitted Not Done Reply Inline Actions Perhaps a comment about what the use of this function is since this patch has none. nemanjai: Perhaps a comment about what the use of this function is since this patch has none.
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions The name is clear. Isn't it? amehsan: The name is clear. Isn't it?
		nemanjaiUnsubmitted Not Done Reply Inline Actions Oh, I see. This is used by the target-independent transformations. nemanjai: Oh, I see. This is used by the target-independent transformations.
private:		private:
struct ReuseLoadInfo {		struct ReuseLoadInfo {
SDValue Ptr;		SDValue Ptr;
SDValue Chain;		SDValue Chain;
SDValue ResChain;		SDValue ResChain;
MachinePointerInfo MPI;		MachinePointerInfo MPI;
bool IsInvariant;		bool IsInvariant;
unsigned Alignment;		unsigned Alignment;
▲ Show 20 Lines • Show All 217 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 12,107 Lines • ▼ Show 20 Lines	bool PPCTargetLowering::useLoadStackGuardNode() const {
return true;		return true;
}		}

// Override to disable global variable loading on Linux.		// Override to disable global variable loading on Linux.
void PPCTargetLowering::insertSSPDeclarations(Module &M) const {		void PPCTargetLowering::insertSSPDeclarations(Module &M) const {
if (!Subtarget.isTargetLinux())		if (!Subtarget.isTargetLinux())
return TargetLowering::insertSSPDeclarations(M);		return TargetLowering::insertSSPDeclarations(M);
}		}

		bool PPCTargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT) const {
		echristoUnsubmitted Not Done Reply Inline Actions Here's a good point for a comment on things that you expect to work :) echristo: Here's a good point for a comment on things that you expect to work :)
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions VT! amehsan: VT!
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions I don't really know what we do with f16, f80, etc. (need to check a test case). But I think it does not hurt to guard against them. amehsan: I don't really know what we do with f16, f80, etc. (need to check a test case). But I think it…
		kbartonUnsubmitted Not Done Reply Inline Actions I agree with Eric. I think a general comment would be useful here. Something like: Single-precision and double precision FP immediates can be loaded when VSX instructions are available and the Immediate has value 0. Half-precision and 80-bit are excluded because... kbarton: I agree with Eric. I think a general comment would be useful here. Something like: Single…
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions I have explained in another comment why this does not support f16 and f80. If you don't mind to approve this, I will add a summary of that comment to the code, before committing the change. amehsan: I have explained in another comment why this does not support f16 and f80. If you don't mind to…
		return Imm.isPosZero() && Subtarget.hasVSX();
		}

lib/Target/PowerPC/PPCInstrFormats.td

Show First 20 Lines • Show All 1,037 Lines • ▼ Show 20 Lines	class XX3Form<bits<6> opcode, bits<8> xo, dag OOL, dag IOL, string asmstr,
let Inst{11-15} = XA{4-0};		let Inst{11-15} = XA{4-0};
let Inst{16-20} = XB{4-0};		let Inst{16-20} = XB{4-0};
let Inst{21-28} = xo;		let Inst{21-28} = xo;
let Inst{29} = XA{5};		let Inst{29} = XA{5};
let Inst{30} = XB{5};		let Inst{30} = XB{5};
let Inst{31} = XT{5};		let Inst{31} = XT{5};
}		}

		class XX3Form_SetZero<bits<6> opcode, bits<8> xo, dag OOL, dag IOL, string asmstr,
		InstrItinClass itin, list<dag> pattern>
		: XX3Form<opcode, xo, OOL, IOL, asmstr, itin, pattern> {
		let XB = XT;
		let XA = XT;
		}

class XX3Form_1<bits<6> opcode, bits<8> xo, dag OOL, dag IOL, string asmstr,		class XX3Form_1<bits<6> opcode, bits<8> xo, dag OOL, dag IOL, string asmstr,
InstrItinClass itin, list<dag> pattern>		InstrItinClass itin, list<dag> pattern>
: I<opcode, OOL, IOL, asmstr, itin> {		: I<opcode, OOL, IOL, asmstr, itin> {
bits<3> CR;		bits<3> CR;
bits<6> XA;		bits<6> XA;
bits<6> XB;		bits<6> XB;

let Pattern = pattern;		let Pattern = pattern;
▲ Show 20 Lines • Show All 880 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCInstrInfo.td

Show First 20 Lines • Show All 585 Lines • ▼ Show 20 Lines	def s17imm : Operand<i32> {
// This operand type is used for addis/lis to allow the assembler parser		// This operand type is used for addis/lis to allow the assembler parser
// to accept immediates in the range -65536..65535 for compatibility with		// to accept immediates in the range -65536..65535 for compatibility with
// the GNU assembler. The operand is treated as 16-bit otherwise.		// the GNU assembler. The operand is treated as 16-bit otherwise.
let PrintMethod = "printS16ImmOperand";		let PrintMethod = "printS16ImmOperand";
let EncoderMethod = "getImm16Encoding";		let EncoderMethod = "getImm16Encoding";
let ParserMatchClass = PPCS17ImmAsmOperand;		let ParserMatchClass = PPCS17ImmAsmOperand;
let DecoderMethod = "decodeSImmOperand<16>";		let DecoderMethod = "decodeSImmOperand<16>";
}		}

		def fpimm0 : PatLeaf<(fpimm), [{ return N->isExactlyValue(+0.0); }]>;

def PPCDirectBrAsmOperand : AsmOperandClass {		def PPCDirectBrAsmOperand : AsmOperandClass {
let Name = "DirectBr"; let PredicateMethod = "isDirectBr";		let Name = "DirectBr"; let PredicateMethod = "isDirectBr";
let RenderMethod = "addBranchTargetOperands";		let RenderMethod = "addBranchTargetOperands";
}		}
def directbrtarget : Operand<OtherVT> {		def directbrtarget : Operand<OtherVT> {
let PrintMethod = "printBranchOperand";		let PrintMethod = "printBranchOperand";
let EncoderMethod = "getDirectBrEncoding";		let EncoderMethod = "getDirectBrEncoding";
let ParserMatchClass = PPCDirectBrAsmOperand;		let ParserMatchClass = PPCDirectBrAsmOperand;
▲ Show 20 Lines • Show All 3,645 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 756 Lines • ▼ Show 20 Lines	def XXLORf: XX3Form<60, 146,
(outs vsfrc:$XT), (ins vsfrc:$XA, vsfrc:$XB),		(outs vsfrc:$XT), (ins vsfrc:$XA, vsfrc:$XB),
"xxlor $XT, $XA, $XB", IIC_VecGeneral, []>;		"xxlor $XT, $XA, $XB", IIC_VecGeneral, []>;
def XXLXOR : XX3Form<60, 154,		def XXLXOR : XX3Form<60, 154,
(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),		(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),
"xxlxor $XT, $XA, $XB", IIC_VecGeneral,		"xxlxor $XT, $XA, $XB", IIC_VecGeneral,
[(set v4i32:$XT, (xor v4i32:$XA, v4i32:$XB))]>;		[(set v4i32:$XT, (xor v4i32:$XA, v4i32:$XB))]>;
} // isCommutable		} // isCommutable

		let isCodeGenOnly = 1 in {
		def XXLXORdpz : XX3Form_SetZero<60, 154,
		(outs vsfrc:$XT), (ins),
		"xxlxor $XT, $XT, $XT", IIC_VecGeneral,
		[(set f64:$XT, (fpimm0))]>;
		def XXLXORspz : XX3Form_SetZero<60, 154,
		(outs vssrc:$XT), (ins),
		"xxlxor $XT, $XT, $XT", IIC_VecGeneral,
		[(set f32:$XT, (fpimm0))]>;
		}

// Permutation Instructions		// Permutation Instructions
def XXMRGHW : XX3Form<60, 18,		def XXMRGHW : XX3Form<60, 18,
(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),		(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),
"xxmrghw $XT, $XA, $XB", IIC_VecPerm, []>;		"xxmrghw $XT, $XA, $XB", IIC_VecPerm, []>;
def XXMRGLW : XX3Form<60, 50,		def XXMRGLW : XX3Form<60, 50,
(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),		(outs vsrc:$XT), (ins vsrc:$XA, vsrc:$XB),
"xxmrglw $XT, $XA, $XB", IIC_VecPerm, []>;		"xxmrglw $XT, $XA, $XB", IIC_VecPerm, []>;

▲ Show 20 Lines • Show All 1,513 Lines • Show Last 20 Lines

test/CodeGen/PowerPC/crbits.ll

	; RUN: llc -verify-machineinstrs -mcpu=pwr7 < %s \| FileCheck %s			; RUN: llc -verify-machineinstrs -mcpu=pwr7 < %s \| FileCheck %s
	target datalayout = "E-m:e-i64:64-n32:64"			target datalayout = "E-m:e-i64:64-n32:64"
	target triple = "powerpc64-unknown-linux-gnu"			target triple = "powerpc64-unknown-linux-gnu"

	; Function Attrs: nounwind readnone			; Function Attrs: nounwind readnone
	define zeroext i1 @test1(float %v1, float %v2) #0 {			define zeroext i1 @test1(float %v1, float %v2) #0 {
	entry:			entry:
	%cmp = fcmp oge float %v1, %v2			%cmp = fcmp oge float %v1, %v2
	%cmp2 = fcmp ole float %v2, 0.000000e+00			%cmp2 = fcmp ole float %v2, 0.000000e+00
	%and5 = and i1 %cmp, %cmp2			%and5 = and i1 %cmp, %cmp2
	ret i1 %and5			ret i1 %and5

	; CHECK-LABEL: @test1			; CHECK-LABEL: @test1
	; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2			; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2
	; CHECK-DAG: li [[REG1:[0-9]+]], 1			; CHECK-DAG: li [[REG1:[0-9]+]], 1
	; CHECK-DAG: lfs [[REG2:[0-9]+]],			; CHECK-DAG: xxlxor [[REG2:[0-9]+]], [[REG2]], [[REG2]]
	; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]			; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]
	; CHECK: crnor			; CHECK: crnor
	; CHECK: crnor			; CHECK: crnor
	; CHECK: crnand [[REG4:[0-9]+]],			; CHECK: crnand [[REG4:[0-9]+]],
	; CHECK: isel 3, 0, [[REG1]], [[REG4]]			; CHECK: isel 3, 0, [[REG1]], [[REG4]]
	; CHECK: blr			; CHECK: blr
	}			}

	; Function Attrs: nounwind readnone			; Function Attrs: nounwind readnone
	define zeroext i1 @test2(float %v1, float %v2) #0 {			define zeroext i1 @test2(float %v1, float %v2) #0 {
	entry:			entry:
	%cmp = fcmp oge float %v1, %v2			%cmp = fcmp oge float %v1, %v2
	%cmp2 = fcmp ole float %v2, 0.000000e+00			%cmp2 = fcmp ole float %v2, 0.000000e+00
	%xor5 = xor i1 %cmp, %cmp2			%xor5 = xor i1 %cmp, %cmp2
	ret i1 %xor5			ret i1 %xor5

	; CHECK-LABEL: @test2			; CHECK-LABEL: @test2
	; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2			; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2
	; CHECK-DAG: li [[REG1:[0-9]+]], 1			; CHECK-DAG: li [[REG1:[0-9]+]], 1
	; CHECK-DAG: lfs [[REG2:[0-9]+]],			; CHECK-DAG: xxlxor [[REG2:[0-9]+]], [[REG2]], [[REG2]]
	; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]			; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]
	; CHECK: crnor			; CHECK: crnor
	; CHECK: crnor			; CHECK: crnor
	; CHECK: creqv [[REG4:[0-9]+]],			; CHECK: creqv [[REG4:[0-9]+]],
	; CHECK: isel 3, 0, [[REG1]], [[REG4]]			; CHECK: isel 3, 0, [[REG1]], [[REG4]]
	; CHECK: blr			; CHECK: blr
	}			}

	; Function Attrs: nounwind readnone			; Function Attrs: nounwind readnone
	define zeroext i1 @test3(float %v1, float %v2, i32 signext %x) #0 {			define zeroext i1 @test3(float %v1, float %v2, i32 signext %x) #0 {
	entry:			entry:
	%cmp = fcmp oge float %v1, %v2			%cmp = fcmp oge float %v1, %v2
	%cmp2 = fcmp ole float %v2, 0.000000e+00			%cmp2 = fcmp ole float %v2, 0.000000e+00
	%cmp4 = icmp ne i32 %x, -2			%cmp4 = icmp ne i32 %x, -2
	%and7 = and i1 %cmp2, %cmp4			%and7 = and i1 %cmp2, %cmp4
	%xor8 = xor i1 %cmp, %and7			%xor8 = xor i1 %cmp, %and7
	ret i1 %xor8			ret i1 %xor8

	; CHECK-LABEL: @test3			; CHECK-LABEL: @test3
	; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2			; CHECK-DAG: fcmpu {{[0-9]+}}, 1, 2
	; CHECK-DAG: li [[REG1:[0-9]+]], 1			; CHECK-DAG: li [[REG1:[0-9]+]], 1
	; CHECK-DAG: lfs [[REG2:[0-9]+]],			; CHECK-DAG: xxlxor [[REG2:[0-9]+]], [[REG2]], [[REG2]]
	; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]			; CHECK-DAG: fcmpu {{[0-9]+}}, 2, [[REG2]]
	; CHECK: crnor			; CHECK: crnor
	; CHECK: crnor			; CHECK: crnor
	; CHECK: crandc			; CHECK: crandc
	; CHECK: creqv [[REG4:[0-9]+]],			; CHECK: creqv [[REG4:[0-9]+]],
	; CHECK: isel 3, 0, [[REG1]], [[REG4]]			; CHECK: isel 3, 0, [[REG1]], [[REG4]]
	; CHECK: blr			; CHECK: blr
	}			}
	▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines

test/CodeGen/PowerPC/fast-isel-fcmp-nan.ll

; RUN: llc -mtriple powerpc64le-unknown-linux-gnu -fast-isel -O0 < %s \| FileCheck %s		; RUN: llc -mtriple powerpc64le-unknown-linux-gnu -fast-isel -O0 < %s \| FileCheck %s

define i1 @TestULT(double %t0) {		define i1 @TestULT(double %t0) {
; CHECK-LABEL: TestULT:		; CHECK-LABEL: TestULT:
; CHECK: mcrf		; CHECK: xscmpudp
		nemanjaiUnsubmitted Not Done Reply Inline Actions I don't understand the changes to this test case. How is the VSX floating point comparison replacing the move of a field within the condition register and why? I don't really understand the intent of the original test either, but nonetheless checking for these instructions does not seem like the equivalent test. nemanjai: I don't understand the changes to this test case. How is the VSX floating point comparison…
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions I was not able to understand the intention of this testcase. (That said, I didn't check git blame. There might be some info there). The mcrf here, follows a fcmpu and copies BF 0 to BF7. the new VSX cmp insn, puts the result directly in BF 7. So that was the closest thing that I could have put here. I will check git blame, and if I found something useful to add, I will modify it. But with a testcase, whose intention is not known, I think this change should be fine. amehsan: I was not able to understand the intention of this testcase. (That said, I didn't check git…
		nemanjaiUnsubmitted Not Done Reply Inline Actions Yeah ok. I agree. I think this is kind of a weird test case. nemanjai: Yeah ok. I agree. I think this is kind of a weird test case.
		kbartonUnsubmitted Not Done Reply Inline Actions Any insight from git blame? kbarton: Any insight from git blame?
		amehsanAuthorUnsubmitted Not Done Reply Inline Actions Yes. It turns out that with this patch a downstream codegen bug is exposed. I have opened a PR for this Since fixing that bug will impact the code generated for this testcase, I prefer to leave this test unchanged. I have explicitly mentioned in the 2nd comment of the PR that a fix for that bug, should make sure we generate good code for this testcases. amehsan: Yes. It turns out that with this patch a downstream codegen bug is exposed. I have opened a…
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp ult double %t0, 0.000000e+00		%t1 = fcmp ult double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

Show All 30 Lines	bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

define i1 @TestUEQ(double %t0) {		define i1 @TestUEQ(double %t0) {
; CHECK-LABEL: TestUEQ:		; CHECK-LABEL: TestUEQ:
; CHECK: mcrf		; CHECK: xscmpudp
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp ueq double %t0, 0.000000e+00		%t1 = fcmp ueq double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

define i1 @TestUGT(double %t0) {		define i1 @TestUGT(double %t0) {
; CHECK-LABEL: TestUGT:		; CHECK-LABEL: TestUGT:
; CHECK: mcrf		; CHECK: xscmpudp
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp ugt double %t0, 0.000000e+00		%t1 = fcmp ugt double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

Show All 30 Lines	bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

define i1 @TestOLE(double %t0) {		define i1 @TestOLE(double %t0) {
; CHECK-LABEL: TestOLE:		; CHECK-LABEL: TestOLE:
; CHECK: mcrf		; CHECK: xscmpudp
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp ole double %t0, 0.000000e+00		%t1 = fcmp ole double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

define i1 @TestONE(double %t0) {		define i1 @TestONE(double %t0) {
; CHECK-LABEL: TestONE:		; CHECK-LABEL: TestONE:
; CHECK: mcrf		; CHECK: xscmpudp
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp one double %t0, 0.000000e+00		%t1 = fcmp one double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

Show All 30 Lines	bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

define i1 @TestOGE(double %t0) {		define i1 @TestOGE(double %t0) {
; CHECK-LABEL: TestOGE:		; CHECK-LABEL: TestOGE:
; CHECK: mcrf		; CHECK: xscmpudp
; CHECK: blr		; CHECK: blr
entry:		entry:
%t1 = fcmp oge double %t0, 0.000000e+00		%t1 = fcmp oge double %t0, 0.000000e+00
br i1 %t1, label %good, label %bad		br i1 %t1, label %good, label %bad

bad:		bad:
ret i1 false		ret i1 false

good:		good:
ret i1 true		ret i1 true
}		}

test/CodeGen/PowerPC/pzero-fp-xored.ll

				; RUN: llc -mtriple=powerpc-unknown-linux-gnu -mattr=+vsx < %s \| FileCheck %s
				nemanjaiUnsubmitted Not Done Reply Inline Actions I think it might be a good idea to include `--implicit-check-not` on this line and specify which patterns (loading a zero from constant pool?) you want to not see. nemanjai: I think it might be a good idea to include `--implicit-check-not` on this line and specify…
				amehsanAuthorUnsubmitted Not Done Reply Inline Actions Yes, I can add that. amehsan: Yes, I can add that.
				; RUN: llc -mtriple=powerpc-unknown-linux-gnu -mattr=-vsx < %s \| FileCheck %s --check-prefix=CHECK-NOT
				nemanjaiUnsubmitted Not Done Reply Inline Actions I don't think it's a particularly good idea to use a builtin check string (`CHECK-NOT`) as a check prefix. In fact, I'm not sure what the semantics of this actually are. And in any case, using different check prefixes for the two run lines should probably include equivalent checking (including CHECK-WHATEVER-PREFIX-LABEL:). nemanjai: I don't think it's a particularly good idea to use a builtin check string (`CHECK-NOT`) as a…
				amehsanAuthorUnsubmitted Not Done Reply Inline Actions I had forgot that CHECK-NOT is built-in. Will change that. amehsan: I had forgot that CHECK-NOT is built-in. Will change that.

				define signext i32 @t1(float %x) local_unnamed_addr #0 {
				entry:
				%cmp = fcmp ogt float %x, 0.000000e+00
				%tmp = select i1 %cmp, i32 43, i32 11
				ret i32 %tmp

				; CHECK-LABEL: t1:
				; CHECK: xxlxor [[REG1:[0-9]+]], [[REG1]], [[REG1]]
				; CHECK: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG1]]
				; CHECK: blr
				; CHECK-NOT: lfs [[REG1:[0-9]+]]
				; CHECK-NOT: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG1]]
				; CHECK-NOT: blr
				}

				define signext i32 @t2(double %x) local_unnamed_addr #0 {
				entry:
				%cmp = fcmp ogt double %x, 0.000000e+00
				%tmp = select i1 %cmp, i32 43, i32 11
				ret i32 %tmp

				; CHECK-LABEL: t2:
				; CHECK: xxlxor [[REG2:[0-9]+]], [[REG2]], [[REG2]]
				; CHECK: xscmpudp {{[0-9]+}}, {{[0-9]+}}, [[REG2]]
				; CHECK: blr
				; CHECK-NOT: lfs [[REG2:[0-9]+]]
				; CHECK-NOT: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG2]]
				; CHECK-NOT: blr
				}

				define signext i32 @t3(ppc_fp128 %x) local_unnamed_addr #0 {
				entry:
				%cmp = fcmp ogt ppc_fp128 %x, 0xM00000000000000000000000000000000
				%tmp = select i1 %cmp, i32 43, i32 11
				ret i32 %tmp

				; CHECK-LABEL: t3:
				; CHECK: xxlxor [[REG3:[0-9]+]], [[REG3]], [[REG3]]
				; CHECK: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG3]]
				; CHECK: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG3]]
				; CHECK: blr
				; CHECK-NOT: lfs [[REG3:[0-9]+]]
				; CHECK-NOT: fcmpu {{[0-9]+}}, {{[0-9]+}}, [[REG3]]
				; CHECK-NOT: blr
				}

This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Generate positive FP zero using xor insn instead of loading from constant areaClosedPublic

Details

Diff Detail

Event Timeline