This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
BuiltinsPPC.def
-
lib/
-
Basic/Targets/
-
Targets/
-
PPC.cpp
-
CodeGen/
-
CGBuiltin.cpp
-
test/CodeGen/
-
CodeGen/
4/4
builtins-ppc-xlcompat-fp.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
3/3
IntrinsicsPowerPC.td
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
4/4
PPCISelDAGToDAG.cpp
1/1
PPCInstrInfo.td
-
PPCInstrVSX.td
-
test/CodeGen/
-
CodeGen/
4/4
builtins-ppc-xlcompat-fp.ll

Differential D103986

[PowerPC] Floating Point Builtins for XL Compat.
ClosedPublic

Authored by quinnp on Jun 9 2021, 12:44 PM.

Download Raw Diff

Details

Reviewers

nemanjai
stefanp
amyk
NeHuang

Group Reviewers

Restricted Project

Commits

rGe002d251dd34: [PowerPC] Floating Point Builtins for XL Compat.

Summary

This patch is in a series of patches to provide
builtins for compatibility with the XL compiler.
This patch adds builtins related to floating point
operations

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

quinnp created this revision.Jun 9 2021, 12:44 PM

Herald added subscribers: shchenz, kbarton, hiraditya, nemanjai. · View Herald TranscriptJun 9 2021, 12:44 PM

quinnp requested review of this revision.Jun 9 2021, 12:44 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 9 2021, 12:44 PM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

quinnp edited the summary of this revision. (Show Details)Jun 9 2021, 12:54 PM

quinnp added reviewers: Restricted Project, nemanjai, stefanp.

Harbormaster completed remote builds in B108476: Diff 350978.Jun 9 2021, 1:33 PM

quinnp edited the summary of this revision. (Show Details)Jun 10 2021, 8:31 AM

quinnp edited the summary of this revision. (Show Details)

Changed the target architecture of tests to follow the convention: BE-pwr7 LE-pwr8.

Harbormaster completed remote builds in B109374: Diff 352240.Jun 15 2021, 4:52 PM

qiucf added a subscriber: qiucf.Jun 16 2021, 11:51 PM

qiucf added inline comments.

clang/test/CodeGen/builtins-ppc-xlcompat-sync.c
2 ↗	(On Diff #352240)	Why change `pwr7` to `pwr8`?
llvm/lib/Target/PowerPC/PPCInstrInfo.td
3341	Seems we didn't exploit `XXSEL` in this case. But for sqrt/rsqrt, PPC has VSX and non-VSX versions for them: frsqrte - xsrsqrtedp frsqrtes - xsrsqrtesp fsqrt - xssqrtdp fsqrts - xssqrtsp So needs to add `frsqrte` here and `xsrsqrtesp/xssqrtsp` in VSX part?
llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll
10	You can use `update_llc_test` script to automatically update this file. And add non-vsx test?

qiucf mentioned this in D104386: [PowerPC][Builtins] Added a number of builtins for compatibility with XL..Jun 16 2021, 11:55 PM

Addressing some comments and updating tests

Harbormaster completed remote builds in B110280: Diff 353475.Jun 21 2021, 3:08 PM

Fixing the 32bit AIX run line in the testcases.

Harbormaster completed remote builds in B110453: Diff 353716.Jun 22 2021, 12:11 PM

Added non-vsx implementation of builtins and non-vsx backend tests

Harbormaster completed remote builds in B110999: Diff 354498.Jun 25 2021, 8:47 AM

Overall looks good. Some nits as below.

clang/test/CodeGen/builtins-ppc-xlcompat-fp.c
10	You can define three extern variables for all the bulitins. extern double a; extern float b； extern float c； You can auto update the test case with utils/update_cc_test_checks.py
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4986	you can delete blank line and better add comments for the operation below.
4988	Please use `Ops` as the variable name.

NeHuang added inline comments.Jun 30 2021, 1:29 PM

llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll
19	you can remove `#0`, `#1` and `#2`

Addressing review comments.

Fixing a typo and the format of a test.

Harbormaster completed remote builds in B112266: Diff 356272.Jul 2 2021, 3:09 PM

quinnp added inline comments.Jul 6 2021, 10:45 AM

clang/test/CodeGen/builtins-ppc-xlcompat-sync.c
2 ↗	(On Diff #352240)	This was to match the testing convention for XL compatibility builtins of targeting pwr7 for big endian and pwr8 for little endian. However, I will not be changing the `sync` tests in this patch, I have reversed those changes and will be putting them into a different patch.

amyk added a subscriber: amyk.Jul 8 2021, 7:39 PM

amyk added inline comments.

clang/test/CodeGen/builtins-ppc-xlcompat-fp.c
4	nit: indentation on every second run line. We usually do two spaces more than the previous line. So I think it would look like:
20	Do the `entry` lines cause an issue when we don't have asserts?
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1564	Minor indentation nit.
1567	Minor indentation nit to make it aligned with above.
llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll
3	Minor indentation nit:

Adressing some review comments.

Harbormaster completed remote builds in B113205: Diff 357516.Jul 9 2021, 9:48 AM

quinnp added a parent revision: D105834: [PowerPC] Semachecking for XL compat builtin icbt.Jul 12 2021, 12:26 PM

quinnp removed a parent revision: D105834: [PowerPC] Semachecking for XL compat builtin icbt.

nemanjai added inline comments.Jul 13 2021, 7:13 AM

clang/test/CodeGen/builtins-ppc-xlcompat-fp.c
6	AFAICT, nothing changes with Power8 so you might as well just have run lines for Power7 (AIX) and Power8 (LE).
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1564	I don't think this indentation is desired. This makes it look like the first type on line 1529 belongs in the first list rather than the second list it belongs to. I suggest the following: def int_ppc_fsel : GCCBuiltin<"__builtin_ppc_fsel">, Intrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty, llvm_double_ty], [IntrNoMem]>;
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4986	I agree that this warrants a comment. However I don't think this is really what Victor had in mind. In general if your comment boils down to "this is what we are doing" and simply describes the code, it is not very useful. What your comment should address is "why are we doing this here and this way". To me as a reader, it is not at all clear why we manually select this to `PPC::FSELS` here. All the other intrinsics are matched with patterns in the `.td` file but this one is matched specifically here. Why?
llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll
3	Again, this is excessive. There should be run lines for Power8, Power8-32-bit, Power8-no-vsx, Power7. We don't really need a cross-product of triples and CPUs.

Requesting changes to remove it from the queue until comments are addressed or answered.

This revision now requires changes to proceed.Jul 13 2021, 7:16 AM

quinnp edited the summary of this revision. (Show Details)Jul 16 2021, 6:28 AM

Addressing some review comments.

Adding a better comment for the handling of the ppc_fsels intrinsic.

quinnp marked an inline comment as done.Jul 16 2021, 1:49 PM

Fixing a run line in a test.

Harbormaster completed remote builds in B114587: Diff 359441.Jul 16 2021, 2:41 PM

LGTM aside from a very minor nit.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4986	s/ppc_fels intrinsic/PPC::FSELS instruction since we aren't emitting an intrinsic, we are consuming it and emitting the instruction (well a `MachineSDNode`, but it represents an instruction).

This revision is now accepted and ready to land.Jul 18 2021, 2:53 PM

My comments were addressed and I agree with the new indentation - LGTM.

LGTM

Addressing review comment.

quinnp marked an inline comment as done.Jul 19 2021, 7:22 AM

Harbormaster completed remote builds in B114855: Diff 359783.Jul 19 2021, 8:37 AM

please rebase to ToT

Rebasing patch. Moving macro definitions from a header file to a src file.

Harbormaster completed remote builds in B115100: Diff 360128.Jul 20 2021, 8:42 AM

Rebasing to ToT

Harbormaster completed remote builds in B115180: Diff 360261.Jul 20 2021, 3:51 PM

Rebasing to ToT.

Harbormaster completed remote builds in B115281: Diff 360408.Jul 21 2021, 6:29 AM

This revision was landed with ongoing or failed builds.Jul 21 2021, 6:33 AM

Closed by commit rGe002d251dd34: [PowerPC] Floating Point Builtins for XL Compat. (authored by quinnp). · Explain Why

This revision was automatically updated to reflect the committed changes.

quinnp added a commit: rGe002d251dd34: [PowerPC] Floating Point Builtins for XL Compat..

You missed a REQUIRES for the llvm test, I added one in: https://reviews.llvm.org/rG2404834c206a8930b0c420d94f4941b31c355de5

So if you see Arm-AArch64 quick bot failures, that was the reason.

In D103986#2893177, @DavidSpickett wrote:

You missed a REQUIRES for the llvm test, I added one in: https://reviews.llvm.org/rG2404834c206a8930b0c420d94f4941b31c355de5

So if you see Arm-AArch64 quick bot failures, that was the reason.

Right, sorry about that. Thank you, I will also move the test to llvm/test/CodeGen/PowerPC

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

BuiltinsPPC.def

15 lines

lib/

Basic/

Targets/

PPC.cpp

15 lines

CodeGen/

CGBuiltin.cpp

35 lines

test/

CodeGen/

builtins-ppc-xlcompat-fp.c

314 lines

llvm/

include/

llvm/

IR/

IntrinsicsPowerPC.td

11 lines

lib/

Target/

PowerPC/

PPCISelDAGToDAG.cpp

12 lines

PPCInstrInfo.td

4 lines

PPCInstrVSX.td

2 lines

test/

CodeGen/

builtins-ppc-xlcompat-fp.ll

101 lines

Diff 360425

clang/include/clang/Basic/BuiltinsPPC.def

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_ppc_iospace_sync, "v", "")			BUILTIN(__builtin_ppc_iospace_sync, "v", "")
	BUILTIN(__builtin_ppc_dcbfl, "vvC*", "")			BUILTIN(__builtin_ppc_dcbfl, "vvC*", "")
	BUILTIN(__builtin_ppc_dcbflp, "vvC*", "")			BUILTIN(__builtin_ppc_dcbflp, "vvC*", "")
	BUILTIN(__builtin_ppc_dcbst, "vvC*", "")			BUILTIN(__builtin_ppc_dcbst, "vvC*", "")
	BUILTIN(__builtin_ppc_dcbt, "vv*", "")			BUILTIN(__builtin_ppc_dcbt, "vv*", "")
	BUILTIN(__builtin_ppc_dcbtst, "vv*", "")			BUILTIN(__builtin_ppc_dcbtst, "vv*", "")
	BUILTIN(__builtin_ppc_dcbz, "vv*", "")			BUILTIN(__builtin_ppc_dcbz, "vv*", "")
	BUILTIN(__builtin_ppc_icbt, "vv*", "")			BUILTIN(__builtin_ppc_icbt, "vv*", "")
				BUILTIN(__builtin_ppc_fric, "dd", "")
				BUILTIN(__builtin_ppc_frim, "dd", "")
				BUILTIN(__builtin_ppc_frims, "ff", "")
				BUILTIN(__builtin_ppc_frin, "dd", "")
				BUILTIN(__builtin_ppc_frins, "ff", "")
				BUILTIN(__builtin_ppc_frip, "dd", "")
				BUILTIN(__builtin_ppc_frips, "ff", "")
				BUILTIN(__builtin_ppc_friz, "dd", "")
				BUILTIN(__builtin_ppc_frizs, "ff", "")
				BUILTIN(__builtin_ppc_fsel, "dddd", "")
				BUILTIN(__builtin_ppc_fsels, "ffff", "")
				BUILTIN(__builtin_ppc_frsqrte, "dd", "")
				BUILTIN(__builtin_ppc_frsqrtes, "ff", "")
				BUILTIN(__builtin_ppc_fsqrt, "dd", "")
				BUILTIN(__builtin_ppc_fsqrts, "ff", "")
	BUILTIN(__builtin_ppc_compare_and_swap, "iiDii", "")			BUILTIN(__builtin_ppc_compare_and_swap, "iiDii", "")
	BUILTIN(__builtin_ppc_compare_and_swaplp, "iLiDLiLi", "")			BUILTIN(__builtin_ppc_compare_and_swaplp, "iLiDLiLi", "")
	BUILTIN(__builtin_ppc_fetch_and_add, "UiUiD*Ui", "")			BUILTIN(__builtin_ppc_fetch_and_add, "UiUiD*Ui", "")
	BUILTIN(__builtin_ppc_fetch_and_addlp, "ULiULiD*ULi", "")			BUILTIN(__builtin_ppc_fetch_and_addlp, "ULiULiD*ULi", "")
	BUILTIN(__builtin_ppc_fetch_and_and, "UiUiD*Ui", "")			BUILTIN(__builtin_ppc_fetch_and_and, "UiUiD*Ui", "")
	BUILTIN(__builtin_ppc_fetch_and_andlp, "ULiULiD*ULi", "")			BUILTIN(__builtin_ppc_fetch_and_andlp, "ULiULiD*ULi", "")
	BUILTIN(__builtin_ppc_fetch_and_or, "UiUiD*Ui", "")			BUILTIN(__builtin_ppc_fetch_and_or, "UiUiD*Ui", "")
	BUILTIN(__builtin_ppc_fetch_and_orlp, "ULiULiD*ULi", "")			BUILTIN(__builtin_ppc_fetch_and_orlp, "ULiULiD*ULi", "")
	▲ Show 20 Lines • Show All 823 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/PPC.cpp

Show First 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	static void defineXLCompatMacros(MacroBuilder &Builder) {
Builder.defineMacro("__setrnd", "__builtin_setrnd");		Builder.defineMacro("__setrnd", "__builtin_setrnd");
Builder.defineMacro("__dcbtstt", "__builtin_ppc_dcbtstt");		Builder.defineMacro("__dcbtstt", "__builtin_ppc_dcbtstt");
Builder.defineMacro("__dcbtt", "__builtin_ppc_dcbtt");		Builder.defineMacro("__dcbtt", "__builtin_ppc_dcbtt");
Builder.defineMacro("__mftbu", "__builtin_ppc_mftbu");		Builder.defineMacro("__mftbu", "__builtin_ppc_mftbu");
Builder.defineMacro("__mfmsr", "__builtin_ppc_mfmsr");		Builder.defineMacro("__mfmsr", "__builtin_ppc_mfmsr");
Builder.defineMacro("__mtmsr", "__builtin_ppc_mtmsr");		Builder.defineMacro("__mtmsr", "__builtin_ppc_mtmsr");
Builder.defineMacro("__mfspr", "__builtin_ppc_mfspr");		Builder.defineMacro("__mfspr", "__builtin_ppc_mfspr");
Builder.defineMacro("__mtspr", "__builtin_ppc_mtspr");		Builder.defineMacro("__mtspr", "__builtin_ppc_mtspr");
		Builder.defineMacro("__fric", "__builtin_ppc_fric");
		Builder.defineMacro("__frim", "__builtin_ppc_frim");
		Builder.defineMacro("__frims", "__builtin_ppc_frims");
		Builder.defineMacro("__frin", "__builtin_ppc_frin");
		Builder.defineMacro("__frins", "__builtin_ppc_frins");
		Builder.defineMacro("__frip", "__builtin_ppc_frip");
		Builder.defineMacro("__frips", "__builtin_ppc_frips");
		Builder.defineMacro("__friz", "__builtin_ppc_friz");
		Builder.defineMacro("__frizs", "__builtin_ppc_frizs");
		Builder.defineMacro("__fsel", "__builtin_ppc_fsel");
		Builder.defineMacro("__fsels", "__builtin_ppc_fsels");
		Builder.defineMacro("__frsqrte", "__builtin_ppc_frsqrte");
		Builder.defineMacro("__frsqrtes", "__builtin_ppc_frsqrtes");
		Builder.defineMacro("__fsqrt", "__builtin_ppc_fsqrt");
		Builder.defineMacro("__fsqrts", "__builtin_ppc_fsqrts");
}		}

/// PPCTargetInfo::getTargetDefines - Return a set of the PowerPC-specific		/// PPCTargetInfo::getTargetDefines - Return a set of the PowerPC-specific
/// #defines that are not tied to a specific subtarget.		/// #defines that are not tied to a specific subtarget.
void PPCTargetInfo::getTargetDefines(const LangOptions &Opts,		void PPCTargetInfo::getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const {		MacroBuilder &Builder) const {

defineXLCompatMacros(Builder);		defineXLCompatMacros(Builder);
▲ Show 20 Lines • Show All 526 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 15,720 Lines • ▼ Show 20 Lines	#include "clang/Basic/BuiltinsPPC.def"
case PPC::BI__builtin_ppc_swdiv_nochk:		case PPC::BI__builtin_ppc_swdiv_nochk:
case PPC::BI__builtin_ppc_swdivs_nochk: {		case PPC::BI__builtin_ppc_swdivs_nochk: {
FastMathFlags FMF = Builder.getFastMathFlags();		FastMathFlags FMF = Builder.getFastMathFlags();
Builder.getFastMathFlags().setFast();		Builder.getFastMathFlags().setFast();
Value *FDiv = Builder.CreateFDiv(Ops[0], Ops[1], "swdiv_nochk");		Value *FDiv = Builder.CreateFDiv(Ops[0], Ops[1], "swdiv_nochk");
Builder.getFastMathFlags() &= (FMF);		Builder.getFastMathFlags() &= (FMF);
return FDiv;		return FDiv;
}		}
		case PPC::BI__builtin_ppc_fric:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::rint,
		Intrinsic::experimental_constrained_rint))
		.getScalarVal();
		case PPC::BI__builtin_ppc_frim:
		case PPC::BI__builtin_ppc_frims:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::floor,
		Intrinsic::experimental_constrained_floor))
		.getScalarVal();
		case PPC::BI__builtin_ppc_frin:
		case PPC::BI__builtin_ppc_frins:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::round,
		Intrinsic::experimental_constrained_round))
		.getScalarVal();
		case PPC::BI__builtin_ppc_frip:
		case PPC::BI__builtin_ppc_frips:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::ceil,
		Intrinsic::experimental_constrained_ceil))
		.getScalarVal();
		case PPC::BI__builtin_ppc_friz:
		case PPC::BI__builtin_ppc_frizs:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::trunc,
		Intrinsic::experimental_constrained_trunc))
		.getScalarVal();
		case PPC::BI__builtin_ppc_fsqrt:
		case PPC::BI__builtin_ppc_fsqrts:
		return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
		*this, E, Intrinsic::sqrt,
		Intrinsic::experimental_constrained_sqrt))
		.getScalarVal();
}		}
}		}

namespace {		namespace {
// If \p E is not null pointer, insert address space cast to match return		// If \p E is not null pointer, insert address space cast to match return
// type of \p E if necessary.		// type of \p E if necessary.
Value *EmitAMDGPUDispatchPtr(CodeGenFunction &CGF,		Value *EmitAMDGPUDispatchPtr(CodeGenFunction &CGF,
const CallExpr *E = nullptr) {		const CallExpr *E = nullptr) {
▲ Show 20 Lines • Show All 2,614 Lines • Show Last 20 Lines

clang/test/CodeGen/builtins-ppc-xlcompat-fp.c

This file was added.

// REQUIRES: powerpc-registered-target

// RUN: %clang_cc1 -triple powerpc64le-unknown-unknown \

// RUN: -emit-llvm %s -o - -target-cpu pwr8 | FileCheck %s

// RUN: %clang_cc1 -triple powerpc64-unknown-aix \

amykUnsubmitted

Done

// RUN: %clang_cc1 -triple powerpc64-unknown-unknown \

- // RUN: -emit-llvm %s -o - -target-cpu pwr7 | FileCheck %s

+ // RUN: -emit-llvm %s -o - -target-cpu pwr7 | FileCheck %s

// RUN: %clang_cc1 -triple powerpc64le-unknown-unknown \

nit: indentation on every second run line. We usually do two spaces more than the previous line. So I think it would look like:

amyk: nit: indentation on every second run line. We usually do two spaces more than the previous line.

// RUN: -emit-llvm %s -o - -target-cpu pwr7 | FileCheck %s

// RUN: %clang_cc1 -triple powerpc-unknown-aix \

nemanjaiUnsubmitted

Done

AFAICT, nothing changes with Power8 so you might as well just have run lines for Power7 (AIX) and Power8 (LE).

nemanjai: AFAICT, nothing changes with Power8 so you might as well just have run lines for Power7 (AIX)…

// RUN: -emit-llvm %s -o - -target-cpu pwr7 | FileCheck %s

extern double a;

extern double b;

NeHuangUnsubmitted

Done

You can define three extern variables for all the bulitins.

extern double a;
extern float b；
extern float c；

You can auto update the test case with utils/update_cc_test_checks.py

NeHuang: - You can define three extern variables for all the bulitins. ``` extern double a; extern float…

extern double c;

extern float d;

extern float e;

extern float f;

// CHECK-LABEL: @test_fric(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.rint.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

amykUnsubmitted

Done

Do the entry lines cause an issue when we don't have asserts?

amyk: Do the `entry` lines cause an issue when we don't have asserts?

double test_fric() {

return __fric(a);

}

// CHECK-LABEL: @test_frim(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.floor.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_frim() {

return __frim(a);

}

// CHECK-LABEL: @test_frims(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.floor.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_frims() {

return __frims(d);

}

// CHECK-LABEL: @test_frin(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.round.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_frin() {

return __frin(a);

}

// CHECK-LABEL: @test_frins(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.round.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_frins() {

return __frins(d);

}

// CHECK-LABEL: @test_frip(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.ceil.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_frip() {

return __frip(a);

}

// CHECK-LABEL: @test_frips(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.ceil.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_frips() {

return __frips(d);

}

// CHECK-LABEL: @test_friz(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.trunc.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_friz() {

return __friz(a);

}

// CHECK-LABEL: @test_frizs(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.trunc.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_frizs() {

return __frizs(d);

}

// CHECK-LABEL: @test_fsel(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @b, align 8

// CHECK-NEXT: [[TMP2:%.*]] = load double, double* @c, align 8

// CHECK-NEXT: [[TMP3:%.*]] = call double @llvm.ppc.fsel(double [[TMP0]], double [[TMP1]], double [[TMP2]])

// CHECK-NEXT: ret double [[TMP3]]

double test_fsel() {

return __fsel(a, b, c);

}

// CHECK-LABEL: @test_fsels(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @e, align 4

// CHECK-NEXT: [[TMP2:%.*]] = load float, float* @f, align 4

// CHECK-NEXT: [[TMP3:%.*]] = call float @llvm.ppc.fsels(float [[TMP0]], float [[TMP1]], float [[TMP2]])

// CHECK-NEXT: ret float [[TMP3]]

float test_fsels() {

return __fsels(d, e, f);

}

// CHECK-LABEL: @test_frsqrte(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.ppc.frsqrte(double [[TMP0]])

// CHECK-NEXT: ret double [[TMP1]]

double test_frsqrte() {

return __frsqrte(a);

}

// CHECK-LABEL: @test_frsqrtes(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = call float @llvm.ppc.frsqrtes(float [[TMP0]])

// CHECK-NEXT: ret float [[TMP1]]

float test_frsqrtes() {

return __frsqrtes(d);

}

// CHECK-LABEL: @test_fsqrt(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.sqrt.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_fsqrt() {

return __fsqrt(a);

}

// CHECK-LABEL: @test_fsqrts(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.sqrt.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_fsqrts() {

return __fsqrts(d);

}

// CHECK-LABEL: @test_builtin_ppc_fric(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.rint.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_fric() {

return __builtin_ppc_fric(a);

}

// CHECK-LABEL: @test_builtin_ppc_frim(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.floor.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_frim() {

return __builtin_ppc_frim(a);

}

// CHECK-LABEL: @test_builtin_ppc_frims(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.floor.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_builtin_ppc_frims() {

return __builtin_ppc_frims(d);

}

// CHECK-LABEL: @test_builtin_ppc_frin(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.round.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_frin() {

return __builtin_ppc_frin(a);

}

// CHECK-LABEL: @test_builtin_ppc_frins(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.round.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_builtin_ppc_frins() {

return __builtin_ppc_frins(d);

}

// CHECK-LABEL: @test_builtin_ppc_frip(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.ceil.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_frip() {

return __builtin_ppc_frip(a);

}

// CHECK-LABEL: @test_builtin_ppc_frips(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.ceil.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_builtin_ppc_frips() {

return __builtin_ppc_frips(d);

}

// CHECK-LABEL: @test_builtin_ppc_friz(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.trunc.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_friz() {

return __builtin_ppc_friz(a);

}

// CHECK-LABEL: @test_builtin_ppc_frizs(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.trunc.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_builtin_ppc_frizs() {

return __builtin_ppc_frizs(d);

}

// CHECK-LABEL: @test_builtin_ppc_fsel(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @b, align 8

// CHECK-NEXT: [[TMP2:%.*]] = load double, double* @c, align 8

// CHECK-NEXT: [[TMP3:%.*]] = call double @llvm.ppc.fsel(double [[TMP0]], double [[TMP1]], double [[TMP2]])

// CHECK-NEXT: ret double [[TMP3]]

double test_builtin_ppc_fsel() {

return __builtin_ppc_fsel(a, b, c);

}

// CHECK-LABEL: @test_builtin_ppc_fsels(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @e, align 4

// CHECK-NEXT: [[TMP2:%.*]] = load float, float* @f, align 4

// CHECK-NEXT: [[TMP3:%.*]] = call float @llvm.ppc.fsels(float [[TMP0]], float [[TMP1]], float [[TMP2]])

// CHECK-NEXT: ret float [[TMP3]]

float test_builtin_ppc_fsels() {

return __builtin_ppc_fsels(d, e, f);

}

// CHECK-LABEL: @test_builtin_ppc_frsqrte(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.ppc.frsqrte(double [[TMP0]])

// CHECK-NEXT: ret double [[TMP1]]

double test_builtin_ppc_frsqrte() {

return __builtin_ppc_frsqrte(a);

}

// CHECK-LABEL: @test_builtin_ppc_frsqrtes(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = call float @llvm.ppc.frsqrtes(float [[TMP0]])

// CHECK-NEXT: ret float [[TMP1]]

float test_builtin_ppc_frsqrtes() {

return __builtin_ppc_frsqrtes(d);

}

// CHECK-LABEL: @test_builtin_ppc_fsqrt(

// CHECK: [[TMP0:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP1:%.*]] = load double, double* @a, align 8

// CHECK-NEXT: [[TMP2:%.*]] = call double @llvm.sqrt.f64(double [[TMP1]])

// CHECK-NEXT: ret double [[TMP2]]

double test_builtin_ppc_fsqrt() {

return __builtin_ppc_fsqrt(a);

}

// CHECK-LABEL: @test_builtin_ppc_fsqrts(

// CHECK: [[TMP0:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP1:%.*]] = load float, float* @d, align 4

// CHECK-NEXT: [[TMP2:%.*]] = call float @llvm.sqrt.f32(float [[TMP1]])

// CHECK-NEXT: ret float [[TMP2]]

float test_builtin_ppc_fsqrts() {

return __builtin_ppc_fsqrts(d);

}

llvm/include/llvm/IR/IntrinsicsPowerPC.td

Show First 20 Lines • Show All 1,555 Lines • ▼ Show 20 Lines

let TargetPrefix = "ppc" in {

// eieio instruction

def int_ppc_eieio : GCCBuiltin<"__builtin_ppc_eieio">,

Intrinsic<[],[],[]>;

def int_ppc_iospace_eieio : GCCBuiltin<"__builtin_ppc_iospace_eieio">,

Intrinsic<[],[],[]>;

def int_ppc_stdcx : GCCBuiltin<"__builtin_ppc_stdcx">,

Intrinsic<[llvm_i32_ty], [llvm_ptr_ty, llvm_i64_ty],

[IntrWriteMem]>;

def int_ppc_stwcx : GCCBuiltin<"__builtin_ppc_stwcx">,

amykUnsubmitted

Done

Intrinsic<[llvm_double_ty], [llvm_double_ty,

- llvm_double_ty, llvm_double_ty], [IntrNoMem]>;

+ llvm_double_ty, llvm_double_ty], [IntrNoMem]>;

def int_ppc_fsels : GCCBuiltin<"__builtin_ppc_fsels">,

Minor indentation nit.

amyk: Minor indentation nit.

nemanjaiUnsubmitted

Done

I don't think this indentation is desired. This makes it look like the first type on line 1529 belongs in the first list rather than the second list it belongs to. I suggest the following:

def int_ppc_fsel :
  GCCBuiltin<"__builtin_ppc_fsel">,
  Intrinsic<[llvm_double_ty], [llvm_double_ty,
                               llvm_double_ty, llvm_double_ty], [IntrNoMem]>;

nemanjai: I don't think this indentation is desired. This makes it look like the first type on line 1529…

Intrinsic<[llvm_i32_ty], [llvm_ptr_ty, llvm_i32_ty],

[IntrWriteMem]>;

def int_ppc_sthcx

amykUnsubmitted

Done

Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty,

- llvm_float_ty], [IntrNoMem]>;

+ llvm_float_ty], [IntrNoMem]>;

def int_ppc_frsqrte : GCCBuiltin<"__builtin_ppc_frsqrte">,

Minor indentation nit to make it aligned with above.

amyk: Minor indentation nit to make it aligned with above.

: Intrinsic<[llvm_i32_ty], [ llvm_ptr_ty, llvm_i32_ty ], [IntrWriteMem]>;

def int_ppc_dcbtstt : GCCBuiltin<"__builtin_ppc_dcbtstt">,

Intrinsic<[], [llvm_ptr_ty],

[IntrArgMemOnly, NoCapture<ArgIndex<0>>]>;

def int_ppc_dcbtt : GCCBuiltin<"__builtin_ppc_dcbtt">,

Intrinsic<[], [llvm_ptr_ty],

[IntrArgMemOnly, NoCapture<ArgIndex<0>>]>;

def int_ppc_mftbu : GCCBuiltin<"__builtin_ppc_mftbu">,

▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines

def int_ppc_fnmsubs

[llvm_float_ty, llvm_float_ty, llvm_float_ty],

[IntrNoMem]>;

def int_ppc_fre

: GCCBuiltin<"__builtin_ppc_fre">,

Intrinsic <[llvm_double_ty], [llvm_double_ty], [IntrNoMem]>;

def int_ppc_fres

: GCCBuiltin<"__builtin_ppc_fres">,

Intrinsic <[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>;

def int_ppc_fsel : GCCBuiltin<"__builtin_ppc_fsel">,

Intrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty,

llvm_double_ty], [IntrNoMem]>;

def int_ppc_fsels : GCCBuiltin<"__builtin_ppc_fsels">,

Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty,

llvm_float_ty], [IntrNoMem]>;

def int_ppc_frsqrte : GCCBuiltin<"__builtin_ppc_frsqrte">,

Intrinsic<[llvm_double_ty], [llvm_double_ty], [IntrNoMem]>;

def int_ppc_frsqrtes : GCCBuiltin<"__builtin_ppc_frsqrtes">,

Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>;

}

//===----------------------------------------------------------------------===//

// PowerPC Atomic Intrinsic Definitions.

let TargetPrefix = "ppc" in {

class AtomicRMW128Intrinsic

: Intrinsic<[llvm_i64_ty, llvm_i64_ty],

[llvm_ptr_ty, llvm_i64_ty, llvm_i64_ty],

Show All 14 Lines

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,977 Lines • ▼ Show 20 Lines	void PPCDAGToDAGISel::Select(SDNode *N) {
case ISD::Constant:		case ISD::Constant:
if (N->getValueType(0) == MVT::i64) {		if (N->getValueType(0) == MVT::i64) {
ReplaceNode(N, selectI64Imm(CurDAG, N));		ReplaceNode(N, selectI64Imm(CurDAG, N));
return;		return;
}		}
break;		break;

case ISD::INTRINSIC_WO_CHAIN: {		case ISD::INTRINSIC_WO_CHAIN: {
		// We emit the PPC::FSELS instruction here because of type conflicts with
		NeHuangUnsubmitted Done Reply Inline Actions you can delete blank line and better add comments for the operation below. NeHuang: you can delete blank line and better add comments for the operation below.
		nemanjaiUnsubmitted Done Reply Inline Actions I agree that this warrants a comment. However I don't think this is really what Victor had in mind. In general if your comment boils down to "this is what we are doing" and simply describes the code, it is not very useful. What your comment should address is "why are we doing this here and this way". To me as a reader, it is not at all clear why we manually select this to `PPC::FSELS` here. All the other intrinsics are matched with patterns in the `.td` file but this one is matched specifically here. Why? nemanjai: I agree that this warrants a comment. However I don't think this is really what Victor had in…
		nemanjaiUnsubmitted Done Reply Inline Actions s/ppc_fels intrinsic/PPC::FSELS instruction since we aren't emitting an intrinsic, we are consuming it and emitting the instruction (well a `MachineSDNode`, but it represents an instruction). nemanjai: s/ppc_fels intrinsic/PPC::FSELS instruction since we aren't emitting an intrinsic, we are…
		// the comparison operand. The FSELS instruction is defined to use an 8-byte
		// comparison like the FSELD version. The fsels intrinsic takes a 4-byte
		NeHuangUnsubmitted Done Reply Inline Actions Please use `Ops` as the variable name. NeHuang: Please use `Ops` as the variable name.
		// value for the comparison. When selecting through a .td file, a type
		// error is raised. Must check this first so we never break on the
		// !Subtarget->isISA3_1() check.
		if (N->getConstantOperandVal(0) == Intrinsic::ppc_fsels) {
		SDValue Ops[] = {N->getOperand(1), N->getOperand(2), N->getOperand(3)};
		CurDAG->SelectNodeTo(N, PPC::FSELS, MVT::f32, Ops);
		return;
		}

if (!Subtarget->isISA3_1())		if (!Subtarget->isISA3_1())
break;		break;
unsigned Opcode = 0;		unsigned Opcode = 0;
switch (N->getConstantOperandVal(0)) {		switch (N->getConstantOperandVal(0)) {
default:		default:
break;		break;
case Intrinsic::ppc_altivec_vstribr_p:		case Intrinsic::ppc_altivec_vstribr_p:
Opcode = PPC::VSTRIBR_rec;		Opcode = PPC::VSTRIBR_rec;
▲ Show 20 Lines • Show All 2,267 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.td

Show First 20 Lines • Show All 3,332 Lines • ▼ Show 20 Lines	defm FSUBS : AForm_2r<59, 20,
"fsubs", "$FRT, $FRA, $FRB", IIC_FPGeneral,		"fsubs", "$FRT, $FRA, $FRB", IIC_FPGeneral,
[(set f32:$FRT, (any_fsub f32:$FRA, f32:$FRB))]>;		[(set f32:$FRT, (any_fsub f32:$FRA, f32:$FRB))]>;
}		}
}		}

let hasSideEffects = 0 in {		let hasSideEffects = 0 in {
let PPC970_Unit = 1 in { // FXU Operations.		let PPC970_Unit = 1 in { // FXU Operations.
let isSelect = 1 in		let isSelect = 1 in
def ISEL : AForm_4<31, 15,		def ISEL : AForm_4<31, 15,
		qiucfUnsubmitted Done Reply Inline Actions Seems we didn't exploit `XXSEL` in this case. But for sqrt/rsqrt, PPC has VSX and non-VSX versions for them: frsqrte - xsrsqrtedp frsqrtes - xsrsqrtesp fsqrt - xssqrtdp fsqrts - xssqrtsp So needs to add `frsqrte` here and `xsrsqrtesp/xssqrtsp` in VSX part? qiucf: Seems we didn't exploit `XXSEL` in this case. But for sqrt/rsqrt, PPC has VSX and non-VSX…
(outs gprc:$rT), (ins gprc_nor0:$rA, gprc:$rB, crbitrc:$cond),		(outs gprc:$rT), (ins gprc_nor0:$rA, gprc:$rB, crbitrc:$cond),
"isel $rT, $rA, $rB, $cond", IIC_IntISEL,		"isel $rT, $rA, $rB, $cond", IIC_IntISEL,
[]>;		[]>;
}		}

let PPC970_Unit = 1 in { // FXU Operations.		let PPC970_Unit = 1 in { // FXU Operations.
// M-Form instructions. rotate and mask instructions.		// M-Form instructions. rotate and mask instructions.
//		//
▲ Show 20 Lines • Show All 1,054 Lines • ▼ Show 20 Lines	def ANDI_rec_1_GT_BIT8 : PPCCustomInserterPseudo<(outs crbitrc:$dst), (ins g8rc:$in),
"#ANDI_rec_1_GT_BIT8",		"#ANDI_rec_1_GT_BIT8",
[(set i1:$dst, (trunc i64:$in))]>;		[(set i1:$dst, (trunc i64:$in))]>;

def : Pat<(i1 (not (trunc i32:$in))),		def : Pat<(i1 (not (trunc i32:$in))),
(ANDI_rec_1_EQ_BIT $in)>;		(ANDI_rec_1_EQ_BIT $in)>;
def : Pat<(i1 (not (trunc i64:$in))),		def : Pat<(i1 (not (trunc i64:$in))),
(ANDI_rec_1_EQ_BIT8 $in)>;		(ANDI_rec_1_EQ_BIT8 $in)>;

		def : Pat<(int_ppc_fsel f8rc:$FRA, f8rc:$FRC, f8rc:$FRB), (FSELD $FRA, $FRC, $FRB)>;
		def : Pat<(int_ppc_frsqrte f8rc:$frB), (FRSQRTE $frB)>;
		def : Pat<(int_ppc_frsqrtes f4rc:$frB), (FRSQRTES $frB)>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PowerPC Instructions used for assembler/disassembler only		// PowerPC Instructions used for assembler/disassembler only
//		//

// FIXME: For B=0 or B > 8, the registers following RT are used.		// FIXME: For B=0 or B > 8, the registers following RT are used.
// WARNING: Do not add patterns for this instruction without fixing this.		// WARNING: Do not add patterns for this instruction without fixing this.
def LSWI : XForm_base_r3xo_memOp<31, 597, (outs gprc:$RT),		def LSWI : XForm_base_r3xo_memOp<31, 597, (outs gprc:$RT),
(ins gprc:$A, u5imm:$B),		(ins gprc:$A, u5imm:$B),
▲ Show 20 Lines • Show All 1,084 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 2,851 Lines • ▼ Show 20 Lines	def : Pat<(v2i64 (PPCvcmp_rec v2i64:$vA, v2i64:$vB, 199)),
(VCMPGTUB_rec DblwdCmp.MRGEQ, (v2i64 (XXLXORz)))>;		(VCMPGTUB_rec DblwdCmp.MRGEQ, (v2i64 (XXLXORz)))>;
} // AddedComplexity = 0		} // AddedComplexity = 0

// XL Compat builtins.		// XL Compat builtins.
def : Pat<(int_ppc_fmsub f64:$A, f64:$B, f64:$C), (XSMSUBMDP $A, $B, $C)>;		def : Pat<(int_ppc_fmsub f64:$A, f64:$B, f64:$C), (XSMSUBMDP $A, $B, $C)>;
def : Pat<(int_ppc_fnmsub f64:$A, f64:$B, f64:$C), (XSNMSUBMDP $A, $B, $C)>;		def : Pat<(int_ppc_fnmsub f64:$A, f64:$B, f64:$C), (XSNMSUBMDP $A, $B, $C)>;
def : Pat<(int_ppc_fnmadd f64:$A, f64:$B, f64:$C), (XSNMADDMDP $A, $B, $C)>;		def : Pat<(int_ppc_fnmadd f64:$A, f64:$B, f64:$C), (XSNMADDMDP $A, $B, $C)>;
def : Pat<(int_ppc_fre f64:$A), (XSREDP $A)>;		def : Pat<(int_ppc_fre f64:$A), (XSREDP $A)>;
		def : Pat<(int_ppc_frsqrte vsfrc:$XB), (XSRSQRTEDP $XB)>;
} // HasVSX		} // HasVSX

// Any big endian VSX subtarget.		// Any big endian VSX subtarget.
let Predicates = [HasVSX, IsBigEndian] in {		let Predicates = [HasVSX, IsBigEndian] in {
def : Pat<(v2f64 (scalar_to_vector f64:$A)),		def : Pat<(v2f64 (scalar_to_vector f64:$A)),
(v2f64 (SUBREG_TO_REG (i64 1), $A, sub_64))>;		(v2f64 (SUBREG_TO_REG (i64 1), $A, sub_64))>;

def : Pat<(f64 (extractelt v2f64:$S, 0)),		def : Pat<(f64 (extractelt v2f64:$S, 0)),
▲ Show 20 Lines • Show All 394 Lines • ▼ Show 20 Lines	def : Pat<(i32 (int_ppc_extract_exp f64:$A)),
(EXTRACT_SUBREG (XSXEXPDP (COPY_TO_REGCLASS $A, VSFRC)), sub_32)>;		(EXTRACT_SUBREG (XSXEXPDP (COPY_TO_REGCLASS $A, VSFRC)), sub_32)>;
def : Pat<(int_ppc_extract_sig f64:$A),		def : Pat<(int_ppc_extract_sig f64:$A),
(XSXSIGDP (COPY_TO_REGCLASS $A, VSFRC))>;		(XSXSIGDP (COPY_TO_REGCLASS $A, VSFRC))>;
def : Pat<(f64 (int_ppc_insert_exp f64:$A, i64:$B)),		def : Pat<(f64 (int_ppc_insert_exp f64:$A, i64:$B)),
(COPY_TO_REGCLASS (XSIEXPDP (COPY_TO_REGCLASS $A, G8RC), $B), F8RC)>;		(COPY_TO_REGCLASS (XSIEXPDP (COPY_TO_REGCLASS $A, G8RC), $B), F8RC)>;

def : Pat<(int_ppc_stfiw ForceXForm:$dst, f64:$XT),		def : Pat<(int_ppc_stfiw ForceXForm:$dst, f64:$XT),
(STXSIWX f64:$XT, ForceXForm:$dst)>;		(STXSIWX f64:$XT, ForceXForm:$dst)>;
		def : Pat<(int_ppc_frsqrtes vssrc:$XB), (XSRSQRTESP $XB)>;
} // HasVSX, HasP8Vector		} // HasVSX, HasP8Vector

// Any big endian Power8 VSX subtarget.		// Any big endian Power8 VSX subtarget.
let Predicates = [HasVSX, HasP8Vector, IsBigEndian] in {		let Predicates = [HasVSX, HasP8Vector, IsBigEndian] in {
def : Pat<DWToSPExtractConv.El0SS1,		def : Pat<DWToSPExtractConv.El0SS1,
(f32 (XSCVSXDSP (COPY_TO_REGCLASS $S1, VSFRC)))>;		(f32 (XSCVSXDSP (COPY_TO_REGCLASS $S1, VSFRC)))>;
def : Pat<DWToSPExtractConv.El1SS1,		def : Pat<DWToSPExtractConv.El1SS1,
(f32 (XSCVSXDSP (COPY_TO_REGCLASS (XXPERMDI $S1, $S1, 2), VSFRC)))>;		(f32 (XSCVSXDSP (COPY_TO_REGCLASS (XXPERMDI $S1, $S1, 2), VSFRC)))>;
▲ Show 20 Lines • Show All 1,738 Lines • Show Last 20 Lines

llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll

This file was added.

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py

; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-unknown \

; RUN: -mcpu=pwr7 < %s | FileCheck %s --check-prefix=CHECK-PWR7

amykUnsubmitted

Done

; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-unknown \

- ; RUN: -mcpu=pwr7 < %s | FileCheck %s --check-prefix=CHECK-PWR7

+ ; RUN: -mcpu=pwr7 < %s | FileCheck %s --check-prefix=CHECK-PWR7

; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-unknown \

Minor indentation nit:

amyk: Minor indentation nit:

nemanjaiUnsubmitted

Done

Again, this is excessive. There should be run lines for Power8, Power8-32-bit, Power8-no-vsx, Power7. We don't really need a cross-product of triples and CPUs.

nemanjai: Again, this is excessive. There should be run lines for Power8, Power8-32-bit, Power8-no-vsx…

; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-unknown \

; RUN: -mcpu=pwr8 < %s | FileCheck %s --check-prefix=CHECK-PWR8

; RUN: llc -verify-machineinstrs -mtriple=powerpc-unknown-aix \

; RUN: -mcpu=pwr8 < %s | FileCheck %s --check-prefix=CHECK-PWR8

; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-unknown \

; RUN: -mattr=-vsx -mcpu=pwr8 < %s | FileCheck %s --check-prefix=CHECK-NOVSX

qiucfUnsubmitted

Done

You can use update_llc_test script to automatically update this file.

And add non-vsx test?

qiucf: You can use `update_llc_test` script to automatically update this file. And add non-vsx test?

define dso_local double @test_fsel(double %a, double %b, double %c) local_unnamed_addr {

; CHECK-PWR7-LABEL: test_fsel:

; CHECK-PWR7: # %bb.0: # %entry

; CHECK-PWR7-NEXT: fsel 1, 1, 2, 3

; CHECK-PWR7-NEXT: blr

;

; CHECK-PWR8-LABEL: test_fsel:

; CHECK-PWR8: # %bb.0: # %entry

; CHECK-PWR8-NEXT: fsel 1, 1, 2, 3

NeHuangUnsubmitted

Done

you can remove #0, #1 and #2

NeHuang: you can remove `#0`, `#1` and `#2`

; CHECK-PWR8-NEXT: blr

;

; CHECK-NOVSX-LABEL: test_fsel:

; CHECK-NOVSX: # %bb.0: # %entry

; CHECK-NOVSX-NEXT: fsel 1, 1, 2, 3

; CHECK-NOVSX-NEXT: blr

entry:

%0 = tail call double @llvm.ppc.fsel(double %a, double %b, double %c)

ret double %0

}

declare double @llvm.ppc.fsel(double, double, double)

define dso_local float @test_fsels(float %a, float %b, float %c) local_unnamed_addr {

; CHECK-PWR7-LABEL: test_fsels:

; CHECK-PWR7: # %bb.0: # %entry

; CHECK-PWR7-NEXT: fsel 1, 1, 2, 3

; CHECK-PWR7-NEXT: blr

;

; CHECK-PWR8-LABEL: test_fsels:

; CHECK-PWR8: # %bb.0: # %entry

; CHECK-PWR8-NEXT: fsel 1, 1, 2, 3

; CHECK-PWR8-NEXT: blr

;

; CHECK-NOVSX-LABEL: test_fsels:

; CHECK-NOVSX: # %bb.0: # %entry

; CHECK-NOVSX-NEXT: fsel 1, 1, 2, 3

; CHECK-NOVSX-NEXT: blr

entry:

%0 = tail call float @llvm.ppc.fsels(float %a, float %b, float %c)

ret float %0

}

declare float @llvm.ppc.fsels(float, float, float)

define dso_local double @test_frsqrte(double %a) local_unnamed_addr {

; CHECK-PWR7-LABEL: test_frsqrte:

; CHECK-PWR7: # %bb.0: # %entry

; CHECK-PWR7-NEXT: xsrsqrtedp 1, 1

; CHECK-PWR7-NEXT: blr

;

; CHECK-PWR8-LABEL: test_frsqrte:

; CHECK-PWR8: # %bb.0: # %entry

; CHECK-PWR8-NEXT: xsrsqrtedp 1, 1

; CHECK-PWR8-NEXT: blr

;

; CHECK-NOVSX-LABEL: test_frsqrte:

; CHECK-NOVSX: # %bb.0: # %entry

; CHECK-NOVSX-NEXT: frsqrte 1, 1

; CHECK-NOVSX-NEXT: blr

entry:

%0 = tail call double @llvm.ppc.frsqrte(double %a)

ret double %0

}

declare double @llvm.ppc.frsqrte(double)

define dso_local float @test_frsqrtes(float %a) local_unnamed_addr {

; CHECK-PWR7-LABEL: test_frsqrtes:

; CHECK-PWR7: # %bb.0: # %entry

; CHECK-PWR7-NEXT: frsqrtes 1, 1

; CHECK-PWR7-NEXT: blr

;

; CHECK-PWR8-LABEL: test_frsqrtes:

; CHECK-PWR8: # %bb.0: # %entry

; CHECK-PWR8-NEXT: xsrsqrtesp 1, 1

; CHECK-PWR8-NEXT: blr

;

; CHECK-NOVSX-LABEL: test_frsqrtes:

; CHECK-NOVSX: # %bb.0: # %entry

; CHECK-NOVSX-NEXT: frsqrtes 1, 1

; CHECK-NOVSX-NEXT: blr

entry:

%0 = tail call float @llvm.ppc.frsqrtes(float %a)

ret float %0

}

declare float @llvm.ppc.frsqrtes(float)

This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Floating Point Builtins for XL Compat.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 360425

clang/include/clang/Basic/BuiltinsPPC.def

clang/lib/Basic/Targets/PPC.cpp

clang/lib/CodeGen/CGBuiltin.cpp

clang/test/CodeGen/builtins-ppc-xlcompat-fp.c

llvm/include/llvm/IR/IntrinsicsPowerPC.td

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp

llvm/lib/Target/PowerPC/PPCInstrInfo.td

llvm/lib/Target/PowerPC/PPCInstrVSX.td

llvm/test/CodeGen/builtins-ppc-xlcompat-fp.ll

[PowerPC] Floating Point Builtins for XL Compat.
ClosedPublic