This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn
ClosedPublic

Authored by kzhuravl on Feb 16 2018, 2:46 PM.

Download Raw Diff

Details

Reviewers

t-tye
b-sumner
scchan

Summary

Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.)
- This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601)
Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables
Add HasFastFMA, HasFastFMAF to GPUInfo tables
Add missing tests

Diff Detail

Event Timeline

kzhuravl created this revision.Feb 16 2018, 2:46 PM

Herald added subscribers: tpr, dstuttard, yaxunl and 2 others. · View Herald TranscriptFeb 16 2018, 2:46 PM

t-tye added a subscriber: b-sumner.Feb 16 2018, 2:53 PM

t-tye added inline comments.

lib/Basic/Targets/AMDGPU.cpp
357–360	Do all amdgcn targets have fast FMA? @b-sumner can you clarify?

t-tye added a reviewer: b-sumner.Feb 16 2018, 2:54 PM

b-sumner added inline comments.Feb 16 2018, 3:18 PM

lib/Basic/Targets/AMDGPU.cpp
357–360	No. All targets that support double precision should report FAST_FMA. Only targets with full rate v_fma_f32 should report FAST_FMAF

t-tye added inline comments.Feb 16 2018, 3:24 PM

lib/Basic/Targets/AMDGPU.cpp
357–360	It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target.

t-tye added a reviewer: scchan.Feb 16 2018, 3:27 PM

t-tye requested changes to this revision.Feb 16 2018, 4:31 PM

t-tye added inline comments.

lib/Basic/Targets/AMDGPU.cpp
357–360	Now that the compiler knows the target it seems the clang options that specify fast_fma et al should be removed and the runtimes changed to not set them. The implementation of when fast fma[f] is present should match the amdgcn td files which have all gfx9 and some pre-gfx9 targets supporting fast fmaf (the ones that have full rate double precision).

This revision now requires changes to proceed.Feb 16 2018, 4:31 PM

Address review feedback

t-tye requested changes to this revision.Feb 23 2018, 3:03 PM

t-tye added inline comments.

lib/Basic/Targets/AMDGPU.cpp
159	What does this mean? Has it now been addressed by this patch?
275–276	To be consistent should this be: if (isAMDGCN(getTriple())) Similar comment elsewhere.
282	This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF.
lib/Basic/Targets/AMDGPU.h
88–91	Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target?
99–103	Suggest reordering to be in a logical groups of FP32 and FP64: bool HasFMAF; bool HasFastFMAF; bool HasLDEXPF; bool HasFP64; bool HasFastFMA;
109	Suggest adding a comment here which is a header for the columns to make it easier to check if the settings are right. For example: // Name Canonical Kind HasFMAF HasFP64 HasLDEXPF HasFastFMA HasFastFMAF
140	Same comment as above. Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn has fast_fma, it is fast_fmaf that varies.
151	gfx702 should have true for fast fmaf. Does the TD file need correcting too?
332	I found the original operand order easier to read:-)

This revision now requires changes to proceed.Feb 23 2018, 3:03 PM

b-sumner added inline comments.Feb 23 2018, 3:10 PM

lib/Basic/Targets/AMDGPU.cpp
349–353	I'm not sure why this is here. No languages we support have this AFAIK. We should probably add a comment that this is deprecated and remove it in a year or so.

kzhuravl added inline comments.Feb 23 2018, 3:39 PM

lib/Basic/Targets/AMDGPU.cpp
159	This class has a member called GPU. We are using the CPU that is passed as an argument. Has it now been addressed by this patch? No.
lib/Basic/Targets/AMDGPU.h
140	fast_fmaf is the last column. I think those are in the correct order.
151	Not according to the TD files in our BE: https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/AMDGPU.td#L545
332	I was trying to match the style above. But I can change it back.

t-tye added inline comments.Feb 23 2018, 4:58 PM

lib/Basic/Targets/AMDGPU.h
140	Agreed.
151	I believe the TD file is incorrect. gfx701 and 702 should both have fast fmaf.
332	Leave it in your new order which keeps it consistent as you point out. (For me they are all backwards:-) )

nhaehnle removed a subscriber: nhaehnle.Feb 25 2018, 7:36 AM

Address review feedback.

kzhuravl mentioned this in D43790: AMDGPU: Add fast fmaf feature to gfx702.Feb 26 2018, 3:11 PM

b-sumner added inline comments.Feb 26 2018, 3:57 PM

lib/Basic/Targets/AMDGPU.h
103	I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64.

LGTM except for comment on deprecated macros.

lib/Basic/Targets/AMDGPU.cpp
346–347	@b-sumner which ones were to recommending deprecating? I would think FP64 and FMAF would be ones that needs to be kept? I thought it was HAS_FAST_FMA that is not specified by OpenCL.

This revision is now accepted and ready to land.Feb 26 2018, 9:58 PM

b-sumner added inline comments.Feb 27 2018, 5:41 AM

lib/Basic/Targets/AMDGPU.cpp
346–347	Sorry, I was referring to all of the __HAS_* macros.

rL326254

Revision Contents

Path

Size

lib/

Basic/

Targets/

AMDGPU.h

218 lines

AMDGPU.cpp

124 lines

test/

Driver/

amdgpu-macros.cl

222 lines

Misc/

target-invalid-cpu-note.c

2 lines

Diff 135997

lib/Basic/Targets/AMDGPU.h

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	AddrSpace(bool IsGenericZero_ = false) {
Global = 1;		Global = 1;
Local = 3;		Local = 3;
Constant = 2;		Constant = 2;
Private = 0;		Private = 0;
}		}
}		}
};		};

/// \brief The GPU profiles supported by the AMDGPU target.		/// \brief GPU kinds supported by the AMDGPU target.
enum GPUKind {		enum GPUKind : uint32_t {
GK_NONE,		// Not specified processor.
		GK_NONE = 0,

		// R600-based processors.
GK_R600,		GK_R600,
GK_R600_DOUBLE_OPS,		GK_R630,
GK_R700,		GK_RS880,
GK_R700_DOUBLE_OPS,		GK_RV670,
GK_EVERGREEN,		GK_RV710,
GK_EVERGREEN_DOUBLE_OPS,		GK_RV730,
GK_NORTHERN_ISLANDS,		GK_RV770,
		GK_CEDAR,
		GK_CYPRESS,
		GK_JUNIPER,
		GK_REDWOOD,
		GK_SUMO,
		GK_BARTS,
		GK_CAICOS,
GK_CAYMAN,		GK_CAYMAN,
GK_GFX6,		GK_TURKS,
GK_GFX7,
GK_GFX8,		GK_R600_FIRST = GK_R600,
GK_GFX9		GK_R600_LAST = GK_TURKS,

		// AMDGCN-based processors.
		GK_GFX600,
		GK_GFX601,
		GK_GFX700,
		GK_GFX701,
		GK_GFX702,
		GK_GFX703,
		GK_GFX704,
		GK_GFX801,
		GK_GFX802,
		GK_GFX803,
		GK_GFX810,
		GK_GFX900,
		GK_GFX902,

		GK_AMDGCN_FIRST = GK_GFX600,
		t-tyeUnsubmitted Done Reply Inline Actions Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target? t-tye: Would it be better to position these at the beginning/end of the respective enumerations so it…
		GK_AMDGCN_LAST = GK_GFX902,
};		};

struct GPUInfo {		struct GPUInfo {
llvm::StringLiteral Name;		llvm::StringLiteral Name;
llvm::StringLiteral CanonicalName;		llvm::StringLiteral CanonicalName;
AMDGPUTargetInfo::GPUKind Kind;		AMDGPUTargetInfo::GPUKind Kind;
		bool HasFMAF;
		bool HasFastFMAF;
		bool HasLDEXPF;
		bool HasFP64;
		bool HasFastFMA;
		t-tyeUnsubmitted Done Reply Inline Actions Suggest reordering to be in a logical groups of FP32 and FP64: bool HasFMAF; bool HasFastFMAF; bool HasLDEXPF; bool HasFP64; bool HasFastFMA; t-tye: Suggest reordering to be in a logical groups of FP32 and FP64: ``` bool HasFMAF; bool…
		b-sumnerUnsubmitted Not Done Reply Inline Actions I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64. b-sumner: I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64.
};		};

GPUInfo GPU;		static constexpr GPUInfo InvalidGPU =
		{{""}, {""}, GK_NONE, false, false, false, false, false};
static constexpr GPUInfo InvalidGPU = {{""}, {""}, GK_NONE};		static constexpr GPUInfo R600GPUs[26] = {
static constexpr GPUInfo R600Names[26] = {		// Name Canonical Kind Has Has Has Has Has
		t-tyeUnsubmitted Done Reply Inline Actions Suggest adding a comment here which is a header for the columns to make it easier to check if the settings are right. For example: // Name Canonical Kind HasFMAF HasFP64 HasLDEXPF HasFastFMA HasFastFMAF t-tye: Suggest adding a comment here which is a header for the columns to make it easier to check if…
{{"r600"}, {"r600"}, GK_R600},		// Name FMAF Fast LDEXPF FP64 Fast
{{"rv630"}, {"r600"}, GK_R600},		// FMAF FMA
{{"rv635"}, {"r600"}, GK_R600},		{{"r600"}, {"r600"}, GK_R600, false, false, false, false, false},
{{"r630"}, {"r630"}, GK_R600},		{{"rv630"}, {"r600"}, GK_R600, false, false, false, false, false},
{{"rs780"}, {"rs880"}, GK_R600},		{{"rv635"}, {"r600"}, GK_R600, false, false, false, false, false},
{{"rs880"}, {"rs880"}, GK_R600},		{{"r630"}, {"r630"}, GK_R630, false, false, false, false, false},
{{"rv610"}, {"rs880"}, GK_R600},		{{"rs780"}, {"rs880"}, GK_RS880, false, false, false, false, false},
{{"rv620"}, {"rs880"}, GK_R600},		{{"rs880"}, {"rs880"}, GK_RS880, false, false, false, false, false},
{{"rv670"}, {"rv670"}, GK_R600_DOUBLE_OPS},		{{"rv610"}, {"rs880"}, GK_RS880, false, false, false, false, false},
{{"rv710"}, {"rv710"}, GK_R700},		{{"rv620"}, {"rs880"}, GK_RS880, false, false, false, false, false},
{{"rv730"}, {"rv730"}, GK_R700},		{{"rv670"}, {"rv670"}, GK_RV670, false, false, false, false, false},
{{"rv740"}, {"rv770"}, GK_R700_DOUBLE_OPS},		{{"rv710"}, {"rv710"}, GK_RV710, false, false, false, false, false},
{{"rv770"}, {"rv770"}, GK_R700_DOUBLE_OPS},		{{"rv730"}, {"rv730"}, GK_RV730, false, false, false, false, false},
{{"cedar"}, {"cedar"}, GK_EVERGREEN},		{{"rv740"}, {"rv770"}, GK_RV770, false, false, false, false, false},
{{"palm"}, {"cedar"}, GK_EVERGREEN},		{{"rv770"}, {"rv770"}, GK_RV770, false, false, false, false, false},
{{"cypress"}, {"cypress"}, GK_EVERGREEN_DOUBLE_OPS},		{{"cedar"}, {"cedar"}, GK_CEDAR, false, false, false, false, false},
{{"hemlock"}, {"cypress"}, GK_EVERGREEN_DOUBLE_OPS},		{{"palm"}, {"cedar"}, GK_CEDAR, false, false, false, false, false},
{{"juniper"}, {"juniper"}, GK_EVERGREEN},		{{"cypress"}, {"cypress"}, GK_CYPRESS, true, false, false, false, false},
{{"redwood"}, {"redwood"}, GK_EVERGREEN},		{{"hemlock"}, {"cypress"}, GK_CYPRESS, true, false, false, false, false},
{{"sumo"}, {"sumo"}, GK_EVERGREEN},		{{"juniper"}, {"juniper"}, GK_JUNIPER, false, false, false, false, false},
{{"sumo2"}, {"sumo"}, GK_EVERGREEN},		{{"redwood"}, {"redwood"}, GK_REDWOOD, false, false, false, false, false},
{{"barts"}, {"barts"}, GK_NORTHERN_ISLANDS},		{{"sumo"}, {"sumo"}, GK_SUMO, false, false, false, false, false},
{{"caicos"}, {"caicos"}, GK_NORTHERN_ISLANDS},		{{"sumo2"}, {"sumo"}, GK_SUMO, false, false, false, false, false},
{{"turks"}, {"turks"}, GK_NORTHERN_ISLANDS},		{{"barts"}, {"barts"}, GK_BARTS, false, false, false, false, false},
{{"aruba"}, {"cayman"}, GK_CAYMAN},		{{"caicos"}, {"caicos"}, GK_BARTS, false, false, false, false, false},
{{"cayman"}, {"cayman"}, GK_CAYMAN},		{{"aruba"}, {"cayman"}, GK_CAYMAN, true, false, false, false, false},
		{{"cayman"}, {"cayman"}, GK_CAYMAN, true, false, false, false, false},
		{{"turks"}, {"turks"}, GK_TURKS, false, false, false, false, false},
};		};
static constexpr GPUInfo AMDGCNNames[30] = {		static constexpr GPUInfo AMDGCNGPUs[30] = {
{{"gfx600"}, {"gfx600"}, GK_GFX6},		// Name Canonical Kind Has Has Has Has Has
		t-tyeUnsubmitted Done Reply Inline Actions Same comment as above. Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn has fast_fma, it is fast_fmaf that varies. t-tye: Same comment as above. Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn…
		kzhuravlAuthorUnsubmitted Done Reply Inline Actions fast_fmaf is the last column. I think those are in the correct order. kzhuravl: fast_fmaf is the last column. I think those are in the correct order.
		t-tyeUnsubmitted Done Reply Inline Actions Agreed. t-tye: Agreed.
{{"tahiti"}, {"gfx600"}, GK_GFX6},		// Name FMAF Fast LDEXPF FP64 Fast
{{"gfx601"}, {"gfx601"}, GK_GFX6},		// FMAF FMA
{{"hainan"}, {"gfx601"}, GK_GFX6},		{{"gfx600"}, {"gfx600"}, GK_GFX600, true, true, true, true, true},
{{"oland"}, {"gfx601"}, GK_GFX6},		{{"tahiti"}, {"gfx600"}, GK_GFX600, true, true, true, true, true},
{{"pitcairn"}, {"gfx601"}, GK_GFX6},		{{"gfx601"}, {"gfx601"}, GK_GFX601, true, false, true, true, true},
{{"verde"}, {"gfx601"}, GK_GFX6},		{{"hainan"}, {"gfx601"}, GK_GFX601, true, false, true, true, true},
{{"gfx700"}, {"gfx700"}, GK_GFX7},		{{"oland"}, {"gfx601"}, GK_GFX601, true, false, true, true, true},
{{"kaveri"}, {"gfx700"}, GK_GFX7},		{{"pitcairn"}, {"gfx601"}, GK_GFX601, true, false, true, true, true},
{{"gfx701"}, {"gfx701"}, GK_GFX7},		{{"verde"}, {"gfx601"}, GK_GFX601, true, false, true, true, true},
{{"hawaii"}, {"gfx701"}, GK_GFX7},		{{"gfx700"}, {"gfx700"}, GK_GFX700, true, false, true, true, true},
{{"gfx702"}, {"gfx702"}, GK_GFX7},		{{"kaveri"}, {"gfx700"}, GK_GFX700, true, false, true, true, true},
		t-tyeUnsubmitted Done Reply Inline Actions gfx702 should have true for fast fmaf. Does the TD file need correcting too? t-tye: gfx702 should have true for fast fmaf. Does the TD file need correcting too?
		kzhuravlAuthorUnsubmitted Done Reply Inline Actions Not according to the TD files in our BE: https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/AMDGPU.td#L545 kzhuravl: Not according to the TD files in our BE: https://github.com/llvm…
		t-tyeUnsubmitted Done Reply Inline Actions I believe the TD file is incorrect. gfx701 and 702 should both have fast fmaf. t-tye: I believe the TD file is incorrect. gfx701 and 702 should both have fast fmaf.
{{"gfx703"}, {"gfx703"}, GK_GFX7},		{{"gfx701"}, {"gfx701"}, GK_GFX701, true, true, true, true, true},
{{"kabini"}, {"gfx703"}, GK_GFX7},		{{"hawaii"}, {"gfx701"}, GK_GFX701, true, true, true, true, true},
{{"mullins"}, {"gfx703"}, GK_GFX7},		{{"gfx702"}, {"gfx702"}, GK_GFX702, true, true, true, true, true},
{{"gfx704"}, {"gfx704"}, GK_GFX7},		{{"gfx703"}, {"gfx703"}, GK_GFX703, true, false, true, true, true},
{{"bonaire"}, {"gfx704"}, GK_GFX7},		{{"kabini"}, {"gfx703"}, GK_GFX703, true, false, true, true, true},
{{"gfx801"}, {"gfx801"}, GK_GFX8},		{{"mullins"}, {"gfx703"}, GK_GFX703, true, false, true, true, true},
{{"carrizo"}, {"gfx801"}, GK_GFX8},		{{"gfx704"}, {"gfx704"}, GK_GFX704, true, false, true, true, true},
{{"gfx802"}, {"gfx802"}, GK_GFX8},		{{"bonaire"}, {"gfx704"}, GK_GFX704, true, false, true, true, true},
{{"iceland"}, {"gfx802"}, GK_GFX8},		{{"gfx801"}, {"gfx801"}, GK_GFX801, true, true, true, true, true},
{{"tonga"}, {"gfx802"}, GK_GFX8},		{{"carrizo"}, {"gfx801"}, GK_GFX801, true, true, true, true, true},
{{"gfx803"}, {"gfx803"}, GK_GFX8},		{{"gfx802"}, {"gfx802"}, GK_GFX802, true, false, true, true, true},
{{"fiji"}, {"gfx803"}, GK_GFX8},		{{"iceland"}, {"gfx802"}, GK_GFX802, true, false, true, true, true},
{{"polaris10"}, {"gfx803"}, GK_GFX8},		{{"tonga"}, {"gfx802"}, GK_GFX802, true, false, true, true, true},
{{"polaris11"}, {"gfx803"}, GK_GFX8},		{{"gfx803"}, {"gfx803"}, GK_GFX803, true, false, true, true, true},
{{"gfx810"}, {"gfx810"}, GK_GFX8},		{{"fiji"}, {"gfx803"}, GK_GFX803, true, false, true, true, true},
{{"stoney"}, {"gfx810"}, GK_GFX8},		{{"polaris10"}, {"gfx803"}, GK_GFX803, true, false, true, true, true},
{{"gfx900"}, {"gfx900"}, GK_GFX9},		{{"polaris11"}, {"gfx803"}, GK_GFX803, true, false, true, true, true},
{{"gfx902"}, {"gfx902"}, GK_GFX9},		{{"gfx810"}, {"gfx810"}, GK_GFX810, true, false, true, true, true},
		{{"stoney"}, {"gfx810"}, GK_GFX810, true, false, true, true, true},
		{{"gfx900"}, {"gfx900"}, GK_GFX900, true, true, true, true, true},
		{{"gfx902"}, {"gfx902"}, GK_GFX900, true, true, true, true, true},
};		};

bool hasFP64 : 1;		static GPUInfo parseR600Name(StringRef Name);
bool hasFMAF : 1;
bool hasLDEXPF : 1;
const AddrSpace AS;

static bool hasFullSpeedFMAF32(StringRef GPUName) {		static GPUInfo parseAMDGCNName(StringRef Name);
return parseAMDGCNName(GPUName).Kind >= GK_GFX9;
}		GPUInfo parseGPUName(StringRef Name) const;

		const AddrSpace AS;
		GPUInfo GPU;

static bool isAMDGCN(const llvm::Triple &TT) {		static bool isAMDGCN(const llvm::Triple &TT) {
return TT.getArch() == llvm::Triple::amdgcn;		return TT.getArch() == llvm::Triple::amdgcn;
}		}

static bool isGenericZero(const llvm::Triple &TT) { return true; }		static bool isGenericZero(const llvm::Triple &TT) { return true; }

public:		public:
AMDGPUTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts);		AMDGPUTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts);

void setAddressSpaceMap(bool DefaultIsPrivate);		void setAddressSpaceMap(bool DefaultIsPrivate);

void adjust(LangOptions &Opts) override;		void adjust(LangOptions &Opts) override;

uint64_t getPointerWidthV(unsigned AddrSpace) const override {		uint64_t getPointerWidthV(unsigned AddrSpace) const override {
if (GPU.Kind <= GK_CAYMAN)		if (GPU.Kind <= GK_R600_LAST)
return 32;		return 32;
		if (AddrSpace == AS.Private \|\| AddrSpace == AS.Local)
if (AddrSpace == AS.Private \|\| AddrSpace == AS.Local) {
return 32;		return 32;
}
return 64;		return 64;
}		}

uint64_t getPointerAlignV(unsigned AddrSpace) const override {		uint64_t getPointerAlignV(unsigned AddrSpace) const override {
return getPointerWidthV(AddrSpace);		return getPointerWidthV(AddrSpace);
}		}

uint64_t getMaxPointerWidth() const override {		uint64_t getMaxPointerWidth() const override {
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	public:

void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override;		MacroBuilder &Builder) const override;

BuiltinVaListKind getBuiltinVaListKind() const override {		BuiltinVaListKind getBuiltinVaListKind() const override {
return TargetInfo::CharPtrBuiltinVaList;		return TargetInfo::CharPtrBuiltinVaList;
}		}

static GPUInfo parseR600Name(StringRef Name);

static GPUInfo parseAMDGCNName(StringRef Name);

bool isValidCPUName(StringRef Name) const override {		bool isValidCPUName(StringRef Name) const override {
if (getTriple().getArch() == llvm::Triple::amdgcn)		if (getTriple().getArch() == llvm::Triple::amdgcn)
return GK_NONE != parseAMDGCNName(Name).Kind;		return GK_NONE != parseAMDGCNName(Name).Kind;
else		else
return GK_NONE != parseR600Name(Name).Kind;		return GK_NONE != parseR600Name(Name).Kind;
}		}

void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;		void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;

bool setCPU(const std::string &Name) override {		bool setCPU(const std::string &Name) override {
if (getTriple().getArch() == llvm::Triple::amdgcn)		if (getTriple().getArch() == llvm::Triple::amdgcn)
GPU = parseAMDGCNName(Name);		GPU = parseAMDGCNName(Name);
else		else
GPU = parseR600Name(Name);		GPU = parseR600Name(Name);

return GPU.Kind != GK_NONE;		return GK_NONE != GPU.Kind;
		t-tyeUnsubmitted Done Reply Inline Actions I found the original operand order easier to read:-) t-tye: I found the original operand order easier to read:-)
		kzhuravlAuthorUnsubmitted Done Reply Inline Actions I was trying to match the style above. But I can change it back. kzhuravl: I was trying to match the style above. But I can change it back.
		t-tyeUnsubmitted Done Reply Inline Actions Leave it in your new order which keeps it consistent as you point out. (For me they are all backwards:-) ) t-tye: Leave it in your new order which keeps it consistent as you point out. (For me they are all…
}		}

void setSupportedOpenCLOpts() override {		void setSupportedOpenCLOpts() override {
auto &Opts = getSupportedOpenCLOpts();		auto &Opts = getSupportedOpenCLOpts();
Opts.support("cl_clang_storage_class_specifiers");		Opts.support("cl_clang_storage_class_specifiers");
Opts.support("cl_khr_icd");		Opts.support("cl_khr_icd");

if (hasFP64)		if (GPU.HasFP64)
Opts.support("cl_khr_fp64");		Opts.support("cl_khr_fp64");
if (GPU.Kind >= GK_EVERGREEN) {		if (GPU.Kind >= GK_CEDAR) {
Opts.support("cl_khr_byte_addressable_store");		Opts.support("cl_khr_byte_addressable_store");
Opts.support("cl_khr_global_int32_base_atomics");		Opts.support("cl_khr_global_int32_base_atomics");
Opts.support("cl_khr_global_int32_extended_atomics");		Opts.support("cl_khr_global_int32_extended_atomics");
Opts.support("cl_khr_local_int32_base_atomics");		Opts.support("cl_khr_local_int32_base_atomics");
Opts.support("cl_khr_local_int32_extended_atomics");		Opts.support("cl_khr_local_int32_extended_atomics");
}		}
if (GPU.Kind >= GK_GFX6) {		if (GPU.Kind >= GK_AMDGCN_FIRST) {
Opts.support("cl_khr_fp16");		Opts.support("cl_khr_fp16");
Opts.support("cl_khr_int64_base_atomics");		Opts.support("cl_khr_int64_base_atomics");
Opts.support("cl_khr_int64_extended_atomics");		Opts.support("cl_khr_int64_extended_atomics");
Opts.support("cl_khr_mipmap_image");		Opts.support("cl_khr_mipmap_image");
Opts.support("cl_khr_subgroups");		Opts.support("cl_khr_subgroups");
Opts.support("cl_khr_3d_image_writes");		Opts.support("cl_khr_3d_image_writes");
Opts.support("cl_amd_media_ops");		Opts.support("cl_amd_media_ops");
Opts.support("cl_amd_media_ops2");		Opts.support("cl_amd_media_ops2");
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

lib/Basic/Targets/AMDGPU.cpp

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
ArrayRef<const char *> AMDGPUTargetInfo::getGCCRegNames() const {		ArrayRef<const char *> AMDGPUTargetInfo::getGCCRegNames() const {
return llvm::makeArrayRef(GCCRegNames);		return llvm::makeArrayRef(GCCRegNames);
}		}

bool AMDGPUTargetInfo::initFeatureMap(		bool AMDGPUTargetInfo::initFeatureMap(
llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef CPU,		llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef CPU,
const std::vector<std::string> &FeatureVec) const {		const std::vector<std::string> &FeatureVec) const {

// XXX - What does the member GPU mean if device name string passed here?		// XXX - What does the member GPU mean if device name string passed here?
		t-tyeUnsubmitted Done Reply Inline Actions What does this mean? Has it now been addressed by this patch? t-tye: What does this mean? Has it now been addressed by this patch?
		kzhuravlAuthorUnsubmitted Done Reply Inline Actions This class has a member called GPU. We are using the CPU that is passed as an argument. Has it now been addressed by this patch? No. kzhuravl: This class has a member called GPU. We are using the CPU that is passed as an argument. >…
if (getTriple().getArch() == llvm::Triple::amdgcn) {		if (isAMDGCN(getTriple())) {
if (CPU.empty())		if (CPU.empty())
CPU = "tahiti";		CPU = "gfx600";

switch (parseAMDGCNName(CPU).Kind) {		switch (parseAMDGCNName(CPU).Kind) {
case GK_GFX6:		case GK_GFX902:
case GK_GFX7:		case GK_GFX900:
break;

case GK_GFX9:
Features["gfx9-insts"] = true;		Features["gfx9-insts"] = true;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case GK_GFX8:		case GK_GFX810:
Features["s-memrealtime"] = true;		case GK_GFX803:
		case GK_GFX802:
		case GK_GFX801:
Features["16-bit-insts"] = true;		Features["16-bit-insts"] = true;
Features["dpp"] = true;		Features["dpp"] = true;
		Features["s-memrealtime"] = true;
		break;
		case GK_GFX704:
		case GK_GFX703:
		case GK_GFX702:
		case GK_GFX701:
		case GK_GFX700:
		case GK_GFX601:
		case GK_GFX600:
break;		break;

case GK_NONE:		case GK_NONE:
return false;		return false;
default:		default:
llvm_unreachable("unhandled subtarget");		llvm_unreachable("Unhandled GPU!");
}		}
} else {		} else {
if (CPU.empty())		if (CPU.empty())
CPU = "r600";		CPU = "r600";

switch (parseR600Name(CPU).Kind) {		switch (parseR600Name(CPU).Kind) {
case GK_R600:
case GK_R700:
case GK_EVERGREEN:
case GK_NORTHERN_ISLANDS:
break;
case GK_R600_DOUBLE_OPS:
case GK_R700_DOUBLE_OPS:
case GK_EVERGREEN_DOUBLE_OPS:
case GK_CAYMAN:		case GK_CAYMAN:
		case GK_CYPRESS:
		case GK_RV770:
		case GK_RV670:
// TODO: Add fp64 when implemented.		// TODO: Add fp64 when implemented.
break;		break;
case GK_NONE:		case GK_TURKS:
return false;		case GK_CAICOS:
		case GK_BARTS:
		case GK_SUMO:
		case GK_REDWOOD:
		case GK_JUNIPER:
		case GK_CEDAR:
		case GK_RV730:
		case GK_RV710:
		case GK_RS880:
		case GK_R630:
		case GK_R600:
		break;
default:		default:
llvm_unreachable("unhandled subtarget");		llvm_unreachable("Unhandled GPU!");
}		}
}		}

return TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec);		return TargetInfo::initFeatureMap(Features, Diags, CPU, FeatureVec);
}		}

void AMDGPUTargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts,		void AMDGPUTargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts,
TargetOptions &TargetOpts) const {		TargetOptions &TargetOpts) const {
bool hasFP32Denormals = false;		bool hasFP32Denormals = false;
bool hasFP64Denormals = false;		bool hasFP64Denormals = false;
		GPUInfo CGOptsGPU = parseGPUName(TargetOpts.CPU);
for (auto &I : TargetOpts.FeaturesAsWritten) {		for (auto &I : TargetOpts.FeaturesAsWritten) {
if (I == "+fp32-denormals" \|\| I == "-fp32-denormals")		if (I == "+fp32-denormals" \|\| I == "-fp32-denormals")
hasFP32Denormals = true;		hasFP32Denormals = true;
if (I == "+fp64-fp16-denormals" \|\| I == "-fp64-fp16-denormals")		if (I == "+fp64-fp16-denormals" \|\| I == "-fp64-fp16-denormals")
hasFP64Denormals = true;		hasFP64Denormals = true;
}		}
if (!hasFP32Denormals)		if (!hasFP32Denormals)
TargetOpts.Features.push_back(		TargetOpts.Features.push_back(
(Twine(hasFullSpeedFMAF32(TargetOpts.CPU) && !CGOpts.FlushDenorm		(Twine(CGOptsGPU.HasFastFMAF && !CGOpts.FlushDenorm
? '+'		? '+'
: '-') +		: '-') +
Twine("fp32-denormals"))		Twine("fp32-denormals"))
.str());		.str());
// Always do not flush fp64 or fp16 denorms.		// Always do not flush fp64 or fp16 denorms.
if (!hasFP64Denormals && hasFP64)		if (!hasFP64Denormals && CGOptsGPU.HasFP64)
TargetOpts.Features.push_back("+fp64-fp16-denormals");		TargetOpts.Features.push_back("+fp64-fp16-denormals");
}		}

constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::InvalidGPU;		constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::InvalidGPU;
constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::R600Names[];		constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::R600GPUs[];
constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::AMDGCNNames[];		constexpr AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::AMDGCNGPUs[];

AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::parseR600Name(StringRef Name) {		AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::parseR600Name(StringRef Name) {
const auto *Result = llvm::find_if(		const auto *Result = llvm::find_if(
R600Names, [Name](const GPUInfo &GPU) { return GPU.Name == Name; });		R600GPUs, [Name](const GPUInfo &GPU) { return GPU.Name == Name; });

if (Result == std::end(R600Names))		if (Result == std::end(R600GPUs))
return InvalidGPU;		return InvalidGPU;
return *Result;		return *Result;
}		}

AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::parseAMDGCNName(StringRef Name) {		AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::parseAMDGCNName(StringRef Name) {
const auto *Result =		const auto *Result = llvm::find_if(
llvm::find_if(AMDGCNNames, [Name](const GPUInfo &GPU) {		AMDGCNGPUs, [Name](const GPUInfo &GPU) { return GPU.Name == Name; });
return GPU.Name == Name;
});

if (Result == std::end(AMDGCNNames))		if (Result == std::end(AMDGCNGPUs))
return InvalidGPU;		return InvalidGPU;
return *Result;		return *Result;
}		}

		AMDGPUTargetInfo::GPUInfo AMDGPUTargetInfo::parseGPUName(StringRef Name) const {
		if (isAMDGCN(getTriple()))
		return parseAMDGCNName(Name);
		else
		return parseR600Name(Name);
		}

void AMDGPUTargetInfo::fillValidCPUList(		void AMDGPUTargetInfo::fillValidCPUList(
SmallVectorImpl<StringRef> &Values) const {		SmallVectorImpl<StringRef> &Values) const {
if (getTriple().getArch() == llvm::Triple::amdgcn)		if (isAMDGCN(getTriple()))
		t-tyeUnsubmitted Done Reply Inline Actions To be consistent should this be: if (isAMDGCN(getTriple())) Similar comment elsewhere. t-tye: To be consistent should this be: ``` if (isAMDGCN(getTriple())) ``` Similar comment elsewhere.
llvm::for_each(AMDGCNNames, [&Values](const GPUInfo &GPU) {		llvm::for_each(AMDGCNGPUs, [&Values](const GPUInfo &GPU) {
Values.emplace_back(GPU.Name);});		Values.emplace_back(GPU.Name);});
else		else
llvm::for_each(R600Names, [&Values](const GPUInfo &GPU) {		llvm::for_each(R600GPUs, [&Values](const GPUInfo &GPU) {
Values.emplace_back(GPU.Name);});		Values.emplace_back(GPU.Name);});
}		}

void AMDGPUTargetInfo::setAddressSpaceMap(bool DefaultIsPrivate) {		void AMDGPUTargetInfo::setAddressSpaceMap(bool DefaultIsPrivate) {
if (isGenericZero(getTriple())) {		if (isGenericZero(getTriple())) {
AddrSpaceMap = DefaultIsPrivate ? &AMDGPUGenIsZeroDefIsPrivMap		AddrSpaceMap = DefaultIsPrivate ? &AMDGPUGenIsZeroDefIsPrivMap
: &AMDGPUGenIsZeroDefIsGenMap;		: &AMDGPUGenIsZeroDefIsGenMap;
} else {		} else {
AddrSpaceMap = DefaultIsPrivate ? &AMDGPUPrivIsZeroDefIsPrivMap		AddrSpaceMap = DefaultIsPrivate ? &AMDGPUPrivIsZeroDefIsPrivMap
: &AMDGPUPrivIsZeroDefIsGenMap;		: &AMDGPUPrivIsZeroDefIsGenMap;
}		}
}		}

AMDGPUTargetInfo::AMDGPUTargetInfo(const llvm::Triple &Triple,		AMDGPUTargetInfo::AMDGPUTargetInfo(const llvm::Triple &Triple,
const TargetOptions &Opts)		const TargetOptions &Opts)
: TargetInfo(Triple),		: TargetInfo(Triple), AS(isGenericZero(Triple)),
GPU(isAMDGCN(Triple) ? AMDGCNNames[0] : parseR600Name(Opts.CPU)),		GPU(isAMDGCN(Triple) ? AMDGCNGPUs[0] : parseR600Name(Opts.CPU)) {
hasFP64(false), hasFMAF(false), hasLDEXPF(false),
AS(isGenericZero(Triple)) {
if (getTriple().getArch() == llvm::Triple::amdgcn) {
hasFP64 = true;
hasFMAF = true;
t-tyeUnsubmitted Done Reply Inline Actions This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF. t-tye: This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF.
hasLDEXPF = true;
}
if (getTriple().getArch() == llvm::Triple::r600) {
if (GPU.Kind == GK_EVERGREEN_DOUBLE_OPS \|\| GPU.Kind == GK_CAYMAN) {
hasFMAF = true;
}
}
auto IsGenericZero = isGenericZero(Triple);		auto IsGenericZero = isGenericZero(Triple);
resetDataLayout(getTriple().getArch() == llvm::Triple::amdgcn		resetDataLayout(isAMDGCN(getTriple())
? (IsGenericZero ? DataLayoutStringSIGenericIsZero		? (IsGenericZero ? DataLayoutStringSIGenericIsZero
: DataLayoutStringSIPrivateIsZero)		: DataLayoutStringSIPrivateIsZero)
: DataLayoutStringR600);		: DataLayoutStringR600);
assert(DataLayout->getAllocaAddrSpace() == AS.Private);		assert(DataLayout->getAllocaAddrSpace() == AS.Private);

setAddressSpaceMap(Triple.getOS() == llvm::Triple::Mesa3D \|\|		setAddressSpaceMap(Triple.getOS() == llvm::Triple::Mesa3D \|\|
Triple.getEnvironment() == llvm::Triple::OpenCL \|\|		Triple.getEnvironment() == llvm::Triple::OpenCL \|\|
Triple.getEnvironmentName() == "amdgizcl" \|\|		Triple.getEnvironmentName() == "amdgizcl" \|\|
Show All 22 Lines	return llvm::makeArrayRef(BuiltinInfo, clang::AMDGPU::LastTSBuiltin -
Builtin::FirstTSBuiltin);		Builtin::FirstTSBuiltin);
}		}

void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,		void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const {		MacroBuilder &Builder) const {
Builder.defineMacro("__AMD__");		Builder.defineMacro("__AMD__");
Builder.defineMacro("__AMDGPU__");		Builder.defineMacro("__AMDGPU__");

if (getTriple().getArch() == llvm::Triple::amdgcn)		if (isAMDGCN(getTriple()))
Builder.defineMacro("__AMDGCN__");		Builder.defineMacro("__AMDGCN__");
else		else
Builder.defineMacro("__R600__");		Builder.defineMacro("__R600__");

if (GPU.Kind != GK_NONE)		if (GPU.Kind != GK_NONE)
Builder.defineMacro(Twine("__") + Twine(GPU.CanonicalName) + Twine("__"));		Builder.defineMacro(Twine("__") + Twine(GPU.CanonicalName) + Twine("__"));

if (hasFMAF)		// TODO: __HAS_FMAF__, __HAS_LDEXPF__, __HAS_FP64__ are deprecated and will be
		// removed in the near future.
		t-tyeUnsubmitted Not Done Reply Inline Actions @b-sumner which ones were to recommending deprecating? I would think FP64 and FMAF would be ones that needs to be kept? I thought it was HAS_FAST_FMA that is not specified by OpenCL. t-tye: @b-sumner which ones were to recommending deprecating? I would think FP64 and FMAF would be…
		b-sumnerUnsubmitted Not Done Reply Inline Actions Sorry, I was referring to all of the __HAS_* macros. b-sumner: Sorry, I was referring to all of the __HAS_* macros.
		if (GPU.HasFMAF)
Builder.defineMacro("__HAS_FMAF__");		Builder.defineMacro("__HAS_FMAF__");
if (hasLDEXPF)		if (GPU.HasFastFMAF)
		Builder.defineMacro("FP_FAST_FMAF");
		if (GPU.HasLDEXPF)
Builder.defineMacro("__HAS_LDEXPF__");		Builder.defineMacro("__HAS_LDEXPF__");
		b-sumnerUnsubmitted Done Reply Inline Actions I'm not sure why this is here. No languages we support have this AFAIK. We should probably add a comment that this is deprecated and remove it in a year or so. b-sumner: I'm not sure why this is here. No languages we support have this AFAIK. We should probably…
if (hasFP64)		if (GPU.HasFP64)
Builder.defineMacro("__HAS_FP64__");		Builder.defineMacro("__HAS_FP64__");
		if (GPU.HasFastFMA)
		Builder.defineMacro("FP_FAST_FMA");
}		}

test/Driver/amdgpu-macros.cl

	Show All 22 Lines
	// RUN: %clang -E -dM -target r600 -mcpu=cypress %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CYPRESS %s			// RUN: %clang -E -dM -target r600 -mcpu=cypress %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CYPRESS %s
	// RUN: %clang -E -dM -target r600 -mcpu=hemlock %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CYPRESS %s			// RUN: %clang -E -dM -target r600 -mcpu=hemlock %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CYPRESS %s
	// RUN: %clang -E -dM -target r600 -mcpu=juniper %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,JUNIPER %s			// RUN: %clang -E -dM -target r600 -mcpu=juniper %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,JUNIPER %s
	// RUN: %clang -E -dM -target r600 -mcpu=redwood %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,REDWOOD %s			// RUN: %clang -E -dM -target r600 -mcpu=redwood %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,REDWOOD %s
	// RUN: %clang -E -dM -target r600 -mcpu=sumo %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,SUMO %s			// RUN: %clang -E -dM -target r600 -mcpu=sumo %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,SUMO %s
	// RUN: %clang -E -dM -target r600 -mcpu=sumo2 %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,SUMO %s			// RUN: %clang -E -dM -target r600 -mcpu=sumo2 %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,SUMO %s
	// RUN: %clang -E -dM -target r600 -mcpu=barts %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,BARTS %s			// RUN: %clang -E -dM -target r600 -mcpu=barts %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,BARTS %s
	// RUN: %clang -E -dM -target r600 -mcpu=caicos %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAICOS %s			// RUN: %clang -E -dM -target r600 -mcpu=caicos %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAICOS %s
	// RUN: %clang -E -dM -target r600 -mcpu=turks %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,TURKS %s
	// RUN: %clang -E -dM -target r600 -mcpu=aruba %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAYMAN %s			// RUN: %clang -E -dM -target r600 -mcpu=aruba %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAYMAN %s
	// RUN: %clang -E -dM -target r600 -mcpu=cayman %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAYMAN %s			// RUN: %clang -E -dM -target r600 -mcpu=cayman %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,CAYMAN %s
				// RUN: %clang -E -dM -target r600 -mcpu=turks %s 2>&1 \| FileCheck --check-prefixes=ARCH-R600,TURKS %s

				// R600-NOT: #define FP_FAST_FMA 1
				// R630-NOT: #define FP_FAST_FMA 1
				// RS880-NOT: #define FP_FAST_FMA 1
				// RV670-NOT: #define FP_FAST_FMA 1
				// RV710-NOT: #define FP_FAST_FMA 1
				// RV730-NOT: #define FP_FAST_FMA 1
				// RV770-NOT: #define FP_FAST_FMA 1
				// CEDAR-NOT: #define FP_FAST_FMA 1
				// CYPRESS-NOT: #define FP_FAST_FMA 1
				// JUNIPER-NOT: #define FP_FAST_FMA 1
				// REDWOOD-NOT: #define FP_FAST_FMA 1
				// SUMO-NOT: #define FP_FAST_FMA 1
				// BARTS-NOT: #define FP_FAST_FMA 1
				// CAICOS-NOT: #define FP_FAST_FMA 1
				// CAYMAN-NOT: #define FP_FAST_FMA 1
				// TURKS-NOT: #define FP_FAST_FMA 1

				// R600-NOT: #define FP_FAST_FMAF 1
				// R630-NOT: #define FP_FAST_FMAF 1
				// RS880-NOT: #define FP_FAST_FMAF 1
				// RV670-NOT: #define FP_FAST_FMAF 1
				// RV710-NOT: #define FP_FAST_FMAF 1
				// RV730-NOT: #define FP_FAST_FMAF 1
				// RV770-NOT: #define FP_FAST_FMAF 1
				// CEDAR-NOT: #define FP_FAST_FMAF 1
				// CYPRESS-NOT: #define FP_FAST_FMAF 1
				// JUNIPER-NOT: #define FP_FAST_FMAF 1
				// REDWOOD-NOT: #define FP_FAST_FMAF 1
				// SUMO-NOT: #define FP_FAST_FMAF 1
				// BARTS-NOT: #define FP_FAST_FMAF 1
				// CAICOS-NOT: #define FP_FAST_FMAF 1
				// CAYMAN-NOT: #define FP_FAST_FMAF 1
				// TURKS-NOT: #define FP_FAST_FMAF 1

	// ARCH-R600-DAG: #define __AMD__ 1
	// ARCH-R600-DAG: #define __AMDGPU__ 1			// ARCH-R600-DAG: #define __AMDGPU__ 1
				// ARCH-R600-DAG: #define __AMD__ 1

				// R600-NOT: #define __HAS_FMAF__ 1
				// R630-NOT: #define __HAS_FMAF__ 1
				// RS880-NOT: #define __HAS_FMAF__ 1
				// RV670-NOT: #define __HAS_FMAF__ 1
				// RV710-NOT: #define __HAS_FMAF__ 1
				// RV730-NOT: #define __HAS_FMAF__ 1
				// RV770-NOT: #define __HAS_FMAF__ 1
				// CEDAR-NOT: #define __HAS_FMAF__ 1
				// CYPRESS-DAG: #define __HAS_FMAF__ 1
				// JUNIPER-NOT: #define __HAS_FMAF__ 1
				// REDWOOD-NOT: #define __HAS_FMAF__ 1
				// SUMO-NOT: #define __HAS_FMAF__ 1
				// BARTS-NOT: #define __HAS_FMAF__ 1
				// CAICOS-NOT: #define __HAS_FMAF__ 1
				// CAYMAN-DAG: #define __HAS_FMAF__ 1
				// TURKS-NOT: #define __HAS_FMAF__ 1

				// R600-NOT: #define __HAS_FP64__ 1
				// R630-NOT: #define __HAS_FP64__ 1
				// RS880-NOT: #define __HAS_FP64__ 1
				// RV670-NOT: #define __HAS_FP64__ 1
				// RV710-NOT: #define __HAS_FP64__ 1
				// RV730-NOT: #define __HAS_FP64__ 1
				// RV770-NOT: #define __HAS_FP64__ 1
				// CEDAR-NOT: #define __HAS_FP64__ 1
				// CYPRESS-NOT: #define __HAS_FP64__ 1
				// JUNIPER-NOT: #define __HAS_FP64__ 1
				// REDWOOD-NOT: #define __HAS_FP64__ 1
				// SUMO-NOT: #define __HAS_FP64__ 1
				// BARTS-NOT: #define __HAS_FP64__ 1
				// CAICOS-NOT: #define __HAS_FP64__ 1
				// CAYMAN-NOT: #define __HAS_FP64__ 1
				// TURKS-NOT: #define __HAS_FP64__ 1

				// R600-NOT: #define __HAS_LDEXPF__ 1
				// R630-NOT: #define __HAS_LDEXPF__ 1
				// RS880-NOT: #define __HAS_LDEXPF__ 1
				// RV670-NOT: #define __HAS_LDEXPF__ 1
				// RV710-NOT: #define __HAS_LDEXPF__ 1
				// RV730-NOT: #define __HAS_LDEXPF__ 1
				// RV770-NOT: #define __HAS_LDEXPF__ 1
				// CEDAR-NOT: #define __HAS_LDEXPF__ 1
				// CYPRESS-NOT: #define __HAS_LDEXPF__ 1
				// JUNIPER-NOT: #define __HAS_LDEXPF__ 1
				// REDWOOD-NOT: #define __HAS_LDEXPF__ 1
				// SUMO-NOT: #define __HAS_LDEXPF__ 1
				// BARTS-NOT: #define __HAS_LDEXPF__ 1
				// CAICOS-NOT: #define __HAS_LDEXPF__ 1
				// CAYMAN-NOT: #define __HAS_LDEXPF__ 1
				// TURKS-NOT: #define __HAS_LDEXPF__ 1

	// ARCH-R600-DAG: #define __R600__ 1			// ARCH-R600-DAG: #define __R600__ 1

	// R600: #define __r600__ 1			// R600-DAG: #define __r600__ 1
	// R630: #define __r630__ 1			// R630-DAG: #define __r630__ 1
	// RS880: #define __rs880__ 1			// RS880-DAG: #define __rs880__ 1
	// RV670: #define __rv670__ 1			// RV670-DAG: #define __rv670__ 1
	// RV710: #define __rv710__ 1			// RV710-DAG: #define __rv710__ 1
	// RV730: #define __rv730__ 1			// RV730-DAG: #define __rv730__ 1
	// RV770: #define __rv770__ 1			// RV770-DAG: #define __rv770__ 1
	// CEDAR: #define __cedar__ 1			// CEDAR-DAG: #define __cedar__ 1
	// CYPRESS: #define __cypress__ 1			// CYPRESS-DAG: #define __cypress__ 1
	// JUNIPER: #define __juniper__ 1			// JUNIPER-DAG: #define __juniper__ 1
	// REDWOOD: #define __redwood__ 1			// REDWOOD-DAG: #define __redwood__ 1
	// SUMO: #define __sumo__ 1			// SUMO-DAG: #define __sumo__ 1
	// BARTS: #define __barts__ 1			// BARTS-DAG: #define __barts__ 1
	// CAICOS: #define __caicos__ 1			// CAICOS-DAG: #define __caicos__ 1
	// TURKS: #define __turks__ 1			// CAYMAN-DAG: #define __cayman__ 1
	// CAYMAN: #define __cayman__ 1			// TURKS-DAG: #define __turks__ 1

	//			//
	// AMDGCN-based processors.			// AMDGCN-based processors.
	//			//

	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx600 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX600 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx600 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX600 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=tahiti %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX600 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=tahiti %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX600 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx601 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX601 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx601 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX601 %s
	Show All 20 Lines
	// RUN: %clang -E -dM -target amdgcn -mcpu=fiji %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=fiji %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=polaris10 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=polaris10 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=polaris11 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=polaris11 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX803 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx810 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX810 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx810 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX810 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=stoney %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX810 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=stoney %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX810 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx900 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX900 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx900 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX900 %s
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx902 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX902 %s			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx902 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,GFX902 %s

	// ARCH-GCN-DAG: #define __AMD__ 1			// GFX600-DAG: #define FP_FAST_FMA 1
	// ARCH-GCN-DAG: #define __AMDGPU__ 1			// GFX601-DAG: #define FP_FAST_FMA 1
				// GFX700-DAG: #define FP_FAST_FMA 1
				// GFX701-DAG: #define FP_FAST_FMA 1
				// GFX702-DAG: #define FP_FAST_FMA 1
				// GFX703-DAG: #define FP_FAST_FMA 1
				// GFX704-DAG: #define FP_FAST_FMA 1
				// GFX801-DAG: #define FP_FAST_FMA 1
				// GFX802-DAG: #define FP_FAST_FMA 1
				// GFX803-DAG: #define FP_FAST_FMA 1
				// GFX810-DAG: #define FP_FAST_FMA 1
				// GFX900-DAG: #define FP_FAST_FMA 1
				// GFX902-DAG: #define FP_FAST_FMA 1

				// GFX600-DAG: #define FP_FAST_FMAF 1
				// GFX601-NOT: #define FP_FAST_FMAF 1
				// GFX700-NOT: #define FP_FAST_FMAF 1
				// GFX701-DAG: #define FP_FAST_FMAF 1
				// GFX702-DAG: #define FP_FAST_FMAF 1
				// GFX703-NOT: #define FP_FAST_FMAF 1
				// GFX704-NOT: #define FP_FAST_FMAF 1
				// GFX801-DAG: #define FP_FAST_FMAF 1
				// GFX802-NOT: #define FP_FAST_FMAF 1
				// GFX803-NOT: #define FP_FAST_FMAF 1
				// GFX810-NOT: #define FP_FAST_FMAF 1
				// GFX900-DAG: #define FP_FAST_FMAF 1
				// GFX902-DAG: #define FP_FAST_FMAF 1

	// ARCH-GCN-DAG: #define __AMDGCN__ 1			// ARCH-GCN-DAG: #define __AMDGCN__ 1
				// ARCH-GCN-DAG: #define __AMDGPU__ 1
				// ARCH-GCN-DAG: #define __AMD__ 1

				// GFX600-DAG: #define __HAS_FMAF__ 1
				// GFX601-DAG: #define __HAS_FMAF__ 1
				// GFX700-DAG: #define __HAS_FMAF__ 1
				// GFX701-DAG: #define __HAS_FMAF__ 1
				// GFX702-DAG: #define __HAS_FMAF__ 1
				// GFX703-DAG: #define __HAS_FMAF__ 1
				// GFX704-DAG: #define __HAS_FMAF__ 1
				// GFX801-DAG: #define __HAS_FMAF__ 1
				// GFX802-DAG: #define __HAS_FMAF__ 1
				// GFX803-DAG: #define __HAS_FMAF__ 1
				// GFX810-DAG: #define __HAS_FMAF__ 1
				// GFX900-DAG: #define __HAS_FMAF__ 1
				// GFX902-DAG: #define __HAS_FMAF__ 1

				// GFX600-DAG: #define __HAS_FP64__ 1
				// GFX601-DAG: #define __HAS_FP64__ 1
				// GFX700-DAG: #define __HAS_FP64__ 1
				// GFX701-DAG: #define __HAS_FP64__ 1
				// GFX702-DAG: #define __HAS_FP64__ 1
				// GFX703-DAG: #define __HAS_FP64__ 1
				// GFX704-DAG: #define __HAS_FP64__ 1
				// GFX801-DAG: #define __HAS_FP64__ 1
				// GFX802-DAG: #define __HAS_FP64__ 1
				// GFX803-DAG: #define __HAS_FP64__ 1
				// GFX810-DAG: #define __HAS_FP64__ 1
				// GFX900-DAG: #define __HAS_FP64__ 1
				// GFX902-DAG: #define __HAS_FP64__ 1

				// GFX600-DAG: #define __HAS_LDEXPF__ 1
				// GFX601-DAG: #define __HAS_LDEXPF__ 1
				// GFX700-DAG: #define __HAS_LDEXPF__ 1
				// GFX701-DAG: #define __HAS_LDEXPF__ 1
				// GFX702-DAG: #define __HAS_LDEXPF__ 1
				// GFX703-DAG: #define __HAS_LDEXPF__ 1
				// GFX704-DAG: #define __HAS_LDEXPF__ 1
				// GFX801-DAG: #define __HAS_LDEXPF__ 1
				// GFX802-DAG: #define __HAS_LDEXPF__ 1
				// GFX803-DAG: #define __HAS_LDEXPF__ 1
				// GFX810-DAG: #define __HAS_LDEXPF__ 1
				// GFX900-DAG: #define __HAS_LDEXPF__ 1
				// GFX902-DAG: #define __HAS_LDEXPF__ 1

	// GFX600: #define __gfx600__ 1			// GFX600-DAG: #define __gfx600__ 1
	// GFX601: #define __gfx601__ 1			// GFX601-DAG: #define __gfx601__ 1
	// GFX700: #define __gfx700__ 1			// GFX700-DAG: #define __gfx700__ 1
	// GFX701: #define __gfx701__ 1			// GFX701-DAG: #define __gfx701__ 1
	// GFX702: #define __gfx702__ 1			// GFX702-DAG: #define __gfx702__ 1
	// GFX703: #define __gfx703__ 1			// GFX703-DAG: #define __gfx703__ 1
	// GFX704: #define __gfx704__ 1			// GFX704-DAG: #define __gfx704__ 1
	// GFX801: #define __gfx801__ 1			// GFX801-DAG: #define __gfx801__ 1
	// GFX802: #define __gfx802__ 1			// GFX802-DAG: #define __gfx802__ 1
	// GFX803: #define __gfx803__ 1			// GFX803-DAG: #define __gfx803__ 1
	// GFX810: #define __gfx810__ 1			// GFX810-DAG: #define __gfx810__ 1
	// GFX900: #define __gfx900__ 1			// GFX900-DAG: #define __gfx900__ 1
	// GFX902: #define __gfx902__ 1			// GFX902-DAG: #define __gfx902__ 1

test/Misc/target-invalid-cpu-note.c

	Show All 36 Lines
	// NVPTX: note: valid target CPU values are: sm_20, sm_21, sm_30, sm_32, sm_35,			// NVPTX: note: valid target CPU values are: sm_20, sm_21, sm_30, sm_32, sm_35,
	// NVPTX-SAME: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72			// NVPTX-SAME: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72

	// RUN: not %clang_cc1 -triple r600--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix R600			// RUN: not %clang_cc1 -triple r600--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix R600
	// R600: error: unknown target CPU 'not-a-cpu'			// R600: error: unknown target CPU 'not-a-cpu'
	// R600: note: valid target CPU values are: r600, rv630, rv635, r630, rs780,			// R600: note: valid target CPU values are: r600, rv630, rv635, r630, rs780,
	// R600-SAME: rs880, rv610, rv620, rv670, rv710, rv730, rv740, rv770, cedar,			// R600-SAME: rs880, rv610, rv620, rv670, rv710, rv730, rv740, rv770, cedar,
	// R600-SAME: palm, cypress, hemlock, juniper, redwood, sumo, sumo2, barts,			// R600-SAME: palm, cypress, hemlock, juniper, redwood, sumo, sumo2, barts,
	// R600-SAME: caicos, turks, aruba, cayman			// R600-SAME: caicos, aruba, cayman, turks


	// RUN: not %clang_cc1 -triple amdgcn--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AMDGCN			// RUN: not %clang_cc1 -triple amdgcn--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AMDGCN
	// AMDGCN: error: unknown target CPU 'not-a-cpu'			// AMDGCN: error: unknown target CPU 'not-a-cpu'
	// AMDGCN: note: valid target CPU values are: gfx600, tahiti, gfx601, hainan,			// AMDGCN: note: valid target CPU values are: gfx600, tahiti, gfx601, hainan,
	// AMDGCN-SAME: oland, pitcairn, verde, gfx700, kaveri, gfx701, hawaii, gfx702,			// AMDGCN-SAME: oland, pitcairn, verde, gfx700, kaveri, gfx701, hawaii, gfx702,
	// AMDGCN-SAME: gfx703, kabini, mullins, gfx704, bonaire, gfx801, carrizo,			// AMDGCN-SAME: gfx703, kabini, mullins, gfx704, bonaire, gfx801, carrizo,
	// AMDGCN-SAME: gfx802, iceland, tonga, gfx803, fiji, polaris10, polaris11,			// AMDGCN-SAME: gfx802, iceland, tonga, gfx803, fiji, polaris10, polaris11,
	▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Define FP_FAST_FMA{F} macros for amdgcnClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 135997

lib/Basic/Targets/AMDGPU.h

lib/Basic/Targets/AMDGPU.cpp

test/Driver/amdgpu-macros.cl

test/Misc/target-invalid-cpu-note.c

AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn
ClosedPublic