This is an archive of the discontinued LLVM Phabricator instance.

llvm/docs/AMDGPUUsage.rst
389	Is it dGPU or APU? Every other entry with `TBA` also has a `TODO::` message
llvm/lib/Target/AMDGPU/AMDGPU.td
468	What is this new encoding? It doesn't seem to be used for anything.
llvm/lib/Target/AMDGPU/GCNSubtarget.h
879	Stray whitespace on this line.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
1453	Stray whitespace.
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	This test surely should not pass for gfx1012, since it does not have these instructions. And with your patch as written it should fail for gfx1013 too, since they are predicated on HasGFX10_BEncoding. @rampitec any idea what is wrong here? Apparently the backend will happily generate image_bvh_intersect_ray instructions even for gfx900!
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
3	Likewise.

You need to replace HasGFX10_BEncoding with HasGFX10_AEncoding in the BVH and IMAGE_MSAA_LOAD_X. You also need to update llvm.amdgcn.image.msaa.load.x.ll test to include gfx1013.

llvm/lib/Target/AMDGPU/AMDGPU.td
1106	gfx1030 should now include FeatureGFX10_AEncoding as well. 10_B is an extension above 10_A.
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	Indeed. MIMG_IntersectRay has this: let SubtargetPredicate = HasGFX10_BEncoding, AssemblerPredicate = HasGFX10_BEncoding, but apparently SubtargetPredicate did not work. It needs to be fixed. gfx1012 does not have it, gfx1013 does though. That is what GFX10_A encoding is about, 10_B it has to be replaced with 10_A in BVH and MSAA load.

rampitec added inline comments.Jun 4 2021, 12:27 PM

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	Image lowering and selection is not really done like everything else. For BVH it just lowers intrinsic to opcode. I think the easiest fix is to add to SIISelLowering.cpp where we lower Intrinsic::amdgcn_image_bvh_intersect_ray something like this: if (!Subtarget->hasGFX10_AEncoding()) report_fatal_error( "requested image instruction is not supported on this GPU");

Addressed review comments. Updated the patch to use the new AEncoding target feature
correctly. Added code to report an error for the image intersect intrinsics for
unsupported targets.

bcahoon marked 8 inline comments as done.Jun 5 2021, 1:22 PM

bcahoon added inline comments.

llvm/docs/AMDGPUUsage.rst
389	It is APU.
llvm/lib/Target/AMDGPU/AMDGPU.td
468	Fixed this. The BVH raytracing instructions use the encoding.
1106	I had added FeatureGFX10_AEncoding as an Implies feature for FeatureGFX10_BEncoding in the previous patch. But, I've changed the patch so that it's no longer an Implies feature.
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump.

Harbormaster completed remote builds in B107826: Diff 350075.Jun 5 2021, 1:55 PM

rampitec added inline comments.Jun 7 2021, 10:11 AM

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4690	80 chars per line.
4693	Just return false like in other places.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345	return emitRemovedIntrinsicError();
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	not --crash llc ...
4	I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump. Diagnostics is a good thing, but we still have to fail the compilation.
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll
3	not --crash llc

bcahoon marked 4 inline comments as done.Jun 7 2021, 11:31 AM

bcahoon added inline comments.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	The diagnostic is marked as an error, so the compilation fails in that llc returns a non-zero return code. This mechanism is used in other places in the back-end to report similar types of errors. The alternative, if I understand correctly, is that a crash occurs with an error message that indicates that the bug is in LLVM (rather the the input source file).

rampitec added inline comments.Jun 7 2021, 11:56 AM

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	We do not seem to be consistent here and return either undef or SDValue(), but as far as I can see we never continue selecting code though, like here in SIISelLowering and always return false from the AMDGPUInstructionSelector.

Addressed review comments

bcahoon marked an inline comment as done.Jun 7 2021, 3:00 PM

bcahoon added inline comments.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345	I've changed this to return. Thanks for catching that. But, it returns a UNDEF value instead of SDValue() so that it doesn't crash. I can change the behavior if that's preferred.
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
4	I've left the patch so that it doesn't crash. But, let me know if you think we should return false and crash, and I'll make that change.

rampitec added inline comments.Jun 7 2021, 3:24 PM

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4693	Just return false. I see that is like this in the whole file.

Harbormaster completed remote builds in B108076: Diff 350423.Jun 7 2021, 3:45 PM

Changed legalizer to return false for raytracing intrinsics that are not supported by the subtarget. I changed both GlobalISel and regular ISel to work similarly. A crash occurs with a message that the intrinsic cannot be legalized, and only the first instance is reported.

bcahoon added inline comments.Jun 7 2021, 4:51 PM

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4693	Changed this to false, and also changed SIISelLlowering to return SDValue so that both fail in a similar way.

rampitec added inline comments.Jun 7 2021, 5:13 PM

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4694	You can just omit undef and erase.

Harbormaster completed remote builds in B108098: Diff 350448.Jun 7 2021, 5:39 PM

Addressed review comment.

bcahoon marked an inline comment as done.Jun 7 2021, 7:06 PM

Harbormaster completed remote builds in B108115: Diff 350468.Jun 7 2021, 7:50 PM

rampitec accepted this revision.Jun 7 2021, 11:43 PM

This revision is now accepted and ready to land.Jun 7 2021, 11:43 PM

foad added inline comments.Jun 8 2021, 3:12 AM

llvm/lib/Target/AMDGPU/AMDGPU.td
471	I realise you're just following the precedent set by GFX10_B, but is this terminology actually used in any documentation anywhere? And if not could we describe it a little better here?
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345	Personally I would follow all the existing precedents and "return emitRemovedIntrinsicError(...)". I don't see any value in deliberately trying to make the compiler crash harder.

LGTM anyway, with or without any action on my last couple of comments.

This revision was landed with ongoing or failed builds.Jun 8 2021, 9:56 AM

Closed by commit rGea10a86984ea: [AMDGPU] Add gfx1013 target (authored by bcahoon). · Explain Why

This revision was automatically updated to reflect the committed changes.

bcahoon added a commit: rGea10a86984ea: [AMDGPU] Add gfx1013 target.

bcahoon marked an inline comment as done.Jun 8 2021, 10:21 AM

bcahoon added inline comments.

llvm/lib/Target/AMDGPU/AMDGPU.td
471	I changed the description to be specific w.r.t what the target feature enables.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7345	The handling of diagnostic errors is a little inconsistent in ISelLowering. Sometimes SDValue() is returned and other times it's Undef. I have it return SDValue() so that the failure mode is consistent with how GlobaISel handles these intrinsics. It's probably worth a discussion to decide how best to handle diagnostic errors. I'm happy to submit a follow-on patch as needed. As an aside, return emitRemovedIntrinsicError() isn't enough because the intrinsic has both a return value and a chain edge. So, something like return DAG.getMergeValues({emitRemovedIntrinsicError(), Op.getValue(0)}, DL) is needed.

bcahoon added a reverting change: rG211e584fa2a4: Revert "[AMDGPU] Add gfx1013 target".Jun 8 2021, 1:42 PM

foad added inline comments.Jun 9 2021, 1:24 AM

llvm/lib/Target/AMDGPU/AMDGPU.td
471	Thank you. I think that is much more useful.

Joe_Nash added a subscriber: Joe_Nash.Jun 7 2022, 6:14 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 6:14 AM

Herald added subscribers: kosarev, jsilvanus, mattd and 4 others. · View Herald Transcript

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

Cuda.h

1 line

lib/

Basic/

Cuda.cpp

1 line

Targets/

AMDGPU.cpp

1 line

NVPTX.cpp

1 line

CodeGen/

CGOpenMPRuntimeGPU.cpp

2 lines

test/

CodeGenOpenCL/

amdgpu-features.cl

2 lines

Driver/

amdgpu-macros.cl

1 line

amdgpu-mcpu.cl

2 lines

Misc/

target-invalid-cpu-note.c

2 lines

llvm/

docs/

AMDGPUUsage.rst

8 lines

include/

llvm/

BinaryFormat/

ELF.h

3 lines

Support/

TargetParser.h

1 line

lib/

Object/

ELFObjectFile.cpp

2 lines

ObjectYAML/

ELFYAML.cpp

1 line

Support/

TargetParser.cpp

2 lines

Target/

AMDGPU/

AMDGPU.td

27 lines

AMDGPULegalizerInfo.cpp

9 lines

AMDGPUSubtarget.cpp

1 line

GCNProcessors.td

4 lines

GCNSubtarget.h

5 lines

MCTargetDesc/

AMDGPUTargetStreamer.cpp

2 lines

MIMGInstructions.td

6 lines

SIISelLowering.cpp

5 lines

Utils/

AMDGPUBaseInfo.h

1 line

AMDGPUBaseInfo.cpp

4 lines

test/

CodeGen/

AMDGPU/

GlobalISel/

llvm.amdgcn.intersect_ray.ll

3 lines

directive-amdgcn-target.ll

11 lines

elf-header-flags-mach.ll

2 lines

llvm.amdgcn.intersect_ray.ll

3 lines

MC/

AMDGPU/

dl-insts-err.s

8 lines

gfx10_unsupported.s

1 line

Object/

AMDGPU/

elf-header-flags-mach.yaml

7 lines

tools/

llvm-objdump/

ELF/

AMDGPU/

subtarget.ll

5 lines

llvm-readobj/

ELF/

amdgpu-elf-headers.test

9 lines

tools/

llvm-readobj/

ELFDumper.cpp

2 lines

openmp/

libomptarget/

plugins/

amdgpu/

impl/

get_elf_mach_gfx_name.cpp

2 lines

Diff 350640

clang/include/clang/Basic/Cuda.h

Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	enum class CudaArch {
GFX906,		GFX906,
GFX908,		GFX908,
GFX909,		GFX909,
GFX90a,		GFX90a,
GFX90c,		GFX90c,
GFX1010,		GFX1010,
GFX1011,		GFX1011,
GFX1012,		GFX1012,
		GFX1013,
GFX1030,		GFX1030,
GFX1031,		GFX1031,
GFX1032,		GFX1032,
GFX1033,		GFX1033,
GFX1034,		GFX1034,
LAST,		LAST,
};		};

Show All 35 Lines

clang/lib/Basic/Cuda.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	static const CudaArchToStringMap arch_names[] = {
GFX(906), // gfx906		GFX(906), // gfx906
GFX(908), // gfx908		GFX(908), // gfx908
GFX(909), // gfx909		GFX(909), // gfx909
GFX(90a), // gfx90a		GFX(90a), // gfx90a
GFX(90c), // gfx90c		GFX(90c), // gfx90c
GFX(1010), // gfx1010		GFX(1010), // gfx1010
GFX(1011), // gfx1011		GFX(1011), // gfx1011
GFX(1012), // gfx1012		GFX(1012), // gfx1012
		GFX(1013), // gfx1013
GFX(1030), // gfx1030		GFX(1030), // gfx1030
GFX(1031), // gfx1031		GFX(1031), // gfx1031
GFX(1032), // gfx1032		GFX(1032), // gfx1032
GFX(1033), // gfx1033		GFX(1033), // gfx1033
GFX(1034), // gfx1034		GFX(1034), // gfx1034
// clang-format on		// clang-format on
};		};
#undef SM		#undef SM
▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/AMDGPU.cpp

Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	if (isAMDGCN(getTriple())) {
case GK_GFX1012:		case GK_GFX1012:
case GK_GFX1011:		case GK_GFX1011:
Features["dot1-insts"] = true;		Features["dot1-insts"] = true;
Features["dot2-insts"] = true;		Features["dot2-insts"] = true;
Features["dot5-insts"] = true;		Features["dot5-insts"] = true;
Features["dot6-insts"] = true;		Features["dot6-insts"] = true;
Features["dot7-insts"] = true;		Features["dot7-insts"] = true;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
		case GK_GFX1013:
case GK_GFX1010:		case GK_GFX1010:
Features["dl-insts"] = true;		Features["dl-insts"] = true;
Features["ci-insts"] = true;		Features["ci-insts"] = true;
Features["flat-address-space"] = true;		Features["flat-address-space"] = true;
Features["16-bit-insts"] = true;		Features["16-bit-insts"] = true;
Features["dpp"] = true;		Features["dpp"] = true;
Features["gfx8-insts"] = true;		Features["gfx8-insts"] = true;
Features["gfx9-insts"] = true;		Features["gfx9-insts"] = true;
▲ Show 20 Lines • Show All 224 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/NVPTX.cpp

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	std::string CUDAArchCode = [this] {
case CudaArch::GFX906:		case CudaArch::GFX906:
case CudaArch::GFX908:		case CudaArch::GFX908:
case CudaArch::GFX909:		case CudaArch::GFX909:
case CudaArch::GFX90a:		case CudaArch::GFX90a:
case CudaArch::GFX90c:		case CudaArch::GFX90c:
case CudaArch::GFX1010:		case CudaArch::GFX1010:
case CudaArch::GFX1011:		case CudaArch::GFX1011:
case CudaArch::GFX1012:		case CudaArch::GFX1012:
		case CudaArch::GFX1013:
case CudaArch::GFX1030:		case CudaArch::GFX1030:
case CudaArch::GFX1031:		case CudaArch::GFX1031:
case CudaArch::GFX1032:		case CudaArch::GFX1032:
case CudaArch::GFX1033:		case CudaArch::GFX1033:
case CudaArch::GFX1034:		case CudaArch::GFX1034:
case CudaArch::LAST:		case CudaArch::LAST:
break;		break;
case CudaArch::UNUSED:		case CudaArch::UNUSED:
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp

Show First 20 Lines • Show All 4,476 Lines • ▼ Show 20 Lines	if (Clause->getClauseKind() == OMPC_unified_shared_memory) {
case CudaArch::GFX906:		case CudaArch::GFX906:
case CudaArch::GFX908:		case CudaArch::GFX908:
case CudaArch::GFX909:		case CudaArch::GFX909:
case CudaArch::GFX90a:		case CudaArch::GFX90a:
case CudaArch::GFX90c:		case CudaArch::GFX90c:
case CudaArch::GFX1010:		case CudaArch::GFX1010:
case CudaArch::GFX1011:		case CudaArch::GFX1011:
case CudaArch::GFX1012:		case CudaArch::GFX1012:
		case CudaArch::GFX1013:
case CudaArch::GFX1030:		case CudaArch::GFX1030:
case CudaArch::GFX1031:		case CudaArch::GFX1031:
case CudaArch::GFX1032:		case CudaArch::GFX1032:
case CudaArch::GFX1033:		case CudaArch::GFX1033:
case CudaArch::GFX1034:		case CudaArch::GFX1034:
case CudaArch::UNUSED:		case CudaArch::UNUSED:
case CudaArch::UNKNOWN:		case CudaArch::UNKNOWN:
break;		break;
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	static std::pair<unsigned, unsigned> getSMsBlocksPerSM(CodeGenModule &CGM) {
case CudaArch::GFX906:		case CudaArch::GFX906:
case CudaArch::GFX908:		case CudaArch::GFX908:
case CudaArch::GFX909:		case CudaArch::GFX909:
case CudaArch::GFX90a:		case CudaArch::GFX90a:
case CudaArch::GFX90c:		case CudaArch::GFX90c:
case CudaArch::GFX1010:		case CudaArch::GFX1010:
case CudaArch::GFX1011:		case CudaArch::GFX1011:
case CudaArch::GFX1012:		case CudaArch::GFX1012:
		case CudaArch::GFX1013:
case CudaArch::GFX1030:		case CudaArch::GFX1030:
case CudaArch::GFX1031:		case CudaArch::GFX1031:
case CudaArch::GFX1032:		case CudaArch::GFX1032:
case CudaArch::GFX1033:		case CudaArch::GFX1033:
case CudaArch::GFX1034:		case CudaArch::GFX1034:
case CudaArch::UNUSED:		case CudaArch::UNUSED:
case CudaArch::UNKNOWN:		case CudaArch::UNKNOWN:
break;		break;
▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

clang/test/CodeGenOpenCL/amdgpu-features.cl

	Show All 22 Lines
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx906 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX906 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx906 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX906 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx908 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX908 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx908 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX908 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx909 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX909 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx909 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX909 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx90a -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX90A %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx90a -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX90A %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx90c -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX90C %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx90c -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX90C %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1010 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1010 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1010 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1010 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1011 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1011 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1011 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1011 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1012 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1012 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1012 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1012 %s
				// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1013 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1013 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1030 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1030 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1030 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1030 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1031 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1031 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1031 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1031 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1032 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1032 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1032 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1032 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1033 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1033 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1033 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1033 %s
	// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1034 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1034 %s			// RUN: %clang_cc1 -triple amdgcn -target-cpu gfx1034 -S -emit-llvm -o - %s \| FileCheck --check-prefix=GFX1034 %s

	// GFX600: "target-features"="+s-memtime-inst"			// GFX600: "target-features"="+s-memtime-inst"
	// GFX601: "target-features"="+s-memtime-inst"			// GFX601: "target-features"="+s-memtime-inst"
	Show All 15 Lines
	// GFX906: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX906: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX908: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+mai-insts,+s-memrealtime,+s-memtime-inst"			// GFX908: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+mai-insts,+s-memrealtime,+s-memtime-inst"
	// GFX909: "target-features"="+16-bit-insts,+ci-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX909: "target-features"="+16-bit-insts,+ci-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX90A: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+gfx90a-insts,+mai-insts,+s-memrealtime,+s-memtime-inst"			// GFX90A: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+gfx90a-insts,+mai-insts,+s-memrealtime,+s-memtime-inst"
	// GFX90C: "target-features"="+16-bit-insts,+ci-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX90C: "target-features"="+16-bit-insts,+ci-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1010: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1010: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1011: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1011: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1012: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1012: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
				// GFX1013: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dpp,+flat-address-space,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1030: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1030: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1031: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1031: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1032: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1032: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1033: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1033: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"
	// GFX1034: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"			// GFX1034: "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx10-3-insts,+gfx10-insts,+gfx8-insts,+gfx9-insts,+s-memrealtime,+s-memtime-inst"

	kernel void test() {}			kernel void test() {}

clang/test/Driver/amdgpu-macros.cl

	Show First 20 Lines • Show All 104 Lines • ▼ Show 20 Lines
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx906			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx906
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx908			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx908
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx909 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx909			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx909 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx909
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90a %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90a			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90a %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90a
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90c %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90c			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx90c %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=64 -DCPU=gfx90c
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1010 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1010			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1010 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1010
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1011 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1011			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1011 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1011
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1012 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1012			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1012 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1012
				// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1013 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1013
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1030 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1030			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1030 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1030
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1031 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1031			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1031 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1031
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1032 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1032			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1032 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1032
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1033 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1033			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1033 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1033
	// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1034 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1034			// RUN: %clang -E -dM -target amdgcn -mcpu=gfx1034 %s 2>&1 \| FileCheck --check-prefixes=ARCH-GCN,FAST_FMAF %s -DWAVEFRONT_SIZE=32 -DCPU=gfx1034

	// ARCH-GCN-DAG: #define FP_FAST_FMA 1			// ARCH-GCN-DAG: #define FP_FAST_FMA 1

	Show All 27 Lines

clang/test/Driver/amdgpu-mcpu.cl

	Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	// RUN: %clang -### -target amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck --check-prefix=GFX906 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck --check-prefix=GFX906 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck --check-prefix=GFX908 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck --check-prefix=GFX908 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx909 %s 2>&1 \| FileCheck --check-prefix=GFX909 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx909 %s 2>&1 \| FileCheck --check-prefix=GFX909 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx90a %s 2>&1 \| FileCheck --check-prefix=GFX90A %s			// RUN: %clang -### -target amdgcn -mcpu=gfx90a %s 2>&1 \| FileCheck --check-prefix=GFX90A %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx90c %s 2>&1 \| FileCheck --check-prefix=GFX90C %s			// RUN: %clang -### -target amdgcn -mcpu=gfx90c %s 2>&1 \| FileCheck --check-prefix=GFX90C %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1010 %s 2>&1 \| FileCheck --check-prefix=GFX1010 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1010 %s 2>&1 \| FileCheck --check-prefix=GFX1010 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1011 %s 2>&1 \| FileCheck --check-prefix=GFX1011 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1011 %s 2>&1 \| FileCheck --check-prefix=GFX1011 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1012 %s 2>&1 \| FileCheck --check-prefix=GFX1012 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1012 %s 2>&1 \| FileCheck --check-prefix=GFX1012 %s
				// RUN: %clang -### -target amdgcn -mcpu=gfx1013 %s 2>&1 \| FileCheck --check-prefix=GFX1013 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1030 %s 2>&1 \| FileCheck --check-prefix=GFX1030 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1030 %s 2>&1 \| FileCheck --check-prefix=GFX1030 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1031 %s 2>&1 \| FileCheck --check-prefix=GFX1031 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1031 %s 2>&1 \| FileCheck --check-prefix=GFX1031 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1032 %s 2>&1 \| FileCheck --check-prefix=GFX1032 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1032 %s 2>&1 \| FileCheck --check-prefix=GFX1032 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1033 %s 2>&1 \| FileCheck --check-prefix=GFX1033 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1033 %s 2>&1 \| FileCheck --check-prefix=GFX1033 %s
	// RUN: %clang -### -target amdgcn -mcpu=gfx1034 %s 2>&1 \| FileCheck --check-prefix=GFX1034 %s			// RUN: %clang -### -target amdgcn -mcpu=gfx1034 %s 2>&1 \| FileCheck --check-prefix=GFX1034 %s

	// GCNDEFAULT-NOT: -target-cpu			// GCNDEFAULT-NOT: -target-cpu
	// GFX600: "-target-cpu" "gfx600"			// GFX600: "-target-cpu" "gfx600"
	Show All 16 Lines
	// GFX906: "-target-cpu" "gfx906"			// GFX906: "-target-cpu" "gfx906"
	// GFX908: "-target-cpu" "gfx908"			// GFX908: "-target-cpu" "gfx908"
	// GFX909: "-target-cpu" "gfx909"			// GFX909: "-target-cpu" "gfx909"
	// GFX90A: "-target-cpu" "gfx90a"			// GFX90A: "-target-cpu" "gfx90a"
	// GFX90C: "-target-cpu" "gfx90c"			// GFX90C: "-target-cpu" "gfx90c"
	// GFX1010: "-target-cpu" "gfx1010"			// GFX1010: "-target-cpu" "gfx1010"
	// GFX1011: "-target-cpu" "gfx1011"			// GFX1011: "-target-cpu" "gfx1011"
	// GFX1012: "-target-cpu" "gfx1012"			// GFX1012: "-target-cpu" "gfx1012"
				// GFX1013: "-target-cpu" "gfx1013"
	// GFX1030: "-target-cpu" "gfx1030"			// GFX1030: "-target-cpu" "gfx1030"
	// GFX1031: "-target-cpu" "gfx1031"			// GFX1031: "-target-cpu" "gfx1031"
	// GFX1032: "-target-cpu" "gfx1032"			// GFX1032: "-target-cpu" "gfx1032"
	// GFX1033: "-target-cpu" "gfx1033"			// GFX1033: "-target-cpu" "gfx1033"
	// GFX1034: "-target-cpu" "gfx1034"			// GFX1034: "-target-cpu" "gfx1034"

clang/test/Misc/target-invalid-cpu-note.c

	Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines

	// RUN: not %clang_cc1 -triple amdgcn--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AMDGCN			// RUN: not %clang_cc1 -triple amdgcn--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AMDGCN
	// AMDGCN: error: unknown target CPU 'not-a-cpu'			// AMDGCN: error: unknown target CPU 'not-a-cpu'
	// AMDGCN: note: valid target CPU values are: gfx600, tahiti, gfx601, pitcairn, verde,			// AMDGCN: note: valid target CPU values are: gfx600, tahiti, gfx601, pitcairn, verde,
	// AMDGCN-SAME: gfx602, hainan, oland, gfx700, kaveri, gfx701, hawaii, gfx702,			// AMDGCN-SAME: gfx602, hainan, oland, gfx700, kaveri, gfx701, hawaii, gfx702,
	// AMDGCN-SAME: gfx703, kabini, mullins, gfx704, bonaire, gfx705, gfx801, carrizo,			// AMDGCN-SAME: gfx703, kabini, mullins, gfx704, bonaire, gfx705, gfx801, carrizo,
	// AMDGCN-SAME: gfx802, iceland, tonga, gfx803, fiji, polaris10, polaris11,			// AMDGCN-SAME: gfx802, iceland, tonga, gfx803, fiji, polaris10, polaris11,
	// AMDGCN-SAME: gfx805, tongapro, gfx810, stoney, gfx900, gfx902, gfx904, gfx906,			// AMDGCN-SAME: gfx805, tongapro, gfx810, stoney, gfx900, gfx902, gfx904, gfx906,
	// AMDGCN-SAME: gfx908, gfx909, gfx90a, gfx90c, gfx1010, gfx1011, gfx1012, gfx1030, gfx1031,			// AMDGCN-SAME: gfx908, gfx909, gfx90a, gfx90c, gfx1010, gfx1011, gfx1012, gfx1013, gfx1030, gfx1031,
	// AMDGCN-SAME: gfx1032, gfx1033, gfx1034			// AMDGCN-SAME: gfx1032, gfx1033, gfx1034

	// RUN: not %clang_cc1 -triple wasm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix WEBASM			// RUN: not %clang_cc1 -triple wasm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix WEBASM
	// WEBASM: error: unknown target CPU 'not-a-cpu'			// WEBASM: error: unknown target CPU 'not-a-cpu'
	// WEBASM: note: valid target CPU values are: mvp, bleeding-edge, generic			// WEBASM: note: valid target CPU values are: mvp, bleeding-edge, generic

	// RUN: not %clang_cc1 -triple systemz--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix SYSTEMZ			// RUN: not %clang_cc1 -triple systemz--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix SYSTEMZ
	// SYSTEMZ: error: unknown target CPU 'not-a-cpu'			// SYSTEMZ: error: unknown target CPU 'not-a-cpu'
	▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/docs/AMDGPUUsage.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 380 Lines • ▼ Show 20 Lines	``gfx1010`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 5700
- Radeon Pro 5600M		- Radeon Pro 5600M
``gfx1011`` ``amdgcn`` dGPU - cumode - rocm-amdhsa - Radeon Pro V520		``gfx1011`` ``amdgcn`` dGPU - cumode - rocm-amdhsa - Radeon Pro V520
- wavefrontsize64 - Absolute - pal-amdhsa		- wavefrontsize64 - Absolute - pal-amdhsa
- xnack flat - pal-amdpal		- xnack flat - pal-amdpal
scratch		scratch
``gfx1012`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 5500		``gfx1012`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 5500
- wavefrontsize64 flat - pal-amdhsa - Radeon RX 5500 XT		- wavefrontsize64 flat - pal-amdhsa - Radeon RX 5500 XT
- xnack scratch - pal-amdpal		- xnack scratch - pal-amdpal
		``gfx1013`` ``amdgcn`` APU - cumode - Absolute - rocm-amdhsa TBA
		foadUnsubmitted Done Reply Inline Actions Is it dGPU or APU? Every other entry with `TBA` also has a `TODO::` message foad: Is it dGPU or APU? Every other entry with `TBA` also has a `TODO::` message
		bcahoonAuthorUnsubmitted Done Reply Inline Actions It is APU. bcahoon: It is APU.
		- wavefrontsize64 flat - pal-amdhsa
		- xnack scratch - pal-amdpal .. TODO::

		Add product
		names.

GCN GFX10 (RDNA 2) [AMD-GCN-GFX10-RDNA2]_		GCN GFX10 (RDNA 2) [AMD-GCN-GFX10-RDNA2]_
-----------------------------------------------------------------------------------------------------------------------		-----------------------------------------------------------------------------------------------------------------------
``gfx1030`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 6800		``gfx1030`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 6800
- wavefrontsize64 flat - pal-amdhsa - Radeon RX 6800 XT		- wavefrontsize64 flat - pal-amdhsa - Radeon RX 6800 XT
scratch - pal-amdpal - Radeon RX 6900 XT		scratch - pal-amdpal - Radeon RX 6900 XT
``gfx1031`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 6700 XT		``gfx1031`` ``amdgcn`` dGPU - cumode - Absolute - rocm-amdhsa - Radeon RX 6700 XT
- wavefrontsize64 flat - pal-amdhsa		- wavefrontsize64 flat - pal-amdhsa
scratch - pal-amdpal		scratch - pal-amdpal
▲ Show 20 Lines • Show All 747 Lines • ▼ Show 20 Lines	.. table:: AMDGPU ``EF_AMDGPU_MACH`` Values
``EF_AMDGPU_MACH_AMDGCN_GFX602`` 0x03a ``gfx602``		``EF_AMDGPU_MACH_AMDGCN_GFX602`` 0x03a ``gfx602``
``EF_AMDGPU_MACH_AMDGCN_GFX705`` 0x03b ``gfx705``		``EF_AMDGPU_MACH_AMDGCN_GFX705`` 0x03b ``gfx705``
``EF_AMDGPU_MACH_AMDGCN_GFX805`` 0x03c ``gfx805``		``EF_AMDGPU_MACH_AMDGCN_GFX805`` 0x03c ``gfx805``
reserved 0x03d Reserved.		reserved 0x03d Reserved.
``EF_AMDGPU_MACH_AMDGCN_GFX1034`` 0x03e ``gfx1034``		``EF_AMDGPU_MACH_AMDGCN_GFX1034`` 0x03e ``gfx1034``
``EF_AMDGPU_MACH_AMDGCN_GFX90A`` 0x03f ``gfx90a``		``EF_AMDGPU_MACH_AMDGCN_GFX90A`` 0x03f ``gfx90a``
reserved 0x040 Reserved.		reserved 0x040 Reserved.
reserved 0x041 Reserved.		reserved 0x041 Reserved.
		``EF_AMDGPU_MACH_AMDGCN_GFX1013`` 0x042 ``gfx1013``
==================================== ========== =============================		==================================== ========== =============================

Sections		Sections
--------		--------

An AMDGPU target ELF code object has the standard ELF sections which include:		An AMDGPU target ELF code object has the standard ELF sections which include:

.. table:: AMDGPU ELF Sections		.. table:: AMDGPU ELF Sections
▲ Show 20 Lines • Show All 11,020 Lines • Show Last 20 Lines

llvm/include/llvm/BinaryFormat/ELF.h

Show First 20 Lines • Show All 736 Lines • ▼ Show 20 Lines	enum : unsigned {
EF_AMDGPU_MACH_AMDGCN_GFX602 = 0x03a,		EF_AMDGPU_MACH_AMDGCN_GFX602 = 0x03a,
EF_AMDGPU_MACH_AMDGCN_GFX705 = 0x03b,		EF_AMDGPU_MACH_AMDGCN_GFX705 = 0x03b,
EF_AMDGPU_MACH_AMDGCN_GFX805 = 0x03c,		EF_AMDGPU_MACH_AMDGCN_GFX805 = 0x03c,
EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3D = 0x03d,		EF_AMDGPU_MACH_AMDGCN_RESERVED_0X3D = 0x03d,
EF_AMDGPU_MACH_AMDGCN_GFX1034 = 0x03e,		EF_AMDGPU_MACH_AMDGCN_GFX1034 = 0x03e,
EF_AMDGPU_MACH_AMDGCN_GFX90A = 0x03f,		EF_AMDGPU_MACH_AMDGCN_GFX90A = 0x03f,
EF_AMDGPU_MACH_AMDGCN_RESERVED_0X40 = 0x040,		EF_AMDGPU_MACH_AMDGCN_RESERVED_0X40 = 0x040,
EF_AMDGPU_MACH_AMDGCN_RESERVED_0X41 = 0x041,		EF_AMDGPU_MACH_AMDGCN_RESERVED_0X41 = 0x041,
		EF_AMDGPU_MACH_AMDGCN_GFX1013 = 0x042,

// First/last AMDGCN-based processors.		// First/last AMDGCN-based processors.
EF_AMDGPU_MACH_AMDGCN_FIRST = EF_AMDGPU_MACH_AMDGCN_GFX600,		EF_AMDGPU_MACH_AMDGCN_FIRST = EF_AMDGPU_MACH_AMDGCN_GFX600,
EF_AMDGPU_MACH_AMDGCN_LAST = EF_AMDGPU_MACH_AMDGCN_GFX90A,		EF_AMDGPU_MACH_AMDGCN_LAST = EF_AMDGPU_MACH_AMDGCN_GFX1013,

// Indicates if the "xnack" target feature is enabled for all code contained		// Indicates if the "xnack" target feature is enabled for all code contained
// in the object.		// in the object.
//		//
// Only valid for ELFOSABI_AMDGPU_HSA and ELFABIVERSION_AMDGPU_HSA_V2.		// Only valid for ELFOSABI_AMDGPU_HSA and ELFABIVERSION_AMDGPU_HSA_V2.
EF_AMDGPU_FEATURE_XNACK_V2 = 0x01,		EF_AMDGPU_FEATURE_XNACK_V2 = 0x01,
// Indicates if the trap handler is enabled for all code contained		// Indicates if the trap handler is enabled for all code contained
// in the object.		// in the object.
▲ Show 20 Lines • Show All 920 Lines • Show Last 20 Lines

llvm/include/llvm/Support/TargetParser.h

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	enum GPUKind : uint32_t {
GK_GFX908 = 64,		GK_GFX908 = 64,
GK_GFX909 = 65,		GK_GFX909 = 65,
GK_GFX90A = 66,		GK_GFX90A = 66,
GK_GFX90C = 67,		GK_GFX90C = 67,

GK_GFX1010 = 71,		GK_GFX1010 = 71,
GK_GFX1011 = 72,		GK_GFX1011 = 72,
GK_GFX1012 = 73,		GK_GFX1012 = 73,
		GK_GFX1013 = 74,
GK_GFX1030 = 75,		GK_GFX1030 = 75,
GK_GFX1031 = 76,		GK_GFX1031 = 76,
GK_GFX1032 = 77,		GK_GFX1032 = 77,
GK_GFX1033 = 78,		GK_GFX1033 = 78,
GK_GFX1034 = 79,		GK_GFX1034 = 79,

GK_AMDGCN_FIRST = GK_GFX600,		GK_AMDGCN_FIRST = GK_GFX600,
GK_AMDGCN_LAST = GK_GFX1034,		GK_AMDGCN_LAST = GK_GFX1034,
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Object/ELFObjectFile.cpp

Show First 20 Lines • Show All 463 Lines • ▼ Show 20 Lines	StringRef ELFObjectFileBase::getAMDGPUCPUName() const {

// AMDGCN GFX10.		// AMDGCN GFX10.
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010:
return "gfx1010";		return "gfx1010";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011:
return "gfx1011";		return "gfx1011";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012:
return "gfx1012";		return "gfx1012";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1013:
		return "gfx1013";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030:
return "gfx1030";		return "gfx1030";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031:
return "gfx1031";		return "gfx1031";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032:
return "gfx1032";		return "gfx1032";
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033:		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033:
return "gfx1033";		return "gfx1033";
▲ Show 20 Lines • Show All 166 Lines • Show Last 20 Lines

llvm/lib/ObjectYAML/ELFYAML.cpp

Show First 20 Lines • Show All 543 Lines • ▼ Show 20 Lines	case ELF::EM_AMDGPU:
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX906, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX906, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX908, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX908, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX909, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX909, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX90A, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX90A, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX90C, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX90C, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1010, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1010, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1011, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1011, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1012, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1012, EF_AMDGPU_MACH);
		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1013, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1030, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1030, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1031, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1031, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1032, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1032, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1033, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1033, EF_AMDGPU_MACH);
BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1034, EF_AMDGPU_MACH);		BCaseMask(EF_AMDGPU_MACH_AMDGCN_GFX1034, EF_AMDGPU_MACH);
switch (Object->Header.ABIVersion) {		switch (Object->Header.ABIVersion) {
default:		default:
// ELFOSABI_AMDGPU_PAL, ELFOSABI_AMDGPU_MESA3D support *_V3 flags.		// ELFOSABI_AMDGPU_PAL, ELFOSABI_AMDGPU_MESA3D support *_V3 flags.
▲ Show 20 Lines • Show All 1,292 Lines • Show Last 20 Lines

llvm/lib/Support/TargetParser.cpp

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	constexpr GPUInfo AMDGCNGPUs[] = {
{{"gfx906"}, {"gfx906"}, GK_GFX906, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},		{{"gfx906"}, {"gfx906"}, GK_GFX906, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},
{{"gfx908"}, {"gfx908"}, GK_GFX908, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},		{{"gfx908"}, {"gfx908"}, GK_GFX908, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},
{{"gfx909"}, {"gfx909"}, GK_GFX909, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK},		{{"gfx909"}, {"gfx909"}, GK_GFX909, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK},
{{"gfx90a"}, {"gfx90a"}, GK_GFX90A, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},		{{"gfx90a"}, {"gfx90a"}, GK_GFX90A, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK\|FEATURE_SRAMECC},
{{"gfx90c"}, {"gfx90c"}, GK_GFX90C, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK},		{{"gfx90c"}, {"gfx90c"}, GK_GFX90C, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_XNACK},
{{"gfx1010"}, {"gfx1010"}, GK_GFX1010, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},		{{"gfx1010"}, {"gfx1010"}, GK_GFX1010, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},
{{"gfx1011"}, {"gfx1011"}, GK_GFX1011, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},		{{"gfx1011"}, {"gfx1011"}, GK_GFX1011, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},
{{"gfx1012"}, {"gfx1012"}, GK_GFX1012, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},		{{"gfx1012"}, {"gfx1012"}, GK_GFX1012, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},
		{{"gfx1013"}, {"gfx1013"}, GK_GFX1013, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32\|FEATURE_XNACK},
{{"gfx1030"}, {"gfx1030"}, GK_GFX1030, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},		{{"gfx1030"}, {"gfx1030"}, GK_GFX1030, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},
{{"gfx1031"}, {"gfx1031"}, GK_GFX1031, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},		{{"gfx1031"}, {"gfx1031"}, GK_GFX1031, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},
{{"gfx1032"}, {"gfx1032"}, GK_GFX1032, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},		{{"gfx1032"}, {"gfx1032"}, GK_GFX1032, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},
{{"gfx1033"}, {"gfx1033"}, GK_GFX1033, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},		{{"gfx1033"}, {"gfx1033"}, GK_GFX1033, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},
{{"gfx1034"}, {"gfx1034"}, GK_GFX1034, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},		{{"gfx1034"}, {"gfx1034"}, GK_GFX1034, FEATURE_FAST_FMA_F32\|FEATURE_FAST_DENORMAL_F32\|FEATURE_WAVE32},
};		};

const GPUInfo *getArchEntry(AMDGPU::GPUKind AK, ArrayRef<GPUInfo> Table) {		const GPUInfo *getArchEntry(AMDGPU::GPUKind AK, ArrayRef<GPUInfo> Table) {
▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	AMDGPU::IsaVersion AMDGPU::getIsaVersion(StringRef GPU) {
case GK_GFX906: return {9, 0, 6};		case GK_GFX906: return {9, 0, 6};
case GK_GFX908: return {9, 0, 8};		case GK_GFX908: return {9, 0, 8};
case GK_GFX909: return {9, 0, 9};		case GK_GFX909: return {9, 0, 9};
case GK_GFX90A: return {9, 0, 10};		case GK_GFX90A: return {9, 0, 10};
case GK_GFX90C: return {9, 0, 12};		case GK_GFX90C: return {9, 0, 12};
case GK_GFX1010: return {10, 1, 0};		case GK_GFX1010: return {10, 1, 0};
case GK_GFX1011: return {10, 1, 1};		case GK_GFX1011: return {10, 1, 1};
case GK_GFX1012: return {10, 1, 2};		case GK_GFX1012: return {10, 1, 2};
		case GK_GFX1013: return {10, 1, 3};
case GK_GFX1030: return {10, 3, 0};		case GK_GFX1030: return {10, 3, 0};
case GK_GFX1031: return {10, 3, 1};		case GK_GFX1031: return {10, 3, 1};
case GK_GFX1032: return {10, 3, 2};		case GK_GFX1032: return {10, 3, 2};
case GK_GFX1033: return {10, 3, 3};		case GK_GFX1033: return {10, 3, 3};
case GK_GFX1034: return {10, 3, 4};		case GK_GFX1034: return {10, 3, 4};
default: return {0, 0, 0};		default: return {0, 0, 0};
}		}
}		}
▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPU.td

Show First 20 Lines • Show All 459 Lines • ▼ Show 20 Lines
>;		>;

def FeatureExtendedImageInsts : SubtargetFeature<"extended-image-insts",		def FeatureExtendedImageInsts : SubtargetFeature<"extended-image-insts",
"HasExtendedImageInsts",		"HasExtendedImageInsts",
"true",		"true",
"Support mips != 0, lod != 0, gather4, and get_lod"		"Support mips != 0, lod != 0, gather4, and get_lod"
>;		>;

		def FeatureGFX10_AEncoding : SubtargetFeature<"gfx10_a-encoding",
		foadUnsubmitted Done Reply Inline Actions What is this new encoding? It doesn't seem to be used for anything. foad: What is this new encoding? It doesn't seem to be used for anything.
		bcahoonAuthorUnsubmitted Done Reply Inline Actions Fixed this. The BVH raytracing instructions use the encoding. bcahoon: Fixed this. The BVH raytracing instructions use the encoding.
		"GFX10_AEncoding",
		"true",
		"Has BVH ray tracing instructions"
		foadUnsubmitted Not Done Reply Inline Actions I realise you're just following the precedent set by GFX10_B, but is this terminology actually used in any documentation anywhere? And if not could we describe it a little better here? foad: I realise you're just following the precedent set by GFX10_B, but is this terminology actually…
		bcahoonAuthorUnsubmitted Done Reply Inline Actions I changed the description to be specific w.r.t what the target feature enables. bcahoon: I changed the description to be specific w.r.t what the target feature enables.
		foadUnsubmitted Not Done Reply Inline Actions Thank you. I think that is much more useful. foad: Thank you. I think that is much more useful.
		>;

def FeatureGFX10_BEncoding : SubtargetFeature<"gfx10_b-encoding",		def FeatureGFX10_BEncoding : SubtargetFeature<"gfx10_b-encoding",
"GFX10_BEncoding",		"GFX10_BEncoding",
"true",		"true",
"Encoding format GFX10_B"		"Encoding format GFX10_B"
>;		>;

def FeatureIntClamp : SubtargetFeature<"int-clamp-insts",		def FeatureIntClamp : SubtargetFeature<"int-clamp-insts",
"HasIntClamp",		"HasIntClamp",
▲ Show 20 Lines • Show All 596 Lines • ▼ Show 20 Lines	!listconcat(FeatureGroup.GFX10_1_Bugs,
FeatureScalarAtomics,		FeatureScalarAtomics,
FeatureScalarFlatScratchInsts,		FeatureScalarFlatScratchInsts,
FeatureGetWaveIdInst,		FeatureGetWaveIdInst,
FeatureMadMacF32Insts,		FeatureMadMacF32Insts,
FeatureDsSrc2Insts,		FeatureDsSrc2Insts,
FeatureLdsMisalignedBug,		FeatureLdsMisalignedBug,
FeatureSupportsXNACK])>;		FeatureSupportsXNACK])>;

		def FeatureISAVersion10_1_3 : FeatureSet<
		!listconcat(FeatureGroup.GFX10_1_Bugs,
		[FeatureGFX10,
		FeatureGFX10_AEncoding,
		FeatureLDSBankCount32,
		FeatureDLInsts,
		FeatureNSAEncoding,
		FeatureWavefrontSize32,
		FeatureScalarStores,
		FeatureScalarAtomics,
		FeatureScalarFlatScratchInsts,
		FeatureGetWaveIdInst,
		FeatureMadMacF32Insts,
		FeatureDsSrc2Insts,
		FeatureLdsMisalignedBug,
		FeatureSupportsXNACK])>;

def FeatureISAVersion10_3_0 : FeatureSet<		def FeatureISAVersion10_3_0 : FeatureSet<
[FeatureGFX10,		[FeatureGFX10,
		FeatureGFX10_AEncoding,
FeatureGFX10_BEncoding,		FeatureGFX10_BEncoding,
		rampitecUnsubmitted Done Reply Inline Actions gfx1030 should now include FeatureGFX10_AEncoding as well. 10_B is an extension above 10_A. rampitec: gfx1030 should now include FeatureGFX10_AEncoding as well. 10_B is an extension above 10_A.
		bcahoonAuthorUnsubmitted Done Reply Inline Actions I had added FeatureGFX10_AEncoding as an Implies feature for FeatureGFX10_BEncoding in the previous patch. But, I've changed the patch so that it's no longer an Implies feature. bcahoon: I had added FeatureGFX10_AEncoding as an Implies feature for FeatureGFX10_BEncoding in the…
FeatureGFX10_3Insts,		FeatureGFX10_3Insts,
FeatureLDSBankCount32,		FeatureLDSBankCount32,
FeatureDLInsts,		FeatureDLInsts,
FeatureDot1Insts,		FeatureDot1Insts,
FeatureDot2Insts,		FeatureDot2Insts,
FeatureDot5Insts,		FeatureDot5Insts,
FeatureDot6Insts,		FeatureDot6Insts,
FeatureDot7Insts,		FeatureDot7Insts,
▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines
def HasScalarFlatScratchInsts : Predicate<"Subtarget->hasScalarFlatScratchInsts()">,		def HasScalarFlatScratchInsts : Predicate<"Subtarget->hasScalarFlatScratchInsts()">,
AssemblerPredicate<(all_of FeatureScalarFlatScratchInsts)>;		AssemblerPredicate<(all_of FeatureScalarFlatScratchInsts)>;
def HasD16LoadStore : Predicate<"Subtarget->hasD16LoadStore()">,		def HasD16LoadStore : Predicate<"Subtarget->hasD16LoadStore()">,
AssemblerPredicate<(all_of FeatureGFX9Insts)>;		AssemblerPredicate<(all_of FeatureGFX9Insts)>;

def HasFlatScratchSTMode : Predicate<"Subtarget->hasFlatScratchSTMode()">,		def HasFlatScratchSTMode : Predicate<"Subtarget->hasFlatScratchSTMode()">,
AssemblerPredicate<(any_of FeatureGFX10_3Insts)>;		AssemblerPredicate<(any_of FeatureGFX10_3Insts)>;

		def HasGFX10_AEncoding : Predicate<"Subtarget->hasGFX10_AEncoding()">,
		AssemblerPredicate<(all_of FeatureGFX10_AEncoding)>;

def HasGFX10_BEncoding : Predicate<"Subtarget->hasGFX10_BEncoding()">,		def HasGFX10_BEncoding : Predicate<"Subtarget->hasGFX10_BEncoding()">,
AssemblerPredicate<(all_of FeatureGFX10_BEncoding)>;		AssemblerPredicate<(all_of FeatureGFX10_BEncoding)>;

def HasUnpackedD16VMem : Predicate<"Subtarget->hasUnpackedD16VMem()">,		def HasUnpackedD16VMem : Predicate<"Subtarget->hasUnpackedD16VMem()">,
AssemblerPredicate<(all_of FeatureUnpackedD16VMem)>;		AssemblerPredicate<(all_of FeatureUnpackedD16VMem)>;
def HasPackedD16VMem : Predicate<"!Subtarget->hasUnpackedD16VMem()">,		def HasPackedD16VMem : Predicate<"!Subtarget->hasUnpackedD16VMem()">,
AssemblerPredicate<(all_of (not FeatureUnpackedD16VMem))>;		AssemblerPredicate<(all_of (not FeatureUnpackedD16VMem))>;

▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp

Show First 20 Lines • Show All 4,680 Lines • ▼ Show 20 Lines	bool AMDGPULegalizerInfo::legalizeBVHIntrinsic(MachineInstr &MI,
Register DstReg = MI.getOperand(0).getReg();		Register DstReg = MI.getOperand(0).getReg();
Register NodePtr = MI.getOperand(2).getReg();		Register NodePtr = MI.getOperand(2).getReg();
Register RayExtent = MI.getOperand(3).getReg();		Register RayExtent = MI.getOperand(3).getReg();
Register RayOrigin = MI.getOperand(4).getReg();		Register RayOrigin = MI.getOperand(4).getReg();
Register RayDir = MI.getOperand(5).getReg();		Register RayDir = MI.getOperand(5).getReg();
Register RayInvDir = MI.getOperand(6).getReg();		Register RayInvDir = MI.getOperand(6).getReg();
Register TDescr = MI.getOperand(7).getReg();		Register TDescr = MI.getOperand(7).getReg();

		if (!ST.hasGFX10_AEncoding()) {
		DiagnosticInfoUnsupported BadIntrin(B.getMF().getFunction(),
		rampitecUnsubmitted Done Reply Inline Actions 80 chars per line. rampitec: 80 chars per line.
		"intrinsic not supported on subtarget",
		MI.getDebugLoc());
		B.getMF().getFunction().getContext().diagnose(BadIntrin);
		rampitecUnsubmitted Not Done Reply Inline Actions Just return false like in other places. rampitec: Just return false like in other places.
		rampitecUnsubmitted Not Done Reply Inline Actions Just return false. I see that is like this in the whole file. rampitec: Just return false. I see that is like this in the whole file.
		bcahoonAuthorUnsubmitted Done Reply Inline Actions Changed this to false, and also changed SIISelLlowering to return SDValue so that both fail in a similar way. bcahoon: Changed this to false, and also changed SIISelLlowering to return SDValue so that both fail in…
		MI.eraseFromParent();
		rampitecUnsubmitted Done Reply Inline Actions You can just omit undef and erase. rampitec: You can just omit undef and erase.
		return false;
		}

bool IsA16 = MRI.getType(RayDir).getElementType().getSizeInBits() == 16;		bool IsA16 = MRI.getType(RayDir).getElementType().getSizeInBits() == 16;
bool Is64 = MRI.getType(NodePtr).getSizeInBits() == 64;		bool Is64 = MRI.getType(NodePtr).getSizeInBits() == 64;
unsigned Opcode = IsA16 ? Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_a16_nsa		unsigned Opcode = IsA16 ? Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_a16_nsa
: AMDGPU::IMAGE_BVH_INTERSECT_RAY_a16_nsa		: AMDGPU::IMAGE_BVH_INTERSECT_RAY_a16_nsa
: Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_nsa		: Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_nsa
: AMDGPU::IMAGE_BVH_INTERSECT_RAY_nsa;		: AMDGPU::IMAGE_BVH_INTERSECT_RAY_nsa;

SmallVector<Register, 12> Ops;		SmallVector<Register, 12> Ops;
▲ Show 20 Lines • Show All 273 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp

Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	GCNSubtarget::GCNSubtarget(const Triple &TT, StringRef GPU, StringRef FS,
HasDPP8(false),		HasDPP8(false),
Has64BitDPP(false),		Has64BitDPP(false),
HasPackedFP32Ops(false),		HasPackedFP32Ops(false),
HasExtendedImageInsts(false),		HasExtendedImageInsts(false),
HasR128A16(false),		HasR128A16(false),
HasGFX10A16(false),		HasGFX10A16(false),
HasG16(false),		HasG16(false),
HasNSAEncoding(false),		HasNSAEncoding(false),
		GFX10_AEncoding(false),
GFX10_BEncoding(false),		GFX10_BEncoding(false),
HasDLInsts(false),		HasDLInsts(false),
HasDot1Insts(false),		HasDot1Insts(false),
HasDot2Insts(false),		HasDot2Insts(false),
HasDot3Insts(false),		HasDot3Insts(false),
HasDot4Insts(false),		HasDot4Insts(false),
HasDot5Insts(false),		HasDot5Insts(false),
HasDot6Insts(false),		HasDot6Insts(false),
▲ Show 20 Lines • Show All 720 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/GCNProcessors.td

	Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines
	def : ProcessorModel<"gfx1011", GFX10SpeedModel,			def : ProcessorModel<"gfx1011", GFX10SpeedModel,
	FeatureISAVersion10_1_1.Features			FeatureISAVersion10_1_1.Features
	>;			>;

	def : ProcessorModel<"gfx1012", GFX10SpeedModel,			def : ProcessorModel<"gfx1012", GFX10SpeedModel,
	FeatureISAVersion10_1_2.Features			FeatureISAVersion10_1_2.Features
	>;			>;

				def : ProcessorModel<"gfx1013", GFX10SpeedModel,
				FeatureISAVersion10_1_3.Features
				>;

	def : ProcessorModel<"gfx1030", GFX10SpeedModel,			def : ProcessorModel<"gfx1030", GFX10SpeedModel,
	FeatureISAVersion10_3_0.Features			FeatureISAVersion10_3_0.Features
	>;			>;

	def : ProcessorModel<"gfx1031", GFX10SpeedModel,			def : ProcessorModel<"gfx1031", GFX10SpeedModel,
	FeatureISAVersion10_3_0.Features			FeatureISAVersion10_3_0.Features
	>;			>;

	Show All 11 Lines

llvm/lib/Target/AMDGPU/GCNSubtarget.h

Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	protected:
bool HasDPP8;		bool HasDPP8;
bool Has64BitDPP;		bool Has64BitDPP;
bool HasPackedFP32Ops;		bool HasPackedFP32Ops;
bool HasExtendedImageInsts;		bool HasExtendedImageInsts;
bool HasR128A16;		bool HasR128A16;
bool HasGFX10A16;		bool HasGFX10A16;
bool HasG16;		bool HasG16;
bool HasNSAEncoding;		bool HasNSAEncoding;
		bool GFX10_AEncoding;
bool GFX10_BEncoding;		bool GFX10_BEncoding;
bool HasDLInsts;		bool HasDLInsts;
bool HasDot1Insts;		bool HasDot1Insts;
bool HasDot2Insts;		bool HasDot2Insts;
bool HasDot3Insts;		bool HasDot3Insts;
bool HasDot4Insts;		bool HasDot4Insts;
bool HasDot5Insts;		bool HasDot5Insts;
bool HasDot6Insts;		bool HasDot6Insts;
▲ Show 20 Lines • Show All 720 Lines • ▼ Show 20 Lines	public:
}		}

bool hasImageStoreD16Bug() const { return HasImageStoreD16Bug; }		bool hasImageStoreD16Bug() const { return HasImageStoreD16Bug; }

bool hasImageGather4D16Bug() const { return HasImageGather4D16Bug; }		bool hasImageGather4D16Bug() const { return HasImageGather4D16Bug; }

bool hasNSAEncoding() const { return HasNSAEncoding; }		bool hasNSAEncoding() const { return HasNSAEncoding; }

		bool hasGFX10_AEncoding() const {
		return GFX10_AEncoding;
		}

		foadUnsubmitted Done Reply Inline Actions Stray whitespace on this line. foad: Stray whitespace on this line.
bool hasGFX10_BEncoding() const {		bool hasGFX10_BEncoding() const {
return GFX10_BEncoding;		return GFX10_BEncoding;
}		}

bool hasGFX10_3Insts() const {		bool hasGFX10_3Insts() const {
return GFX10_3Insts;		return GFX10_3Insts;
}		}

▲ Show 20 Lines • Show All 255 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	StringRef AMDGPUTargetStreamer::getArchNameFromElfMach(unsigned ElfMach) {
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX906: AK = GK_GFX906; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX906: AK = GK_GFX906; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX908: AK = GK_GFX908; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX908: AK = GK_GFX908; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX909: AK = GK_GFX909; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX909: AK = GK_GFX909; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX90A: AK = GK_GFX90A; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX90A: AK = GK_GFX90A; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX90C: AK = GK_GFX90C; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX90C: AK = GK_GFX90C; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010: AK = GK_GFX1010; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010: AK = GK_GFX1010; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011: AK = GK_GFX1011; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011: AK = GK_GFX1011; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012: AK = GK_GFX1012; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012: AK = GK_GFX1012; break;
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1013: AK = GK_GFX1013; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030: AK = GK_GFX1030; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030: AK = GK_GFX1030; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031: AK = GK_GFX1031; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031: AK = GK_GFX1031; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032: AK = GK_GFX1032; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032: AK = GK_GFX1032; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033: AK = GK_GFX1033; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033: AK = GK_GFX1033; break;
case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1034: AK = GK_GFX1034; break;		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1034: AK = GK_GFX1034; break;
case ELF::EF_AMDGPU_MACH_NONE: AK = GK_NONE; break;		case ELF::EF_AMDGPU_MACH_NONE: AK = GK_NONE; break;
}		}

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	unsigned AMDGPUTargetStreamer::getElfMach(StringRef GPU) {
case GK_GFX906: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX906;		case GK_GFX906: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX906;
case GK_GFX908: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX908;		case GK_GFX908: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX908;
case GK_GFX909: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX909;		case GK_GFX909: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX909;
case GK_GFX90A: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX90A;		case GK_GFX90A: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX90A;
case GK_GFX90C: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX90C;		case GK_GFX90C: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX90C;
case GK_GFX1010: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010;		case GK_GFX1010: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010;
case GK_GFX1011: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011;		case GK_GFX1011: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011;
case GK_GFX1012: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012;		case GK_GFX1012: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012;
		case GK_GFX1013: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1013;
case GK_GFX1030: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030;		case GK_GFX1030: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030;
case GK_GFX1031: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031;		case GK_GFX1031: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1031;
case GK_GFX1032: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032;		case GK_GFX1032: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1032;
case GK_GFX1033: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033;		case GK_GFX1033: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1033;
case GK_GFX1034: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1034;		case GK_GFX1034: return ELF::EF_AMDGPU_MACH_AMDGCN_GFX1034;
case GK_NONE: return ELF::EF_AMDGPU_MACH_NONE;		case GK_NONE: return ELF::EF_AMDGPU_MACH_NONE;
}		}

▲ Show 20 Lines • Show All 697 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/MIMGInstructions.td

Show First 20 Lines • Show All 880 Lines • ▼ Show 20 Lines	class MIMG_IntersectRay_nsa_gfx10<mimgopc op, string opcode, int num_addrs, bit A16>
let InOperandList = !con(nsah.AddrIns,		let InOperandList = !con(nsah.AddrIns,
(ins SReg_128:$srsrc),		(ins SReg_128:$srsrc),
!if(A16, (ins GFX10A16:$a16), (ins)));		!if(A16, (ins GFX10A16:$a16), (ins)));
let AsmString = opcode#" $vdata, "#nsah.AddrAsm#", $srsrc"#!if(A16, "$a16", "");		let AsmString = opcode#" $vdata, "#nsah.AddrAsm#", $srsrc"#!if(A16, "$a16", "");
}		}

multiclass MIMG_IntersectRay<mimgopc op, string opcode, int num_addrs, bit A16> {		multiclass MIMG_IntersectRay<mimgopc op, string opcode, int num_addrs, bit A16> {
def "" : MIMGBaseOpcode;		def "" : MIMGBaseOpcode;
let SubtargetPredicate = HasGFX10_BEncoding,		let SubtargetPredicate = HasGFX10_AEncoding,
AssemblerPredicate = HasGFX10_BEncoding,		AssemblerPredicate = HasGFX10_AEncoding,
AsmMatchConverter = !if(A16, "cvtIntersectRay", ""),		AsmMatchConverter = !if(A16, "cvtIntersectRay", ""),
dmask = 0xf,		dmask = 0xf,
unorm = 1,		unorm = 1,
d16 = 0,		d16 = 0,
cpol = 0,		cpol = 0,
tfe = 0,		tfe = 0,
lwe = 0,		lwe = 0,
r128 = 1,		r128 = 1,
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
defm IMAGE_SAMPLE_CD_O_G16 : MIMG_Sampler <mimgopc<0xec>, AMDGPUSample_cd_o, 0, 1>;		defm IMAGE_SAMPLE_CD_O_G16 : MIMG_Sampler <mimgopc<0xec>, AMDGPUSample_cd_o, 0, 1>;
defm IMAGE_SAMPLE_CD_CL_O_G16 : MIMG_Sampler <mimgopc<0xed>, AMDGPUSample_cd_cl_o, 0, 1>;		defm IMAGE_SAMPLE_CD_CL_O_G16 : MIMG_Sampler <mimgopc<0xed>, AMDGPUSample_cd_cl_o, 0, 1>;
defm IMAGE_SAMPLE_C_CD_O_G16 : MIMG_Sampler <mimgopc<0xee>, AMDGPUSample_c_cd_o, 0, 1>;		defm IMAGE_SAMPLE_C_CD_O_G16 : MIMG_Sampler <mimgopc<0xee>, AMDGPUSample_c_cd_o, 0, 1>;
defm IMAGE_SAMPLE_C_CD_CL_O_G16 : MIMG_Sampler <mimgopc<0xef>, AMDGPUSample_c_cd_cl_o, 0, 1>;		defm IMAGE_SAMPLE_C_CD_CL_O_G16 : MIMG_Sampler <mimgopc<0xef>, AMDGPUSample_c_cd_cl_o, 0, 1>;
} // End OtherPredicates = [HasExtendedImageInsts]		} // End OtherPredicates = [HasExtendedImageInsts]
//def IMAGE_RSRC256 : MIMG_NoPattern_RSRC256 <"image_rsrc256", 0x0000007e>;		//def IMAGE_RSRC256 : MIMG_NoPattern_RSRC256 <"image_rsrc256", 0x0000007e>;
//def IMAGE_SAMPLER : MIMG_NoPattern_ <"image_sampler", 0x0000007f>;		//def IMAGE_SAMPLER : MIMG_NoPattern_ <"image_sampler", 0x0000007f>;

let SubtargetPredicate = HasGFX10_BEncoding in		let SubtargetPredicate = HasGFX10_AEncoding in
defm IMAGE_MSAA_LOAD_X : MIMG_NoSampler <mimgopc<0x80>, "image_msaa_load", 1, 0, 0, 1>;		defm IMAGE_MSAA_LOAD_X : MIMG_NoSampler <mimgopc<0x80>, "image_msaa_load", 1, 0, 0, 1>;

defm IMAGE_BVH_INTERSECT_RAY : MIMG_IntersectRay<mimgopc<0xe6>, "image_bvh_intersect_ray", 11, 0>;		defm IMAGE_BVH_INTERSECT_RAY : MIMG_IntersectRay<mimgopc<0xe6>, "image_bvh_intersect_ray", 11, 0>;
defm IMAGE_BVH_INTERSECT_RAY_a16 : MIMG_IntersectRay<mimgopc<0xe6>, "image_bvh_intersect_ray", 8, 1>;		defm IMAGE_BVH_INTERSECT_RAY_a16 : MIMG_IntersectRay<mimgopc<0xe6>, "image_bvh_intersect_ray", 8, 1>;
defm IMAGE_BVH64_INTERSECT_RAY : MIMG_IntersectRay<mimgopc<0xe7>, "image_bvh64_intersect_ray", 12, 0>;		defm IMAGE_BVH64_INTERSECT_RAY : MIMG_IntersectRay<mimgopc<0xe7>, "image_bvh64_intersect_ray", 12, 0>;
defm IMAGE_BVH64_INTERSECT_RAY_a16 : MIMG_IntersectRay<mimgopc<0xe7>, "image_bvh64_intersect_ray", 9, 1>;		defm IMAGE_BVH64_INTERSECT_RAY_a16 : MIMG_IntersectRay<mimgopc<0xe7>, "image_bvh64_intersect_ray", 9, 1>;

/******** ========================================= ********/		/******** ========================================= ********/
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,335 Lines • ▼ Show 20 Lines	case Intrinsic::amdgcn_image_bvh_intersect_ray: {
SDValue RayInvDir = M->getOperand(6);		SDValue RayInvDir = M->getOperand(6);
SDValue TDescr = M->getOperand(7);		SDValue TDescr = M->getOperand(7);

assert(NodePtr.getValueType() == MVT::i32 \|\|		assert(NodePtr.getValueType() == MVT::i32 \|\|
NodePtr.getValueType() == MVT::i64);		NodePtr.getValueType() == MVT::i64);
assert(RayDir.getValueType() == MVT::v4f16 \|\|		assert(RayDir.getValueType() == MVT::v4f16 \|\|
RayDir.getValueType() == MVT::v4f32);		RayDir.getValueType() == MVT::v4f32);

		if (!Subtarget->hasGFX10_AEncoding()) {
		emitRemovedIntrinsicError(DAG, DL, Op.getValueType());
		rampitecUnsubmitted Not Done Reply Inline Actions return emitRemovedIntrinsicError(); rampitec: return emitRemovedIntrinsicError();
		bcahoonAuthorUnsubmitted Done Reply Inline Actions I've changed this to return. Thanks for catching that. But, it returns a UNDEF value instead of SDValue() so that it doesn't crash. I can change the behavior if that's preferred. bcahoon: I've changed this to return. Thanks for catching that. But, it returns a UNDEF value instead of…
		foadUnsubmitted Not Done Reply Inline Actions Personally I would follow all the existing precedents and "return emitRemovedIntrinsicError(...)". I don't see any value in deliberately trying to make the compiler crash harder. foad: Personally I would follow all the existing precedents and "return emitRemovedIntrinsicError(...
		bcahoonAuthorUnsubmitted Not Done Reply Inline Actions The handling of diagnostic errors is a little inconsistent in ISelLowering. Sometimes SDValue() is returned and other times it's Undef. I have it return SDValue() so that the failure mode is consistent with how GlobaISel handles these intrinsics. It's probably worth a discussion to decide how best to handle diagnostic errors. I'm happy to submit a follow-on patch as needed. As an aside, return emitRemovedIntrinsicError() isn't enough because the intrinsic has both a return value and a chain edge. So, something like return DAG.getMergeValues({emitRemovedIntrinsicError(), Op.getValue(0)}, DL) is needed. bcahoon: The handling of diagnostic errors is a little inconsistent in ISelLowering. Sometimes SDValue()…
		return SDValue();
		}

bool IsA16 = RayDir.getValueType().getVectorElementType() == MVT::f16;		bool IsA16 = RayDir.getValueType().getVectorElementType() == MVT::f16;
bool Is64 = NodePtr.getValueType() == MVT::i64;		bool Is64 = NodePtr.getValueType() == MVT::i64;
unsigned Opcode = IsA16 ? Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_a16_nsa		unsigned Opcode = IsA16 ? Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_a16_nsa
: AMDGPU::IMAGE_BVH_INTERSECT_RAY_a16_nsa		: AMDGPU::IMAGE_BVH_INTERSECT_RAY_a16_nsa
: Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_nsa		: Is64 ? AMDGPU::IMAGE_BVH64_INTERSECT_RAY_nsa
: AMDGPU::IMAGE_BVH_INTERSECT_RAY_nsa;		: AMDGPU::IMAGE_BVH_INTERSECT_RAY_nsa;

SmallVector<SDValue, 16> Ops;		SmallVector<SDValue, 16> Ops;
▲ Show 20 Lines • Show All 4,974 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

	Show First 20 Lines • Show All 731 Lines • ▼ Show 20 Lines
	bool isSI(const MCSubtargetInfo &STI);			bool isSI(const MCSubtargetInfo &STI);
	bool isCI(const MCSubtargetInfo &STI);			bool isCI(const MCSubtargetInfo &STI);
	bool isVI(const MCSubtargetInfo &STI);			bool isVI(const MCSubtargetInfo &STI);
	bool isGFX9(const MCSubtargetInfo &STI);			bool isGFX9(const MCSubtargetInfo &STI);
	bool isGFX9Plus(const MCSubtargetInfo &STI);			bool isGFX9Plus(const MCSubtargetInfo &STI);
	bool isGFX10(const MCSubtargetInfo &STI);			bool isGFX10(const MCSubtargetInfo &STI);
	bool isGFX10Plus(const MCSubtargetInfo &STI);			bool isGFX10Plus(const MCSubtargetInfo &STI);
	bool isGCN3Encoding(const MCSubtargetInfo &STI);			bool isGCN3Encoding(const MCSubtargetInfo &STI);
				bool isGFX10_AEncoding(const MCSubtargetInfo &STI);
	bool isGFX10_BEncoding(const MCSubtargetInfo &STI);			bool isGFX10_BEncoding(const MCSubtargetInfo &STI);
	bool hasGFX10_3Insts(const MCSubtargetInfo &STI);			bool hasGFX10_3Insts(const MCSubtargetInfo &STI);
	bool isGFX90A(const MCSubtargetInfo &STI);			bool isGFX90A(const MCSubtargetInfo &STI);
	bool hasArchitectedFlatScratch(const MCSubtargetInfo &STI);			bool hasArchitectedFlatScratch(const MCSubtargetInfo &STI);

	/// Is Reg - scalar register			/// Is Reg - scalar register
	bool isSGPR(unsigned Reg, const MCRegisterInfo* TRI);			bool isSGPR(unsigned Reg, const MCRegisterInfo* TRI);

	▲ Show 20 Lines • Show All 261 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

	Show First 20 Lines • Show All 1,441 Lines • ▼ Show 20 Lines
	}			}

	bool isGFX10Plus(const MCSubtargetInfo &STI) { return isGFX10(STI); }			bool isGFX10Plus(const MCSubtargetInfo &STI) { return isGFX10(STI); }

	bool isGCN3Encoding(const MCSubtargetInfo &STI) {			bool isGCN3Encoding(const MCSubtargetInfo &STI) {
	return STI.getFeatureBits()[AMDGPU::FeatureGCN3Encoding];			return STI.getFeatureBits()[AMDGPU::FeatureGCN3Encoding];
	}			}

				bool isGFX10_AEncoding(const MCSubtargetInfo &STI) {
				return STI.getFeatureBits()[AMDGPU::FeatureGFX10_AEncoding];
				}

				foadUnsubmitted Done Reply Inline Actions Stray whitespace. foad: Stray whitespace.
	bool isGFX10_BEncoding(const MCSubtargetInfo &STI) {			bool isGFX10_BEncoding(const MCSubtargetInfo &STI) {
	return STI.getFeatureBits()[AMDGPU::FeatureGFX10_BEncoding];			return STI.getFeatureBits()[AMDGPU::FeatureGFX10_BEncoding];
	}			}

	bool hasGFX10_3Insts(const MCSubtargetInfo &STI) {			bool hasGFX10_3Insts(const MCSubtargetInfo &STI) {
	return STI.getFeatureBits()[AMDGPU::FeatureGFX10_3Insts];			return STI.getFeatureBits()[AMDGPU::FeatureGFX10_3Insts];
	}			}

	▲ Show 20 Lines • Show All 560 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1030 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s			; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1030 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1013 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: not --crash llc -global-isel -march=amdgcn -mcpu=gfx1012 -verify-machineinstrs < %s -o /dev/null 2>&1 \| FileCheck -check-prefix=ERR %s
				foadUnsubmitted Not Done Reply Inline Actions This test surely should not pass for gfx1012, since it does not have these instructions. And with your patch as written it should fail for gfx1013 too, since they are predicated on HasGFX10_BEncoding. @rampitec any idea what is wrong here? Apparently the backend will happily generate image_bvh_intersect_ray instructions even for gfx900! foad: This test surely should not pass for gfx1012, since it does not have these instructions. And…
				rampitecUnsubmitted Done Reply Inline Actions Indeed. MIMG_IntersectRay has this: let SubtargetPredicate = HasGFX10_BEncoding, AssemblerPredicate = HasGFX10_BEncoding, but apparently SubtargetPredicate did not work. It needs to be fixed. gfx1012 does not have it, gfx1013 does though. That is what GFX10_A encoding is about, 10_B it has to be replaced with 10_A in BVH and MSAA load. rampitec: Indeed. MIMG_IntersectRay has this: ``` let SubtargetPredicate = HasGFX10_BEncoding…
				rampitecUnsubmitted Done Reply Inline Actions Image lowering and selection is not really done like everything else. For BVH it just lowers intrinsic to opcode. I think the easiest fix is to add to SIISelLowering.cpp where we lower Intrinsic::amdgcn_image_bvh_intersect_ray something like this: if (!Subtarget->hasGFX10_AEncoding()) report_fatal_error( "requested image instruction is not supported on this GPU"); rampitec: Image lowering and selection is not really done like everything else. For BVH it just lowers…
				bcahoonAuthorUnsubmitted Done Reply Inline Actions I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump. bcahoon: I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the…
				rampitecUnsubmitted Not Done Reply Inline Actions I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way the failure isn't a crash dump. Diagnostics is a good thing, but we still have to fail the compilation. rampitec: > I ended up using emitRemovedIntrinsicError, which uses DiagnosticInfoUnsupported. This way…
				bcahoonAuthorUnsubmitted Done Reply Inline Actions The diagnostic is marked as an error, so the compilation fails in that llc returns a non-zero return code. This mechanism is used in other places in the back-end to report similar types of errors. The alternative, if I understand correctly, is that a crash occurs with an error message that indicates that the bug is in LLVM (rather the the input source file). bcahoon: The diagnostic is marked as an error, so the compilation fails in that llc returns a non-zero…
				rampitecUnsubmitted Not Done Reply Inline Actions We do not seem to be consistent here and return either undef or SDValue(), but as far as I can see we never continue selecting code though, like here in SIISelLowering and always return false from the AMDGPUInstructionSelector. rampitec: We do not seem to be consistent here and return either undef or SDValue(), but as far as I can…
				bcahoonAuthorUnsubmitted Done Reply Inline Actions I've left the patch so that it doesn't crash. But, let me know if you think we should return false and crash, and I'll make that change. bcahoon: I've left the patch so that it doesn't crash. But, let me know if you think we should return…
				rampitecUnsubmitted Not Done Reply Inline Actions not --crash llc ... rampitec: not --crash llc ...

	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(uint node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(uint node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(uint node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(uint node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(ulong node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(ulong node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(ulong node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(ulong node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)

	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(i32, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(i32, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(i64, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(i64, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(i64, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(i64, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)

	define amdgpu_ps <4 x float> @image_bvh_intersect_ray(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x float> %ray_dir, <4 x float> %ray_inv_dir, <4 x i32> inreg %tdescr) {			define amdgpu_ps <4 x float> @image_bvh_intersect_ray(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x float> %ray_dir, <4 x float> %ray_inv_dir, <4 x i32> inreg %tdescr) {
	; GCN-LABEL: image_bvh_intersect_ray:			; GCN-LABEL: image_bvh_intersect_ray:
	; GCN: ; %bb.0:			; GCN: ; %bb.0:
	; GCN-NEXT: image_bvh_intersect_ray v[0:3], [v0, v1, v2, v3, v4, v6, v7, v8, v10, v11, v12], s[0:3]			; GCN-NEXT: image_bvh_intersect_ray v[0:3], [v0, v1, v2, v3, v4, v6, v7, v8, v10, v11, v12], s[0:3]
	; GCN-NEXT: s_waitcnt vmcnt(0)			; GCN-NEXT: s_waitcnt vmcnt(0)
	; GCN-NEXT: ; return to shader part epilog			; GCN-NEXT: ; return to shader part epilog
				; ERR: in function image_bvh_intersect_ray{{.*}}intrinsic not supported on subtarget
	%v = call <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x float> %ray_dir, <4 x float> %ray_inv_dir, <4 x i32> %tdescr)			%v = call <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x float> %ray_dir, <4 x float> %ray_inv_dir, <4 x i32> %tdescr)
	%r = bitcast <4 x i32> %v to <4 x float>			%r = bitcast <4 x i32> %v to <4 x float>
	ret <4 x float> %r			ret <4 x float> %r
	}			}

	define amdgpu_ps <4 x float> @image_bvh_intersect_ray_a16(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x half> %ray_dir, <4 x half> %ray_inv_dir, <4 x i32> inreg %tdescr) {			define amdgpu_ps <4 x float> @image_bvh_intersect_ray_a16(i32 %node_ptr, float %ray_extent, <4 x float> %ray_origin, <4 x half> %ray_dir, <4 x half> %ray_inv_dir, <4 x i32> inreg %tdescr) {
	; GCN-LABEL: image_bvh_intersect_ray_a16:			; GCN-LABEL: image_bvh_intersect_ray_a16:
	; GCN: ; %bb.0:			; GCN: ; %bb.0:
	▲ Show 20 Lines • Show All 185 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1010 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1010-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1010 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1010-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1010 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1010-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1010 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1010-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 < %s \| FileCheck --check-prefixes=V3-GFX1011-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 < %s \| FileCheck --check-prefixes=V3-GFX1011-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1011-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1011-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1011-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1011 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1011-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 < %s \| FileCheck --check-prefixes=V3-GFX1012-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 < %s \| FileCheck --check-prefixes=V3-GFX1012-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1012-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1012-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1012-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1012 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1012-XNACK %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1013 < %s \| FileCheck --check-prefixes=V3-GFX1013-XNACK %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1013 -mattr=-xnack < %s \| FileCheck --check-prefixes=V3-GFX1013-NOXNACK %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1013 -mattr=+xnack < %s \| FileCheck --check-prefixes=V3-GFX1013-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1030 < %s \| FileCheck --check-prefixes=V3-GFX1030 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1030 < %s \| FileCheck --check-prefixes=V3-GFX1030 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1031 < %s \| FileCheck --check-prefixes=V3-GFX1031 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1031 < %s \| FileCheck --check-prefixes=V3-GFX1031 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1032 < %s \| FileCheck --check-prefixes=V3-GFX1032 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1032 < %s \| FileCheck --check-prefixes=V3-GFX1032 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1033 < %s \| FileCheck --check-prefixes=V3-GFX1033 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1033 < %s \| FileCheck --check-prefixes=V3-GFX1033 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1034 < %s \| FileCheck --check-prefixes=V3-GFX1034 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa --amdhsa-code-object-version=3 -mcpu=gfx1034 < %s \| FileCheck --check-prefixes=V3-GFX1034 %s

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 < %s \| FileCheck --check-prefixes=GFX600 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 < %s \| FileCheck --check-prefixes=GFX600 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti < %s \| FileCheck --check-prefixes=GFX600 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti < %s \| FileCheck --check-prefixes=GFX600 %s
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1010-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1010-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1010-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1010-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 < %s \| FileCheck --check-prefixes=GFX1011 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 < %s \| FileCheck --check-prefixes=GFX1011 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1011-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1011-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1011-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1011-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 < %s \| FileCheck --check-prefixes=GFX1012 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 < %s \| FileCheck --check-prefixes=GFX1012 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1012-NOXNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1012-NOXNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1012-XNACK %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1012-XNACK %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1013 < %s \| FileCheck --check-prefixes=GFX1013 %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1013 -mattr=-xnack < %s \| FileCheck --check-prefixes=GFX1013-NOXNACK %s
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1013 -mattr=+xnack < %s \| FileCheck --check-prefixes=GFX1013-XNACK %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 < %s \| FileCheck --check-prefixes=GFX1030 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 < %s \| FileCheck --check-prefixes=GFX1030 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1031 < %s \| FileCheck --check-prefixes=GFX1031 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1031 < %s \| FileCheck --check-prefixes=GFX1031 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1032 < %s \| FileCheck --check-prefixes=GFX1032 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1032 < %s \| FileCheck --check-prefixes=GFX1032 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1033 < %s \| FileCheck --check-prefixes=GFX1033 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1033 < %s \| FileCheck --check-prefixes=GFX1033 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1034 < %s \| FileCheck --check-prefixes=GFX1034 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1034 < %s \| FileCheck --check-prefixes=GFX1034 %s

	; V3-GFX600: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"			; V3-GFX600: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"
	; V3-GFX601: .amdgcn_target "amdgcn-amd-amdhsa--gfx601"			; V3-GFX601: .amdgcn_target "amdgcn-amd-amdhsa--gfx601"
	Show All 30 Lines
	; V3-GFX90C-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx90c"			; V3-GFX90C-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx90c"
	; V3-GFX90C-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx90c+xnack"			; V3-GFX90C-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx90c+xnack"
	; V3-GFX1010-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010"			; V3-GFX1010-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010"
	; V3-GFX1010-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010+xnack"			; V3-GFX1010-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010+xnack"
	; V3-GFX1011-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011"			; V3-GFX1011-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011"
	; V3-GFX1011-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011+xnack"			; V3-GFX1011-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011+xnack"
	; V3-GFX1012-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012"			; V3-GFX1012-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012"
	; V3-GFX1012-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012+xnack"			; V3-GFX1012-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012+xnack"
				; V3-GFX1013-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1013"
				; V3-GFX1013-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1013+xnack"
	; V3-GFX1030: .amdgcn_target "amdgcn-amd-amdhsa--gfx1030"			; V3-GFX1030: .amdgcn_target "amdgcn-amd-amdhsa--gfx1030"
	; V3-GFX1031: .amdgcn_target "amdgcn-amd-amdhsa--gfx1031"			; V3-GFX1031: .amdgcn_target "amdgcn-amd-amdhsa--gfx1031"
	; V3-GFX1032: .amdgcn_target "amdgcn-amd-amdhsa--gfx1032"			; V3-GFX1032: .amdgcn_target "amdgcn-amd-amdhsa--gfx1032"
	; V3-GFX1033: .amdgcn_target "amdgcn-amd-amdhsa--gfx1033"			; V3-GFX1033: .amdgcn_target "amdgcn-amd-amdhsa--gfx1033"
	; V3-GFX1034: .amdgcn_target "amdgcn-amd-amdhsa--gfx1034"			; V3-GFX1034: .amdgcn_target "amdgcn-amd-amdhsa--gfx1034"

	; GFX600: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"			; GFX600: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"
	; GFX601: .amdgcn_target "amdgcn-amd-amdhsa--gfx601"			; GFX601: .amdgcn_target "amdgcn-amd-amdhsa--gfx601"
	▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	; GFX1010-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010:xnack-"			; GFX1010-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010:xnack-"
	; GFX1010-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010:xnack+"			; GFX1010-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1010:xnack+"
	; GFX1011: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011"			; GFX1011: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011"
	; GFX1011-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011:xnack-"			; GFX1011-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011:xnack-"
	; GFX1011-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011:xnack+"			; GFX1011-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1011:xnack+"
	; GFX1012: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012"			; GFX1012: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012"
	; GFX1012-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012:xnack-"			; GFX1012-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012:xnack-"
	; GFX1012-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012:xnack+"			; GFX1012-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1012:xnack+"
				; GFX1013: .amdgcn_target "amdgcn-amd-amdhsa--gfx1013"
				; GFX1013-NOXNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1013:xnack-"
				; GFX1013-XNACK: .amdgcn_target "amdgcn-amd-amdhsa--gfx1013:xnack+"
	; GFX1030: .amdgcn_target "amdgcn-amd-amdhsa--gfx1030"			; GFX1030: .amdgcn_target "amdgcn-amd-amdhsa--gfx1030"
	; GFX1031: .amdgcn_target "amdgcn-amd-amdhsa--gfx1031"			; GFX1031: .amdgcn_target "amdgcn-amd-amdhsa--gfx1031"
	; GFX1032: .amdgcn_target "amdgcn-amd-amdhsa--gfx1032"			; GFX1032: .amdgcn_target "amdgcn-amd-amdhsa--gfx1032"
	; GFX1033: .amdgcn_target "amdgcn-amd-amdhsa--gfx1033"			; GFX1033: .amdgcn_target "amdgcn-amd-amdhsa--gfx1033"
	; GFX1034: .amdgcn_target "amdgcn-amd-amdhsa--gfx1034"			; GFX1034: .amdgcn_target "amdgcn-amd-amdhsa--gfx1034"

	define amdgpu_kernel void @directive_amdgcn_target() {			define amdgpu_kernel void @directive_amdgcn_target() {
	ret void			ret void
	}			}

llvm/test/CodeGen/AMDGPU/elf-header-flags-mach.ll

	Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx906 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX906 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx906 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX906 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx908 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX908 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx908 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX908 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx909 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX909 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx909 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX909 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx90a < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX90A %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx90a < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX90A %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx90c < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX90C %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx90c < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX90C %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1010 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1010 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1010 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1010 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1011 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1011 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1011 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1011 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1012 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1012 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1012 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1012 %s
				; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1013 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1013 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1030 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1030 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1030 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1030 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1031 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1031 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1031 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1031 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1032 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1032 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1032 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1032 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1033 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1033 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1033 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1033 %s
	; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1034 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1034 %s			; RUN: llc -filetype=obj -march=amdgcn -mcpu=gfx1034 < %s \| llvm-readobj -file-headers - \| FileCheck --check-prefixes=ALL,ARCH-GCN,GFX1034 %s

	; FIXME: With the default attributes the eflags are not accurate for			; FIXME: With the default attributes the eflags are not accurate for
	; xnack and sramecc. Subsequent Target-ID patches will address this.			; xnack and sramecc. Subsequent Target-ID patches will address this.
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; GFX906: EF_AMDGPU_MACH_AMDGCN_GFX906 (0x2F)			; GFX906: EF_AMDGPU_MACH_AMDGCN_GFX906 (0x2F)
	; GFX908: EF_AMDGPU_MACH_AMDGCN_GFX908 (0x30)			; GFX908: EF_AMDGPU_MACH_AMDGCN_GFX908 (0x30)
	; GFX909: EF_AMDGPU_MACH_AMDGCN_GFX909 (0x31)			; GFX909: EF_AMDGPU_MACH_AMDGCN_GFX909 (0x31)
	; GFX90A: EF_AMDGPU_MACH_AMDGCN_GFX90A (0x3F)			; GFX90A: EF_AMDGPU_MACH_AMDGCN_GFX90A (0x3F)
	; GFX90C: EF_AMDGPU_MACH_AMDGCN_GFX90C (0x32)			; GFX90C: EF_AMDGPU_MACH_AMDGCN_GFX90C (0x32)
	; GFX1010: EF_AMDGPU_MACH_AMDGCN_GFX1010 (0x33)			; GFX1010: EF_AMDGPU_MACH_AMDGCN_GFX1010 (0x33)
	; GFX1011: EF_AMDGPU_MACH_AMDGCN_GFX1011 (0x34)			; GFX1011: EF_AMDGPU_MACH_AMDGCN_GFX1011 (0x34)
	; GFX1012: EF_AMDGPU_MACH_AMDGCN_GFX1012 (0x35)			; GFX1012: EF_AMDGPU_MACH_AMDGCN_GFX1012 (0x35)
				; GFX1013: EF_AMDGPU_MACH_AMDGCN_GFX1013 (0x42)
	; GFX1030: EF_AMDGPU_MACH_AMDGCN_GFX1030 (0x36)			; GFX1030: EF_AMDGPU_MACH_AMDGCN_GFX1030 (0x36)
	; GFX1031: EF_AMDGPU_MACH_AMDGCN_GFX1031 (0x37)			; GFX1031: EF_AMDGPU_MACH_AMDGCN_GFX1031 (0x37)
	; GFX1032: EF_AMDGPU_MACH_AMDGCN_GFX1032 (0x38)			; GFX1032: EF_AMDGPU_MACH_AMDGCN_GFX1032 (0x38)
	; GFX1033: EF_AMDGPU_MACH_AMDGCN_GFX1033 (0x39)			; GFX1033: EF_AMDGPU_MACH_AMDGCN_GFX1033 (0x39)
	; GFX1034: EF_AMDGPU_MACH_AMDGCN_GFX1034 (0x3E)			; GFX1034: EF_AMDGPU_MACH_AMDGCN_GFX1034 (0x3E)
	; ALL: ]			; ALL: ]

	define amdgpu_kernel void @elf_header() {			define amdgpu_kernel void @elf_header() {
	ret void			ret void
	}			}

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll

	; RUN: llc -march=amdgcn -mcpu=gfx1030 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s			; RUN: llc -march=amdgcn -mcpu=gfx1030 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: llc -march=amdgcn -mcpu=gfx1013 -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: not --crash llc -march=amdgcn -mcpu=gfx1012 -verify-machineinstrs < %s 2>&1 \| FileCheck -check-prefix=ERR %s
				foadUnsubmitted Done Reply Inline Actions Likewise. foad: Likewise.
				rampitecUnsubmitted Not Done Reply Inline Actions not --crash llc rampitec: not --crash llc

	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(uint node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(uint node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(uint node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(uint node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(ulong node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(ulong node_ptr, float ray_extent, float4 ray_origin, float4 ray_dir, float4 ray_inv_dir, uint4 texture_descr)
	; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(ulong node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)			; uint4 llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(ulong node_ptr, float ray_extent, float4 ray_origin, half4 ray_dir, half4 ray_inv_dir, uint4 texture_descr)

	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f32(i32, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(i32, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i32.v4f16(i32, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(i64, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f32(i64, float, <4 x float>, <4 x float>, <4 x float>, <4 x i32>)
	declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(i64, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)			declare <4 x i32> @llvm.amdgcn.image.bvh.intersect.ray.i64.v4f16(i64, float, <4 x float>, <4 x half>, <4 x half>, <4 x i32>)

	; GCN-LABEL: {{^}}image_bvh_intersect_ray:			; GCN-LABEL: {{^}}image_bvh_intersect_ray:
	; GCN: image_bvh_intersect_ray v[0:3], v[0:15], s[0:3]{{$}}			; GCN: image_bvh_intersect_ray v[0:3], v[0:15], s[0:3]{{$}}
				; ERR: in function image_bvh_intersect_ray{{.*}}intrinsic not supported on subtarget
	; Arguments are flattened to represent the actual VGPR_A layout, so we have no			; Arguments are flattened to represent the actual VGPR_A layout, so we have no
	; extra moves in the generated kernel.			; extra moves in the generated kernel.
	define amdgpu_ps <4 x float> @image_bvh_intersect_ray(i32 %node_ptr, float %ray_extent, float %ray_origin_x, float %ray_origin_y, float %ray_origin_z, float %ray_dir_x, float %ray_dir_y, float %ray_dir_z, float %ray_inv_dir_x, float %ray_inv_dir_y, float %ray_inv_dir_z, <4 x i32> inreg %tdescr) {			define amdgpu_ps <4 x float> @image_bvh_intersect_ray(i32 %node_ptr, float %ray_extent, float %ray_origin_x, float %ray_origin_y, float %ray_origin_z, float %ray_dir_x, float %ray_dir_y, float %ray_dir_z, float %ray_inv_dir_x, float %ray_inv_dir_y, float %ray_inv_dir_z, <4 x i32> inreg %tdescr) {
	main_body:			main_body:
	%ray_origin0 = insertelement <4 x float> undef, float %ray_origin_x, i32 0			%ray_origin0 = insertelement <4 x float> undef, float %ray_origin_x, i32 0
	%ray_origin1 = insertelement <4 x float> %ray_origin0, float %ray_origin_y, i32 1			%ray_origin1 = insertelement <4 x float> %ray_origin0, float %ray_origin_y, i32 1
	%ray_origin = insertelement <4 x float> %ray_origin1, float %ray_origin_z, i32 2			%ray_origin = insertelement <4 x float> %ray_origin1, float %ray_origin_z, i32 2
	%ray_dir0 = insertelement <4 x float> undef, float %ray_dir_x, i32 0			%ray_dir0 = insertelement <4 x float> undef, float %ray_dir_x, i32 0
	▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

llvm/test/MC/AMDGPU/dl-insts-err.s

	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx800 %s 2>&1 \| FileCheck %s			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx800 %s 2>&1 \| FileCheck %s
	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 %s 2>&1 \| FileCheck %s			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx900 %s 2>&1 \| FileCheck %s
	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck %s --check-prefix=GFX906-GFX908			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx906 %s 2>&1 \| FileCheck %s --check-prefix=GFX906-GFX908
	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck %s --check-prefix=GFX906-GFX908			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx908 %s 2>&1 \| FileCheck %s --check-prefix=GFX906-GFX908
				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1013 %s 2>&1 \| FileCheck %s --check-prefix=GFX1013

	//			//
	// Test unsupported GPUs.			// Test unsupported GPUs.
	//			//

	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
	v_fmac_f32 v0, v1, v2			v_fmac_f32 v0, v1, v2
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
	v_xnor_b32 v0, v1, v2			v_xnor_b32 v0, v1, v2
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot2_f32_f16 v0, v1, v2, v3			v_dot2_f32_f16 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot2_i32_i16 v0, v1, v2, v3			v_dot2_i32_i16 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot2_u32_u16 v0, v1, v2, v3			v_dot2_u32_u16 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot4_i32_i8 v0, v1, v2, v3			v_dot4_i32_i8 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot4_u32_u8 v0, v1, v2, v3			v_dot4_u32_u8 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot8_i32_i4 v0, v1, v2, v3			v_dot8_i32_i4 v0, v1, v2, v3
	// CHECK: error: instruction not supported on this GPU			// CHECK: error: instruction not supported on this GPU
				// GFX1013: error: instruction not supported on this GPU
	v_dot8_u32_u4 v0, v1, v2, v3			v_dot8_u32_u4 v0, v1, v2, v3

	//			//
	// Test invalid operands.			// Test invalid operands.
	//			//

	// GFX906-GFX908: error: invalid operand for instruction			// GFX906-GFX908: error: invalid operand for instruction
	v_dot2_f32_f16 v0, v1, v2, v3 op_sel			v_dot2_f32_f16 v0, v1, v2, v3 op_sel
	▲ Show 20 Lines • Show All 354 Lines • Show Last 20 Lines

llvm/test/MC/AMDGPU/gfx10_unsupported.s

	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 %s 2>&1 \| FileCheck --implicit-check-not=error: %s			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -mattr=+wavefrontsize32,-wavefrontsize64 %s 2>&1 \| FileCheck --implicit-check-not=error: %s
	// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 %s 2>&1 \| FileCheck --implicit-check-not=error: %s			// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 %s 2>&1 \| FileCheck --implicit-check-not=error: %s
				// RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1013 -mattr=-wavefrontsize32,+wavefrontsize64 %s 2>&1 \| FileCheck --implicit-check-not=error: %s

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Unsupported instructions.			// Unsupported instructions.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	buffer_atomic_add_f32 v255, off, s[8:11], s3 offset:4095			buffer_atomic_add_f32 v255, off, s[8:11], s3 offset:4095
	// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU			// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU

	▲ Show 20 Lines • Show All 1,098 Lines • Show Last 20 Lines

llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml

	Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines
	# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1011/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1011			# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1011/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1011
	# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1011 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1011 %s			# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1011 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1011 %s
	# RUN: obj2yaml %t.o.AMDGCN_GFX1011 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1011 %s			# RUN: obj2yaml %t.o.AMDGCN_GFX1011 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1011 %s

	# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1012/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1012			# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1012/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1012
	# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1012 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1012 %s			# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1012 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1012 %s
	# RUN: obj2yaml %t.o.AMDGCN_GFX1012 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1012 %s			# RUN: obj2yaml %t.o.AMDGCN_GFX1012 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1012 %s

				# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1013/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1013
				# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1013 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1013 %s
				# RUN: obj2yaml %t.o.AMDGCN_GFX1013 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1013 %s

	# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1030/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1030			# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1030/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1030
	# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1030 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1030 %s			# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1030 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1030 %s
	# RUN: obj2yaml %t.o.AMDGCN_GFX1030 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1030 %s			# RUN: obj2yaml %t.o.AMDGCN_GFX1030 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1030 %s

	# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1031/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1031			# RUN: sed -e 's/<BITS>/64/' -e 's/<MACH>/AMDGCN_GFX1031/' %s \| yaml2obj -o %t.o.AMDGCN_GFX1031
	# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1031 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1031 %s			# RUN: llvm-readobj -S --file-headers %t.o.AMDGCN_GFX1031 \| FileCheck --check-prefixes=ELF-AMDGCN-ALL,ELF-AMDGCN-GFX1031 %s
	# RUN: obj2yaml %t.o.AMDGCN_GFX1031 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1031 %s			# RUN: obj2yaml %t.o.AMDGCN_GFX1031 \| FileCheck --check-prefixes=YAML-AMDGCN-ALL,YAML-AMDGCN-GFX1031 %s

	▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
	# YAML-AMDGCN-GFX1010: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1010 ]			# YAML-AMDGCN-GFX1010: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1010 ]

	# ELF-AMDGCN-GFX1011: EF_AMDGPU_MACH_AMDGCN_GFX1011 (0x34)			# ELF-AMDGCN-GFX1011: EF_AMDGPU_MACH_AMDGCN_GFX1011 (0x34)
	# YAML-AMDGCN-GFX1011: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1011 ]			# YAML-AMDGCN-GFX1011: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1011 ]

	# ELF-AMDGCN-GFX1012: EF_AMDGPU_MACH_AMDGCN_GFX1012 (0x35)			# ELF-AMDGCN-GFX1012: EF_AMDGPU_MACH_AMDGCN_GFX1012 (0x35)
	# YAML-AMDGCN-GFX1012: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1012 ]			# YAML-AMDGCN-GFX1012: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1012 ]

				# ELF-AMDGCN-GFX1013: EF_AMDGPU_MACH_AMDGCN_GFX1013 (0x42)
				# YAML-AMDGCN-GFX1013: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1013 ]

	# ELF-AMDGCN-GFX1030: EF_AMDGPU_MACH_AMDGCN_GFX1030 (0x36)			# ELF-AMDGCN-GFX1030: EF_AMDGPU_MACH_AMDGCN_GFX1030 (0x36)
	# YAML-AMDGCN-GFX1030: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1030 ]			# YAML-AMDGCN-GFX1030: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1030 ]

	# ELF-AMDGCN-GFX1031: EF_AMDGPU_MACH_AMDGCN_GFX1031 (0x37)			# ELF-AMDGCN-GFX1031: EF_AMDGPU_MACH_AMDGCN_GFX1031 (0x37)
	# YAML-AMDGCN-GFX1031: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1031 ]			# YAML-AMDGCN-GFX1031: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1031 ]

	# ELF-AMDGCN-GFX1032: EF_AMDGPU_MACH_AMDGCN_GFX1032 (0x38)			# ELF-AMDGCN-GFX1032: EF_AMDGPU_MACH_AMDGCN_GFX1032 (0x38)
	# YAML-AMDGCN-GFX1032: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1032 ]			# YAML-AMDGCN-GFX1032: Flags: [ EF_AMDGPU_MACH_AMDGCN_GFX1032 ]
	Show All 19 Lines

llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll

	Show All 25 Lines
	; RUN: llvm-objdump -D %t.o > %t-detect.txt			; RUN: llvm-objdump -D %t.o > %t-detect.txt
	; RUN: diff %t-specify.txt %t-detect.txt			; RUN: diff %t-specify.txt %t-detect.txt

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -filetype=obj -O0 -o %t.o %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -filetype=obj -O0 -o %t.o %s
	; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1030 %t.o > %t-specify.txt			; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1030 %t.o > %t-specify.txt
	; RUN: llvm-objdump -D %t.o > %t-detect.txt			; RUN: llvm-objdump -D %t.o > %t-detect.txt
	; RUN: diff %t-specify.txt %t-detect.txt			; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1013 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1013 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -filetype=obj -O0 -o %t.o %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -filetype=obj -O0 -o %t.o %s
	; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1012 %t.o > %t-specify.txt			; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1012 %t.o > %t-specify.txt
	; RUN: llvm-objdump -D %t.o > %t-detect.txt			; RUN: llvm-objdump -D %t.o > %t-detect.txt
	; RUN: diff %t-specify.txt %t-detect.txt			; RUN: diff %t-specify.txt %t-detect.txt

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -filetype=obj -O0 -o %t.o %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -filetype=obj -O0 -o %t.o %s
	; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1011 %t.o > %t-specify.txt			; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1011 %t.o > %t-specify.txt
	; RUN: llvm-objdump -D %t.o > %t-detect.txt			; RUN: llvm-objdump -D %t.o > %t-detect.txt
	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

llvm/test/tools/llvm-readobj/ELF/amdgpu-elf-headers.test

	Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=0 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=0 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35

	# RUN: yaml2obj %s -o %t -DABI_VERSION=1 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012			# RUN: yaml2obj %s -o %t -DABI_VERSION=1 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=1 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=1 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35

	# RUN: yaml2obj %s -o %t -DABI_VERSION=2 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012			# RUN: yaml2obj %s -o %t -DABI_VERSION=2 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=2 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=2 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1012 -DFLAG_VALUE=0x35

				# RUN: yaml2obj %s -o %t -DABI_VERSION=0 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013
				# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=0 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013 -DFLAG_VALUE=0x42

				# RUN: yaml2obj %s -o %t -DABI_VERSION=1 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013
				# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=1 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013 -DFLAG_VALUE=0x42

				# RUN: yaml2obj %s -o %t -DABI_VERSION=2 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013
				# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=2 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1013 -DFLAG_VALUE=0x42

	# RUN: yaml2obj %s -o %t -DABI_VERSION=0 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030			# RUN: yaml2obj %s -o %t -DABI_VERSION=0 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=0 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=0 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36

	# RUN: yaml2obj %s -o %t -DABI_VERSION=1 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030			# RUN: yaml2obj %s -o %t -DABI_VERSION=1 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=1 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=1 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36

	# RUN: yaml2obj %s -o %t -DABI_VERSION=2 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030			# RUN: yaml2obj %s -o %t -DABI_VERSION=2 -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030
	# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=2 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36			# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefixes=ALL,KNOWN-ABI-VERSION,SINGLE-FLAG --match-full-lines -DABI_VERSION=2 -DFILE=%t -DFLAG_NAME=EF_AMDGPU_MACH_AMDGCN_GFX1030 -DFLAG_VALUE=0x36
	▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

llvm/tools/llvm-readobj/ELFDumper.cpp

Show First 20 Lines • Show All 1,476 Lines • ▼ Show 20 Lines	static const EnumEntry<unsigned> ElfHeaderAMDGPUFlagsABIVersion3[] = {
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX906),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX906),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX908),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX908),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX909),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX909),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90A),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90A),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90C),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90C),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1010),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1010),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1011),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1011),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1012),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1012),
		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1013),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1032),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1032),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1033),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1033),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1034),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1034),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_V3),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_V3),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_SRAMECC_V3)		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_SRAMECC_V3)
};		};
Show All 36 Lines	static const EnumEntry<unsigned> ElfHeaderAMDGPUFlagsABIVersion4[] = {
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX906),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX906),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX908),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX908),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX909),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX909),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90A),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90A),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90C),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX90C),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1010),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1010),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1011),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1011),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1012),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1012),
		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1013),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1032),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1032),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1033),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1033),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1034),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1034),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_ANY_V4),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_ANY_V4),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_OFF_V4),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_OFF_V4),
LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_ON_V4),		LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_FEATURE_XNACK_ON_V4),
▲ Show 20 Lines • Show All 5,570 Lines • Show Last 20 Lines

openmp/libomptarget/plugins/amdgpu/impl/get_elf_mach_gfx_name.cpp

Show All 33 Lines	const char *get_elf_mach_gfx_name(uint32_t EFlags) {
case EF_AMDGPU_MACH_AMDGCN_GFX90C:		case EF_AMDGPU_MACH_AMDGCN_GFX90C:
return "gfx90c";		return "gfx90c";
case EF_AMDGPU_MACH_AMDGCN_GFX1010:		case EF_AMDGPU_MACH_AMDGCN_GFX1010:
return "gfx1010";		return "gfx1010";
case EF_AMDGPU_MACH_AMDGCN_GFX1011:		case EF_AMDGPU_MACH_AMDGCN_GFX1011:
return "gfx1011";		return "gfx1011";
case EF_AMDGPU_MACH_AMDGCN_GFX1012:		case EF_AMDGPU_MACH_AMDGCN_GFX1012:
return "gfx1012";		return "gfx1012";
		case EF_AMDGPU_MACH_AMDGCN_GFX1013:
		return "gfx1013";
case EF_AMDGPU_MACH_AMDGCN_GFX1030:		case EF_AMDGPU_MACH_AMDGCN_GFX1030:
return "gfx1030";		return "gfx1030";
case EF_AMDGPU_MACH_AMDGCN_GFX1031:		case EF_AMDGPU_MACH_AMDGCN_GFX1031:
return "gfx1031";		return "gfx1031";
case EF_AMDGPU_MACH_AMDGCN_GFX1032:		case EF_AMDGPU_MACH_AMDGCN_GFX1032:
return "gfx1032";		return "gfx1032";
case EF_AMDGPU_MACH_AMDGCN_GFX1033:		case EF_AMDGPU_MACH_AMDGCN_GFX1033:
return "gfx1033";		return "gfx1033";
case EF_AMDGPU_MACH_AMDGCN_GFX1034:		case EF_AMDGPU_MACH_AMDGCN_GFX1034:
return "gfx1034";		return "gfx1034";
default:		default:
return "--unknown gfx";		return "--unknown gfx";
}		}
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add gfx1013 targetClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 350640

clang/include/clang/Basic/Cuda.h

clang/lib/Basic/Cuda.cpp

clang/lib/Basic/Targets/AMDGPU.cpp

clang/lib/Basic/Targets/NVPTX.cpp

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp

clang/test/CodeGenOpenCL/amdgpu-features.cl

clang/test/Driver/amdgpu-macros.cl

clang/test/Driver/amdgpu-mcpu.cl

clang/test/Misc/target-invalid-cpu-note.c

llvm/docs/AMDGPUUsage.rst

llvm/include/llvm/BinaryFormat/ELF.h

llvm/include/llvm/Support/TargetParser.h

llvm/lib/Object/ELFObjectFile.cpp

llvm/lib/ObjectYAML/ELFYAML.cpp

llvm/lib/Support/TargetParser.cpp

llvm/lib/Target/AMDGPU/AMDGPU.td

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp

llvm/lib/Target/AMDGPU/GCNProcessors.td

llvm/lib/Target/AMDGPU/GCNSubtarget.h

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp

llvm/lib/Target/AMDGPU/MIMGInstructions.td

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll

llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll

llvm/test/CodeGen/AMDGPU/elf-header-flags-mach.ll

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.intersect_ray.ll

llvm/test/MC/AMDGPU/dl-insts-err.s

llvm/test/MC/AMDGPU/gfx10_unsupported.s

llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml

llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll

llvm/test/tools/llvm-readobj/ELF/amdgpu-elf-headers.test

llvm/tools/llvm-readobj/ELFDumper.cpp

openmp/libomptarget/plugins/amdgpu/impl/get_elf_mach_gfx_name.cpp

[AMDGPU] Add gfx1013 target
ClosedPublic