This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU - Add diagnostic for compiling modules with AMD HSA OS type and GFX 6 arch
AbandonedPublic

Authored by pvellien on Nov 25 2020, 10:17 AM.

Download Raw Diff

Details

Reviewers

rampitec
arsenm
sameerds
t-tye

Summary

Bail out from compiling modules for GFX6 + AMD HSA OS type as HSA is not supported for SI ASICs. Currently gfx6+hsa setup crashing during ISel for global load/stores due to lack of FLAT instructions. This patch add a check to report error when modules are compiled with -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 and exit from compilation pipeline.

Diff Detail

Event Timeline

pvellien created this revision.Nov 25 2020, 10:17 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 25 2020, 10:17 AM

Herald added subscribers: llvm-commits, cfe-commits, kerbowa and 9 others. · View Herald Transcript

pvellien requested review of this revision.Nov 25 2020, 10:17 AM

Herald added a subscriber: wdng. · View Herald TranscriptNov 25 2020, 10:17 AM

You need to add a new test for this new error.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
134	"do not support". I would also drop "(SI)" from the message. Maybe even better just "GFX6 does not support AMD HSA".
307	Please keep original formatting.
llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll
1	You probably just need to change triple for these targets, not just drop them from the test.
llvm/test/CodeGen/AMDGPU/lower-kernargs-si-mesa.ll
3	There are no such checks?

Harbormaster completed remote builds in B80118: Diff 307650.Nov 25 2020, 10:50 AM

Updated with stanislav comments

t-tye requested changes to this revision.Nov 27 2020, 9:43 AM

t-tye added inline comments.

llvm/docs/AMDGPUUsage.rst
2109–2112	This is not the right place for mentioning this. The Processor table would likely be a better place. It should be in terms of supporting the amdhsa ABI.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
62–72	I am not clear what this function is doing. It seems to be returning a generation unrelated to to the actual target generation to satisfy the one place it is called. If the target is not SEA_ISLANDS it seems incorrect to be returning SEA_ISLANDS. If this function is doing something special for the one place it is called perhaps it should be expanded there?
134	Make the message include the full target triple text so the user understands how to resolve the issue. For example: The target triple %s is not supported: the processor %s does not support the amdhsa OS Do the r600 targets also produce a similar error message? Is this really the right test? My understanding is that the issue is not that gfx60x does not support the amdhsa OS, but that it does not use the FLAT address space. My understanding is that the current problem is that FLAT instructions are being used for the GLOBAL address space accesses. The use of FLAT instructions for the global address space was introduced after gfx60x was initially being supported on amdhsa. Originally BUFFER instructions that use an SRD that has a 0 base and are marked as addr64 where used for GLOBAL address space accesses. This was changed to use FLAT instructions due to later targets dropping the SRD addr64 support. I suspect it is that change that broke gfx60x as there were no tests to catch it. So the real fix seems to find that change and make the code still use use BUFFER instructions for gfxx60x and FLAT instructions for gfx70x+. The tests can then be updated to test gfx60x for amdhsa but to omit the FLAT address space tests. The error would then indicate that the gfx60x does not support the FLAT address space (and that is not conditional on the OS). The documentation in AMDGPUUsage can state that gfx60x does not support the FLAT address space in the Address Space section. The Processor table can add a column for processor characteristics and mention that the gfx60x targets do not support the FLAT address space.

This revision now requires changes to proceed.Nov 27 2020, 9:43 AM

pvellien added inline comments.Nov 27 2020, 10:40 AM

llvm/docs/AMDGPUUsage.rst
2109–2112	Thanks for your feedback, I will update this
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
134	Previously in the internal review process it mentioned that gfx60x does not support HSA and agreed to add a diagnostic to report that GFX6 do not support HSA OS type, @rampitec mentioned that SI ASICs cannot support HSA because we can't able to map memory on SI as HSA requires so the user will just have weird runtime failures. But based on your comment it seems like we have to use MUBUF instructions for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx60x combination and use FLAT instructions for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx70x+. Is my understanding correct? If the compiler emits the MUBUF instructions for global address space accesses, it is still required to produce the error msg?

t-tye added inline comments.Nov 27 2020, 11:36 AM

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
134	In the early days of implementing HSA I believe we were bringing up on gfx6. It could not support all HSA features, but it did function with the parts it could support. So I was suggesting we restore the code to support what it did originally. That would mean using MUBUF for the GLOBAL address space like it used to do (is that code still present?). The compiler can then report errors for the features it cannot support, which in this case is it cannot support instruction selection of the GENERIC address space on gfx6. If you could find the commit that switched to using FLAT instructions to access the GLOBAL address space that will likely provide the necessary information to decide the best thing to do for this issue.

pvellien added inline comments.Nov 27 2020, 11:50 AM

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
134	I code to select MUBUF instructions for Global address space is still present, In fact my first patch for this issue is to generate MUBUF instructions instead of reporting error. But it got rejected due to SI ASICs do not support HSA. This is the patch https://reviews.llvm.org/D15543 which switched to using FLAT instructions for global. So whether the new approach is to off FlatForGlobal flag for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx60x combination and generate MUBUF instructions instead. It would be very much to know what is the expectation :) whether we wait for @rampitec for a comment Btw, thanks a lot for your feedback.

rampitec added inline comments.Dec 7 2020, 1:01 PM

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
62–72	It is probably better just print an error if amdhsa is requested but no compatible -mcpu specified (e.g. when we have generic/tahiti etc).
134	We can try to use MUBUF for global, but we certainly cannot support HSA and/or generic pointers on SI.

This change is wrong, the different patch is landed in llvm to handle global address space access in gfx60x for HSA Os. So closing it.

Revision Contents

Path

Size

clang/

test/

CodeGenOpenCL/

amdgpu-attrs.cl

2 lines

llvm/

docs/

AMDGPUUsage.rst

3 lines

lib/

Target/

AMDGPU/

AMDGPUSubtarget.cpp

18 lines

test/

Analysis/

DivergenceAnalysis/

AMDGPU/

inline-asm.ll

2 lines

CodeGen/

AMDGPU/

GlobalISel/

inst-select-and.mir

2 lines

inst-select-or.mir

2 lines

inst-select-xor.mir

2 lines

directive-amdgcn-target.ll

22 lines

flat-error-unsupported-gpu-hsa.ll

1 line

gfx6-amdhsa-noflat.ll

9 lines

lower-kernargs-si-mesa.ll

15 lines

lower-kernargs.ll

19 lines

Diff 308015

clang/test/CodeGenOpenCL/amdgpu-attrs.cl

	// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu tahiti -O0 -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu kaveri -O0 -emit-llvm -o - %s \| FileCheck %s
	// RUN: %clang_cc1 -triple amdgcn-- -target-cpu tahiti -O0 -emit-llvm -o - %s \| FileCheck %s -check-prefix=NONAMDHSA			// RUN: %clang_cc1 -triple amdgcn-- -target-cpu tahiti -O0 -emit-llvm -o - %s \| FileCheck %s -check-prefix=NONAMDHSA
	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O0 -emit-llvm -verify -o - %s \| FileCheck -check-prefix=X86 %s			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -O0 -emit-llvm -verify -o - %s \| FileCheck -check-prefix=X86 %s

	__attribute__((amdgpu_flat_work_group_size(0, 0))) // expected-no-diagnostics			__attribute__((amdgpu_flat_work_group_size(0, 0))) // expected-no-diagnostics
	kernel void flat_work_group_size_0_0() {}			kernel void flat_work_group_size_0_0() {}
	__attribute__((amdgpu_waves_per_eu(0))) // expected-no-diagnostics			__attribute__((amdgpu_waves_per_eu(0))) // expected-no-diagnostics
	kernel void waves_per_eu_0() {}			kernel void waves_per_eu_0() {}
	__attribute__((amdgpu_waves_per_eu(0, 0))) // expected-no-diagnostics			__attribute__((amdgpu_waves_per_eu(0, 0))) // expected-no-diagnostics
	▲ Show 20 Lines • Show All 185 Lines • Show Last 20 Lines

llvm/docs/AMDGPUUsage.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,100 Lines • ▼ Show 20 Lines	Where:

- ``<Target Features>`` is a list of the enabled Target Features		- ``<Target Features>`` is a list of the enabled Target Features
(see :ref:`amdgpu-target-features`), each prefixed by a plus, that		(see :ref:`amdgpu-target-features`), each prefixed by a plus, that
apply to Processor. The list must be in the same order as listed		apply to Processor. The list must be in the same order as listed
in the table :ref:`amdgpu-target-feature-table`. Note that *Target		in the table :ref:`amdgpu-target-feature-table`. Note that *Target
Features* must be included in the list if they are enabled even if		Features* must be included in the list if they are enabled even if
that is the default for Processor.		that is the default for Processor.

		Caution:
		AMD HSA Os is not supported in Southern Islands (GFX6) ASICs.

For example:		For example:
		t-tyeUnsubmitted Not Done Reply Inline Actions This is not the right place for mentioning this. The Processor table would likely be a better place. It should be in terms of supporting the amdhsa ABI. t-tye: This is not the right place for mentioning this. The Processor table would likely be a better…
		pvellienAuthorUnsubmitted Done Reply Inline Actions Thanks for your feedback, I will update this pvellien: Thanks for your feedback, I will update this

``"amdgcn-amd-amdhsa--gfx902+xnack"``		``"amdgcn-amd-amdhsa--gfx902+xnack"``

.. _amdgpu-amdhsa-code-object-metadata:		.. _amdgpu-amdhsa-code-object-metadata:

Code Object Metadata		Code Object Metadata
~~~~~~~~~~~~~~~~~~~~		~~~~~~~~~~~~~~~~~~~~

▲ Show 20 Lines • Show All 6,899 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	static cl::opt<bool> EnableFlatScratch(
"amdgpu-enable-flat-scratch",		"amdgpu-enable-flat-scratch",
cl::desc("Use flat scratch instructions"),		cl::desc("Use flat scratch instructions"),
cl::init(false));		cl::init(false));

static cl::opt<bool> UseAA("amdgpu-use-aa-in-codegen",		static cl::opt<bool> UseAA("amdgpu-use-aa-in-codegen",
cl::desc("Enable the use of AA during codegen."),		cl::desc("Enable the use of AA during codegen."),
cl::init(true));		cl::init(true));

		static AMDGPUSubtarget::Generation initializeGen(const Triple &TT,
		StringRef GPU) {
		if (GPU.contains("generic")) {
		return TT.getOS() == Triple::AMDHSA
		? AMDGPUSubtarget::Generation::SEA_ISLANDS
		: AMDGPUSubtarget::Generation::SOUTHERN_ISLANDS;
		} else {
		return AMDGPUSubtarget::Generation::SOUTHERN_ISLANDS;
		}
		}

		t-tyeUnsubmitted Not Done Reply Inline Actions I am not clear what this function is doing. It seems to be returning a generation unrelated to to the actual target generation to satisfy the one place it is called. If the target is not SEA_ISLANDS it seems incorrect to be returning SEA_ISLANDS. If this function is doing something special for the one place it is called perhaps it should be expanded there? t-tye: I am not clear what this function is doing. It seems to be returning a generation unrelated to…
		rampitecUnsubmitted Not Done Reply Inline Actions It is probably better just print an error if amdhsa is requested but no compatible -mcpu specified (e.g. when we have generic/tahiti etc). rampitec: It is probably better just print an error if amdhsa is requested but no compatible -mcpu…
GCNSubtarget::~GCNSubtarget() = default;		GCNSubtarget::~GCNSubtarget() = default;

R600Subtarget &		R600Subtarget &
R600Subtarget::initializeSubtargetDependencies(const Triple &TT,		R600Subtarget::initializeSubtargetDependencies(const Triple &TT,
StringRef GPU, StringRef FS) {		StringRef GPU, StringRef FS) {
SmallString<256> FullFS("+promote-alloca,");		SmallString<256> FullFS("+promote-alloca,");
FullFS += FS;		FullFS += FS;
ParseSubtargetFeatures(GPU, /TuneCPU/ GPU, FullFS);		ParseSubtargetFeatures(GPU, /TuneCPU/ GPU, FullFS);
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	GCNSubtarget::initializeSubtargetDependencies(const Triple &TT,

// Unless +-flat-for-global is specified, turn on FlatForGlobal for all OS-es		// Unless +-flat-for-global is specified, turn on FlatForGlobal for all OS-es
// on VI and newer hardware to avoid assertion failures due to missing ADDR64		// on VI and newer hardware to avoid assertion failures due to missing ADDR64
// variants of MUBUF instructions.		// variants of MUBUF instructions.
if (!hasAddr64() && !FS.contains("flat-for-global")) {		if (!hasAddr64() && !FS.contains("flat-for-global")) {
FlatForGlobal = true;		FlatForGlobal = true;
}		}

		// bail out from compilation for HSA OS type in GFX6
		if (isAmdHsaOS() && getGeneration() == AMDGPUSubtarget::SOUTHERN_ISLANDS) {
		report_fatal_error("GFX6 do not support AMD HSA");
		rampitecUnsubmitted Not Done Reply Inline Actions "do not support". I would also drop "(SI)" from the message. Maybe even better just "GFX6 does not support AMD HSA". rampitec: "do not support". I would also drop "(SI)" from the message. Maybe even better just "GFX6 does…
		t-tyeUnsubmitted Not Done Reply Inline Actions Make the message include the full target triple text so the user understands how to resolve the issue. For example: The target triple %s is not supported: the processor %s does not support the amdhsa OS Do the r600 targets also produce a similar error message? Is this really the right test? My understanding is that the issue is not that gfx60x does not support the amdhsa OS, but that it does not use the FLAT address space. My understanding is that the current problem is that FLAT instructions are being used for the GLOBAL address space accesses. The use of FLAT instructions for the global address space was introduced after gfx60x was initially being supported on amdhsa. Originally BUFFER instructions that use an SRD that has a 0 base and are marked as addr64 where used for GLOBAL address space accesses. This was changed to use FLAT instructions due to later targets dropping the SRD addr64 support. I suspect it is that change that broke gfx60x as there were no tests to catch it. So the real fix seems to find that change and make the code still use use BUFFER instructions for gfxx60x and FLAT instructions for gfx70x+. The tests can then be updated to test gfx60x for amdhsa but to omit the FLAT address space tests. The error would then indicate that the gfx60x does not support the FLAT address space (and that is not conditional on the OS). The documentation in AMDGPUUsage can state that gfx60x does not support the FLAT address space in the Address Space section. The Processor table can add a column for processor characteristics and mention that the gfx60x targets do not support the FLAT address space. t-tye: Make the message include the full target triple text so the user understands how to resolve the…
		pvellienAuthorUnsubmitted Done Reply Inline Actions Previously in the internal review process it mentioned that gfx60x does not support HSA and agreed to add a diagnostic to report that GFX6 do not support HSA OS type, @rampitec mentioned that SI ASICs cannot support HSA because we can't able to map memory on SI as HSA requires so the user will just have weird runtime failures. But based on your comment it seems like we have to use MUBUF instructions for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx60x combination and use FLAT instructions for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx70x+. Is my understanding correct? If the compiler emits the MUBUF instructions for global address space accesses, it is still required to produce the error msg? pvellien: Previously in the internal review process it mentioned that gfx60x does not support HSA and…
		t-tyeUnsubmitted Not Done Reply Inline Actions In the early days of implementing HSA I believe we were bringing up on gfx6. It could not support all HSA features, but it did function with the parts it could support. So I was suggesting we restore the code to support what it did originally. That would mean using MUBUF for the GLOBAL address space like it used to do (is that code still present?). The compiler can then report errors for the features it cannot support, which in this case is it cannot support instruction selection of the GENERIC address space on gfx6. If you could find the commit that switched to using FLAT instructions to access the GLOBAL address space that will likely provide the necessary information to decide the best thing to do for this issue. t-tye: In the early days of implementing HSA I believe we were bringing up on gfx6. It could not…
		pvellienAuthorUnsubmitted Done Reply Inline Actions I code to select MUBUF instructions for Global address space is still present, In fact my first patch for this issue is to generate MUBUF instructions instead of reporting error. But it got rejected due to SI ASICs do not support HSA. This is the patch https://reviews.llvm.org/D15543 which switched to using FLAT instructions for global. So whether the new approach is to off FlatForGlobal flag for -mtriple=amdgcn-amd-amdhsa -mcpu=gfx60x combination and generate MUBUF instructions instead. It would be very much to know what is the expectation :) whether we wait for @rampitec for a comment Btw, thanks a lot for your feedback. pvellien: I code to select MUBUF instructions for Global address space is still present, In fact my first…
		rampitecUnsubmitted Not Done Reply Inline Actions We can try to use MUBUF for global, but we certainly cannot support HSA and/or generic pointers on SI. rampitec: We can try to use MUBUF for global, but we certainly cannot support HSA and/or generic pointers…
		}

// Set defaults if needed.		// Set defaults if needed.
if (MaxPrivateElementSize == 0)		if (MaxPrivateElementSize == 0)
MaxPrivateElementSize = 4;		MaxPrivateElementSize = 4;

if (LDSBankCount == 0)		if (LDSBankCount == 0)
LDSBankCount = 32;		LDSBankCount = 32;

if (TT.getArch() == Triple::amdgcn) {		if (TT.getArch() == Triple::amdgcn) {
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	AMDGPUSubtarget::AMDGPUSubtarget(const Triple &TT) :
WavefrontSizeLog2(0)		WavefrontSizeLog2(0)
{ }		{ }

GCNSubtarget::GCNSubtarget(const Triple &TT, StringRef GPU, StringRef FS,		GCNSubtarget::GCNSubtarget(const Triple &TT, StringRef GPU, StringRef FS,
const GCNTargetMachine &TM) :		const GCNTargetMachine &TM) :
AMDGPUGenSubtargetInfo(TT, GPU, /TuneCPU/ GPU, FS),		AMDGPUGenSubtargetInfo(TT, GPU, /TuneCPU/ GPU, FS),
AMDGPUSubtarget(TT),		AMDGPUSubtarget(TT),
TargetTriple(TT),		TargetTriple(TT),
Gen(TT.getOS() == Triple::AMDHSA ? SEA_ISLANDS : SOUTHERN_ISLANDS),		Gen(initializeGen(TT, GPU)),
InstrItins(getInstrItineraryForCPU(GPU)),		InstrItins(getInstrItineraryForCPU(GPU)),
LDSBankCount(0),		LDSBankCount(0),
MaxPrivateElementSize(0),		MaxPrivateElementSize(0),

FastFMAF32(false),		FastFMAF32(false),
FastDenormalF32(false),		FastDenormalF32(false),
HalfRate64Ops(false),		HalfRate64Ops(false),

▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	GCNSubtarget::GCNSubtarget(const Triple &TT, StringRef GPU, StringRef FS,
HasImageGather4D16Bug(false),		HasImageGather4D16Bug(false),

FeatureDisable(false),		FeatureDisable(false),
InstrInfo(initializeSubtargetDependencies(TT, GPU, FS)),		InstrInfo(initializeSubtargetDependencies(TT, GPU, FS)),
TLInfo(TM, *this),		TLInfo(TM, *this),
FrameLowering(TargetFrameLowering::StackGrowsUp, getStackAlignment(), 0) {		FrameLowering(TargetFrameLowering::StackGrowsUp, getStackAlignment(), 0) {
MaxWavesPerEU = AMDGPU::IsaInfo::getMaxWavesPerEU(this);		MaxWavesPerEU = AMDGPU::IsaInfo::getMaxWavesPerEU(this);
CallLoweringInfo.reset(new AMDGPUCallLowering(*getTargetLowering()));		CallLoweringInfo.reset(new AMDGPUCallLowering(*getTargetLowering()));
InlineAsmLoweringInfo.reset(new InlineAsmLowering(getTargetLowering()));		InlineAsmLoweringInfo.reset(new InlineAsmLowering(getTargetLowering()));
		rampitecUnsubmitted Not Done Reply Inline Actions Please keep original formatting. rampitec: Please keep original formatting.
Legalizer.reset(new AMDGPULegalizerInfo(*this, TM));		Legalizer.reset(new AMDGPULegalizerInfo(*this, TM));
RegBankInfo.reset(new AMDGPURegisterBankInfo(*this));		RegBankInfo.reset(new AMDGPURegisterBankInfo(*this));
InstSelector.reset(new AMDGPUInstructionSelector(		InstSelector.reset(new AMDGPUInstructionSelector(
this, static_cast<AMDGPURegisterBankInfo *>(RegBankInfo.get()), TM));		this, static_cast<AMDGPURegisterBankInfo *>(RegBankInfo.get()), TM));
}		}

bool GCNSubtarget::enableFlatScratch() const {		bool GCNSubtarget::enableFlatScratch() const {
return EnableFlatScratch && hasFlatScratchInsts();		return EnableFlatScratch && hasFlatScratchInsts();
▲ Show 20 Lines • Show All 642 Lines • Show Last 20 Lines

llvm/test/Analysis/DivergenceAnalysis/AMDGPU/inline-asm.ll

	; RUN: opt -mtriple=amdgcn-unknown-amdhsa -mcpu=tahiti -analyze -divergence -use-gpu-divergence-analysis %s \| FileCheck %s			; RUN: opt -mtriple=amdgcn-unknown- -mcpu=tahiti -analyze -divergence -use-gpu-divergence-analysis %s \| FileCheck %s
	; RUN: opt -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx908 -analyze -divergence -use-gpu-divergence-analysis %s \| FileCheck %s			; RUN: opt -mtriple=amdgcn-unknown-amdhsa -mcpu=gfx908 -analyze -divergence -use-gpu-divergence-analysis %s \| FileCheck %s
	; Make sure nothing crashes on targets with or without AGPRs			; Make sure nothing crashes on targets with or without AGPRs

	; CHECK: Printing analysis 'Legacy Divergence Analysis' for function 'inline_asm_1_sgpr_virtreg_output':			; CHECK: Printing analysis 'Legacy Divergence Analysis' for function 'inline_asm_1_sgpr_virtreg_output':
	; CHECK-NOT: DIVERGENT			; CHECK-NOT: DIVERGENT
	define i32 @inline_asm_1_sgpr_virtreg_output() {			define i32 @inline_asm_1_sgpr_virtreg_output() {
	%sgpr = call i32 asm "s_mov_b32 $0, 0", "=s"()			%sgpr = call i32 asm "s_mov_b32 $0, 0", "=s"()
	ret i32 %sgpr			ret i32 %sgpr
	▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-and.mir

	# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd- -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s

	---			---

	name: and_s1_vcc_vcc_vcc			name: and_s1_vcc_vcc_vcc
	legalized: true			legalized: true
	regBankSelected: true			regBankSelected: true
	▲ Show 20 Lines • Show All 526 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-or.mir

	# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd- -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s

	---			---

	name: or_s1_vcc_vcc_vcc			name: or_s1_vcc_vcc_vcc
	legalized: true			legalized: true
	regBankSelected: true			regBankSelected: true
	▲ Show 20 Lines • Show All 497 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-xor.mir

	# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd- -mcpu=tahiti -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -run-pass=instruction-select -verify-machineinstrs -global-isel-abort=0 -o - %s \| FileCheck -check-prefix=WAVE64 %s
	# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s			# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr="+wavefrontsize32" -run-pass=instruction-select -global-isel-abort=0 -verify-machineinstrs -o - %s \| FileCheck -check-prefix=WAVE32 %s

	---			---

	name: xor_s1_vcc_vcc_vcc			name: xor_s1_vcc_vcc_vcc
	legalized: true			legalized: true
	regBankSelected: true			regBankSelected: true
	▲ Show 20 Lines • Show All 498 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 < %s \| FileCheck --check-prefixes=GFX600 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=gfx600 < %s \| FileCheck --check-prefixes=GFX600 %s
	rampitecUnsubmitted Not Done Reply Inline Actions You probably just need to change triple for these targets, not just drop them from the test. rampitec: You probably just need to change triple for these targets, not just drop them from the test.
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=tahiti < %s \| FileCheck --check-prefixes=GFX600 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=tahiti < %s \| FileCheck --check-prefixes=GFX600 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx601 < %s \| FileCheck --check-prefixes=GFX601 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=gfx601 < %s \| FileCheck --check-prefixes=GFX601 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=pitcairn < %s \| FileCheck --check-prefixes=GFX601 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=pitcairn < %s \| FileCheck --check-prefixes=GFX601 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=verde < %s \| FileCheck --check-prefixes=GFX601 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=verde < %s \| FileCheck --check-prefixes=GFX601 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx602 < %s \| FileCheck --check-prefixes=GFX602 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=gfx602 < %s \| FileCheck --check-prefixes=GFX602 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=hainan < %s \| FileCheck --check-prefixes=GFX602 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=hainan < %s \| FileCheck --check-prefixes=GFX602 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=oland < %s \| FileCheck --check-prefixes=GFX602 %s			; RUN: llc -mtriple=amdgcn-amd- -mcpu=oland < %s \| FileCheck --check-prefixes=GFX602 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 < %s \| FileCheck --check-prefixes=GFX700 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 < %s \| FileCheck --check-prefixes=GFX700 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kaveri < %s \| FileCheck --check-prefixes=GFX700 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kaveri < %s \| FileCheck --check-prefixes=GFX700 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx701 < %s \| FileCheck --check-prefixes=GFX701 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx701 < %s \| FileCheck --check-prefixes=GFX701 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=hawaii < %s \| FileCheck --check-prefixes=GFX701 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=hawaii < %s \| FileCheck --check-prefixes=GFX701 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx702 < %s \| FileCheck --check-prefixes=GFX702 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx702 < %s \| FileCheck --check-prefixes=GFX702 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx703 < %s \| FileCheck --check-prefixes=GFX703 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx703 < %s \| FileCheck --check-prefixes=GFX703 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kabini < %s \| FileCheck --check-prefixes=GFX703 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kabini < %s \| FileCheck --check-prefixes=GFX703 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=mullins < %s \| FileCheck --check-prefixes=GFX703 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=mullins < %s \| FileCheck --check-prefixes=GFX703 %s
	Show All 22 Lines
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx902 -mattr=-xnack < %s \| FileCheck --check-prefixes=NO-XNACK-GFX902 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx902 -mattr=-xnack < %s \| FileCheck --check-prefixes=NO-XNACK-GFX902 %s

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx904 -mattr=+sram-ecc < %s \| FileCheck --check-prefixes=SRAM-ECC-GFX904 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx904 -mattr=+sram-ecc < %s \| FileCheck --check-prefixes=SRAM-ECC-GFX904 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -mattr=+sram-ecc < %s \| FileCheck --check-prefixes=SRAM-ECC-GFX906 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -mattr=+sram-ecc < %s \| FileCheck --check-prefixes=SRAM-ECC-GFX906 %s

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx904 -mattr=+sram-ecc,+xnack < %s \| FileCheck --check-prefixes=SRAM-ECC-XNACK-GFX904 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx904 -mattr=+sram-ecc,+xnack < %s \| FileCheck --check-prefixes=SRAM-ECC-XNACK-GFX904 %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -mattr=+sram-ecc,+xnack < %s \| FileCheck --check-prefixes=SRAM-ECC-XNACK-GFX906 %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -mattr=+sram-ecc,+xnack < %s \| FileCheck --check-prefixes=SRAM-ECC-XNACK-GFX906 %s

	; GFX600: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"			; GFX600: .amd_amdgpu_isa "amdgcn-amd-unknown--gfx600"
	; GFX601: .amdgcn_target "amdgcn-amd-amdhsa--gfx601"			; GFX601: .amd_amdgpu_isa "amdgcn-amd-unknown--gfx601"
	; GFX602: .amdgcn_target "amdgcn-amd-amdhsa--gfx602"			; GFX602: .amd_amdgpu_isa "amdgcn-amd-unknown--gfx602"
	; GFX700: .amdgcn_target "amdgcn-amd-amdhsa--gfx700"			; GFX700: .amdgcn_target "amdgcn-amd-amdhsa--gfx700"
	; GFX701: .amdgcn_target "amdgcn-amd-amdhsa--gfx701"			; GFX701: .amdgcn_target "amdgcn-amd-amdhsa--gfx701"
	; GFX702: .amdgcn_target "amdgcn-amd-amdhsa--gfx702"			; GFX702: .amdgcn_target "amdgcn-amd-amdhsa--gfx702"
	; GFX703: .amdgcn_target "amdgcn-amd-amdhsa--gfx703"			; GFX703: .amdgcn_target "amdgcn-amd-amdhsa--gfx703"
	; GFX704: .amdgcn_target "amdgcn-amd-amdhsa--gfx704"			; GFX704: .amdgcn_target "amdgcn-amd-amdhsa--gfx704"
	; GFX705: .amdgcn_target "amdgcn-amd-amdhsa--gfx705"			; GFX705: .amdgcn_target "amdgcn-amd-amdhsa--gfx705"
	; GFX801: .amdgcn_target "amdgcn-amd-amdhsa--gfx801+xnack"			; GFX801: .amdgcn_target "amdgcn-amd-amdhsa--gfx801+xnack"
	; GFX802: .amdgcn_target "amdgcn-amd-amdhsa--gfx802"			; GFX802: .amdgcn_target "amdgcn-amd-amdhsa--gfx802"
	Show All 20 Lines

llvm/test/CodeGen/AMDGPU/flat-error-unsupported-gpu-hsa.ll

	; RUN: not --crash llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 -filetype=obj -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERROR %s
	; RUN: not --crash llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx600 -filetype=obj -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERROR %s			; RUN: not --crash llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx600 -filetype=obj -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERROR %s

	; RUN: llc -mtriple=amdgcn-amd-amdhsa -o - %s \| FileCheck -check-prefix=HSA-DEFAULT %s			; RUN: llc -mtriple=amdgcn-amd-amdhsa -o - %s \| FileCheck -check-prefix=HSA-DEFAULT %s
	; RUN: not --crash llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx600 -filetype=obj -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERROR %s			; RUN: not --crash llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx600 -filetype=obj -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERROR %s

	; Flat instructions should not select if the target device doesn't			; Flat instructions should not select if the target device doesn't
	; support them. The default device should be able to select for HSA.			; support them. The default device should be able to select for HSA.

	; ERROR: LLVM ERROR: Cannot select: {{0x[0-9,a-f]+\|t[0-9]+}}: i32,ch = load<(volatile load 4 from %ir.flat.ptr.load)>			; ERROR: LLVM ERROR: Cannot select: {{0x[0-9,a-f]+\|t[0-9]+}}: i32,ch = load<(volatile load 4 from %ir.flat.ptr.load)>
	; HSA-DEFAULT: flat_load_dword			; HSA-DEFAULT: flat_load_dword
	define amdgpu_kernel void @load_flat_i32(i32* %flat.ptr) {			define amdgpu_kernel void @load_flat_i32(i32* %flat.ptr) {
	%load = load volatile i32, i32* %flat.ptr, align 4			%load = load volatile i32, i32* %flat.ptr, align 4
	ret void			ret void
	}			}

llvm/test/CodeGen/AMDGPU/gfx6-amdhsa-noflat.ll

This file was added.

				; RUN: not --crash llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 -verify-machineinstrs -o /dev/null %s 2>&1 \| FileCheck -check-prefix=ERR %s
				; Report error for gfx6 and amdhsa
				; ERR: LLVM ERROR: GFX6 do not support AMD HSA

				define void @f(i32 addrspace(1)* %out) {
				store i32 0, i32 addrspace(1)* %out
				ret void
				}

llvm/test/CodeGen/AMDGPU/lower-kernargs-si-mesa.ll

This file was added.

				; RUN: opt -mtriple=amdgcn-- -S -o - -amdgpu-lower-kernel-arguments %s \| FileCheck -check-prefix=MESA %s

				target datalayout = "A5"
				rampitecUnsubmitted Not Done Reply Inline Actions There are no such checks? rampitec: There are no such checks?

				define amdgpu_kernel void @kern_lds_ptr_si(i32 addrspace(3)* %lds) #0 {
				; MESA-LABEL: @kern_lds_ptr_si(
				; MESA-NEXT: [[KERN_LDS_PTR_SI_KERNARG_SEGMENT:%.]] = call nonnull align 16 dereferenceable(44) i8 addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()
				; MESA-NEXT: store i32 0, i32 addrspace(3)* [[LDS:%.*]], align 4
				; MESA-NEXT: ret void
				;
				store i32 0, i32 addrspace(3)* %lds, align 4
				ret void
				}

				attributes #0 = { nounwind "target-cpu"="tahiti" }

llvm/test/CodeGen/AMDGPU/lower-kernargs.ll

	Show First 20 Lines • Show All 524 Lines • ▼ Show 20 Lines
	; MESA-NEXT: [[LDS_LOAD:%.]] = load i32 addrspace(3), i32 addrspace(3)* addrspace(4)* [[LDS_KERNARG_OFFSET_CAST]], align 4, !invariant.load !0			; MESA-NEXT: [[LDS_LOAD:%.]] = load i32 addrspace(3), i32 addrspace(3)* addrspace(4)* [[LDS_KERNARG_OFFSET_CAST]], align 4, !invariant.load !0
	; MESA-NEXT: store i32 0, i32 addrspace(3)* [[LDS_LOAD]], align 4			; MESA-NEXT: store i32 0, i32 addrspace(3)* [[LDS_LOAD]], align 4
	; MESA-NEXT: ret void			; MESA-NEXT: ret void
	;			;
	store i32 0, i32 addrspace(3)* %lds, align 4			store i32 0, i32 addrspace(3)* %lds, align 4
	ret void			ret void
	}			}

	define amdgpu_kernel void @kern_lds_ptr_si(i32 addrspace(3)* %lds) #2 {
	; HSA-LABEL: @kern_lds_ptr_si(
	; HSA-NEXT: [[KERN_LDS_PTR_SI_KERNARG_SEGMENT:%.]] = call nonnull align 16 dereferenceable(8) i8 addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()
	; HSA-NEXT: [[LDS_KERNARG_OFFSET:%.]] = getelementptr inbounds i8, i8 addrspace(4) [[KERN_LDS_PTR_SI_KERNARG_SEGMENT]], i64 0
	; HSA-NEXT: [[LDS_KERNARG_OFFSET_CAST:%.]] = bitcast i8 addrspace(4) [[LDS_KERNARG_OFFSET]] to i32 addrspace(3)* addrspace(4)*
	; HSA-NEXT: [[LDS_LOAD:%.]] = load i32 addrspace(3), i32 addrspace(3)* addrspace(4)* [[LDS_KERNARG_OFFSET_CAST]], align 16, !invariant.load !0
	; HSA-NEXT: store i32 0, i32 addrspace(3)* [[LDS_LOAD]], align 4
	; HSA-NEXT: ret void
	;
	; MESA-LABEL: @kern_lds_ptr_si(
	; MESA-NEXT: [[KERN_LDS_PTR_SI_KERNARG_SEGMENT:%.]] = call nonnull align 16 dereferenceable(44) i8 addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()
	; MESA-NEXT: store i32 0, i32 addrspace(3)* [[LDS:%.*]], align 4
	; MESA-NEXT: ret void
	;
	store i32 0, i32 addrspace(3)* %lds, align 4
	ret void
	}

	define amdgpu_kernel void @kern_realign_i8_i8(i8 %arg0, i8 %arg1) #0 {			define amdgpu_kernel void @kern_realign_i8_i8(i8 %arg0, i8 %arg1) #0 {
	; HSA-LABEL: @kern_realign_i8_i8(			; HSA-LABEL: @kern_realign_i8_i8(
	; HSA-NEXT: [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT:%.]] = call nonnull align 16 dereferenceable(4) i8 addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()			; HSA-NEXT: [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT:%.]] = call nonnull align 16 dereferenceable(4) i8 addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()
	; HSA-NEXT: [[ARG0_KERNARG_OFFSET_ALIGN_DOWN:%.]] = getelementptr inbounds i8, i8 addrspace(4) [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT]], i64 0			; HSA-NEXT: [[ARG0_KERNARG_OFFSET_ALIGN_DOWN:%.]] = getelementptr inbounds i8, i8 addrspace(4) [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT]], i64 0
	; HSA-NEXT: [[ARG0_KERNARG_OFFSET_ALIGN_DOWN_CAST:%.]] = bitcast i8 addrspace(4) [[ARG0_KERNARG_OFFSET_ALIGN_DOWN]] to i32 addrspace(4)*			; HSA-NEXT: [[ARG0_KERNARG_OFFSET_ALIGN_DOWN_CAST:%.]] = bitcast i8 addrspace(4) [[ARG0_KERNARG_OFFSET_ALIGN_DOWN]] to i32 addrspace(4)*
	; HSA-NEXT: [[TMP1:%.]] = load i32, i32 addrspace(4) [[ARG0_KERNARG_OFFSET_ALIGN_DOWN_CAST]], align 16, !invariant.load !0			; HSA-NEXT: [[TMP1:%.]] = load i32, i32 addrspace(4) [[ARG0_KERNARG_OFFSET_ALIGN_DOWN_CAST]], align 16, !invariant.load !0
	; HSA-NEXT: [[TMP2:%.*]] = trunc i32 [[TMP1]] to i8			; HSA-NEXT: [[TMP2:%.*]] = trunc i32 [[TMP1]] to i8
	; HSA-NEXT: [[ARG1_KERNARG_OFFSET_ALIGN_DOWN:%.]] = getelementptr inbounds i8, i8 addrspace(4) [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT]], i64 0			; HSA-NEXT: [[ARG1_KERNARG_OFFSET_ALIGN_DOWN:%.]] = getelementptr inbounds i8, i8 addrspace(4) [[KERN_REALIGN_I8_I8_KERNARG_SEGMENT]], i64 0
	▲ Show 20 Lines • Show All 1,350 Lines • ▼ Show 20 Lines
	;			;
	%in = load i32, i32 addrspace(4)* %in.byref			%in = load i32, i32 addrspace(4)* %in.byref
	store i32 %in, i32 addrspace(1)* undef, align 4			store i32 %in, i32 addrspace(1)* undef, align 4
	ret void			ret void
	}			}

	attributes #0 = { nounwind "target-cpu"="kaveri" }			attributes #0 = { nounwind "target-cpu"="kaveri" }
	attributes #1 = { nounwind "target-cpu"="kaveri" "amdgpu-implicitarg-num-bytes"="40" }			attributes #1 = { nounwind "target-cpu"="kaveri" "amdgpu-implicitarg-num-bytes"="40" }
	attributes #2 = { nounwind "target-cpu"="tahiti" }

	; GCN: 0 = !{}			; GCN: 0 = !{}
	; GCN: !1 = !{i64 42}			; GCN: !1 = !{i64 42}
	; GCN: !2 = !{i64 128}			; GCN: !2 = !{i64 128}
	; GCN: !3 = !{i64 1024}			; GCN: !3 = !{i64 1024}

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU - Add diagnostic for compiling modules with AMD HSA OS type and GFX 6 arch AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 308015

clang/test/CodeGenOpenCL/amdgpu-attrs.cl

llvm/docs/AMDGPUUsage.rst

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp

llvm/test/Analysis/DivergenceAnalysis/AMDGPU/inline-asm.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-and.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-or.mir

llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-xor.mir

llvm/test/CodeGen/AMDGPU/directive-amdgcn-target.ll

llvm/test/CodeGen/AMDGPU/flat-error-unsupported-gpu-hsa.ll

llvm/test/CodeGen/AMDGPU/gfx6-amdhsa-noflat.ll

llvm/test/CodeGen/AMDGPU/lower-kernargs-si-mesa.ll

llvm/test/CodeGen/AMDGPU/lower-kernargs.ll

AMDGPU - Add diagnostic for compiling modules with AMD HSA OS type and GFX 6 arch
AbandonedPublic