Download Raw Diff

Details

Reviewers

Commits

rGd7e03a5bd9f8: AMDGPU: Export workitem builtins
rC275030: AMDGPU: Export workitem builtins
rL275030: AMDGPU: Export workitem builtins

Diff Detail

Repository: rL LLVM

Event Timeline

jvesely updated this revision to Diff 57391.May 16 2016, 1:33 PM

jvesely retitled this revision from to AMDGPU: Export target workitem related builtins.

jvesely updated this object.

jvesely added a reviewer: • tstellarAMD.

jvesely set the repository for this revision to rL LLVM.

jvesely added subscribers: arsenm, llvm-commits.

Herald added a subscriber: kzhuravl. · View Herald TranscriptMay 16 2016, 1:33 PM

jvesely added parent revisions: D20297: AMDGPU/SI: Add implicitarg.ptr intrinsic., D20298: AMDGPU/R600: Add get_global_offset_{x,y,z} intrinsic.May 16 2016, 1:33 PM

Comment messages LANGBUILTIN but not used.

We don't need builtins for the ones that just read off an implicit offset from the kern arg pointer.

I also had a partial patch for this that added custom codegen which added !range metadata for the maximum group sizes on the call sites. Do you want to try adding that?

arsenm added inline comments.May 16 2016, 1:46 PM

include/clang/Basic/BuiltinsAMDGPU.def
16–18	I would move these after the amdgcn and fix this to be r600-NI

use LANDGBUILTIN
drop amdgcn intrinsics that are just reading off the parameter vector

add kernarg segment intrinsic

kernarg segment return const as pointer

arsenm added inline comments.May 20 2016, 10:35 AM

include/clang/Basic/BuiltinsAMDGPU.def
88	These shouldn't be lang builtin, I just meant the commit message said so
test/CodeGenOpenCL/builtins-amdgcn.cl
295	range metadata check?

revert back to BUILTIN

jvesely marked an inline comment as done.May 22 2016, 4:47 PM

jvesely added inline comments.

include/clang/Basic/BuiltinsAMDGPU.def
88	sorry, it was kind of hard to parse, I mentioned LANGBUILTIN as a possible alternative. another alternative is to have clc specific builtins, e.g __builtin_clc_get_work_dim() __builtin_clc_get_global_offset(int) ... and implement them for every target
test/CodeGenOpenCL/builtins-amdgcn.cl
295	range metadata is not generated. if I understand the code correctly, we'd need to drop __builtin names from include/llvm/IR/IntrinsicsAMDGPU.td and emit each intrinsic manually in lib/CodeGen/CGBuiltin.cpp with added range metadata. I can take a look but I think it should be a separate patch

LGTM.

This revision is now accepted and ready to land.May 24 2016, 5:41 PM

I'll add range metadata to this patch. It should be easier to do it right away than to synch changes with llvm to avoid test failures.

jvesely added a parent revision: D20691: AMDGPU: Remove gcc builtin names from workitem intrinsics.May 26 2016, 11:36 AM

emit intrinsics manually and add range metadata.
not sure where to get authoritative source on global size/wg_id limits.
clover reports 256, but that seems bogus.

This revision is now accepted and ready to land.May 26 2016, 12:52 PM

it'd be nice if the accepted state reset after updating the diff...

I've been using 2048. That is the theoretical limit on SI. The limit might have decreased to 1024 on newer hardware

AMD OpenCL has only ever exposed 256 though. Supposedly there was no benefit to larger groups, and I think there might be some hardware bugs at 1024

In D20299#441400, @arsenm wrote:

AMD OpenCL has only ever exposed 256 though. Supposedly there was no benefit to larger groups, and I think there might be some hardware bugs at 1024

OK, let's stick with 256

only local size and wi id are range restricted

ids are [0,256), size is [1,257)

LGTM.

This revision is now accepted and ready to land.May 27 2016, 4:54 PM

drop "segment" from the implicitarg.ptr builtin

fixup test after renaming implicitarg builtin

jvesely added a parent revision: D21622: AMDGPU/R600: Add implicitarg.ptr intrinsic.Jun 25 2016, 6:42 PM

switch r600 to implicitarg.ptr

jvesely requested a review of this revision.Jul 1 2016, 2:01 PM

jvesely edited edge metadata.

• tstellarAMD added inline comments.Jul 4 2016, 5:29 PM

lib/CodeGen/CGBuiltin.cpp
7648–7661	Are you sure 256 is the upper bounds for these?

jvesely added inline comments.Jul 4 2016, 8:09 PM

lib/CodeGen/CGBuiltin.cpp
7648–7661	I'm pretty sure it's not. There was a short discussion earlier in this revision. OpenGL requires at least 1024x1024x64 (1024 total) for compute shaders, so I'd say hw supports at least those sizes. EG/CM ISA specs don't say. SI/CI/VI ISA specs say 1024. Mesa exposes either 256 or 2048. Larger sets can be faked using GDS, but since there is no lower bound in OpenCL it'd be nice to have (efficient) hw limits here

• tstellarAMD added inline comments.Jul 5 2016, 4:55 AM

lib/CodeGen/CGBuiltin.cpp
7648–7661	I think we should use 1024 here, Using a number too low will generate incorrect code. It's also probably same to assume the 1024 is limit for EG/CM too. A follow up improvement might be to check for OpenCL related function attributes, like reqd_work_group_size and use that to emit a smaller range.

change limit to 1024

LGTM.

This revision is now accepted and ready to land.Jul 8 2016, 5:51 PM

Closed by commit rL275030: AMDGPU: Export workitem builtins (authored by jvesely). · Explain WhyJul 10 2016, 3:45 PM

This revision was automatically updated to reflect the committed changes.

Diff 62767

include/clang/Basic/BuiltinsAMDGPU.def

	//==- BuiltinsAMDGPU.def - AMDGPU Builtin function database ------- C++ --==//			//==- BuiltinsAMDGPU.def - AMDGPU Builtin function database ------- C++ --==//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file defines the AMDGPU-specific builtin function database. Users of			// This file defines the AMDGPU-specific builtin function database. Users of
	// this file must define the BUILTIN macro to make use of this information.			// this file must define the BUILTIN macro to make use of this information.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// The format of this database matches clang/Basic/Builtins.def.			// The format of this database matches clang/Basic/Builtins.def.

	#if defined(BUILTIN) && !defined(TARGET_BUILTIN)			#if defined(BUILTIN) && !defined(TARGET_BUILTIN)
	# define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) BUILTIN(ID, TYPE, ATTRS)			# define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) BUILTIN(ID, TYPE, ATTRS)
				arsenmUnsubmitted Done Reply Inline Actions I would move these after the amdgcn and fix this to be r600-NI arsenm: I would move these after the amdgcn and fix this to be r600-NI
	#endif			#endif
				//===----------------------------------------------------------------------===//
				// SI+ only builtins.
				//===----------------------------------------------------------------------===//

				BUILTIN(__builtin_amdgcn_kernarg_segment_ptr, "Uc*2", "nc")
				BUILTIN(__builtin_amdgcn_implicitarg_ptr, "Uc*2", "nc")

				BUILTIN(__builtin_amdgcn_workgroup_id_x, "Ui", "nc")
				BUILTIN(__builtin_amdgcn_workgroup_id_y, "Ui", "nc")
				BUILTIN(__builtin_amdgcn_workgroup_id_z, "Ui", "nc")

				BUILTIN(__builtin_amdgcn_workitem_id_x, "Ui", "nc")
				BUILTIN(__builtin_amdgcn_workitem_id_y, "Ui", "nc")
				BUILTIN(__builtin_amdgcn_workitem_id_z, "Ui", "nc")

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instruction builtins.			// Instruction builtins.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	BUILTIN(__builtin_amdgcn_s_barrier, "v", "n")			BUILTIN(__builtin_amdgcn_s_barrier, "v", "n")
	BUILTIN(__builtin_amdgcn_div_scale, "dddbb*", "n")			BUILTIN(__builtin_amdgcn_div_scale, "dddbb*", "n")
	BUILTIN(__builtin_amdgcn_div_scalef, "fffbb*", "n")			BUILTIN(__builtin_amdgcn_div_scalef, "fffbb*", "n")
	BUILTIN(__builtin_amdgcn_div_fmas, "ddddb", "nc")			BUILTIN(__builtin_amdgcn_div_fmas, "ddddb", "nc")
	Show All 35 Lines
	TARGET_BUILTIN(__builtin_amdgcn_s_memrealtime, "LUi", "n", "s-memrealtime")			TARGET_BUILTIN(__builtin_amdgcn_s_memrealtime, "LUi", "n", "s-memrealtime")

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Special builtins.			// Special builtins.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	BUILTIN(__builtin_amdgcn_read_exec, "LUi", "nc")			BUILTIN(__builtin_amdgcn_read_exec, "LUi", "nc")

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// R600-NI only builtins.
				//===----------------------------------------------------------------------===//

				BUILTIN(__builtin_r600_implicitarg_ptr, "Uc*7", "nc")
				arsenmUnsubmitted Done Reply Inline Actions These shouldn't be lang builtin, I just meant the commit message said so arsenm: These shouldn't be lang builtin, I just meant the commit message said so
				jveselyAuthorUnsubmitted Not Done Reply Inline Actions sorry, it was kind of hard to parse, I mentioned LANGBUILTIN as a possible alternative. another alternative is to have clc specific builtins, e.g __builtin_clc_get_work_dim() __builtin_clc_get_global_offset(int) ... and implement them for every target jvesely: sorry, it was kind of hard to parse, I mentioned LANGBUILTIN as a possible alternative. another…

				BUILTIN(__builtin_r600_read_tgid_x, "Ui", "nc")
				BUILTIN(__builtin_r600_read_tgid_y, "Ui", "nc")
				BUILTIN(__builtin_r600_read_tgid_z, "Ui", "nc")

				BUILTIN(__builtin_r600_read_tidig_x, "Ui", "nc")
				BUILTIN(__builtin_r600_read_tidig_y, "Ui", "nc")
				BUILTIN(__builtin_r600_read_tidig_z, "Ui", "nc")

				//===----------------------------------------------------------------------===//
	// Legacy names with amdgpu prefix			// Legacy names with amdgpu prefix
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	BUILTIN(__builtin_amdgpu_rsq, "dd", "nc")			BUILTIN(__builtin_amdgpu_rsq, "dd", "nc")
	BUILTIN(__builtin_amdgpu_rsqf, "ff", "nc")			BUILTIN(__builtin_amdgpu_rsqf, "ff", "nc")
	BUILTIN(__builtin_amdgpu_ldexp, "ddi", "nc")			BUILTIN(__builtin_amdgpu_ldexp, "ddi", "nc")
	BUILTIN(__builtin_amdgpu_ldexpf, "ffi", "nc")			BUILTIN(__builtin_amdgpu_ldexpf, "ffi", "nc")

	#undef BUILTIN			#undef BUILTIN
	#undef TARGET_BUILTIN			#undef TARGET_BUILTIN

lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 20 Lines
#include "clang/Basic/TargetBuiltins.h"		#include "clang/Basic/TargetBuiltins.h"
#include "clang/Basic/TargetInfo.h"		#include "clang/Basic/TargetInfo.h"
#include "clang/CodeGen/CGFunctionInfo.h"		#include "clang/CodeGen/CGFunctionInfo.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/InlineAsm.h"		#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
		#include "llvm/IR/MDBuilder.h"
#include <sstream>		#include <sstream>

using namespace clang;		using namespace clang;
using namespace CodeGen;		using namespace CodeGen;
using namespace llvm;		using namespace llvm;

/// getBuiltinLibFunction - Given a builtin id for a function like		/// getBuiltinLibFunction - Given a builtin id for a function like
/// "__builtin_fabsf", return a Function* for "fabsf".		/// "__builtin_fabsf", return a Function* for "fabsf".
▲ Show 20 Lines • Show All 289 Lines • ▼ Show 20 Lines	assert(X->getType() == Y->getType() &&
"arguments have the same integer width?)");		"arguments have the same integer width?)");

llvm::Value *Callee = CGF.CGM.getIntrinsic(IntrinsicID, X->getType());		llvm::Value *Callee = CGF.CGM.getIntrinsic(IntrinsicID, X->getType());
llvm::Value *Tmp = CGF.Builder.CreateCall(Callee, {X, Y});		llvm::Value *Tmp = CGF.Builder.CreateCall(Callee, {X, Y});
Carry = CGF.Builder.CreateExtractValue(Tmp, 1);		Carry = CGF.Builder.CreateExtractValue(Tmp, 1);
return CGF.Builder.CreateExtractValue(Tmp, 0);		return CGF.Builder.CreateExtractValue(Tmp, 0);
}		}

		static Value *emitRangedBuiltin(CodeGenFunction &CGF,
		unsigned IntrinsicID,
		int low, int high) {
		llvm::MDBuilder MDHelper(CGF.getLLVMContext());
		llvm::MDNode *RNode = MDHelper.createRange(APInt(32, low), APInt(32, high));
		Value *F = CGF.CGM.getIntrinsic(IntrinsicID, {});
		llvm::Instruction *Call = CGF.Builder.CreateCall(F);
		Call->setMetadata(llvm::LLVMContext::MD_range, RNode);
		return Call;
		}

namespace {		namespace {
struct WidthAndSignedness {		struct WidthAndSignedness {
unsigned Width;		unsigned Width;
bool Signed;		bool Signed;
};		};
}		}

static WidthAndSignedness		static WidthAndSignedness
▲ Show 20 Lines • Show All 7,284 Lines • ▼ Show 20 Lines	case AMDGPU::BI__builtin_amdgpu_rsqf: {
return emitUnaryBuiltin(*this, E, Intrinsic::r600_rsq);		return emitUnaryBuiltin(*this, E, Intrinsic::r600_rsq);
}		}
case AMDGPU::BI__builtin_amdgpu_ldexp:		case AMDGPU::BI__builtin_amdgpu_ldexp:
case AMDGPU::BI__builtin_amdgpu_ldexpf: {		case AMDGPU::BI__builtin_amdgpu_ldexpf: {
if (getTarget().getTriple().getArch() == Triple::amdgcn)		if (getTarget().getTriple().getArch() == Triple::amdgcn)
return emitFPIntBuiltin(*this, E, Intrinsic::amdgcn_ldexp);		return emitFPIntBuiltin(*this, E, Intrinsic::amdgcn_ldexp);
return emitFPIntBuiltin(*this, E, Intrinsic::AMDGPU_ldexp);		return emitFPIntBuiltin(*this, E, Intrinsic::AMDGPU_ldexp);
}		}

		// amdgcn workitem
		case AMDGPU::BI__builtin_amdgcn_workitem_id_x:
		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_x, 0, 1024);
		case AMDGPU::BI__builtin_amdgcn_workitem_id_y:
		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_y, 0, 1024);
		case AMDGPU::BI__builtin_amdgcn_workitem_id_z:
		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_z, 0, 1024);

		// r600 workitem
		case AMDGPU::BI__builtin_r600_read_tidig_x:
		return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_x, 0, 1024);
		case AMDGPU::BI__builtin_r600_read_tidig_y:
		return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_y, 0, 1024);
		case AMDGPU::BI__builtin_r600_read_tidig_z:
		return emitRangedBuiltin(*this, Intrinsic::r600_read_tidig_z, 0, 1024);
		tstellarAMDUnsubmitted Done Reply Inline Actions Are you sure 256 is the upper bounds for these? tstellarAMD: Are you sure 256 is the upper bounds for these?
		jveselyAuthorUnsubmitted Done Reply Inline Actions I'm pretty sure it's not. There was a short discussion earlier in this revision. OpenGL requires at least 1024x1024x64 (1024 total) for compute shaders, so I'd say hw supports at least those sizes. EG/CM ISA specs don't say. SI/CI/VI ISA specs say 1024. Mesa exposes either 256 or 2048. Larger sets can be faked using GDS, but since there is no lower bound in OpenCL it'd be nice to have (efficient) hw limits here jvesely: I'm pretty sure it's not. There was a short discussion earlier in this revision. OpenGL…
		tstellarAMDUnsubmitted Done Reply Inline Actions I think we should use 1024 here, Using a number too low will generate incorrect code. It's also probably same to assume the 1024 is limit for EG/CM too. A follow up improvement might be to check for OpenCL related function attributes, like reqd_work_group_size and use that to emit a smaller range. tstellarAMD: I think we should use 1024 here, Using a number too low will generate incorrect code. It's…
default:		default:
return nullptr;		return nullptr;
}		}
}		}

/// Handle a SystemZ function in which the final argument is a pointer		/// Handle a SystemZ function in which the final argument is a pointer
/// to an int that receives the post-instruction CC value. At the LLVM level		/// to an int that receives the post-instruction CC value. At the LLVM level
/// this is represented as a function that returns a {result, cc} pair.		/// this is represented as a function that returns a {result, cc} pair.
▲ Show 20 Lines • Show All 394 Lines • Show Last 20 Lines

test/CodeGenOpenCL/builtins-amdgcn.cl

	Show First 20 Lines • Show All 285 Lines • ▼ Show 20 Lines

	// CHECK-LABEL: @test_legacy_ldexp_f64			// CHECK-LABEL: @test_legacy_ldexp_f64
	// CHECK: call double @llvm.amdgcn.ldexp.f64			// CHECK: call double @llvm.amdgcn.ldexp.f64
	void test_legacy_ldexp_f64(global double* out, double a, int b)			void test_legacy_ldexp_f64(global double* out, double a, int b)
	{			{
	*out = __builtin_amdgpu_ldexp(a, b);			*out = __builtin_amdgpu_ldexp(a, b);
	}			}

				// CHECK-LABEL: @test_kernarg_segment_ptr
				// CHECK: call i8 addrspace(2)* @llvm.amdgcn.kernarg.segment.ptr()
				arsenmUnsubmitted Done Reply Inline Actions range metadata check? arsenm: range metadata check?
				jveselyAuthorUnsubmitted Done Reply Inline Actions range metadata is not generated. if I understand the code correctly, we'd need to drop __builtin names from include/llvm/IR/IntrinsicsAMDGPU.td and emit each intrinsic manually in lib/CodeGen/CGBuiltin.cpp with added range metadata. I can take a look but I think it should be a separate patch jvesely: range metadata is not generated. if I understand the code correctly, we'd need to drop…
				void test_kernarg_segment_ptr(__attribute__((address_space(2))) unsigned char ** out)
				{
				*out = __builtin_amdgcn_kernarg_segment_ptr();
				}

				// CHECK-LABEL: @test_implicitarg_ptr
				// CHECK: call i8 addrspace(2)* @llvm.amdgcn.implicitarg.ptr()
				void test_implicitarg_ptr(__attribute__((address_space(2))) unsigned char ** out)
				{
				*out = __builtin_amdgcn_implicitarg_ptr();
				}

				// CHECK-LABEL: @test_get_group_id(
				// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.x()
				// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.y()
				// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.z()
				void test_get_group_id(int d, global int *out)
				{
				switch (d) {
				case 0: *out = __builtin_amdgcn_workgroup_id_x(); break;
				case 1: *out = __builtin_amdgcn_workgroup_id_y(); break;
				case 2: *out = __builtin_amdgcn_workgroup_id_z(); break;
				default: *out = 0;
				}
				}

				// CHECK-LABEL: @test_get_local_id(
				// CHECK: tail call i32 @llvm.amdgcn.workitem.id.x(), !range [[WI_RANGE:![0-9]*]]
				// CHECK: tail call i32 @llvm.amdgcn.workitem.id.y(), !range [[WI_RANGE]]
				// CHECK: tail call i32 @llvm.amdgcn.workitem.id.z(), !range [[WI_RANGE]]
				void test_get_local_id(int d, global int *out)
				{
				switch (d) {
				case 0: *out = __builtin_amdgcn_workitem_id_x(); break;
				case 1: *out = __builtin_amdgcn_workitem_id_y(); break;
				case 2: *out = __builtin_amdgcn_workitem_id_z(); break;
				default: *out = 0;
				}
				}

				// CHECK-DAG: [[WI_RANGE]] = !{i32 0, i32 1024}
	// CHECK-DAG: attributes #[[NOUNWIND_READONLY:[0-9]+]] = { nounwind readonly }			// CHECK-DAG: attributes #[[NOUNWIND_READONLY:[0-9]+]] = { nounwind readonly }
	// CHECK-DAG: attributes #[[READ_EXEC_ATTRS]] = { convergent }			// CHECK-DAG: attributes #[[READ_EXEC_ATTRS]] = { convergent }
	// CHECK: ![[EXEC]] = !{!"exec"}			// CHECK: ![[EXEC]] = !{!"exec"}

test/CodeGenOpenCL/builtins-r600.cl

	Show All 26 Lines
	#if cl_khr_fp64			#if cl_khr_fp64
	// XCHECK-LABEL: @test_legacy_ldexp_f64			// XCHECK-LABEL: @test_legacy_ldexp_f64
	// XCHECK: call double @llvm.AMDGPU.ldexp.f64			// XCHECK: call double @llvm.AMDGPU.ldexp.f64
	void test_legacy_ldexp_f64(global double* out, double a, int b)			void test_legacy_ldexp_f64(global double* out, double a, int b)
	{			{
	*out = __builtin_amdgpu_ldexp(a, b);			*out = __builtin_amdgpu_ldexp(a, b);
	}			}
	#endif			#endif

				// CHECK-LABEL: @test_implicitarg_ptr
				// CHECK: call i8 addrspace(7)* @llvm.r600.implicitarg.ptr()
				void test_implicitarg_ptr(__attribute__((address_space(7))) unsigned char ** out)
				{
				*out = __builtin_r600_implicitarg_ptr();
				}

				// CHECK-LABEL: @test_get_group_id(
				// CHECK: tail call i32 @llvm.r600.read.tgid.x()
				// CHECK: tail call i32 @llvm.r600.read.tgid.y()
				// CHECK: tail call i32 @llvm.r600.read.tgid.z()
				void test_get_group_id(int d, global int *out)
				{
				switch (d) {
				case 0: *out = __builtin_r600_read_tgid_x(); break;
				case 1: *out = __builtin_r600_read_tgid_y(); break;
				case 2: *out = __builtin_r600_read_tgid_z(); break;
				default: *out = 0;
				}
				}

				// CHECK-LABEL: @test_get_local_id(
				// CHECK: tail call i32 @llvm.r600.read.tidig.x(), !range [[WI_RANGE:![0-9]*]]
				// CHECK: tail call i32 @llvm.r600.read.tidig.y(), !range [[WI_RANGE]]
				// CHECK: tail call i32 @llvm.r600.read.tidig.z(), !range [[WI_RANGE]]
				void test_get_local_id(int d, global int *out)
				{
				switch (d) {
				case 0: *out = __builtin_r600_read_tidig_x(); break;
				case 1: *out = __builtin_r600_read_tidig_y(); break;
				case 2: *out = __builtin_r600_read_tidig_z(); break;
				default: *out = 0;
				}
				}

				// CHECK-DAG: [[WI_RANGE]] = !{i32 0, i32 1024}

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Export target workitem related builtins
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 62767

include/clang/Basic/BuiltinsAMDGPU.def

lib/CodeGen/CGBuiltin.cpp

test/CodeGenOpenCL/builtins-amdgcn.cl

test/CodeGenOpenCL/builtins-r600.cl

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Export target workitem related builtinsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 62767

include/clang/Basic/BuiltinsAMDGPU.def

lib/CodeGen/CGBuiltin.cpp

test/CodeGenOpenCL/builtins-amdgcn.cl

test/CodeGenOpenCL/builtins-r600.cl

AMDGPU: Export target workitem related builtins
ClosedPublic