This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/
-
clang/
-
AST/
-
ASTContext.h
-
Basic/
3
BuiltinsAMDGPU.def
3
TargetInfo.h
-
lib/
-
AST/
3
ASTContext.cpp
-
Basic/Targets/
-
Targets/
1
AMDGPU.h
-
CodeGen/
6
CGBuiltin.cpp
-
Sema/
-
SemaExpr.cpp
-
test/
-
CodeGenCUDA/
-
builtins-amdgcn.cu
-
CodeGenOpenCL/
1
builtins-amdgcn.cl
4
numbered-address-space.cl
-
SemaOpenCL/
1
numbered-address-space.cl

Differential D47154

Try to make builtin address space declarations not useless
ClosedPublic

Authored by arsenm on May 21 2018, 12:05 PM.

Download Raw Diff

Details

Reviewers

Anastasia
yaxunl
rjmccall
dfukalov
tra

Summary

The way address space declarations for builtins currently work
is nearly useless. The code assumes the address spaces used for
builtins is a confusingly named "target address space" from user
code using attribute((address_space(N))) that matches
the builtin declaration. There's no way to use this to declare
a builtin that returns a language specific address space.
The terminology used is highly cofusing since it has nothing
to do with the the address space selected by the target to use
for a language address space.

This feature is essentially unused as-is. AMDGPU and NVPTX
are the only in-tree targets attempting to use this. The AMDGPU
builtins certainly do not behave as intended (i.e. all of the
builtins returning pointers can never compile because the numbered
address space never matches the expected named address space).

The NVPTX builtins are missing tests for some, and the others
seem to rely on an implicit addrspacecast.

Change the used address space for builtins based on a target
hook to allow using a language address space for a builtin.
This allows the same builtin declaration to be used for multiple
languages with similarly purposed address spaces (e.g. the same
AMDGPU builtin can be used in OpenCL and CUDA even though the
constant address spaces are arbitarily different).

This breaks the possibility of using arbitrary numbered
address spaces alongside the named address spaces for builtins.
If this is an issue we probably need to introduce another builtin
declaration character to distinguish language address spaces from
so-called "target address spaces".

Diff Detail

Event Timeline

arsenm created this revision.May 21 2018, 12:05 PM

Herald added subscribers: tpr, nhaehnle, wdng. · View Herald TranscriptMay 21 2018, 12:05 PM

arsenm added a reviewer: dfukalov.May 21 2018, 12:17 PM

jlebar added a reviewer: tra.May 21 2018, 1:39 PM

Anastasia added inline comments.May 22 2018, 10:26 AM

include/clang/Basic/BuiltinsAMDGPU.def
49	Do you plan to provide the support for it later? Or if else perhaps we should elaborate more what's to be done.
include/clang/Basic/TargetInfo.h
1170	Can you add a comment please to explain what the function is for?
lib/AST/ASTContext.cpp
9355–9359	Could we check against LangAS::Default instead of removing this completely.
lib/CodeGen/CGBuiltin.cpp
3708	Would this be correct for OpenCL? Should we use `isAddressSpaceSupersetOf` helper instead? Would it also sort the issue with constant AS (at least for OpenCL)?
test/CodeGenOpenCL/numbered-address-space.cl
37	`__attribute__((address_space(N)))` is not an OpenCL feature and I think it's not specified in C either? But I think generally non matching address spaces don't compile in Clang. So it might be useful to disallow this?

CUDA does not expose explicit AS on clang size. All pointers are treated as generic and we infer specific address space only in LLVM.
__nvvm_atom_*_[sg]_* builtins should probably be removed as they are indeed useless without pointers with explicit AS and NVCC itself does not have such builtins either. Instead, we should convert the generic AS builtin to address-space specific instruction somewhere in LLVM.

Using attribute((address_space()) should probably produce an error during CUDA compilation.

nhaehnle removed a subscriber: nhaehnle.May 26 2018, 10:44 AM

bjope added a subscriber: bjope.May 31 2018, 1:32 PM

arsenm added inline comments.Jun 6 2018, 12:17 PM

include/clang/Basic/BuiltinsAMDGPU.def
49	I'm not sure. I don't know how to best enforce this
lib/AST/ASTContext.cpp
9355–9359	I don't think that really make sense, since that would be leaving this the same. I don't really need it for this patch, but I fundamentally think specifying address space 0 is different from an unspecified address space. According to the description for builtins, if no address space is specified than any address space will be accepted. This is different from a builtin requiring address space 0
lib/CodeGen/CGBuiltin.cpp
3708	The issue I mentioned for the other builtin is that it modifies the memory, and doesn't have to do with the casting. At this point the AddrSpaceCast has to be emitted. The checking if the cast is legal I guess would be in the SemaExpr part. I know at one point I was trying to use isAddressSpaceSupersetOf in rewriteBuiltinFunctionDecl, but there was some problem with that. I think it didn't make sense with the magic where the builtin without an address space is supposed to accept any address space or something along those lines.
test/CodeGenOpenCL/numbered-address-space.cl
37	I'm pretty sure it's a C extension. The way things seem to work now is address spaces are accepted anywhere and everywhere.

Rebase and add comment

tra mentioned this in D47845: [CUDA] Removed unused __nvvm_* builtins with non-generic pointers..Jun 6 2018, 1:31 PM

Anastasia added inline comments.Jun 12 2018, 4:32 AM

include/clang/Basic/BuiltinsAMDGPU.def
49	The only way I guess if we list overloads with each address space explicitly (apart from `constant`)? Or may be with the use of `generic` AS, although that will only work for CL2.0.
lib/AST/ASTContext.cpp
9355–9359	I thought `Default` AS was meant to be for the case no AS is specified but I guess it doesn't work the same in Builtins specification syntax.
lib/CodeGen/CGBuiltin.cpp
3708	Yes, I think Sema has to check it before indeed. I am not sure it works right with OpenCL rules though for the Builtin functions. Would it make sense to add a negative test for this then?

tra mentioned this in rC335168: [CUDA] Removed unused __nvvm_* builtins with non-generic pointers..Jun 20 2018, 1:38 PM

tra mentioned this in rL335168: [CUDA] Removed unused __nvvm_* builtins with non-generic pointers..

arsenm added inline comments.Jun 26 2018, 12:19 PM

lib/CodeGen/CGBuiltin.cpp
3708	I'm not sure what this test would look like. Do you mean a test that erroneously is accepted now?

Anastasia added inline comments.Jul 5 2018, 8:50 AM

lib/CodeGen/CGBuiltin.cpp
3708	Ok, so at this point you are trying to change generation of `bitcast` to `addrspacecast` which makes sense to me. Do we still need a `bitcast` though? I think `addrspacecast` can be used to convert type and address space too: The ‘addrspacecast‘ instruction converts ptrval from pty in address space n to type pty2 in address space m. It would be nice to add proper Sema checking for `Builtins` for address space of pointers in OpenCL mode, but this might be more work.
test/CodeGenOpenCL/numbered-address-space.cl
37	Yes, the line below should give an error for OpenCL? generic int* generic_ptr = local_ptr;

Add sema test for numbered address spaces

arsenm added inline comments.Jul 9 2018, 3:34 AM

lib/CodeGen/CGBuiltin.cpp
3708	I think the canonical form is to use the bitcast for the type pointer conversion, and then separate the addrspacecast. I think instcombine splits these apart
test/CodeGenOpenCL/numbered-address-space.cl
37	This does not error. The wording of the spec seems to leave some interpretation for other address spaces. It whitelists the valid address spaces for implicit casts, and blacklists constant for implicit or explicit casts. My reading between the lines is that an explicit cast would be OK. I think this is a separate fix since this is independent from the builtins

ping

LGTM from OpenCL side! Thanks!

test/SemaOpenCL/numbered-address-space.cl
10	Ideally this is not governed by any specification. Generic compiler support for such `addrspacecast` is not possible but vendor can implement custom support. I think we can leave it as is until we have better idea how this should be supported.

This revision is now accepted and ready to land.Jul 27 2018, 8:56 AM

yaxunl requested changes to this revision.Jul 27 2018, 9:17 AM

yaxunl added inline comments.

lib/Basic/Targets/AMDGPU.h
398	I am wondering how this would work for CUDA/HIP. Let's say a builtin is supposed to return a pointer to addrspace 4. Now in HIP this builtin is returning a pointer to addrspace 0. How would that work?

This revision now requires changes to proceed.Jul 27 2018, 9:17 AM

yaxunl added inline comments.Jul 27 2018, 9:43 AM

include/clang/Basic/TargetInfo.h
1178	I think this function is not needed. Although CUDA/HIP uses address spaces in codegen, but it does not use named address spaces in sema and AST. Named address space is not in the AST types of CUDA/HIP, therefore there is no point of mapping target address space back to language address space for CUDA/HIP. CUDA/HIP should be just like other address-space-agnostic language and always use getLangASFromTargetAS(AS).

yaxunl added inline comments.Jul 27 2018, 9:59 AM

test/CodeGenOpenCL/builtins-amdgcn.cl
3	Please remove this line since we no longer use opencl in triple.

In D47154#1108813, @tra wrote:

CUDA does not expose explicit AS on clang size. All pointers are treated as generic and we infer specific address space only in LLVM.
__nvvm_atom_*_[sg]_* builtins should probably be removed as they are indeed useless without pointers with explicit AS and NVCC itself does not have such builtins either. Instead, we should convert the generic AS builtin to address-space specific instruction somewhere in LLVM.

Using attribute((address_space()) should probably produce an error during CUDA compilation.

Sometimes we need to call functions defined in our device library which is written in OpenCL. Some function have pointer arguments in non-zero address space. To declare these functions in CUDA/HIP we need to use __attribute__((address_space())). We use C-style cast to cast pointers in CUDA/HIP to a non-zero address space and pass them to the functions. I think __attribute__((address_space())) is still needed for this situation.

arsenm added inline comments.Jul 29 2018, 2:13 AM

include/clang/Basic/TargetInfo.h
1178	This is necessary to insert the correct addrspacecast, otherwise the builtins CUDA test asserts. getLangASFromTargetAS returns the user specified addrspace

Remove old run line

LGTM. Thanks.

I missed the addr space casts you added to CodeGenFunction::EmitBuiltinExpr. With those casts it should work.

For other downstream address space agnostic languages, e.g. (HCC), I guess they need to add similar hooks to use this feature.

This revision is now accepted and ready to land.Jul 30 2018, 4:36 AM

r338707

Revision Contents

Path

Size

include/

clang/

AST/

ASTContext.h

2 lines

Basic/

BuiltinsAMDGPU.def

14 lines

TargetInfo.h

12 lines

lib/

AST/

ASTContext.cpp

18 lines

Basic/

Targets/

AMDGPU.h

21 lines

CodeGen/

CGBuiltin.cpp

62 lines

Sema/

SemaExpr.cpp

5 lines

test/

CodeGenCUDA/

builtins-amdgcn.cu

18 lines

CodeGenOpenCL/

builtins-amdgcn.cl

66 lines

numbered-address-space.cl

34 lines

SemaOpenCL/

numbered-address-space.cl

31 lines

Diff 157885

include/clang/AST/ASTContext.h

Show First 20 Lines • Show All 2,460 Lines • ▼ Show 20 Lines	public:
}		}

unsigned getTargetAddressSpace(Qualifiers Q) const {		unsigned getTargetAddressSpace(Qualifiers Q) const {
return getTargetAddressSpace(Q.getAddressSpace());		return getTargetAddressSpace(Q.getAddressSpace());
}		}

unsigned getTargetAddressSpace(LangAS AS) const;		unsigned getTargetAddressSpace(LangAS AS) const;

		LangAS getLangASForBuiltinAddressSpace(unsigned AS) const;

/// Get target-dependent integer value for null pointer which is used for		/// Get target-dependent integer value for null pointer which is used for
/// constant folding.		/// constant folding.
uint64_t getTargetNullPointerValue(QualType QT) const;		uint64_t getTargetNullPointerValue(QualType QT) const;

bool addressSpaceMapManglingFor(LangAS AS) const {		bool addressSpaceMapManglingFor(LangAS AS) const {
return AddrSpaceMapMangling \|\| isTargetAddressSpace(AS);		return AddrSpaceMapMangling \|\| isTargetAddressSpace(AS);
}		}

▲ Show 20 Lines • Show All 544 Lines • Show Last 20 Lines

include/clang/Basic/BuiltinsAMDGPU.def

	Show All 15 Lines

	#if defined(BUILTIN) && !defined(TARGET_BUILTIN)			#if defined(BUILTIN) && !defined(TARGET_BUILTIN)
	# define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) BUILTIN(ID, TYPE, ATTRS)			# define TARGET_BUILTIN(ID, TYPE, ATTRS, FEATURE) BUILTIN(ID, TYPE, ATTRS)
	#endif			#endif
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// SI+ only builtins.			// SI+ only builtins.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	BUILTIN(__builtin_amdgcn_dispatch_ptr, "Uc*4", "nc")			BUILTIN(__builtin_amdgcn_dispatch_ptr, "v*4", "nc")
	BUILTIN(__builtin_amdgcn_kernarg_segment_ptr, "Uc*4", "nc")			BUILTIN(__builtin_amdgcn_kernarg_segment_ptr, "v*4", "nc")
	BUILTIN(__builtin_amdgcn_implicitarg_ptr, "Uc*4", "nc")			BUILTIN(__builtin_amdgcn_implicitarg_ptr, "v*4", "nc")

	BUILTIN(__builtin_amdgcn_workgroup_id_x, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workgroup_id_x, "Ui", "nc")
	BUILTIN(__builtin_amdgcn_workgroup_id_y, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workgroup_id_y, "Ui", "nc")
	BUILTIN(__builtin_amdgcn_workgroup_id_z, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workgroup_id_z, "Ui", "nc")

	BUILTIN(__builtin_amdgcn_workitem_id_x, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workitem_id_x, "Ui", "nc")
	BUILTIN(__builtin_amdgcn_workitem_id_y, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workitem_id_y, "Ui", "nc")
	BUILTIN(__builtin_amdgcn_workitem_id_z, "Ui", "nc")			BUILTIN(__builtin_amdgcn_workitem_id_z, "Ui", "nc")

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instruction builtins.			// Instruction builtins.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	BUILTIN(__builtin_amdgcn_s_getreg, "UiIi", "n")			BUILTIN(__builtin_amdgcn_s_getreg, "UiIi", "n")
	BUILTIN(__builtin_amdgcn_s_getpc, "LUi", "n")			BUILTIN(__builtin_amdgcn_s_getpc, "LUi", "n")
	BUILTIN(__builtin_amdgcn_s_waitcnt, "vIi", "n")			BUILTIN(__builtin_amdgcn_s_waitcnt, "vIi", "n")
	BUILTIN(__builtin_amdgcn_s_sendmsg, "vIiUi", "n")			BUILTIN(__builtin_amdgcn_s_sendmsg, "vIiUi", "n")
	BUILTIN(__builtin_amdgcn_s_sendmsghalt, "vIiUi", "n")			BUILTIN(__builtin_amdgcn_s_sendmsghalt, "vIiUi", "n")
	BUILTIN(__builtin_amdgcn_s_barrier, "v", "n")			BUILTIN(__builtin_amdgcn_s_barrier, "v", "n")
	BUILTIN(__builtin_amdgcn_wave_barrier, "v", "n")			BUILTIN(__builtin_amdgcn_wave_barrier, "v", "n")
	BUILTIN(__builtin_amdgcn_s_dcache_inv, "v", "n")			BUILTIN(__builtin_amdgcn_s_dcache_inv, "v", "n")
	BUILTIN(__builtin_amdgcn_buffer_wbinvl1, "v", "n")			BUILTIN(__builtin_amdgcn_buffer_wbinvl1, "v", "n")

				// FIXME: Need to disallow constant address space.
				AnastasiaUnsubmitted Not Done Reply Inline Actions Do you plan to provide the support for it later? Or if else perhaps we should elaborate more what's to be done. Anastasia: Do you plan to provide the support for it later? Or if else perhaps we should elaborate more…
				arsenmAuthorUnsubmitted Not Done Reply Inline Actions I'm not sure. I don't know how to best enforce this arsenm: I'm not sure. I don't know how to best enforce this
				AnastasiaUnsubmitted Not Done Reply Inline Actions The only way I guess if we list overloads with each address space explicitly (apart from `constant`)? Or may be with the use of `generic` AS, although that will only work for CL2.0. Anastasia: The only way I guess if we list overloads with each address space explicitly (apart from…
	BUILTIN(__builtin_amdgcn_div_scale, "dddbb*", "n")			BUILTIN(__builtin_amdgcn_div_scale, "dddbb*", "n")
	BUILTIN(__builtin_amdgcn_div_scalef, "fffbb*", "n")			BUILTIN(__builtin_amdgcn_div_scalef, "fffbb*", "n")
	BUILTIN(__builtin_amdgcn_div_fmas, "ddddb", "nc")			BUILTIN(__builtin_amdgcn_div_fmas, "ddddb", "nc")
	BUILTIN(__builtin_amdgcn_div_fmasf, "ffffb", "nc")			BUILTIN(__builtin_amdgcn_div_fmasf, "ffffb", "nc")
	BUILTIN(__builtin_amdgcn_div_fixup, "dddd", "nc")			BUILTIN(__builtin_amdgcn_div_fixup, "dddd", "nc")
	BUILTIN(__builtin_amdgcn_div_fixupf, "ffff", "nc")			BUILTIN(__builtin_amdgcn_div_fixupf, "ffff", "nc")
	BUILTIN(__builtin_amdgcn_trig_preop, "ddi", "nc")			BUILTIN(__builtin_amdgcn_trig_preop, "ddi", "nc")
	BUILTIN(__builtin_amdgcn_trig_preopf, "ffi", "nc")			BUILTIN(__builtin_amdgcn_trig_preopf, "ffi", "nc")
	Show All 32 Lines
	BUILTIN(__builtin_amdgcn_fcmp, "LUiddIi", "nc")			BUILTIN(__builtin_amdgcn_fcmp, "LUiddIi", "nc")
	BUILTIN(__builtin_amdgcn_fcmpf, "LUiffIi", "nc")			BUILTIN(__builtin_amdgcn_fcmpf, "LUiffIi", "nc")
	BUILTIN(__builtin_amdgcn_ds_swizzle, "iiIi", "nc")			BUILTIN(__builtin_amdgcn_ds_swizzle, "iiIi", "nc")
	BUILTIN(__builtin_amdgcn_ds_permute, "iii", "nc")			BUILTIN(__builtin_amdgcn_ds_permute, "iii", "nc")
	BUILTIN(__builtin_amdgcn_ds_bpermute, "iii", "nc")			BUILTIN(__builtin_amdgcn_ds_bpermute, "iii", "nc")
	BUILTIN(__builtin_amdgcn_readfirstlane, "ii", "nc")			BUILTIN(__builtin_amdgcn_readfirstlane, "ii", "nc")
	BUILTIN(__builtin_amdgcn_readlane, "iii", "nc")			BUILTIN(__builtin_amdgcn_readlane, "iii", "nc")
	BUILTIN(__builtin_amdgcn_fmed3f, "ffff", "nc")			BUILTIN(__builtin_amdgcn_fmed3f, "ffff", "nc")
	BUILTIN(__builtin_amdgcn_ds_faddf, "ff*fIiIiIb", "n")			BUILTIN(__builtin_amdgcn_ds_faddf, "ff*3fIiIiIb", "n")
	BUILTIN(__builtin_amdgcn_ds_fminf, "ff*fIiIiIb", "n")			BUILTIN(__builtin_amdgcn_ds_fminf, "ff*3fIiIiIb", "n")
	BUILTIN(__builtin_amdgcn_ds_fmaxf, "ff*fIiIiIb", "n")			BUILTIN(__builtin_amdgcn_ds_fmaxf, "ff*3fIiIiIb", "n")

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VI+ only builtins.			// VI+ only builtins.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	TARGET_BUILTIN(__builtin_amdgcn_div_fixuph, "hhhh", "nc", "16-bit-insts")			TARGET_BUILTIN(__builtin_amdgcn_div_fixuph, "hhhh", "nc", "16-bit-insts")
	TARGET_BUILTIN(__builtin_amdgcn_rcph, "hh", "nc", "16-bit-insts")			TARGET_BUILTIN(__builtin_amdgcn_rcph, "hh", "nc", "16-bit-insts")
	TARGET_BUILTIN(__builtin_amdgcn_rsqh, "hh", "nc", "16-bit-insts")			TARGET_BUILTIN(__builtin_amdgcn_rsqh, "hh", "nc", "16-bit-insts")
	▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

include/clang/Basic/TargetInfo.h

Show First 20 Lines • Show All 1,161 Lines • ▼ Show 20 Lines	public:

/// Return the section to use for C++ static initialization functions.		/// Return the section to use for C++ static initialization functions.
virtual const char *getStaticInitSectionSpecifier() const {		virtual const char *getStaticInitSectionSpecifier() const {
return nullptr;		return nullptr;
}		}

const LangASMap &getAddressSpaceMap() const { return *AddrSpaceMap; }		const LangASMap &getAddressSpaceMap() const { return *AddrSpaceMap; }

		/// Map from the address space field in builtin description strings to the
		AnastasiaUnsubmitted Not Done Reply Inline Actions Can you add a comment please to explain what the function is for? Anastasia: Can you add a comment please to explain what the function is for?
		/// language address space.
		virtual LangAS getOpenCLBuiltinAddressSpace(unsigned AS) const {
		return getLangASFromTargetAS(AS);
		}

		/// Map from the address space field in builtin description strings to the
		/// language address space.
		virtual LangAS getCUDABuiltinAddressSpace(unsigned AS) const {
		yaxunlUnsubmitted Not Done Reply Inline Actions I think this function is not needed. Although CUDA/HIP uses address spaces in codegen, but it does not use named address spaces in sema and AST. Named address space is not in the AST types of CUDA/HIP, therefore there is no point of mapping target address space back to language address space for CUDA/HIP. CUDA/HIP should be just like other address-space-agnostic language and always use getLangASFromTargetAS(AS). yaxunl: I think this function is not needed. Although CUDA/HIP uses address spaces in codegen, but it…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions This is necessary to insert the correct addrspacecast, otherwise the builtins CUDA test asserts. getLangASFromTargetAS returns the user specified addrspace arsenm: This is necessary to insert the correct addrspacecast, otherwise the builtins CUDA test asserts.
		return getLangASFromTargetAS(AS);
		}

/// Return an AST address space which can be used opportunistically		/// Return an AST address space which can be used opportunistically
/// for constant global memory. It must be possible to convert pointers into		/// for constant global memory. It must be possible to convert pointers into
/// this address space to LangAS::Default. If no such address space exists,		/// this address space to LangAS::Default. If no such address space exists,
/// this may return None, and such optimizations will be disabled.		/// this may return None, and such optimizations will be disabled.
virtual llvm::Optional<LangAS> getConstantAddressSpace() const {		virtual llvm::Optional<LangAS> getConstantAddressSpace() const {
return LangAS::Default;		return LangAS::Default;
}		}

▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

lib/AST/ASTContext.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,346 Lines • ▼ Show 20 Lines	while (!Done) {
switch (char c = *Str++) {		switch (char c = *Str++) {
default: Done = true; --Str; break;		default: Done = true; --Str; break;
case '*':		case '*':
case '&': {		case '&': {
// Both pointers and references can have their pointee types		// Both pointers and references can have their pointee types
// qualified with an address space.		// qualified with an address space.
char *End;		char *End;
unsigned AddrSpace = strtoul(Str, &End, 10);		unsigned AddrSpace = strtoul(Str, &End, 10);
if (End != Str && AddrSpace != 0) {		if (End != Str) {
Type = Context.getAddrSpaceQualType(Type,		// Note AddrSpace == 0 is not the same as an unspecified address space.
getLangASFromTargetAS(AddrSpace));		Type = Context.getAddrSpaceQualType(
		Type,
		Context.getLangASForBuiltinAddressSpace(AddrSpace));
		AnastasiaUnsubmitted Not Done Reply Inline Actions Could we check against LangAS::Default instead of removing this completely. Anastasia: Could we check against LangAS::Default instead of removing this completely.
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I don't think that really make sense, since that would be leaving this the same. I don't really need it for this patch, but I fundamentally think specifying address space 0 is different from an unspecified address space. According to the description for builtins, if no address space is specified than any address space will be accepted. This is different from a builtin requiring address space 0 arsenm: I don't think that really make sense, since that would be leaving this the same. I don't really…
		AnastasiaUnsubmitted Not Done Reply Inline Actions I thought `Default` AS was meant to be for the case no AS is specified but I guess it doesn't work the same in Builtins specification syntax. Anastasia: I thought `Default` AS was meant to be for the case no AS is specified but I guess it doesn't…
Str = End;		Str = End;
}		}
if (c == '*')		if (c == '*')
Type = Context.getPointerType(Type);		Type = Context.getPointerType(Type);
else		else
Type = Context.getLValueReferenceType(Type);		Type = Context.getLValueReferenceType(Type);
break;		break;
}		}
▲ Show 20 Lines • Show All 940 Lines • ▼ Show 20 Lines	case BuiltinType::UShortFract:
return SatUnsignedShortFractTy;		return SatUnsignedShortFractTy;
case BuiltinType::UFract:		case BuiltinType::UFract:
return SatUnsignedFractTy;		return SatUnsignedFractTy;
case BuiltinType::ULongFract:		case BuiltinType::ULongFract:
return SatUnsignedLongFractTy;		return SatUnsignedLongFractTy;
}		}
}		}

		LangAS ASTContext::getLangASForBuiltinAddressSpace(unsigned AS) const {
		if (LangOpts.OpenCL)
		return getTargetInfo().getOpenCLBuiltinAddressSpace(AS);

		if (LangOpts.CUDA)
		return getTargetInfo().getCUDABuiltinAddressSpace(AS);

		return getLangASFromTargetAS(AS);
		}

// Explicitly instantiate this in case a Redeclarable<T> is used from a TU that		// Explicitly instantiate this in case a Redeclarable<T> is used from a TU that
// doesn't include ASTContext.h		// doesn't include ASTContext.h
template		template
clang::LazyGenerationalUpdatePtr<		clang::LazyGenerationalUpdatePtr<
const Decl , Decl , &ExternalASTSource::CompleteRedeclChain>::ValueType		const Decl , Decl , &ExternalASTSource::CompleteRedeclChain>::ValueType
clang::LazyGenerationalUpdatePtr<		clang::LazyGenerationalUpdatePtr<
const Decl , Decl , &ExternalASTSource::CompleteRedeclChain>::makeValue(		const Decl , Decl , &ExternalASTSource::CompleteRedeclChain>::makeValue(
const clang::ASTContext &Ctx, Decl *Value);		const clang::ASTContext &Ctx, Decl *Value);
▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

lib/Basic/Targets/AMDGPU.h

Show First 20 Lines • Show All 372 Lines • ▼ Show 20 Lines	LangAS getOpenCLTypeAddrSpace(OpenCLTypeKind TK) const override {
case OCLTK_ReserveID:		case OCLTK_ReserveID:
return LangAS::opencl_global;		return LangAS::opencl_global;

default:		default:
return TargetInfo::getOpenCLTypeAddrSpace(TK);		return TargetInfo::getOpenCLTypeAddrSpace(TK);
}		}
}		}

		LangAS getOpenCLBuiltinAddressSpace(unsigned AS) const override {
		switch (AS) {
		case 0:
		return LangAS::opencl_generic;
		case 1:
		return LangAS::opencl_global;
		case 3:
		return LangAS::opencl_local;
		case 4:
		return LangAS::opencl_constant;
		case 5:
		return LangAS::opencl_private;
		default:
		return getLangASFromTargetAS(AS);
		}
		}

		LangAS getCUDABuiltinAddressSpace(unsigned AS) const override {
		yaxunlUnsubmitted Not Done Reply Inline Actions I am wondering how this would work for CUDA/HIP. Let's say a builtin is supposed to return a pointer to addrspace 4. Now in HIP this builtin is returning a pointer to addrspace 0. How would that work? yaxunl: I am wondering how this would work for CUDA/HIP. Let's say a builtin is supposed to return a…
		return LangAS::Default;
		}

llvm::Optional<LangAS> getConstantAddressSpace() const override {		llvm::Optional<LangAS> getConstantAddressSpace() const override {
return getLangASFromTargetAS(Constant);		return getLangASFromTargetAS(Constant);
}		}

/// \returns Target specific vtbl ptr address space.		/// \returns Target specific vtbl ptr address space.
unsigned getVtblPtrAddressSpace() const override {		unsigned getVtblPtrAddressSpace() const override {
return static_cast<unsigned>(Constant);		return static_cast<unsigned>(Constant);
}		}
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,697 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = E->getNumArgs(); i != e; ++i) {
(void)IsConst;		(void)IsConst;
ArgValue = llvm::ConstantInt::get(getLLVMContext(), Result);		ArgValue = llvm::ConstantInt::get(getLLVMContext(), Result);
}		}

// If the intrinsic arg type is different from the builtin arg type		// If the intrinsic arg type is different from the builtin arg type
// we need to do a bit cast.		// we need to do a bit cast.
llvm::Type *PTy = FTy->getParamType(i);		llvm::Type *PTy = FTy->getParamType(i);
if (PTy != ArgValue->getType()) {		if (PTy != ArgValue->getType()) {
		// XXX - vector of pointers?
		if (auto *PtrTy = dyn_cast<llvm::PointerType>(PTy)) {
		if (PtrTy->getAddressSpace() !=
		AnastasiaUnsubmitted Not Done Reply Inline Actions Would this be correct for OpenCL? Should we use `isAddressSpaceSupersetOf` helper instead? Would it also sort the issue with constant AS (at least for OpenCL)? Anastasia: Would this be correct for OpenCL? Should we use `isAddressSpaceSupersetOf` helper instead?
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions The issue I mentioned for the other builtin is that it modifies the memory, and doesn't have to do with the casting. At this point the AddrSpaceCast has to be emitted. The checking if the cast is legal I guess would be in the SemaExpr part. I know at one point I was trying to use isAddressSpaceSupersetOf in rewriteBuiltinFunctionDecl, but there was some problem with that. I think it didn't make sense with the magic where the builtin without an address space is supposed to accept any address space or something along those lines. arsenm: The issue I mentioned for the other builtin is that it modifies the memory, and doesn't have to…
		AnastasiaUnsubmitted Not Done Reply Inline Actions Yes, I think Sema has to check it before indeed. I am not sure it works right with OpenCL rules though for the Builtin functions. Would it make sense to add a negative test for this then? Anastasia: Yes, I think Sema has to check it before indeed. I am not sure it works right with OpenCL rules…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I'm not sure what this test would look like. Do you mean a test that erroneously is accepted now? arsenm: I'm not sure what this test would look like. Do you mean a test that erroneously is accepted…
		AnastasiaUnsubmitted Not Done Reply Inline Actions Ok, so at this point you are trying to change generation of `bitcast` to `addrspacecast` which makes sense to me. Do we still need a `bitcast` though? I think `addrspacecast` can be used to convert type and address space too: The ‘addrspacecast‘ instruction converts ptrval from pty in address space n to type pty2 in address space m. It would be nice to add proper Sema checking for `Builtins` for address space of pointers in OpenCL mode, but this might be more work. Anastasia: Ok, so at this point you are trying to change generation of `bitcast` to `addrspacecast` which…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I think the canonical form is to use the bitcast for the type pointer conversion, and then separate the addrspacecast. I think instcombine splits these apart arsenm: I think the canonical form is to use the bitcast for the type pointer conversion, and then…
		ArgValue->getType()->getPointerAddressSpace()) {
		ArgValue = Builder.CreateAddrSpaceCast(
		ArgValue,
		ArgValue->getType()->getPointerTo(PtrTy->getAddressSpace()));
		}
		}

assert(PTy->canLosslesslyBitCastTo(FTy->getParamType(i)) &&		assert(PTy->canLosslesslyBitCastTo(FTy->getParamType(i)) &&
"Must be able to losslessly bit cast to param");		"Must be able to losslessly bit cast to param");
ArgValue = Builder.CreateBitCast(ArgValue, PTy);		ArgValue = Builder.CreateBitCast(ArgValue, PTy);
}		}

Args.push_back(ArgValue);		Args.push_back(ArgValue);
}		}

Value *V = Builder.CreateCall(F, Args);		Value *V = Builder.CreateCall(F, Args);
QualType BuiltinRetType = E->getType();		QualType BuiltinRetType = E->getType();

llvm::Type *RetTy = VoidTy;		llvm::Type *RetTy = VoidTy;
if (!BuiltinRetType->isVoidType())		if (!BuiltinRetType->isVoidType())
RetTy = ConvertType(BuiltinRetType);		RetTy = ConvertType(BuiltinRetType);

if (RetTy != V->getType()) {		if (RetTy != V->getType()) {
		// XXX - vector of pointers?
		if (auto *PtrTy = dyn_cast<llvm::PointerType>(RetTy)) {
		if (PtrTy->getAddressSpace() != V->getType()->getPointerAddressSpace()) {
		V = Builder.CreateAddrSpaceCast(
		V, V->getType()->getPointerTo(PtrTy->getAddressSpace()));
		}
		}

assert(V->getType()->canLosslesslyBitCastTo(RetTy) &&		assert(V->getType()->canLosslesslyBitCastTo(RetTy) &&
"Must be able to losslessly bit cast result type");		"Must be able to losslessly bit cast result type");
V = Builder.CreateBitCast(V, RetTy);		V = Builder.CreateBitCast(V, RetTy);
}		}

return RValue::get(V);		return RValue::get(V);
}		}

▲ Show 20 Lines • Show All 7,304 Lines • ▼ Show 20 Lines	Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
case AMDGPU::BI__builtin_amdgcn_read_exec_hi: {		case AMDGPU::BI__builtin_amdgcn_read_exec_hi: {
StringRef RegName = BuiltinID == AMDGPU::BI__builtin_amdgcn_read_exec_lo ?		StringRef RegName = BuiltinID == AMDGPU::BI__builtin_amdgcn_read_exec_lo ?
"exec_lo" : "exec_hi";		"exec_lo" : "exec_hi";
CallInst *CI = cast<CallInst>(		CallInst *CI = cast<CallInst>(
EmitSpecialRegisterBuiltin(*this, E, Int32Ty, Int32Ty, true, RegName));		EmitSpecialRegisterBuiltin(*this, E, Int32Ty, Int32Ty, true, RegName));
CI->setConvergent();		CI->setConvergent();
return CI;		return CI;
}		}
case AMDGPU::BI__builtin_amdgcn_ds_faddf:
case AMDGPU::BI__builtin_amdgcn_ds_fminf:
case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: {
llvm::SmallVector<llvm::Value *, 5> Args;
for (unsigned I = 0; I != 5; ++I)
Args.push_back(EmitScalarExpr(E->getArg(I)));
const llvm::Type *PtrTy = Args[0]->getType();
// check pointer parameter
if (!PtrTy->isPointerTy() \|\|
E->getArg(0)
->getType()
->getPointeeType()
.getQualifiers()
.getAddressSpace() != LangAS::opencl_local \|\|
!PtrTy->getPointerElementType()->isFloatTy()) {
CGM.Error(E->getArg(0)->getLocStart(),
"parameter should have type \"local float*\"");
return nullptr;
}
// check float parameter
if (!Args[1]->getType()->isFloatTy()) {
CGM.Error(E->getArg(1)->getLocStart(),
"parameter should have type \"float\"");
return nullptr;
}

Intrinsic::ID ID;
switch (BuiltinID) {
case AMDGPU::BI__builtin_amdgcn_ds_faddf:
ID = Intrinsic::amdgcn_ds_fadd;
break;
case AMDGPU::BI__builtin_amdgcn_ds_fminf:
ID = Intrinsic::amdgcn_ds_fmin;
break;
case AMDGPU::BI__builtin_amdgcn_ds_fmaxf:
ID = Intrinsic::amdgcn_ds_fmax;
break;
default:
llvm_unreachable("Unknown BuiltinID");
}
Value *F = CGM.getIntrinsic(ID);
return Builder.CreateCall(F, Args);
}

// amdgcn workitem		// amdgcn workitem
case AMDGPU::BI__builtin_amdgcn_workitem_id_x:		case AMDGPU::BI__builtin_amdgcn_workitem_id_x:
return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_x, 0, 1024);		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_x, 0, 1024);
case AMDGPU::BI__builtin_amdgcn_workitem_id_y:		case AMDGPU::BI__builtin_amdgcn_workitem_id_y:
return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_y, 0, 1024);		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_y, 0, 1024);
case AMDGPU::BI__builtin_amdgcn_workitem_id_z:		case AMDGPU::BI__builtin_amdgcn_workitem_id_z:
return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_z, 0, 1024);		return emitRangedBuiltin(*this, Intrinsic::amdgcn_workitem_id_z, 0, 1024);

▲ Show 20 Lines • Show All 1,195 Lines • Show Last 20 Lines

lib/Sema/SemaExpr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,147 Lines • ▼ Show 20 Lines	for (QualType ParamType : FT->param_types()) {
if (!ParamType->isPointerType() \|\|		if (!ParamType->isPointerType() \|\|
ParamType.getQualifiers().hasAddressSpace() \|\|		ParamType.getQualifiers().hasAddressSpace() \|\|
!ArgType->isPointerType() \|\|		!ArgType->isPointerType() \|\|
!ArgType->getPointeeType().getQualifiers().hasAddressSpace()) {		!ArgType->getPointeeType().getQualifiers().hasAddressSpace()) {
OverloadParams.push_back(ParamType);		OverloadParams.push_back(ParamType);
continue;		continue;
}		}

		QualType PointeeType = ParamType->getPointeeType();
		if (PointeeType.getQualifiers().hasAddressSpace())
		continue;

NeedsNewDecl = true;		NeedsNewDecl = true;
LangAS AS = ArgType->getPointeeType().getAddressSpace();		LangAS AS = ArgType->getPointeeType().getAddressSpace();

QualType PointeeType = ParamType->getPointeeType();
PointeeType = Context.getAddrSpaceQualType(PointeeType, AS);		PointeeType = Context.getAddrSpaceQualType(PointeeType, AS);
OverloadParams.push_back(Context.getPointerType(PointeeType));		OverloadParams.push_back(Context.getPointerType(PointeeType));
}		}

if (!NeedsNewDecl)		if (!NeedsNewDecl)
return nullptr;		return nullptr;

FunctionProtoType::ExtProtoInfo EPI;		FunctionProtoType::ExtProtoInfo EPI;
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

test/CodeGenCUDA/builtins-amdgcn.cu

This file was added.

				// RUN: %clang_cc1 -triple amdgcn -fcuda-is-device -emit-llvm %s -o - \| FileCheck %s
				#include "Inputs/cuda.h"

				// CHECK-LABEL: @_Z16use_dispatch_ptrPi(
				// CHECK: %2 = call i8 addrspace(4)* @llvm.amdgcn.dispatch.ptr()
				// CHECK: %3 = addrspacecast i8 addrspace(4)* %2 to i8 addrspace(4)**
				__global__ void use_dispatch_ptr(int* out) {
				const int* dispatch_ptr = (const int*)__builtin_amdgcn_dispatch_ptr();
				out = dispatch_ptr;
				}

				// CHECK-LABEL: @_Z12test_ds_fmaxf(
				// CHECK: call float @llvm.amdgcn.ds.fmax(float addrspace(3)* @_ZZ12test_ds_fmaxfE6shared, float %2, i32 0, i32 0, i1 false)
				__global__
				void test_ds_fmax(float src) {
				__shared__ float shared;
				volatile float x = __builtin_amdgcn_ds_fmaxf(&shared, src, 0, 0, false);
				}

test/CodeGenOpenCL/builtins-amdgcn.cl

	// REQUIRES: amdgpu-registered-target			// REQUIRES: amdgpu-registered-target
	// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -S -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -cl-std=CL2.0 -triple amdgcn-unknown-unknown -S -emit-llvm -o - %s \| FileCheck -enable-var-scope %s
	// RUN: %clang_cc1 -triple amdgcn-unknown-unknown-opencl -S -emit-llvm -o - %s \| FileCheck %s

				yaxunlUnsubmitted Not Done Reply Inline Actions Please remove this line since we no longer use opencl in triple. yaxunl: Please remove this line since we no longer use opencl in triple.
	#pragma OPENCL EXTENSION cl_khr_fp64 : enable			#pragma OPENCL EXTENSION cl_khr_fp64 : enable

	typedef unsigned long ulong;			typedef unsigned long ulong;
	typedef unsigned int uint;			typedef unsigned int uint;

	// CHECK-LABEL: @test_div_scale_f64			// CHECK-LABEL: @test_div_scale_f64
	// CHECK: call { double, i1 } @llvm.amdgcn.div.scale.f64(double %a, double %b, i1 true)			// CHECK: call { double, i1 } @llvm.amdgcn.div.scale.f64(double %a, double %b, i1 true)
	// CHECK-DAG: [[FLAG:%.+]] = extractvalue { double, i1 } %{{.+}}, 1			// CHECK-DAG: [[FLAG:%.+]] = extractvalue { double, i1 } %{{.+}}, 1
	// CHECK-DAG: [[VAL:%.+]] = extractvalue { double, i1 } %{{.+}}, 0			// CHECK-DAG: [[VAL:%.+]] = extractvalue { double, i1 } %{{.+}}, 0
	// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i32			// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i32
	// CHECK: store i32 [[FLAGEXT]]			// CHECK: store i32 [[FLAGEXT]]
	void test_div_scale_f64(global double* out, global int* flagout, double a, double b)			void test_div_scale_f64(global double* out, global int* flagout, double a, double b)
	{			{
	bool flag;			bool flag;
	*out = __builtin_amdgcn_div_scale(a, b, true, &flag);			*out = __builtin_amdgcn_div_scale(a, b, true, &flag);
	*flagout = flag;			*flagout = flag;
	}			}

	// CHECK-LABEL: @test_div_scale_f32			// CHECK-LABEL: @test_div_scale_f32(
	// CHECK: call { float, i1 } @llvm.amdgcn.div.scale.f32(float %a, float %b, i1 true)			// CHECK: call { float, i1 } @llvm.amdgcn.div.scale.f32(float %a, float %b, i1 true)
	// CHECK-DAG: [[FLAG:%.+]] = extractvalue { float, i1 } %{{.+}}, 1			// CHECK-DAG: [[FLAG:%.+]] = extractvalue { float, i1 } %{{.+}}, 1
	// CHECK-DAG: [[VAL:%.+]] = extractvalue { float, i1 } %{{.+}}, 0			// CHECK-DAG: [[VAL:%.+]] = extractvalue { float, i1 } %{{.+}}, 0
	// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i32			// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i8
	// CHECK: store i32 [[FLAGEXT]]			// CHECK: store i8 [[FLAGEXT]]
	void test_div_scale_f32(global float* out, global int* flagout, float a, float b)			void test_div_scale_f32(global float* out, global bool* flagout, float a, float b)
	{			{
	bool flag;			bool flag;
	*out = __builtin_amdgcn_div_scalef(a, b, true, &flag);			*out = __builtin_amdgcn_div_scalef(a, b, true, &flag);
	*flagout = flag;			*flagout = flag;
	}			}

				// CHECK-LABEL: @test_div_scale_f32_global_ptr(
				// CHECK: call { float, i1 } @llvm.amdgcn.div.scale.f32(float %a, float %b, i1 true)
				// CHECK-DAG: [[FLAG:%.+]] = extractvalue { float, i1 } %{{.+}}, 1
				// CHECK-DAG: [[VAL:%.+]] = extractvalue { float, i1 } %{{.+}}, 0
				// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i8
				// CHECK: store i8 [[FLAGEXT]]
				void test_div_scale_f32_global_ptr(global float* out, global int* flagout, float a, float b, global bool* flag)
				{
				*out = __builtin_amdgcn_div_scalef(a, b, true, flag);
				}

				// CHECK-LABEL: @test_div_scale_f32_generic_ptr(
				// CHECK: call { float, i1 } @llvm.amdgcn.div.scale.f32(float %a, float %b, i1 true)
				// CHECK-DAG: [[FLAG:%.+]] = extractvalue { float, i1 } %{{.+}}, 1
				// CHECK-DAG: [[VAL:%.+]] = extractvalue { float, i1 } %{{.+}}, 0
				// CHECK: [[FLAGEXT:%.+]] = zext i1 [[FLAG]] to i8
				// CHECK: store i8 [[FLAGEXT]]
				void test_div_scale_f32_generic_ptr(global float* out, global int* flagout, float a, float b, global bool* flag_arg)
				{
				generic bool* flag = flag_arg;
				*out = __builtin_amdgcn_div_scalef(a, b, true, flag);
				}

	// CHECK-LABEL: @test_div_fmas_f32			// CHECK-LABEL: @test_div_fmas_f32
	// CHECK: call float @llvm.amdgcn.div.fmas.f32			// CHECK: call float @llvm.amdgcn.div.fmas.f32
	void test_div_fmas_f32(global float* out, float a, float b, float c, int d)			void test_div_fmas_f32(global float* out, float a, float b, float c, int d)
	{			{
	*out = __builtin_amdgcn_div_fmasf(a, b, c, d);			*out = __builtin_amdgcn_div_fmasf(a, b, c, d);
	}			}

	// CHECK-LABEL: @test_div_fmas_f64			// CHECK-LABEL: @test_div_fmas_f64
	▲ Show 20 Lines • Show All 365 Lines • ▼ Show 20 Lines

	// CHECK-LABEL: @test_cubema(			// CHECK-LABEL: @test_cubema(
	// CHECK: call float @llvm.amdgcn.cubema(float %a, float %b, float %c)			// CHECK: call float @llvm.amdgcn.cubema(float %a, float %b, float %c)
	void test_cubema(global float* out, float a, float b, float c) {			void test_cubema(global float* out, float a, float b, float c) {
	*out = __builtin_amdgcn_cubema(a, b, c);			*out = __builtin_amdgcn_cubema(a, b, c);
	}			}

	// CHECK-LABEL: @test_read_exec(			// CHECK-LABEL: @test_read_exec(
	// CHECK: call i64 @llvm.read_register.i64(metadata ![[EXEC:[0-9]+]]) #[[READ_EXEC_ATTRS:[0-9]+]]			// CHECK: call i64 @llvm.read_register.i64(metadata ![[$EXEC:[0-9]+]]) #[[$READ_EXEC_ATTRS:[0-9]+]]
	void test_read_exec(global ulong* out) {			void test_read_exec(global ulong* out) {
	*out = __builtin_amdgcn_read_exec();			*out = __builtin_amdgcn_read_exec();
	}			}

	// CHECK: declare i64 @llvm.read_register.i64(metadata) #[[NOUNWIND_READONLY:[0-9]+]]			// CHECK: declare i64 @llvm.read_register.i64(metadata) #[[$NOUNWIND_READONLY:[0-9]+]]

	// CHECK-LABEL: @test_read_exec_lo(			// CHECK-LABEL: @test_read_exec_lo(
	// CHECK: call i32 @llvm.read_register.i32(metadata ![[EXEC_LO:[0-9]+]]) #[[READ_EXEC_ATTRS]]			// CHECK: call i32 @llvm.read_register.i32(metadata ![[$EXEC_LO:[0-9]+]]) #[[$READ_EXEC_ATTRS]]
	void test_read_exec_lo(global uint* out) {			void test_read_exec_lo(global uint* out) {
	*out = __builtin_amdgcn_read_exec_lo();			*out = __builtin_amdgcn_read_exec_lo();
	}			}

	// CHECK-LABEL: @test_read_exec_hi(			// CHECK-LABEL: @test_read_exec_hi(
	// CHECK: call i32 @llvm.read_register.i32(metadata ![[EXEC_HI:[0-9]+]]) #[[READ_EXEC_ATTRS]]			// CHECK: call i32 @llvm.read_register.i32(metadata ![[$EXEC_HI:[0-9]+]]) #[[$READ_EXEC_ATTRS]]
	void test_read_exec_hi(global uint* out) {			void test_read_exec_hi(global uint* out) {
	*out = __builtin_amdgcn_read_exec_hi();			*out = __builtin_amdgcn_read_exec_hi();
	}			}

	// CHECK-LABEL: @test_dispatch_ptr			// CHECK-LABEL: @test_dispatch_ptr
	// CHECK: call i8 addrspace(4)* @llvm.amdgcn.dispatch.ptr()			// CHECK: call i8 addrspace(4)* @llvm.amdgcn.dispatch.ptr()
	void test_dispatch_ptr(__attribute__((address_space(4))) unsigned char ** out)			void test_dispatch_ptr(__constant unsigned char ** out)
	{			{
	*out = __builtin_amdgcn_dispatch_ptr();			*out = __builtin_amdgcn_dispatch_ptr();
	}			}

	// CHECK-LABEL: @test_kernarg_segment_ptr			// CHECK-LABEL: @test_kernarg_segment_ptr
	// CHECK: call i8 addrspace(4)* @llvm.amdgcn.kernarg.segment.ptr()			// CHECK: call i8 addrspace(4)* @llvm.amdgcn.kernarg.segment.ptr()
	void test_kernarg_segment_ptr(__attribute__((address_space(4))) unsigned char ** out)			void test_kernarg_segment_ptr(__constant unsigned char ** out)
	{			{
	*out = __builtin_amdgcn_kernarg_segment_ptr();			*out = __builtin_amdgcn_kernarg_segment_ptr();
	}			}

	// CHECK-LABEL: @test_implicitarg_ptr			// CHECK-LABEL: @test_implicitarg_ptr
	// CHECK: call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()			// CHECK: call i8 addrspace(4)* @llvm.amdgcn.implicitarg.ptr()
	void test_implicitarg_ptr(__attribute__((address_space(4))) unsigned char ** out)			void test_implicitarg_ptr(__constant unsigned char ** out)
	{			{
	*out = __builtin_amdgcn_implicitarg_ptr();			*out = __builtin_amdgcn_implicitarg_ptr();
	}			}

	// CHECK-LABEL: @test_get_group_id(			// CHECK-LABEL: @test_get_group_id(
	// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.x()			// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.x()
	// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.y()			// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.y()
	// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.z()			// CHECK: tail call i32 @llvm.amdgcn.workgroup.id.z()
	Show All 14 Lines
	void test_s_getreg(volatile global uint *out)			void test_s_getreg(volatile global uint *out)
	{			{
	*out = __builtin_amdgcn_s_getreg(0);			*out = __builtin_amdgcn_s_getreg(0);
	*out = __builtin_amdgcn_s_getreg(1);			*out = __builtin_amdgcn_s_getreg(1);
	*out = __builtin_amdgcn_s_getreg(65535);			*out = __builtin_amdgcn_s_getreg(65535);
	}			}

	// CHECK-LABEL: @test_get_local_id(			// CHECK-LABEL: @test_get_local_id(
	// CHECK: tail call i32 @llvm.amdgcn.workitem.id.x(), !range [[WI_RANGE:![0-9]*]]			// CHECK: tail call i32 @llvm.amdgcn.workitem.id.x(), !range [[$WI_RANGE:![0-9]*]]
	// CHECK: tail call i32 @llvm.amdgcn.workitem.id.y(), !range [[WI_RANGE]]			// CHECK: tail call i32 @llvm.amdgcn.workitem.id.y(), !range [[$WI_RANGE]]
	// CHECK: tail call i32 @llvm.amdgcn.workitem.id.z(), !range [[WI_RANGE]]			// CHECK: tail call i32 @llvm.amdgcn.workitem.id.z(), !range [[$WI_RANGE]]
	void test_get_local_id(int d, global int *out)			void test_get_local_id(int d, global int *out)
	{			{
	switch (d) {			switch (d) {
	case 0: *out = __builtin_amdgcn_workitem_id_x(); break;			case 0: *out = __builtin_amdgcn_workitem_id_x(); break;
	case 1: *out = __builtin_amdgcn_workitem_id_y(); break;			case 1: *out = __builtin_amdgcn_workitem_id_y(); break;
	case 2: *out = __builtin_amdgcn_workitem_id_z(); break;			case 2: *out = __builtin_amdgcn_workitem_id_z(); break;
	default: *out = 0;			default: *out = 0;
	}			}
	}			}

	// CHECK-LABEL: @test_fmed3_f32			// CHECK-LABEL: @test_fmed3_f32
	// CHECK: call float @llvm.amdgcn.fmed3.f32(			// CHECK: call float @llvm.amdgcn.fmed3.f32(
	void test_fmed3_f32(global float* out, float a, float b, float c)			void test_fmed3_f32(global float* out, float a, float b, float c)
	{			{
	*out = __builtin_amdgcn_fmed3f(a, b, c);			*out = __builtin_amdgcn_fmed3f(a, b, c);
	}			}

	// CHECK-LABEL: @test_s_getpc			// CHECK-LABEL: @test_s_getpc
	// CHECK: call i64 @llvm.amdgcn.s.getpc()			// CHECK: call i64 @llvm.amdgcn.s.getpc()
	void test_s_getpc(global ulong* out)			void test_s_getpc(global ulong* out)
	{			{
	*out = __builtin_amdgcn_s_getpc();			*out = __builtin_amdgcn_s_getpc();
	}			}

	// CHECK-DAG: [[WI_RANGE]] = !{i32 0, i32 1024}			// CHECK-DAG: [[$WI_RANGE]] = !{i32 0, i32 1024}
	// CHECK-DAG: attributes #[[NOUNWIND_READONLY:[0-9]+]] = { nounwind readonly }			// CHECK-DAG: attributes #[[$NOUNWIND_READONLY:[0-9]+]] = { nounwind readonly }
	// CHECK-DAG: attributes #[[READ_EXEC_ATTRS]] = { convergent }			// CHECK-DAG: attributes #[[$READ_EXEC_ATTRS]] = { convergent }
	// CHECK-DAG: ![[EXEC]] = !{!"exec"}			// CHECK-DAG: ![[$EXEC]] = !{!"exec"}
	// CHECK-DAG: ![[EXEC_LO]] = !{!"exec_lo"}			// CHECK-DAG: ![[$EXEC_LO]] = !{!"exec_lo"}
	// CHECK-DAG: ![[EXEC_HI]] = !{!"exec_hi"}			// CHECK-DAG: ![[$EXEC_HI]] = !{!"exec_hi"}

test/CodeGenOpenCL/numbered-address-space.cl

This file was added.

				// REQUIRES: amdgpu-registered-target
				// RUN: %clang_cc1 -cl-std=CL2.0 -triple amdgcn-unknown-unknown -target-cpu tonga -S -emit-llvm -O0 -o - %s \| FileCheck %s

				// Make sure using numbered address spaces doesn't trigger crashes when a
				// builtin has an address space parameter.

				// CHECK-LABEL: @test_numbered_as_to_generic(
				// CHECK: addrspacecast i32 addrspace(42)* %0 to i32*
				void test_numbered_as_to_generic(__attribute__((address_space(42))) int *arbitary_numbered_ptr) {
				generic int* generic_ptr = arbitary_numbered_ptr;
				*generic_ptr = 4;
				}

				// CHECK-LABEL: @test_numbered_as_to_builtin(
				// CHECK: addrspacecast i32 addrspace(42)* %0 to float addrspace(3)*
				void test_numbered_as_to_builtin(__attribute__((address_space(42))) int *arbitary_numbered_ptr, float src) {
				volatile float result = __builtin_amdgcn_ds_fmaxf(arbitary_numbered_ptr, src, 0, 0, false);
				}

				// CHECK-LABEL: @test_generic_as_to_builtin_parameter_explicit_cast(
				// CHECK: addrspacecast i32 addrspace(3)* %0 to i32*
				void test_generic_as_to_builtin_parameter_explicit_cast(__local int *local_ptr, float src) {
				generic int* generic_ptr = local_ptr;
				volatile float result = __builtin_amdgcn_ds_fmaxf((__local float*) generic_ptr, src, 0, 0, false);
				}

				// CHECK-LABEL: @test_generic_as_to_builtin_parameter_implicit_cast(
				// CHECK: addrspacecast i32* %2 to float addrspace(3)*
				void test_generic_as_to_builtin_parameter_implicit_cast(__local int *local_ptr, float src) {
				generic int* generic_ptr = local_ptr;

				volatile float result = __builtin_amdgcn_ds_fmaxf(generic_ptr, src, 0, 0, false);
				}

				AnastasiaUnsubmitted Not Done Reply Inline Actions `__attribute__((address_space(N)))` is not an OpenCL feature and I think it's not specified in C either? But I think generally non matching address spaces don't compile in Clang. So it might be useful to disallow this? Anastasia: `__attribute__((address_space(N)))` is not an OpenCL feature and I think it's not specified in…
				arsenmAuthorUnsubmitted Not Done Reply Inline Actions I'm pretty sure it's a C extension. The way things seem to work now is address spaces are accepted anywhere and everywhere. arsenm: I'm pretty sure it's a C extension. The way things seem to work now is address spaces are…
				AnastasiaUnsubmitted Not Done Reply Inline Actions Yes, the line below should give an error for OpenCL? generic int* generic_ptr = local_ptr; Anastasia: Yes, the line below should give an error for OpenCL? generic int* generic_ptr = local_ptr;
				arsenmAuthorUnsubmitted Not Done Reply Inline Actions This does not error. The wording of the spec seems to leave some interpretation for other address spaces. It whitelists the valid address spaces for implicit casts, and blacklists constant for implicit or explicit casts. My reading between the lines is that an explicit cast would be OK. I think this is a separate fix since this is independent from the builtins arsenm: This does not error. The wording of the spec seems to leave some interpretation for other…

test/SemaOpenCL/numbered-address-space.cl

This file was added.

				// REQUIRES: amdgpu-registered-target
				// RUN: %clang_cc1 -cl-std=CL2.0 -triple amdgcn-amd-amdhsa -verify -pedantic -fsyntax-only %s

				void test_numeric_as_to_generic_implicit_cast(__attribute__((address_space(3))) int *as3_ptr, float src) {
				generic int* generic_ptr = as3_ptr; // FIXME: This should error
				}

				void test_numeric_as_to_generic_explicit_cast(__attribute__((address_space(3))) int *as3_ptr, float src) {
				generic int* generic_ptr = (generic int*) as3_ptr; // Should maybe be valid?
				}
				AnastasiaUnsubmitted Not Done Reply Inline Actions Ideally this is not governed by any specification. Generic compiler support for such `addrspacecast` is not possible but vendor can implement custom support. I think we can leave it as is until we have better idea how this should be supported. Anastasia: Ideally this is not governed by any specification. Generic compiler support for such…

				void test_generic_to_numeric_as_implicit_cast() {
				generic int* generic_ptr = 0;
				__attribute__((address_space(3))) int as3_ptr = generic_ptr; // expected-error{{initializing '__attribute__((address_space(3))) int ' with an expression of type '__generic int *' changes address space of pointer}}
				}

				void test_generic_to_numeric_as_explicit_cast() {
				generic int* generic_ptr = 0;
				__attribute__((address_space(3))) int as3_ptr = (__attribute__((address_space(3))) int )generic_ptr;
				}

				void test_generic_as_to_builtin_parameter_explicit_cast_numeric(__attribute__((address_space(3))) int *as3_ptr, float src) {
				generic int* generic_ptr = as3_ptr; // FIXME: This should error
				volatile float result = __builtin_amdgcn_ds_fmaxf((__attribute__((address_space(3))) float) generic_ptr, src, 0, 0, false); // expected-error {{passing '__attribute__((address_space(3))) float ' to parameter of type '__local float *' changes address space of pointer}}
				}

				void test_generic_as_to_builtin_parameterimplicit_cast_numeric(__attribute__((address_space(3))) int *as3_ptr, float src) {
				generic int* generic_ptr = as3_ptr;
				volatile float result = __builtin_amdgcn_ds_fmaxf(generic_ptr, src, 0, 0, false); // expected-warning {{incompatible pointer types passing '__generic int ' to parameter of type '__local float '}}
				}