This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
Basic/
-
DiagnosticSemaKinds.td
1
LangOptions.def
-
Driver/
-
Options.td
-
lib/
-
Headers/
-
__clang_hip_cmath.h
-
Sema/
1
SemaDecl.cpp
-
test/SemaCUDA/
-
SemaCUDA/
1
half-arg.cu

Differential D98143

[HIP] Diagnose aggregate args containing half types
Needs ReviewPublic

Authored by yaxunl on Mar 7 2021, 7:34 AM.

Download Raw Diff

Details

Reviewers

tra

Summary

gcc and clang currently do not have a consistent ABI
for half precision types. Passing aggregate args containing half precision
types between clang and gcc can cause UB.

This patch adds an option -fhip-allow-half-arg. When off, clang
will diagnose aggregate arguments containing half precision
types in host functions.

Diff Detail

Event Timeline

yaxunl created this revision.Mar 7 2021, 7:34 AM

Herald added subscribers: jansvoboda11, dexonsmith, dang. · View Herald TranscriptMar 7 2021, 7:34 AM

yaxunl requested review of this revision.Mar 7 2021, 7:34 AM

Harbormaster completed remote builds in B92555: Diff 328861.Mar 7 2021, 8:15 AM

fix test and clang-tidy warnings

Harbormaster completed remote builds in B92636: Diff 328979.Mar 8 2021, 6:27 AM

Is this patch still relevant? Looks like I've missed it.

What exactly is the difference between gcc and clang regarding fp16 and why does it matter for aggregate arguments?
On a trivial example both clang and gcc appear to treat _Float16 similarly: https://godbolt.org/z/8WxK95zTj

In D98143#3034419, @tra wrote:

Is this patch still relevant? Looks like I've missed it.

What exactly is the difference between gcc and clang regarding fp16 and why does it matter for aggregate arguments?
On a trivial example both clang and gcc appear to treat _Float16 similarly: https://godbolt.org/z/8WxK95zTj

On gcc11 and below, since gcc does not support fp16, it is common practice to use short to pass fp16 in struct. Then gcc and clang has different ABI: https://godbolt.org/z/zqhT7x7qo

Basically if one compiles a function with struct type arg containing _Float16 with clang, then compiles a caller with struct type arg containing short with gcc, it will not work.

If use gcc trunk with _Float16, then there will not be such issue since the ABI is the same.

In D98143#3034672, @yaxunl wrote:

On gcc11 and below, since gcc does not support fp16, it is common practice to use short to pass fp16 in struct. Then gcc and clang has different ABI: https://godbolt.org/z/zqhT7x7qo

Basically if one compiles a function with struct type arg containing _Float16 with clang, then compiles a caller with struct type arg containing short with gcc, it will not work.

If both compilers use the same type they both agree on, then there's no problem. If they pass data using different types but identical names, that's an ODR violation and it's not specific to fp16.

Given that fp16 ABI is not stable, the dignostics provided by the patch may be a useful safeguard to have.

That said, I'm not quite sure how one would use it in practice. Is that purely to check that fp16 is not passed around anywhere in the host code?
If we compile everything with the check enabled that would potentially produce a lot of irrelevant/false-positive results.
E.g. there may be legitimate host-only code passing fp16 parameters and used only from clang-compiled files.
Perhaps the check should be limited to externally visible functions only, and, maybe, make it a warning?
We can not conclusively tell whether use of fp16 by the callee will be a problem if we don't know anything about the caller.

Are there specific cases where this particular issue popped up?
I think I did run into an issue with some ROCm headers that were using different fp16 types depending on how a file was compiled, even though in both cases compiler was clang. It's been a while ago, so I do not recall the exact details.
It might've been this code: https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/c560f436cea24a7c5b775042464bc4c2989744ca/library/include/internal/rocblas-types.h#L73

clang/include/clang/Basic/LangOptions.def
256–258	Nit: Maybe simplify it to `Allow using half precision types in host function parameter and return types.` Also, this issue would technically affect CUDA, too, so GPUAllowHalfArg might work better.
clang/lib/Sema/SemaDecl.cpp
8916–8917	I'd split it into separate early exits for the flag and for attribute checks, with an added comment that we only check host functions. That brings the question of how we should handle HD functions. Right now they would not checked, which makes it possible to run into the issue you're trying to solve if that function is used on the host.
clang/test/SemaCUDA/half-arg.cu
9–10	What happens if I have a lambda that captures an fp16 value? Granted, it will be inadvisable to pass across compilers, but, nevertheless it's another kind of an aggregate we may end up passing to a function.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticSemaKinds.td

2 lines

LangOptions.def

3 lines

Driver/

Options.td

6 lines

lib/

Headers/

__clang_hip_cmath.h

4 lines

Sema/

SemaDecl.cpp

305 lines

test/

SemaCUDA/

half-arg.cu

136 lines

Diff 328979

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,852 Lines • ▼ Show 20 Lines	def err_opencl_scalar_type_rank_greater_than_vector_type : Error<
"element. (%0 and %1)">;		"element. (%0 and %1)">;
def err_bad_kernel_param_type : Error<		def err_bad_kernel_param_type : Error<
"%0 cannot be used as the type of a kernel parameter">;		"%0 cannot be used as the type of a kernel parameter">;
def err_opencl_implicit_function_decl : Error<		def err_opencl_implicit_function_decl : Error<
"implicit declaration of function %0 is invalid in OpenCL">;		"implicit declaration of function %0 is invalid in OpenCL">;
def err_record_with_pointers_kernel_param : Error<		def err_record_with_pointers_kernel_param : Error<
"%select{struct\|union}0 kernel parameters may not contain pointers">;		"%select{struct\|union}0 kernel parameters may not contain pointers">;
def note_within_field_of_type : Note<		def note_within_field_of_type : Note<
"within field of type %0 declared here">;		"within field %select{\|or base class }1of type %0 declared here">;
def note_illegal_field_declared_here : Note<		def note_illegal_field_declared_here : Note<
"field of illegal %select{type\|pointer type}0 %1 declared here">;		"field of illegal %select{type\|pointer type}0 %1 declared here">;
def err_opencl_type_struct_or_union_field : Error<		def err_opencl_type_struct_or_union_field : Error<
"the %0 type cannot be used to declare a structure or union field">;		"the %0 type cannot be used to declare a structure or union field">;
def err_event_t_addr_space_qual : Error<		def err_event_t_addr_space_qual : Error<
"the event_t type can only be used with __private address space qualifier">;		"the event_t type can only be used with __private address space qualifier">;
def err_expected_kernel_void_return_type : Error<		def err_expected_kernel_void_return_type : Error<
"kernel must have void return type">;		"kernel must have void return type">;
▲ Show 20 Lines • Show All 1,280 Lines • Show Last 20 Lines

clang/include/clang/Basic/LangOptions.def

	Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines
	LANGOPT(GPUDeferDiag, 1, 0, "defer host/device related diagnostic messages for CUDA/HIP")			LANGOPT(GPUDeferDiag, 1, 0, "defer host/device related diagnostic messages for CUDA/HIP")
	LANGOPT(GPUExcludeWrongSideOverloads, 1, 0, "always exclude wrong side overloads in overloading resolution for CUDA/HIP")			LANGOPT(GPUExcludeWrongSideOverloads, 1, 0, "always exclude wrong side overloads in overloading resolution for CUDA/HIP")

	LANGOPT(SYCL , 1, 0, "SYCL")			LANGOPT(SYCL , 1, 0, "SYCL")
	LANGOPT(SYCLIsDevice , 1, 0, "Generate code for SYCL device")			LANGOPT(SYCLIsDevice , 1, 0, "Generate code for SYCL device")
	ENUM_LANGOPT(SYCLVersion , SYCLMajorVersion, 1, SYCL_None, "Version of the SYCL standard used")			ENUM_LANGOPT(SYCLVersion , SYCLMajorVersion, 1, SYCL_None, "Version of the SYCL standard used")

	LANGOPT(HIPUseNewLaunchAPI, 1, 0, "Use new kernel launching API for HIP")			LANGOPT(HIPUseNewLaunchAPI, 1, 0, "Use new kernel launching API for HIP")
				LANGOPT(HIPAllowHalfArg, 1, 1, "Allow half precision types or aggregate types "
				"containing half precision types as host "
				"function parameter and return types for HIP")
				traUnsubmitted Not Done Reply Inline Actions Nit: Maybe simplify it to `Allow using half precision types in host function parameter and return types.` Also, this issue would technically affect CUDA, too, so GPUAllowHalfArg might work better. tra: Nit: Maybe simplify it to `Allow using half precision types in host function parameter and…

	LANGOPT(SizedDeallocation , 1, 0, "sized deallocation")			LANGOPT(SizedDeallocation , 1, 0, "sized deallocation")
	LANGOPT(AlignedAllocation , 1, 0, "aligned allocation")			LANGOPT(AlignedAllocation , 1, 0, "aligned allocation")
	LANGOPT(AlignedAllocationUnavailable, 1, 0, "aligned allocation functions are unavailable")			LANGOPT(AlignedAllocationUnavailable, 1, 0, "aligned allocation functions are unavailable")
	LANGOPT(NewAlignOverride , 32, 0, "maximum alignment guaranteed by '::operator new(size_t)'")			LANGOPT(NewAlignOverride , 32, 0, "maximum alignment guaranteed by '::operator new(size_t)'")
	LANGOPT(ConceptSatisfactionCaching , 1, 1, "enable satisfaction caching for C++20 Concepts")			LANGOPT(ConceptSatisfactionCaching , 1, 1, "enable satisfaction caching for C++20 Concepts")
	BENIGN_LANGOPT(ModulesCodegen , 1, 0, "Modules code generation")			BENIGN_LANGOPT(ModulesCodegen , 1, 0, "Modules code generation")
	BENIGN_LANGOPT(ModulesDebugInfo , 1, 0, "Modules debug info")			BENIGN_LANGOPT(ModulesDebugInfo , 1, 0, "Modules debug info")
	▲ Show 20 Lines • Show All 158 Lines • Show Last 20 Lines

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 915 Lines • ▼ Show 20 Lines
	def hip_version_EQ : Joined<["--"], "hip-version=">,			def hip_version_EQ : Joined<["--"], "hip-version=">,
	HelpText<"HIP version in the format of major.minor.patch">;			HelpText<"HIP version in the format of major.minor.patch">;
	def fhip_dump_offload_linker_script : Flag<["-"], "fhip-dump-offload-linker-script">,			def fhip_dump_offload_linker_script : Flag<["-"], "fhip-dump-offload-linker-script">,
	Group<f_Group>, Flags<[NoArgumentUnused, HelpHidden]>;			Group<f_Group>, Flags<[NoArgumentUnused, HelpHidden]>;
	defm hip_new_launch_api : BoolFOption<"hip-new-launch-api",			defm hip_new_launch_api : BoolFOption<"hip-new-launch-api",
	LangOpts<"HIPUseNewLaunchAPI">, DefaultFalse,			LangOpts<"HIPUseNewLaunchAPI">, DefaultFalse,
	PosFlag<SetTrue, [CC1Option], "Use">, NegFlag<SetFalse, [], "Don't use">,			PosFlag<SetTrue, [CC1Option], "Use">, NegFlag<SetFalse, [], "Don't use">,
	BothFlags<[], " new kernel launching API for HIP">>;			BothFlags<[], " new kernel launching API for HIP">>;
				defm hip_allow_half_arg : BoolFOption<"hip-allow-half-arg",
				LangOpts<"HIPAllowHalfArg">, DefaultTrue,
				PosFlag<SetTrue, [], "Allow">, NegFlag<SetFalse, [CC1Option], "Don't allow">,
				BothFlags<[], " half precision types or aggregate types containing half "
				"precision types as host function parameter type or return type">>,
				ShouldParseIf<hip.KeyPath>;
	defm gpu_allow_device_init : BoolFOption<"gpu-allow-device-init",			defm gpu_allow_device_init : BoolFOption<"gpu-allow-device-init",
	LangOpts<"GPUAllowDeviceInit">, DefaultFalse,			LangOpts<"GPUAllowDeviceInit">, DefaultFalse,
	PosFlag<SetTrue, [CC1Option], "Allow">, NegFlag<SetFalse, [], "Don't allow">,			PosFlag<SetTrue, [CC1Option], "Allow">, NegFlag<SetFalse, [], "Don't allow">,
	BothFlags<[], " device side init function in HIP">>,			BothFlags<[], " device side init function in HIP">>,
	ShouldParseIf<hip.KeyPath>;			ShouldParseIf<hip.KeyPath>;
	defm gpu_defer_diag : BoolFOption<"gpu-defer-diag",			defm gpu_defer_diag : BoolFOption<"gpu-defer-diag",
	LangOpts<"GPUDeferDiag">, DefaultFalse,			LangOpts<"GPUDeferDiag">, DefaultFalse,
	PosFlag<SetTrue, [CC1Option], "Defer">, NegFlag<SetFalse, [], "Don't defer">,			PosFlag<SetTrue, [CC1Option], "Defer">, NegFlag<SetFalse, [], "Don't defer">,
	▲ Show 20 Lines • Show All 5,282 Lines • Show Last 20 Lines

clang/lib/Headers/__clang_hip_cmath.h

	/*===---- __clang_hip_cmath.h - HIP cmath decls -----------------------------===			/*===---- __clang_hip_cmath.h - HIP cmath decls -----------------------------===
	*			*
	* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	* See https://llvm.org/LICENSE.txt for license information.			* See https://llvm.org/LICENSE.txt for license information.
	* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	*			*
	*===-----------------------------------------------------------------------===			*===-----------------------------------------------------------------------===
	*/			*/

	#ifndef __CLANG_HIP_CMATH_H__			#ifndef __CLANG_HIP_CMATH_H__
	#define __CLANG_HIP_CMATH_H__			#define __CLANG_HIP_CMATH_H__

	#if !defined(__HIP__)			#if !defined(__HIP__)
	#error "This file is for HIP and OpenMP AMDGCN device compilation only."			#error "This file is for HIP and OpenMP AMDGCN device compilation only."
				Lint: Pre-merge checks Inline Actions clang-tidy: error: "This file is for HIP and OpenMP AMDGCN device compilation only." [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: "This file is for HIP and OpenMP AMDGCN device compilation only." [clang…
	#endif			#endif

	#if defined(__cplusplus)			#if defined(__cplusplus)
	#include <limits>			#include <limits>
	#include <type_traits>			#include <type_traits>
	#include <utility>			#include <utility>
	#endif			#endif
	#include <limits.h>			#include <limits.h>
	#include <stdint.h>			#include <stdint.h>

	// __DEVICE__ is a helper macro with common set of attributes for the wrappers			// __DEVICE__ is a helper macro with common set of attributes for the wrappers
	// we implement in this file. We need static in order to avoid emitting unused			// we implement in this file. We need static in order to avoid emitting unused
	// functions.			// functions.
	#pragma push_macro("__DEVICE__")			#pragma push_macro("__DEVICE__")
	#ifdef __OPENMP_AMDGCN__			#ifdef __OPENMP_AMDGCN__
	#define __DEVICE__ static constexpr __attribute__((always_inline, nothrow))			#define __DEVICE__ static constexpr __attribute__((always_inline, nothrow))
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful clang-tidy: error: no member named 'fabs' in the global namespace [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	#else			#else
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful clang-tidy: error: no member named 'fabsf' in the global namespace [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	#define __DEVICE__ static __device__ inline __attribute__((always_inline))			#define __DEVICE__ static __device__ inline __attribute__((always_inline))
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful clang-tidy: error: no member named 'llabs' in the global namespace [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	#endif			#endif
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful clang-tidy: error: no member named 'labs' in the global namespace [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.

				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	// Start with functions that cannot be defined by DEF macros below.			// Start with functions that cannot be defined by DEF macros below.
				Lint: Pre-merge checks Inline Actions clang-tidy: error: no member named 'fmaf' in the global namespace [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: no member named 'fmaf' in the global namespace [clang-diagnostic-error]…
	#if defined(__cplusplus)			#if defined(__cplusplus)
	__DEVICE__ double abs(double __x) { return ::fabs(__x); }			__DEVICE__ double abs(double __x) { return ::fabs(__x); }
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	__DEVICE__ float abs(float __x) { return ::fabsf(__x); }			__DEVICE__ float abs(float __x) { return ::fabsf(__x); }
				Lint: Pre-merge checks Inline Actions clang-tidy: error: use of undeclared identifier 'FP_NAN' [clang-diagnostic-error] not useful clang-tidy: error: use of undeclared identifier 'FP_INFINITE' [clang-diagnostic-error] not useful clang-tidy: error: use of undeclared identifier 'FP_NORMAL' [clang-diagnostic-error] not useful clang-tidy: error: use of undeclared identifier 'FP_SUBNORMAL' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: use of undeclared identifier 'FP_NAN' [clang-diagnostic-error] [[https…
	__DEVICE__ long long abs(long long __n) { return ::llabs(__n); }			__DEVICE__ long long abs(long long __n) { return ::llabs(__n); }
				Lint: Pre-merge checks Inline Actions clang-tidy: error: use of undeclared identifier 'FP_ZERO' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: use of undeclared identifier 'FP_ZERO' [clang-diagnostic-error] [[https…
	__DEVICE__ long abs(long __n) { return ::labs(__n); }			__DEVICE__ long abs(long __n) { return ::labs(__n); }
	__DEVICE__ float fma(float __x, float __y, float __z) {			__DEVICE__ float fma(float __x, float __y, float __z) {
				Lint: Pre-merge checks Inline Actions clang-tidy: error: unknown type name 'device' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: unknown type name '__device__' [clang-diagnostic-error] [[https://github.
	return ::fmaf(__x, __y, __z);			return ::fmaf(__x, __y, __z);
				Lint: Pre-merge checks Inline Actions clang-tidy: error: use of undeclared identifier 'FP_NAN' [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: use of undeclared identifier 'FP_NAN' [clang-diagnostic-error] [[https…
	}			}
	__DEVICE__ int fpclassify(float __x) {			__DEVICE__ int fpclassify(float __x) {
	return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,			return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,
	FP_ZERO, __x);			FP_ZERO, __x);
	}			}
	__DEVICE__ int fpclassify(double __x) {			__DEVICE__ int fpclassify(double __x) {
	return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,			return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,
	FP_ZERO, __x);			FP_ZERO, __x);
	▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines

	// decltype is only available in C++11 and above.			// decltype is only available in C++11 and above.
	#if __cplusplus >= 201103L			#if __cplusplus >= 201103L
	// __hip_promote			// __hip_promote
	namespace __hip {			namespace __hip {

	template <class _Tp> struct __numeric_type {			template <class _Tp> struct __numeric_type {
	static void __test(...);			static void __test(...);
	static _Float16 __test(_Float16);			// _Float16 is not allowed as host function arguments until ABI compatibility
				// issue with gcc is resolved.
				static __device__ _Float16 __test(_Float16);
	static float __test(float);			static float __test(float);
	static double __test(char);			static double __test(char);
	static double __test(int);			static double __test(int);
	static double __test(unsigned);			static double __test(unsigned);
	static double __test(long);			static double __test(long);
	static double __test(unsigned long);			static double __test(unsigned long);
	static double __test(long long);			static double __test(long long);
	static double __test(unsigned long long);			static double __test(unsigned long long);
	▲ Show 20 Lines • Show All 439 Lines • Show Last 20 Lines

clang/lib/Sema/SemaDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,585 Lines • ▼ Show 20 Lines	if (Name.getNameKind() == DeclarationName::CXXConstructorName) {
// prototype. This true when:		// prototype. This true when:
// - we're in C++ (where every function has a prototype),		// - we're in C++ (where every function has a prototype),
return FunctionDecl::Create(SemaRef.Context, DC, D.getBeginLoc(), NameInfo,		return FunctionDecl::Create(SemaRef.Context, DC, D.getBeginLoc(), NameInfo,
R, TInfo, SC, isInline, true /HasPrototype/,		R, TInfo, SC, isInline, true /HasPrototype/,
ConstexprKind, TrailingRequiresClause);		ConstexprKind, TrailingRequiresClause);
}		}
}		}

		// Result type returned by the functor checking struct field.
		enum class CheckFieldResult {
		Valid, // The filed is valid
		Recurse, // The field is a struct which needs to be checked recursively
		Invalid, // The filed is invalid
		};

		// Check whether struct or array type contains invalid fields or elements by
		// recursively visiting fields of the structs with the functor CheckField.
		// Returns true if the type is valid. CheckField returns Valid if the field is
		// valid, emits a diagnostic message and returns Invalid if the field is
		// invalid, returns Recurse if the field is a struct which needs further check.
		// ValidTypes contain known valid types.
		static bool
		checkStructOrArrayType(Sema &S, QualType PT,
		llvm::SmallPtrSetImpl<const Type *> &ValidTypes,
		std::function<CheckFieldResult(QualType)> CheckFieldType,
		std::function<void(QualType)> DiagInvalidParam) {
		// Track nested structs we will inspect
		SmallVector<const Decl *, 4> VisitStack;

		// Track where we are in the nested structs. Items will migrate from
		// VisitStack to HistoryStack as we do the DFS for bad field.
		SmallVector<const FieldDecl *, 4> HistoryStack;
		HistoryStack.push_back(nullptr);

		// At this point we already handled everything except of a RecordType or
		// an ArrayType of a RecordType.
		assert((PT->isArrayType() \|\| PT->isRecordType()) && "Unexpected type.");
		const RecordType *RecTy =
		PT->getPointeeOrArrayElementType()->getAs<RecordType>();
		const RecordDecl *OrigRecDecl = RecTy->getDecl();

		VisitStack.push_back(RecTy->getDecl());
		assert(VisitStack.back() && "First decl null?");

		do {
		const Decl *Next = VisitStack.pop_back_val();
		if (!Next) {
		// HistoryStack is empty if a struct has no fields or base.
		if (HistoryStack.empty())
		continue;
		// Found a marker, we have gone up a level
		if (const FieldDecl *Hist = HistoryStack.pop_back_val())
		ValidTypes.insert(Hist->getType().getTypePtr());

		continue;
		}

		// Adds everything except the original parameter declaration (which is not a
		// field itself) to the history stack.
		const RecordDecl *RD;
		if (const FieldDecl *Field = dyn_cast<FieldDecl>(Next)) {
		HistoryStack.push_back(Field);

		QualType FieldTy = Field->getType();
		// Other field types (known to be valid or invalid) are handled while we
		// walk around RecordDecl::fields().
		assert((FieldTy->isArrayType() \|\| FieldTy->isRecordType()) &&
		"Unexpected type.");
		const Type *FieldRecTy = FieldTy->getPointeeOrArrayElementType();

		RD = FieldRecTy->castAs<RecordType>()->getDecl();
		} else {
		RD = cast<RecordDecl>(Next);
		}

		RD = RD->getDefinition();
		// A struct return type can be undefined.
		if (!RD)
		continue;

		// Add a null marker so we know when we've gone back up a level
		VisitStack.push_back(nullptr);

		if (const auto *CXXRD = dyn_cast<CXXRecordDecl>(RD))
		for (auto Base : CXXRD->bases()) {
		// Skip non-record type, e.g. TemplateSpecializationType
		if (const auto *RT =
		Base.getType().getCanonicalType()->getAs<RecordType>()) {
		VisitStack.push_back(RT->getDecl());
		}
		}

		for (const auto *FD : RD->fields()) {
		QualType QT = FD->getType();

		if (ValidTypes.count(QT.getTypePtr()))
		continue;

		auto Result = CheckFieldType(QT);

		if (Result == CheckFieldResult::Valid)
		continue;

		if (Result == CheckFieldResult::Recurse) {
		VisitStack.push_back(FD);
		continue;
		}

		assert(Result == CheckFieldResult::Invalid);
		DiagInvalidParam(QT);
		S.Diag(OrigRecDecl->getLocation(), diag::note_within_field_of_type)
		<< OrigRecDecl->getDeclName() << S.getLangOpts().CPlusPlus;

		// We have an error, now let's go back up through history and show where
		// the offending field came from
		for (ArrayRef<const FieldDecl *>::const_iterator
		I = HistoryStack.begin() + 1,
		E = HistoryStack.end();
		I != E; ++I) {
		const FieldDecl OuterField = I;
		S.Diag(OuterField->getLocation(), diag::note_within_field_of_type)
		<< OuterField->getType() << S.getLangOpts().CPlusPlus;
		}

		S.Diag(FD->getLocation(), diag::note_illegal_field_declared_here)
		<< QT->isPointerType() << QT;
		return false;
		}
		} while (!VisitStack.empty());
		return true;
		}

enum OpenCLParamType {		enum OpenCLParamType {
ValidKernelParam,		ValidKernelParam,
PtrPtrKernelParam,		PtrPtrKernelParam,
PtrKernelParam,		PtrKernelParam,
InvalidAddrSpacePtrKernelParam,		InvalidAddrSpacePtrKernelParam,
InvalidKernelParam,		InvalidKernelParam,
RecordKernelParam		RecordKernelParam
};		};
▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	static void checkIsValidOpenCLKernelParameter(
case ValidKernelParam:		case ValidKernelParam:
ValidTypes.insert(PT.getTypePtr());		ValidTypes.insert(PT.getTypePtr());
return;		return;

case RecordKernelParam:		case RecordKernelParam:
break;		break;
}		}

// Track nested structs we will inspect		auto DiagInvalidParam = [&](QualType ParamTy) {
SmallVector<const Decl *, 4> VisitStack;		OpenCLParamType ParamType = getOpenCLKernelParameterType(S, ParamTy);

// Track where we are in the nested structs. Items will migrate from
// VisitStack to HistoryStack as we do the DFS for bad field.
SmallVector<const FieldDecl *, 4> HistoryStack;
HistoryStack.push_back(nullptr);

// At this point we already handled everything except of a RecordType or
// an ArrayType of a RecordType.
assert((PT->isArrayType() \|\| PT->isRecordType()) && "Unexpected type.");
const RecordType *RecTy =
PT->getPointeeOrArrayElementType()->getAs<RecordType>();
const RecordDecl *OrigRecDecl = RecTy->getDecl();

VisitStack.push_back(RecTy->getDecl());
assert(VisitStack.back() && "First decl null?");

do {
const Decl *Next = VisitStack.pop_back_val();
if (!Next) {
assert(!HistoryStack.empty());
// Found a marker, we have gone up a level
if (const FieldDecl *Hist = HistoryStack.pop_back_val())
ValidTypes.insert(Hist->getType().getTypePtr());

continue;
}

// Adds everything except the original parameter declaration (which is not a
// field itself) to the history stack.
const RecordDecl *RD;
if (const FieldDecl *Field = dyn_cast<FieldDecl>(Next)) {
HistoryStack.push_back(Field);

QualType FieldTy = Field->getType();
// Other field types (known to be valid or invalid) are handled while we
// walk around RecordDecl::fields().
assert((FieldTy->isArrayType() \|\| FieldTy->isRecordType()) &&
"Unexpected type.");
const Type *FieldRecTy = FieldTy->getPointeeOrArrayElementType();

RD = FieldRecTy->castAs<RecordType>()->getDecl();
} else {
RD = cast<RecordDecl>(Next);
}

// Add a null marker so we know when we've gone back up a level
VisitStack.push_back(nullptr);

for (const auto *FD : RD->fields()) {
QualType QT = FD->getType();

if (ValidTypes.count(QT.getTypePtr()))
continue;

OpenCLParamType ParamType = getOpenCLKernelParameterType(S, QT);
if (ParamType == ValidKernelParam)
continue;

if (ParamType == RecordKernelParam) {
VisitStack.push_back(FD);
continue;
}

// OpenCL v1.2 s6.9.p:		// OpenCL v1.2 s6.9.p:
// Arguments to kernel functions that are declared to be a struct or union		// Arguments to kernel functions that are declared to be a struct or union
// do not allow OpenCL objects to be passed as elements of the struct or		// do not allow OpenCL objects to be passed as elements of the struct or
// union.		// union.
if (ParamType == PtrKernelParam \|\| ParamType == PtrPtrKernelParam \|\|		if (ParamType == PtrKernelParam \|\| ParamType == PtrPtrKernelParam \|\|
ParamType == InvalidAddrSpacePtrKernelParam) {		ParamType == InvalidAddrSpacePtrKernelParam) {
S.Diag(Param->getLocation(),		S.Diag(Param->getLocation(), diag::err_record_with_pointers_kernel_param)
diag::err_record_with_pointers_kernel_param)		<< PT->isUnionType() << PT;
<< PT->isUnionType()
<< PT;
} else {		} else {
S.Diag(Param->getLocation(), diag::err_bad_kernel_param_type) << PT;		S.Diag(Param->getLocation(), diag::err_bad_kernel_param_type) << PT;
}		}
		};
		auto CheckFieldType = [&](QualType QT) {
		OpenCLParamType ParamType = getOpenCLKernelParameterType(S, QT);
		if (ParamType == ValidKernelParam)
		return CheckFieldResult::Valid;

S.Diag(OrigRecDecl->getLocation(), diag::note_within_field_of_type)		if (ParamType == RecordKernelParam) {
<< OrigRecDecl->getDeclName();		return CheckFieldResult::Recurse;

// We have an error, now let's go back up through history and show where
// the offending field came from
for (ArrayRef<const FieldDecl *>::const_iterator
I = HistoryStack.begin() + 1,
E = HistoryStack.end();
I != E; ++I) {
const FieldDecl OuterField = I;
S.Diag(OuterField->getLocation(), diag::note_within_field_of_type)
<< OuterField->getType();
}		}

S.Diag(FD->getLocation(), diag::note_illegal_field_declared_here)		return CheckFieldResult::Invalid;
<< QT->isPointerType()		};
<< QT;		if (!checkStructOrArrayType(S, PT, ValidTypes, CheckFieldType,
		DiagInvalidParam))
D.setInvalidType();		D.setInvalidType();
		}

		// Check whether HIP host function has parameters of half precision type or
		// struct type containing half precision type and diagnose them. This is
		// because gcc and clang does not have consistent ABI for half precision
		// type for now.
		// ToDo: disable the diagnostics once gcc and clang have a consistent ABI
		// about half precision types.
		static void checkHIPFunctionParameters(Sema &S, FunctionDecl *FD) {
		if (S.getLangOpts().HIPAllowHalfArg \|\| FD->hasAttr<CUDADeviceAttr>() \|\|
		FD->hasAttr<CUDAGlobalAttr>())
		traUnsubmitted Not Done Reply Inline Actions I'd split it into separate early exits for the flag and for attribute checks, with an added comment that we only check host functions. That brings the question of how we should handle HD functions. Right now they would not checked, which makes it possible to run into the issue you're trying to solve if that function is used on the host. tra: I'd split it into separate early exits for the flag and for attribute checks, with an added…
		return;

		auto IsInvalidType = [](QualType T) {
		if (T->isArrayType())
		T = QualType(T->getPointeeOrArrayElementType(), 0);
		if (T->isVectorType())
		T = T->getAs<VectorType>()->getElementType();
		return T->isFloat16Type() \|\| T->isHalfType();
		};

		// Check field type.
		auto CheckFieldType = [&](QualType FT) {
		if (IsInvalidType(FT)) {
		return CheckFieldResult::Invalid;
		}
		if (FT->isRecordType())
		return CheckFieldResult::Recurse;
		return CheckFieldResult::Valid;
		};

		// Cache for known valid types to avoid repeated check.
		llvm::SmallPtrSet<const Type *, 16> ValidTypes;

		// Information about parameter or return types to be checked.
		struct TypeCheckInfo {
		QualType Ty;
		SourceLocation Loc;
		bool IsRet; // Whether it is return type
		TypeCheckInfo(QualType T, SourceLocation L, bool _IsRet)
		: Ty(T), Loc(L), IsRet(_IsRet) {}
		};
		llvm::SmallVector<TypeCheckInfo, 8> TCInfo;
		for (auto *ParmVar : FD->parameters())
		TCInfo.emplace_back(
		TypeCheckInfo{ParmVar->getType(), ParmVar->getLocation(), false});
		TCInfo.emplace_back(
		TypeCheckInfo{FD->getReturnType(), FD->getLocation(), true});

		for (auto Info : TCInfo) {
		QualType T = Info.Ty;

		// Diagnose invalid parameter type for the current parameter.
		auto DiagInvalidType = [&](QualType Ty) {
		unsigned DiagID = S.getDiagnostics().getCustomDiagID(
		DiagnosticsEngine::Error,
		"Invalid function %select{parameter\|return}0 type: %1");
		S.Diag(Info.Loc, DiagID) << Info.IsRet << T;
		};

		if (IsInvalidType(T)) {
		DiagInvalidType(T);
		FD->setInvalidDecl();
		continue;
		}

		if (!T->isRecordType() && !T->isArrayType())
		continue;

		if (!checkStructOrArrayType(S, T, ValidTypes, CheckFieldType,
		DiagInvalidType)) {
		FD->setInvalidDecl();
return;		return;
}		}
} while (!VisitStack.empty());		}
}		}

/// Find the DeclContext in which a tag is implicitly declared if we see an		/// Find the DeclContext in which a tag is implicitly declared if we see an
/// elaborated type specifier in the specified context, and lookup finds		/// elaborated type specifier in the specified context, and lookup finds
/// nothing.		/// nothing.
static DeclContext getTagInjectionContext(DeclContext DC) {		static DeclContext getTagInjectionContext(DeclContext DC) {
while (!DC->isFileContext() && !DC->isFunctionOrMethod())		while (!DC->isFileContext() && !DC->isFunctionOrMethod())
DC = DC->getParent();		DC = DC->getParent();
▲ Show 20 Lines • Show All 1,998 Lines • ▼ Show 20 Lines	if (OtherUnmarkedIter != Previous.end()) {

NewFD->addAttr(OverloadableAttr::CreateImplicit(Context));		NewFD->addAttr(OverloadableAttr::CreateImplicit(Context));
}		}
}		}

if (LangOpts.OpenMP)		if (LangOpts.OpenMP)
ActOnFinishedFunctionDefinitionInOpenMPAssumeScope(NewFD);		ActOnFinishedFunctionDefinitionInOpenMPAssumeScope(NewFD);

		// Check HIP host function parameter types.
		if (getLangOpts().HIP)
		checkHIPFunctionParameters(*this, NewFD);

// Semantic checking for this function declaration (in isolation).		// Semantic checking for this function declaration (in isolation).

if (getLangOpts().CPlusPlus) {		if (getLangOpts().CPlusPlus) {
// C++-specific checks.		// C++-specific checks.
if (CXXConstructorDecl *Constructor = dyn_cast<CXXConstructorDecl>(NewFD)) {		if (CXXConstructorDecl *Constructor = dyn_cast<CXXConstructorDecl>(NewFD)) {
CheckConstructor(Constructor);		CheckConstructor(Constructor);
} else if (CXXDestructorDecl *Destructor =		} else if (CXXDestructorDecl *Destructor =
dyn_cast<CXXDestructorDecl>(NewFD)) {		dyn_cast<CXXDestructorDecl>(NewFD)) {
▲ Show 20 Lines • Show All 7,554 Lines • Show Last 20 Lines

clang/test/SemaCUDA/half-arg.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify -fno-hip-allow-half-arg -x hip %s
				// RUN: %clang_cc1 -std=c++11 -fcuda-is-device -fsyntax-only -verify -fno-hip-allow-half-arg -x hip %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify=allow -x hip %s

				// allow-no-diagnostics

				#include "Inputs/cuda.h"

				// Check _Float16/__fp16 or structs containing them are not allowed as function
				// parameter in HIP host functions.
				traUnsubmitted Not Done Reply Inline Actions What happens if I have a lambda that captures an fp16 value? Granted, it will be inadvisable to pass across compilers, but, nevertheless it's another kind of an aggregate we may end up passing to a function. tra: What happens if I have a lambda that captures an fp16 value? Granted, it will be inadvisable…

				typedef _Float16 half;

				typedef _Float16 half2 __attribute__((ext_vector_type(2)));

				struct A { // expected-note 4{{within field or base class of type 'A' declared here}}
				_Float16 x; // expected-note 7{{field of illegal type '_Float16' declared here}}
				};

				struct B { // expected-note {{within field or base class of type 'B' declared here}}
				_Float16 x[2]; // expected-note {{field of illegal type '_Float16 [2]' declared here}}
				};

				struct C { // expected-note {{within field or base class of type 'C' declared here}}
				_Float16 x[2][2]; // expected-note {{field of illegal type '_Float16 [2][2]' declared here}}
				};

				struct D { // expected-note {{within field or base class of type 'D' declared here}}
				A x; // expected-note {{within field or base class of type 'A' declared here}}
				};

				struct E : public A { // expected-note {{within field or base class of type 'E' declared here}}
				};

				struct F : virtual public A { // expected-note {{within field or base class of type 'F' declared here}}
				};

				struct G { // expected-note {{within field or base class of type 'G' declared here}}
				__fp16 x; // expected-note {{field of illegal type '__fp16' declared here}}
				};

				struct H {
				void f(A x);
				// expected-error@-1 {{Invalid function parameter type: 'A'}}
				};

				template<typename T>
				struct I {
				T x;
				void f(T x);
				// expected-error@-1 {{Invalid function parameter type: 'A'}}
				};

				struct J { // expected-note {{within field or base class of type 'J' declared here}}
				half2 v; // expected-note {{field of illegal type 'half2' (vector of 2 '_Float16' values) declared here}}
				};

				struct empty {};

				struct K : public empty {
				int x;
				};

				struct undefined;

				void fa1(_Float16 x);
				// expected-error@-1 {{Invalid function parameter type: '_Float16'}}

				void fa2(A x);
				// expected-error@-1 {{Invalid function parameter type: 'A'}}

				void fa3(B x);
				// expected-error@-1 {{Invalid function parameter type: 'B'}}

				void fa4(C x);
				// expected-error@-1 {{Invalid function parameter type: 'C'}}

				void fa5(D x);
				// expected-error@-1 {{Invalid function parameter type: 'D'}}

				void fa6(E x);
				// expected-error@-1 {{Invalid function parameter type: 'E'}}

				void fa7(F x);
				// expected-error@-1 {{Invalid function parameter type: 'F'}}

				void fa8(G x);
				// expected-error@-1 {{Invalid function parameter type: 'G'}}

				template<typename T> void fa9(T x);
				// expected-error@-1 {{Invalid function parameter type: 'A'}}
				// expected-note@-2 {{candidate template ignored: substitution failure [with T = A]}}
				void fa9_caller() {
				A x;
				fa9(x);
				// expected-error@-1 {{no matching function for call to 'fa9'}}
				// expected-note@-2 {{in instantiation of function template specialization 'fa9<A>' requested here}}
				}

				void fa10() {
				I<A> x;
				// expected-note@-1 {{in instantiation of template class 'I<A>' requested here}}
				}

				void fa11(half x);
				// expected-error@-1 {{Invalid function parameter type: 'half' (aka '_Float16')}}

				void fa12(half2 x);
				// expected-error@-1 {{Invalid function parameter type: 'half2' (vector of 2 '_Float16' values)}}

				void fa13(J x);
				// expected-error@-1 {{Invalid function parameter type: 'J'}}

				void fa14(int x, _Float16 y);
				// expected-error@-1 {{Invalid function parameter type: '_Float16'}}

				_Float16 fa15();
				// expected-error@-1 {{Invalid function return type: '_Float16'}}

				void fa16(K x);

				undefined fa17();

				// Check reference or pointers to _Float16/__fp16 or structs containing
				// them are allowed as function parameters in HIP host functions.

				void fb1(_Float16 &x);
				void fb2(_Float16 *x);
				void fb3(A &x);
				void fb4(A *x);

				// Check device function can use _Float16/__fp16 or struct containing
				// them as parameter type.
				__device__ void fc1(A x);
				__global__ void fc2(A x);
				__host__ __device__ void fc3(A x);