This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
include/clang/Basic/
-
clang/
-
Basic/
-
TargetInfo.h
-
lib/
-
Basic/Targets/
-
Targets/
-
AArch64.h
-
ARM.h
-
X86.h
-
CodeGen/
-
CGExprConstant.cpp
-
CGExprScalar.cpp
-
CodeGenTypes.cpp
-
Sema/
-
SemaExpr.cpp
-
test/
-
CodeGen/
-
fp16-ops.c
-
fp16vec-ops.c
-
CodeGenCXX/
-
float16-declarations.cpp
-
fp16-mangle.cpp

Differential D40112

[CodeGen][X86] Fix handling of __fp16 vectors
ClosedPublic

Authored by ahatanak on Nov 15 2017, 4:46 PM.

Download Raw Diff

Details

Reviewers

craig.topper
bruno
ab
RKSimon

Commits

rG502775a2ee08: [CodeGen][X86] Fix handling of __fp16 vectors.
rL320215: [CodeGen][X86] Fix handling of __fp16 vectors.
rC320215: [CodeGen][X86] Fix handling of __fp16 vectors.

Summary

IRGen for __fp16 vectors on X86 is currently completely broken. For example when the following code is compiled:

half4 hv0, hv1, hv2; // these are vectors of __fp16.

void foo221() {
  hv0 = hv1 + hv2;
}

clang generates the following IR, in which two i16 values are added:

@hv1 = common global <4 x i16> zeroinitializer, align 8
@hv2 = common global <4 x i16> zeroinitializer, align 8
@hv0 = common global <4 x i16> zeroinitializer, align 8

define void @foo221() {
entry:
  %0 = load <4 x i16>, <4 x i16>* @hv1, align 8
  %1 = load <4 x i16>, <4 x i16>* @hv2, align 8
  %add = add <4 x i16> %0, %1
  store <4 x i16> %add, <4 x i16>* @hv0, align 8
  ret void
}

To fix IRGen for fp16 vectors, this patch uses the code committed in r314056, which modified clang to promote and truncate fp16 vectors to and from float vectors in the AST. Also, as the first step toward doing away with the fp16 conversion intrinsics such as @llvm.convert.to.fp16 (see http://lists.llvm.org/pipermail/llvm-dev/2014-July/074689.html), I made changes to IRGen for fp16 scalars so that fpext/fptrunc instructions are emitted instead of the fp16 conversion intrinsics IRGen currently emits. This fixes another IRGen bug where a short value is assigned to an fp16 variable without any integer-to-floating-point conversion, as shown in the following example:

C code

__fp16 a;
short b;

void foo1() {
  a = b;
}

generated IR

@b = common global i16 0, align 2
@a = common global i16 0, align 2

define void @foo1() #0 {
entry:
  %0 = load i16, i16* @b, align 2
  store i16 %0, i16* @a, align 2
  ret void
}

I haven't spent too much time inspecting the code the X86 backend emits, but the code I've seen so far seems at least functionally correct (although it doesn't look very efficient since the backend scalarizes __fp16 vectors).

Diff Detail

Repository: rL LLVM

Event Timeline

ahatanak created this revision.Nov 15 2017, 4:46 PM

Herald added a subscriber: javed.absar. · View Herald TranscriptNov 15 2017, 4:46 PM

bruno added inline comments.Nov 27 2017, 11:53 AM

lib/CodeGen/CGExprScalar.cpp
954 ↗	(On Diff #123100)	This (and in the other places in the patch) means that regardless of `HalfArgsAndReturns` state we want to generate an intrinsic call if `useFP16ConversionIntrinsics()` is true, is that always the intended behavior?

ahatanak added inline comments.Nov 27 2017, 12:08 PM

lib/CodeGen/CGExprScalar.cpp
954 ↗	(On Diff #123100)	Yes, that is the intended behavior. HalfArgsAndReturns is used here to determine whether intrinsic calls should be emitted, but it seems to me that it should only be used to indicate whether returning or passing half types is allowed. Currently ARM and ARM64 are the only targets that are allowed to return or pass half types, but I think it's possible to allow other targets to do so too if that's desirable.

RKSimon added a reviewer: RKSimon.Nov 29 2017, 5:27 AM

Any other comments from anyone?

LGTM with the minor comment below.

include/clang/Basic/TargetInfo.h
563 ↗	(On Diff #123100)	Add a "FIXME"

This revision is now accepted and ready to land.Dec 4 2017, 11:29 AM

Closed by commit rC320215: [CodeGen][X86] Fix handling of __fp16 vectors. (authored by ahatanak). · Explain WhyDec 8 2017, 4:03 PM

Closed by commit rL320215: [CodeGen][X86] Fix handling of __fp16 vectors. (authored by ahatanak). · Explain Why

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

cfe/

trunk/

include/

clang/

Basic/

TargetInfo.h

8 lines

lib/

Basic/

Targets/

AArch64.h

4 lines

ARM.h

4 lines

X86.h

4 lines

CodeGen/

CGExprConstant.cpp

2 lines

CGExprScalar.cpp

12 lines

CodeGenTypes.cpp

8 lines

Sema/

SemaExpr.cpp

3 lines

test/

CodeGen/

fp16-ops.c

55 lines

fp16vec-ops.c

1 line

CodeGenCXX/

float16-declarations.cpp

21 lines

fp16-mangle.cpp

4 lines

Diff 126229

cfe/trunk/include/clang/Basic/TargetInfo.h

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	public:
}		}

/// \brief Check whether _Complex long double should use the "fp2ret" flavor		/// \brief Check whether _Complex long double should use the "fp2ret" flavor
/// of Objective-C message passing on this target.		/// of Objective-C message passing on this target.
bool useObjCFP2RetForComplexLongDouble() const {		bool useObjCFP2RetForComplexLongDouble() const {
return ComplexLongDoubleUsesFP2Ret;		return ComplexLongDoubleUsesFP2Ret;
}		}

		/// Check whether llvm intrinsics such as llvm.convert.to.fp16 should be used
		/// to convert to and from __fp16.
		/// FIXME: This function should be removed once all targets stop using the
		/// conversion intrinsics.
		virtual bool useFP16ConversionIntrinsics() const {
		return true;
		}

/// \brief Specify if mangling based on address space map should be used or		/// \brief Specify if mangling based on address space map should be used or
/// not for language specific address spaces		/// not for language specific address spaces
bool useAddressSpaceMapMangling() const {		bool useAddressSpaceMapMangling() const {
return UseAddrSpaceMapMangling;		return UseAddrSpaceMapMangling;
}		}

///===---- Other target property query methods --------------------------===//		///===---- Other target property query methods --------------------------===//

▲ Show 20 Lines • Show All 543 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/Targets/AArch64.h

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	public:
AArch64TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts);		AArch64TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts);

StringRef getABI() const override;		StringRef getABI() const override;
bool setABI(const std::string &Name) override;		bool setABI(const std::string &Name) override;

bool isValidCPUName(StringRef Name) const override;		bool isValidCPUName(StringRef Name) const override;
bool setCPU(const std::string &Name) override;		bool setCPU(const std::string &Name) override;

		bool useFP16ConversionIntrinsics() const override {
		return false;
		}

void getTargetDefinesARMV81A(const LangOptions &Opts,		void getTargetDefinesARMV81A(const LangOptions &Opts,
MacroBuilder &Builder) const;		MacroBuilder &Builder) const;
void getTargetDefinesARMV82A(const LangOptions &Opts,		void getTargetDefinesARMV82A(const LangOptions &Opts,
MacroBuilder &Builder) const;		MacroBuilder &Builder) const;
void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override;		MacroBuilder &Builder) const override;

ArrayRef<Builtin::Info> getTargetBuiltins() const override;		ArrayRef<Builtin::Info> getTargetBuiltins() const override;
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/Targets/ARM.h

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	public:

bool hasFeature(StringRef Feature) const override;		bool hasFeature(StringRef Feature) const override;

bool isValidCPUName(StringRef Name) const override;		bool isValidCPUName(StringRef Name) const override;
bool setCPU(const std::string &Name) override;		bool setCPU(const std::string &Name) override;

bool setFPMath(StringRef Name) override;		bool setFPMath(StringRef Name) override;

		bool useFP16ConversionIntrinsics() const override {
		return false;
		}

void getTargetDefinesARMV81A(const LangOptions &Opts,		void getTargetDefinesARMV81A(const LangOptions &Opts,
MacroBuilder &Builder) const;		MacroBuilder &Builder) const;

void getTargetDefinesARMV82A(const LangOptions &Opts,		void getTargetDefinesARMV82A(const LangOptions &Opts,
MacroBuilder &Builder) const;		MacroBuilder &Builder) const;
void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override;		MacroBuilder &Builder) const override;

▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/Targets/X86.h

Show First 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	case 'Y':
if ((++I != E) && ((I == '0') \|\| (I == 'z')))		if ((++I != E) && ((I == '0') \|\| (I == 'z')))
return "xmm0";		return "xmm0";
default:		default:
break;		break;
}		}
return "";		return "";
}		}

		bool useFP16ConversionIntrinsics() const override {
		return false;
		}

void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override;		MacroBuilder &Builder) const override;

static void setSSELevel(llvm::StringMap<bool> &Features, X86SSEEnum Level,		static void setSSELevel(llvm::StringMap<bool> &Features, X86SSEEnum Level,
bool Enabled);		bool Enabled);

static void setMMXLevel(llvm::StringMap<bool> &Features, MMX3DNowEnum Level,		static void setMMXLevel(llvm::StringMap<bool> &Features, MMX3DNowEnum Level,
bool Enabled);		bool Enabled);
▲ Show 20 Lines • Show All 594 Lines • Show Last 20 Lines

cfe/trunk/lib/CodeGen/CGExprConstant.cpp

Show First 20 Lines • Show All 1,819 Lines • ▼ Show 20 Lines	case APValue::ComplexInt: {
llvm::StructType *STy =		llvm::StructType *STy =
llvm::StructType::get(Complex[0]->getType(), Complex[1]->getType());		llvm::StructType::get(Complex[0]->getType(), Complex[1]->getType());
return llvm::ConstantStruct::get(STy, Complex);		return llvm::ConstantStruct::get(STy, Complex);
}		}
case APValue::Float: {		case APValue::Float: {
const llvm::APFloat &Init = Value.getFloat();		const llvm::APFloat &Init = Value.getFloat();
if (&Init.getSemantics() == &llvm::APFloat::IEEEhalf() &&		if (&Init.getSemantics() == &llvm::APFloat::IEEEhalf() &&
!CGM.getContext().getLangOpts().NativeHalfType &&		!CGM.getContext().getLangOpts().NativeHalfType &&
!CGM.getContext().getLangOpts().HalfArgsAndReturns)		CGM.getContext().getTargetInfo().useFP16ConversionIntrinsics())
return llvm::ConstantInt::get(CGM.getLLVMContext(),		return llvm::ConstantInt::get(CGM.getLLVMContext(),
Init.bitcastToAPInt());		Init.bitcastToAPInt());
else		else
return llvm::ConstantFP::get(CGM.getLLVMContext(), Init);		return llvm::ConstantFP::get(CGM.getLLVMContext(), Init);
}		}
case APValue::ComplexFloat: {		case APValue::ComplexFloat: {
llvm::Constant *Complex[2];		llvm::Constant *Complex[2];

▲ Show 20 Lines • Show All 287 Lines • Show Last 20 Lines

cfe/trunk/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 945 Lines • ▼ Show 20 Lines	if (DstType->isBooleanType())
return EmitConversionToBool(Src, SrcType);		return EmitConversionToBool(Src, SrcType);

llvm::Type *DstTy = ConvertType(DstType);		llvm::Type *DstTy = ConvertType(DstType);

// Cast from half through float if half isn't a native type.		// Cast from half through float if half isn't a native type.
if (SrcType->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {		if (SrcType->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {
// Cast to FP using the intrinsic if the half type itself isn't supported.		// Cast to FP using the intrinsic if the half type itself isn't supported.
if (DstTy->isFloatingPointTy()) {		if (DstTy->isFloatingPointTy()) {
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns)		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics())
return Builder.CreateCall(		return Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16, DstTy),		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16, DstTy),
Src);		Src);
} else {		} else {
// Cast to other types through float, using either the intrinsic or FPExt,		// Cast to other types through float, using either the intrinsic or FPExt,
// depending on whether the half type itself is supported		// depending on whether the half type itself is supported
// (as opposed to operations on half, available with NativeHalfType).		// (as opposed to operations on half, available with NativeHalfType).
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns) {		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics()) {
Src = Builder.CreateCall(		Src = Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16,		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16,
CGF.CGM.FloatTy),		CGF.CGM.FloatTy),
Src);		Src);
} else {		} else {
Src = Builder.CreateFPExt(Src, CGF.CGM.FloatTy, "conv");		Src = Builder.CreateFPExt(Src, CGF.CGM.FloatTy, "conv");
}		}
SrcType = CGF.getContext().FloatTy;		SrcType = CGF.getContext().FloatTy;
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	EmitFloatConversionCheck(OrigSrc, OrigSrcType, Src, SrcType, DstType, DstTy,
Loc);		Loc);

// Cast to half through float if half isn't a native type.		// Cast to half through float if half isn't a native type.
if (DstType->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {		if (DstType->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {
// Make sure we cast in a single step if from another FP type.		// Make sure we cast in a single step if from another FP type.
if (SrcTy->isFloatingPointTy()) {		if (SrcTy->isFloatingPointTy()) {
// Use the intrinsic if the half type itself isn't supported		// Use the intrinsic if the half type itself isn't supported
// (as opposed to operations on half, available with NativeHalfType).		// (as opposed to operations on half, available with NativeHalfType).
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns)		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics())
return Builder.CreateCall(		return Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16, SrcTy), Src);		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16, SrcTy), Src);
// If the half type is supported, just use an fptrunc.		// If the half type is supported, just use an fptrunc.
return Builder.CreateFPTrunc(Src, DstTy);		return Builder.CreateFPTrunc(Src, DstTy);
}		}
DstTy = CGF.FloatTy;		DstTy = CGF.FloatTy;
}		}

Show All 19 Lines	assert(SrcTy->isFloatingPointTy() && DstTy->isFloatingPointTy() &&
"Unknown real conversion");		"Unknown real conversion");
if (DstTy->getTypeID() < SrcTy->getTypeID())		if (DstTy->getTypeID() < SrcTy->getTypeID())
Res = Builder.CreateFPTrunc(Src, DstTy, "conv");		Res = Builder.CreateFPTrunc(Src, DstTy, "conv");
else		else
Res = Builder.CreateFPExt(Src, DstTy, "conv");		Res = Builder.CreateFPExt(Src, DstTy, "conv");
}		}

if (DstTy != ResTy) {		if (DstTy != ResTy) {
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns) {		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics()) {
assert(ResTy->isIntegerTy(16) && "Only half FP requires extra conversion");		assert(ResTy->isIntegerTy(16) && "Only half FP requires extra conversion");
Res = Builder.CreateCall(		Res = Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16, CGF.CGM.FloatTy),		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16, CGF.CGM.FloatTy),
Res);		Res);
} else {		} else {
Res = Builder.CreateFPTrunc(Res, ResTy, "conv");		Res = Builder.CreateFPTrunc(Res, ResTy, "conv");
}		}
}		}
▲ Show 20 Lines • Show All 907 Lines • ▼ Show 20 Lines	ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,

// Floating point.		// Floating point.
} else if (type->isRealFloatingType()) {		} else if (type->isRealFloatingType()) {
// Add the inc/dec to the real part.		// Add the inc/dec to the real part.
llvm::Value *amt;		llvm::Value *amt;

if (type->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {		if (type->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {
// Another special case: half FP increment should be done via float		// Another special case: half FP increment should be done via float
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns) {		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics()) {
value = Builder.CreateCall(		value = Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16,		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_from_fp16,
CGF.CGM.FloatTy),		CGF.CGM.FloatTy),
input, "incdec.conv");		input, "incdec.conv");
} else {		} else {
value = Builder.CreateFPExt(input, CGF.CGM.FloatTy, "incdec.conv");		value = Builder.CreateFPExt(input, CGF.CGM.FloatTy, "incdec.conv");
}		}
}		}
Show All 18 Lines	else {
else		else
FS = &CGF.getTarget().getLongDoubleFormat();		FS = &CGF.getTarget().getLongDoubleFormat();
F.convert(*FS, llvm::APFloat::rmTowardZero, &ignored);		F.convert(*FS, llvm::APFloat::rmTowardZero, &ignored);
amt = llvm::ConstantFP::get(VMContext, F);		amt = llvm::ConstantFP::get(VMContext, F);
}		}
value = Builder.CreateFAdd(value, amt, isInc ? "inc" : "dec");		value = Builder.CreateFAdd(value, amt, isInc ? "inc" : "dec");

if (type->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {		if (type->isHalfType() && !CGF.getContext().getLangOpts().NativeHalfType) {
if (!CGF.getContext().getLangOpts().HalfArgsAndReturns) {		if (CGF.getContext().getTargetInfo().useFP16ConversionIntrinsics()) {
value = Builder.CreateCall(		value = Builder.CreateCall(
CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16,		CGF.CGM.getIntrinsic(llvm::Intrinsic::convert_to_fp16,
CGF.CGM.FloatTy),		CGF.CGM.FloatTy),
value, "incdec.conv");		value, "incdec.conv");
} else {		} else {
value = Builder.CreateFPTrunc(value, input->getType(), "incdec.conv");		value = Builder.CreateFPTrunc(value, input->getType(), "incdec.conv");
}		}
}		}
▲ Show 20 Lines • Show All 1,995 Lines • Show Last 20 Lines

cfe/trunk/lib/CodeGen/CodeGenTypes.cpp

Show First 20 Lines • Show All 445 Lines • ▼ Show 20 Lines	case Type::Builtin: {
case BuiltinType::Float16:		case BuiltinType::Float16:
ResultType =		ResultType =
getTypeForFormat(getLLVMContext(), Context.getFloatTypeSemantics(T),		getTypeForFormat(getLLVMContext(), Context.getFloatTypeSemantics(T),
/* UseNativeHalf = */ true);		/* UseNativeHalf = */ true);
break;		break;

case BuiltinType::Half:		case BuiltinType::Half:
// Half FP can either be storage-only (lowered to i16) or native.		// Half FP can either be storage-only (lowered to i16) or native.
ResultType =		ResultType = getTypeForFormat(
getTypeForFormat(getLLVMContext(), Context.getFloatTypeSemantics(T),		getLLVMContext(), Context.getFloatTypeSemantics(T),
Context.getLangOpts().NativeHalfType \|\|		Context.getLangOpts().NativeHalfType \|\|
Context.getLangOpts().HalfArgsAndReturns);		!Context.getTargetInfo().useFP16ConversionIntrinsics());
break;		break;
case BuiltinType::Float:		case BuiltinType::Float:
case BuiltinType::Double:		case BuiltinType::Double:
case BuiltinType::LongDouble:		case BuiltinType::LongDouble:
case BuiltinType::Float128:		case BuiltinType::Float128:
ResultType = getTypeForFormat(getLLVMContext(),		ResultType = getTypeForFormat(getLLVMContext(),
Context.getFloatTypeSemantics(T),		Context.getFloatTypeSemantics(T),
/* UseNativeHalf = */ false);		/* UseNativeHalf = */ false);
▲ Show 20 Lines • Show All 319 Lines • Show Last 20 Lines

cfe/trunk/lib/Sema/SemaExpr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,559 Lines • ▼ Show 20 Lines	CorrectDelayedTyposInBinOp(Sema &S, BinaryOperatorKind Opc, Expr *LHSExpr,
return std::make_pair(LHS, RHS);		return std::make_pair(LHS, RHS);
}		}

/// Returns true if conversion between vectors of halfs and vectors of floats		/// Returns true if conversion between vectors of halfs and vectors of floats
/// is needed.		/// is needed.
static bool needsConversionOfHalfVec(bool OpRequiresConversion, ASTContext &Ctx,		static bool needsConversionOfHalfVec(bool OpRequiresConversion, ASTContext &Ctx,
QualType SrcType) {		QualType SrcType) {
return OpRequiresConversion && !Ctx.getLangOpts().NativeHalfType &&		return OpRequiresConversion && !Ctx.getLangOpts().NativeHalfType &&
Ctx.getLangOpts().HalfArgsAndReturns && isVector(SrcType, Ctx.HalfTy);		!Ctx.getTargetInfo().useFP16ConversionIntrinsics() &&
		isVector(SrcType, Ctx.HalfTy);
}		}

/// CreateBuiltinBinOp - Creates a new built-in binary operation with		/// CreateBuiltinBinOp - Creates a new built-in binary operation with
/// operator @p Opc at location @c TokLoc. This routine only supports		/// operator @p Opc at location @c TokLoc. This routine only supports
/// built-in operations; ActOnBinOp handles overloaded operators.		/// built-in operations; ActOnBinOp handles overloaded operators.
ExprResult Sema::CreateBuiltinBinOp(SourceLocation OpLoc,		ExprResult Sema::CreateBuiltinBinOp(SourceLocation OpLoc,
BinaryOperatorKind Opc,		BinaryOperatorKind Opc,
Expr LHSExpr, Expr RHSExpr) {		Expr LHSExpr, Expr RHSExpr) {
▲ Show 20 Lines • Show All 4,513 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGen/fp16-ops.c

// REQUIRES: arm-registered-target		// REQUIRES: arm-registered-target
// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi %s \| FileCheck %s --check-prefix=NOHALF --check-prefix=CHECK		// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi %s \| FileCheck %s --check-prefix=NOTNATIVE --check-prefix=CHECK
// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi %s \| FileCheck %s --check-prefix=NOHALF --check-prefix=CHECK		// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi %s \| FileCheck %s --check-prefix=NOTNATIVE --check-prefix=CHECK
// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=HALF --check-prefix=CHECK		// RUN: %clang_cc1 -emit-llvm -o - -triple x86_64-linux-gnu %s \| FileCheck %s --check-prefix=NOTNATIVE --check-prefix=CHECK
// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=HALF --check-prefix=CHECK		// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=NOTNATIVE --check-prefix=CHECK
		// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=NOTNATIVE --check-prefix=CHECK
// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi -fnative-half-type %s \		// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi -fnative-half-type %s \
// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF		// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF
// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi -fnative-half-type %s \		// RUN: %clang_cc1 -emit-llvm -o - -triple aarch64-none-linux-gnueabi -fnative-half-type %s \
// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF		// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF
// RUN: %clang_cc1 -emit-llvm -o - -x renderscript %s \		// RUN: %clang_cc1 -emit-llvm -o - -x renderscript %s \
// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF		// RUN: \| FileCheck %s --check-prefix=NATIVE-HALF
typedef unsigned cond_t;		typedef unsigned cond_t;

volatile cond_t test;		volatile cond_t test;
volatile int i0;		volatile int i0;
volatile __fp16 h0 = 0.0, h1 = 1.0, h2;		volatile __fp16 h0 = 0.0, h1 = 1.0, h2;
volatile float f0, f1, f2;		volatile float f0, f1, f2;
volatile double d0;		volatile double d0;
		short s0;

void foo(void) {		void foo(void) {
// CHECK-LABEL: define void @foo()		// CHECK-LABEL: define void @foo()

// Check unary ops		// Check unary ops

// NOHALF: [[F16TOF32:call float @llvm.convert.from.fp16.f32]]		// NOTNATIVE: [[F16TOF32:fpext half]]
// HALF: [[F16TOF32:fpext half]]
// CHECK: fptoui float		// CHECK: fptoui float
// NATIVE-HALF: fptoui half		// NATIVE-HALF: fptoui half
test = (h0);		test = (h0);
// CHECK: uitofp i32		// CHECK: uitofp i32
// NOHALF: [[F32TOF16:call i16 @llvm.convert.to.fp16.f32]]		// NOTNATIVE: [[F32TOF16:fptrunc float]]
// HALF: [[F32TOF16:fptrunc float]]
// NATIVE-HALF: uitofp i32 {{.*}} to half		// NATIVE-HALF: uitofp i32 {{.*}} to half
h0 = (test);		h0 = (test);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp une float		// CHECK: fcmp une float
// NATIVE-HALF: fcmp une half		// NATIVE-HALF: fcmp une half
test = (!h1);		test = (!h1);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// NOHALF: [[F32TOF16]]		// NOTNATIVE: [[F32TOF16]]
// HALF: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fsub half
h1 = -h1;		h1 = -h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: load volatile half		// NATIVE-HALF: load volatile half
// NATIVE-HALF-NEXT: store volatile half		// NATIVE-HALF-NEXT: store volatile half
h1 = +h1;		h1 = +h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
Show All 20 Lines	void foo(void) {
// Check binary ops with various operands		// Check binary ops with various operands
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fmul half		// NATIVE-HALF: fmul half
h1 = h0 * h2;		h1 = h0 * h2;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F32TOF16]]
// NOHALF: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fmul half		// NATIVE-HALF: fmul half
h1 = h0 * (__fp16) -2.0f;		h1 = h0 * (__fp16) -2.0f;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 13 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fdiv half		// NATIVE-HALF: fdiv half
h1 = (h0 / h2);		h1 = (h0 / h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fdiv half		// NATIVE-HALF: fdiv half
h1 = (h0 / (__fp16) -2.0f);		h1 = (h0 / (__fp16) -2.0f);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 13 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fadd half		// NATIVE-HALF: fadd half
h1 = (h2 + h0);		h1 = (h2 + h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fadd half		// NATIVE-HALF: fadd half
h1 = ((__fp16)-2.0 + h0);		h1 = ((__fp16)-2.0 + h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 13 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fsub half
h1 = (h2 - h0);		h1 = (h2 - h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fsub half
h1 = ((__fp16)-2.0f - h0);		h1 = ((__fp16)-2.0f - h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 12 Lines	void foo(void) {
h1 = (h0 - i0);		h1 = (h0 - i0);

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp olt float		// CHECK: fcmp olt float
// NATIVE-HALF: fcmp olt half		// NATIVE-HALF: fcmp olt half
test = (h2 < h0);		test = (h2 < h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp olt float		// CHECK: fcmp olt float
// NATIVE-HALF: fcmp olt half		// NATIVE-HALF: fcmp olt half
test = (h2 < (__fp16)42.0);		test = (h2 < (__fp16)42.0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp olt float		// CHECK: fcmp olt float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp olt float		// NATIVE-HALF: fcmp olt float
test = (h2 < f0);		test = (h2 < f0);
Show All 12 Lines	void foo(void) {
test = (h0 < i0);		test = (h0 < i0);

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp ogt float		// CHECK: fcmp ogt float
// NATIVE-HALF: fcmp ogt half		// NATIVE-HALF: fcmp ogt half
test = (h0 > h2);		test = (h0 > h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp ogt float		// CHECK: fcmp ogt float
// NATIVE-HALF: fcmp ogt half		// NATIVE-HALF: fcmp ogt half
test = ((__fp16)42.0 > h2);		test = ((__fp16)42.0 > h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp ogt float		// CHECK: fcmp ogt float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp ogt float		// NATIVE-HALF: fcmp ogt float
test = (h0 > f2);		test = (h0 > f2);
Show All 12 Lines	void foo(void) {
test = (h0 > i0);		test = (h0 > i0);

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp ole float		// CHECK: fcmp ole float
// NATIVE-HALF: fcmp ole half		// NATIVE-HALF: fcmp ole half
test = (h2 <= h0);		test = (h2 <= h0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp ole float		// CHECK: fcmp ole float
// NATIVE-HALF: fcmp ole half		// NATIVE-HALF: fcmp ole half
test = (h2 <= (__fp16)42.0);		test = (h2 <= (__fp16)42.0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp ole float		// CHECK: fcmp ole float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp ole float		// NATIVE-HALF: fcmp ole float
test = (h2 <= f0);		test = (h2 <= f0);
Show All 13 Lines	void foo(void) {


// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp oge float		// CHECK: fcmp oge float
// NATIVE-HALF: fcmp oge half		// NATIVE-HALF: fcmp oge half
test = (h0 >= h2);		test = (h0 >= h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp oge float		// CHECK: fcmp oge float
// NATIVE-HALF: fcmp oge half		// NATIVE-HALF: fcmp oge half
test = (h0 >= (__fp16)-2.0);		test = (h0 >= (__fp16)-2.0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp oge float		// CHECK: fcmp oge float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp oge float		// NATIVE-HALF: fcmp oge float
test = (h0 >= f2);		test = (h0 >= f2);
Show All 12 Lines	void foo(void) {
test = (h0 >= i0);		test = (h0 >= i0);

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp oeq float		// CHECK: fcmp oeq float
// NATIVE-HALF: fcmp oeq half		// NATIVE-HALF: fcmp oeq half
test = (h1 == h2);		test = (h1 == h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp oeq float		// CHECK: fcmp oeq float
// NATIVE-HALF: fcmp oeq half		// NATIVE-HALF: fcmp oeq half
test = (h1 == (__fp16)1.0);		test = (h1 == (__fp16)1.0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp oeq float		// CHECK: fcmp oeq float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp oeq float		// NATIVE-HALF: fcmp oeq float
test = (h1 == f1);		test = (h1 == f1);
Show All 12 Lines	void foo(void) {
test = (h0 == i0);		test = (h0 == i0);

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp une float		// CHECK: fcmp une float
// NATIVE-HALF: fcmp une half		// NATIVE-HALF: fcmp une half
test = (h1 != h2);		test = (h1 != h2);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fcmp une float		// CHECK: fcmp une float
// NATIVE-HALF: fcmp une half		// NATIVE-HALF: fcmp une half
test = (h1 != (__fp16)1.0);		test = (h1 != (__fp16)1.0);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp une float		// CHECK: fcmp une float
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
// NATIVE-HALF: fcmp une float		// NATIVE-HALF: fcmp une float
test = (h1 != f1);		test = (h1 != f1);
Show All 15 Lines	void foo(void) {
// CHECK: fcmp une float		// CHECK: fcmp une float
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fcmp une half {{.*}}, 0xH0000		// NATIVE-HALF: fcmp une half {{.*}}, 0xH0000
h1 = (h1 ? h2 : h0);		h1 = (h1 ? h2 : h0);
// Check assignments (inc. compound)		// Check assignments (inc. compound)
h0 = h1;		h0 = h1;
// NOHALF: [[F32TOF16]]		// NOTNATIVE: store {{.*}} half 0xHC000
// HALF: store {{.*}} half 0xHC000
// NATIVE-HALF: store {{.*}} half 0xHC000		// NATIVE-HALF: store {{.*}} half 0xHC000
h0 = (__fp16)-2.0f;		h0 = (__fp16)-2.0f;
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fptrunc float		// NATIVE-HALF: fptrunc float
h0 = f0;		h0 = f0;

// CHECK: sitofp i32 {{.*}} to float		// CHECK: sitofp i32 {{.*}} to float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: sitofp i32 {{.*}} to half		// NATIVE-HALF: sitofp i32 {{.*}} to half
h0 = i0;		h0 = i0;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fptosi float {{.*}} to i32		// CHECK: fptosi float {{.*}} to i32
// NATIVE-HALF: fptosi half {{.*}} to i32		// NATIVE-HALF: fptosi half {{.*}} to i32
i0 = h0;		i0 = h0;

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fadd half		// NATIVE-HALF: fadd half
h0 += h1;		h0 += h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fadd half		// NATIVE-HALF: fadd half
h0 += (__fp16)1.0f;		h0 += (__fp16)1.0f;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 18 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fsub half
h0 -= h1;		h0 -= h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fsub half
h0 -= (__fp16)1.0;		h0 -= (__fp16)1.0;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fsub float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 18 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fmul half		// NATIVE-HALF: fmul half
h0 *= h1;		h0 *= h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fmul half		// NATIVE-HALF: fmul half
h0 *= (__fp16)1.0;		h0 *= (__fp16)1.0;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fmul float		// CHECK: fmul float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 18 Lines	void foo(void) {

// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fdiv half		// NATIVE-HALF: fdiv half
h0 /= h1;		h0 /= h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// NOHALF: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fdiv half		// NATIVE-HALF: fdiv half
h0 /= (__fp16)1.0;		h0 /= (__fp16)1.0;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: fpext half		// NATIVE-HALF: fpext half
Show All 12 Lines	void foo(void) {
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fdiv float		// CHECK: fdiv float
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: sitofp i32 {{.*}} to half		// NATIVE-HALF: sitofp i32 {{.*}} to half
// NATIVE-HALF: fdiv half		// NATIVE-HALF: fdiv half
h0 /= i0;		h0 /= i0;

// Check conversions to/from double		// Check conversions to/from double
// NOHALF: call i16 @llvm.convert.to.fp16.f64(		// NOTNATIVE: fptrunc double {{.*}} to half
// HALF: fptrunc double {{.*}} to half
// NATIVE-HALF: fptrunc double {{.*}} to half		// NATIVE-HALF: fptrunc double {{.*}} to half
h0 = d0;		h0 = d0;

// CHECK: [[MID:%.]] = fptrunc double {{%.}} to float		// CHECK: [[MID:%.]] = fptrunc double {{%.}} to float
// NOHALF: call i16 @llvm.convert.to.fp16.f32(float [[MID]])		// NOTNATIVE: fptrunc float [[MID]] to half
// HALF: fptrunc float [[MID]] to half
// NATIVE-HALF: [[MID:%.]] = fptrunc double {{%.}} to float		// NATIVE-HALF: [[MID:%.]] = fptrunc double {{%.}} to float
// NATIVE-HALF: fptrunc float {{.*}} to half		// NATIVE-HALF: fptrunc float {{.*}} to half
h0 = (float)d0;		h0 = (float)d0;

// NOHALF: call double @llvm.convert.from.fp16.f64(		// NOTNATIVE: fpext half {{.*}} to double
// HALF: fpext half {{.*}} to double
// NATIVE-HALF: fpext half {{.*}} to double		// NATIVE-HALF: fpext half {{.*}} to double
d0 = h0;		d0 = h0;

// NOHALF: [[MID:%.*]] = call float @llvm.convert.from.fp16.f32(		// NOTNATIVE: [[MID:%.]] = fpext half {{.}} to float
// HALF: [[MID:%.]] = fpext half {{.}} to float
// CHECK: fpext float [[MID]] to double		// CHECK: fpext float [[MID]] to double
// NATIVE-HALF: [[MID:%.]] = fpext half {{.}} to float		// NATIVE-HALF: [[MID:%.]] = fpext half {{.}} to float
// NATIVE-HALF: fpext float [[MID]] to double		// NATIVE-HALF: fpext float [[MID]] to double
d0 = (float)h0;		d0 = (float)h0;

		// NOTNATIVE: [[V1:%.]] = load i16, i16 @s0
		// NOTNATIVE: [[CONV:%.*]] = sitofp i16 [[V1]] to float
		// NOTNATIVE: [[TRUNC:%.*]] = fptrunc float [[CONV]] to half
		// NOTNATIVE: store volatile half [[TRUNC]], half* @h0
		h0 = s0;
}		}

cfe/trunk/test/CodeGen/fp16vec-ops.c

	// REQUIRES: arm-registered-target			// REQUIRES: arm-registered-target
	// RUN: %clang_cc1 -triple arm64-apple-ios9 -emit-llvm -o - -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=CHECK			// RUN: %clang_cc1 -triple arm64-apple-ios9 -emit-llvm -o - -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=CHECK
	// RUN: %clang_cc1 -triple armv7-apple-ios9 -emit-llvm -o - -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=CHECK			// RUN: %clang_cc1 -triple armv7-apple-ios9 -emit-llvm -o - -fallow-half-arguments-and-returns %s \| FileCheck %s --check-prefix=CHECK
				// RUN: %clang_cc1 -triple x86_64-apple-macos10.13 -emit-llvm -o - %s \| FileCheck %s --check-prefix=CHECK

	typedef __fp16 half4 __attribute__ ((vector_size (8)));			typedef __fp16 half4 __attribute__ ((vector_size (8)));
	typedef short short4 __attribute__ ((vector_size (8)));			typedef short short4 __attribute__ ((vector_size (8)));

	half4 hv0, hv1;			half4 hv0, hv1;
	short4 sv0;			short4 sv0;

	// CHECK-LABEL: testFP16Vec0			// CHECK-LABEL: testFP16Vec0
	▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGenCXX/float16-declarations.cpp

// RUN: %clang -std=c++11 --target=aarch64-arm--eabi -S -emit-llvm %s -o - \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64		// RUN: %clang -std=c++11 --target=aarch64-arm--eabi -S -emit-llvm %s -o - \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64
// RUN: %clang -std=c++11 --target=x86_64 -S -emit-llvm %s -o - \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-X86		// RUN: %clang -std=c++11 --target=x86_64 -S -emit-llvm %s -o - \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-X86

/* Various contexts where type _Float16 can appear. */		/* Various contexts where type _Float16 can appear. */


/* Namespace */		/* Namespace */

namespace {		namespace {
_Float16 f1n;		_Float16 f1n;
// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH0000, align 2		// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH0000, align 2

_Float16 f2n = 33.f16;		_Float16 f2n = 33.f16;
// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2		// CHECK-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2
// CHECK-X86-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global i16 20512, align 2

_Float16 arr1n[10];		_Float16 arr1n[10];
// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2		// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2
// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 16		// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 16

_Float16 arr2n[] = { 1.2, 3.0, 3.e4 };		_Float16 arr2n[] = { 1.2, 3.0, 3.e4 };
// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2		// CHECK-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2
// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x i16] [i16 15565, i16 16896, i16 30547], align 2

const volatile _Float16 func1n(const _Float16 &arg) {		const volatile _Float16 func1n(const _Float16 &arg) {
return arg + f2n + arr1n[4] - arr2n[1];		return arg + f2n + arr1n[4] - arr2n[1];
}		}
}		}


/* File */		/* File */

_Float16 f1f;		_Float16 f1f;
// CHECK-AARCH64-DAG: @f1f = global half 0xH0000, align 2		// CHECK-AARCH64-DAG: @f1f = global half 0xH0000, align 2
// CHECK-X86-DAG: @f1f = global half 0xH0000, align 2		// CHECK-X86-DAG: @f1f = global half 0xH0000, align 2

_Float16 f2f = 32.4;		_Float16 f2f = 32.4;
// CHECK-AARCH64-DAG: @f2f = global half 0xH500D, align 2		// CHECK-DAG: @f2f = global half 0xH500D, align 2
// CHECK-X86-DAG: @f2f = global i16 20493, align 2

_Float16 arr1f[10];		_Float16 arr1f[10];
// CHECK-AARCH64-DAG: @arr1f = global [10 x half] zeroinitializer, align 2		// CHECK-AARCH64-DAG: @arr1f = global [10 x half] zeroinitializer, align 2
// CHECK-X86-DAG: @arr1f = global [10 x half] zeroinitializer, align 16		// CHECK-X86-DAG: @arr1f = global [10 x half] zeroinitializer, align 16

_Float16 arr2f[] = { -1.2, -3.0, -3.e4 };		_Float16 arr2f[] = { -1.2, -3.0, -3.e4 };
// CHECK-AARCH64-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0xHC200, half 0xHF753], align 2		// CHECK-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0xHC200, half 0xHF753], align 2
// CHECK-X86-DAG: @arr2f = global [3 x i16] [i16 -17203, i16 -15872, i16 -2221], align 2

_Float16 func1f(_Float16 arg);		_Float16 func1f(_Float16 arg);


/* Class */		/* Class */

class C1 {		class C1 {
_Float16 f1c;		_Float16 f1c;
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	// CHECK-DAG: store half 0xH8000, half* %{{.*}}, align 2
_Float16 f3l = 1.000976562;		_Float16 f3l = 1.000976562;
// CHECK-DAG: store half 0xH3C01, half* %{{.*}}, align 2		// CHECK-DAG: store half 0xH3C01, half* %{{.*}}, align 2

C1 c1(f1l);		C1 c1(f1l);
// CHECK-DAG: [[F1L:%[a-z0-9]+]] = load half, half* %{{.*}}, align 2		// CHECK-DAG: [[F1L:%[a-z0-9]+]] = load half, half* %{{.*}}, align 2
// CHECK-DAG: call void @_ZN2C1C2EDF16_(%class.C1* %{{.}}, half %{{.}})		// CHECK-DAG: call void @_ZN2C1C2EDF16_(%class.C1* %{{.}}, half %{{.}})

S1<_Float16> s1 = { 132.f16 };		S1<_Float16> s1 = { 132.f16 };
// CHECK-AARCH64-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant %struct.S1 { half 0xH5820 }, align 2		// CHECK-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant %struct.S1 { half 0xH5820 }, align 2
// CHECK-X86-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant { i16 } { i16 22560 }, align 2
// CHECK-DAG: [[S1:%[0-9]+]] = bitcast %struct.S1* %{{.}} to i8		// CHECK-DAG: [[S1:%[0-9]+]] = bitcast %struct.S1* %{{.}} to i8
// CHECK-AARCH64-DAG: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[S1]], i8* bitcast (%struct.S1* @_ZZ4mainE2s1 to i8*), i64 2, i32 2, i1 false)		// CHECK-DAG: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[S1]], i8* bitcast (%struct.S1* @_ZZ4mainE2s1 to i8*), i64 2, i32 2, i1 false)
// CHECK-X86-DAG: call void @llvm.memcpy.p0i8.p0i8.i64(i8* %{{.}}, i8 bitcast ({ i16 }* @_ZZ4mainE2s1 to i8*), i64 2, i32 2, i1 false)

_Float16 f4l = func1n(f1l) + func1f(f2l) + c1.func1c(f3l) + c1.func2c(f1l) +		_Float16 f4l = func1n(f1l) + func1f(f2l) + c1.func1c(f3l) + c1.func2c(f1l) +
func1t(f1l) + s1.mem2 - f1n + f2n;		func1t(f1l) + s1.mem2 - f1n + f2n;

auto f5l = -1.f16, *f6l = &f2l, f7l = func1t(f3l);		auto f5l = -1.f16, *f6l = &f2l, f7l = func1t(f3l);
// CHECK-DAG: store half 0xHBC00, half* %{{.*}}, align 2		// CHECK-DAG: store half 0xHBC00, half* %{{.*}}, align 2
// CHECK-DAG: store half* %{{.}}, half* %{{.*}}, align 8		// CHECK-DAG: store half* %{{.}}, half* %{{.*}}, align 8

_Float16 f8l = f4l++;		_Float16 f8l = f4l++;
// CHECK-DAG: %{{.}} = load half, half %{{.*}}, align 2		// CHECK-DAG: %{{.}} = load half, half %{{.*}}, align 2
// CHECK-DAG: [[INC:%[a-z0-9]+]] = fadd half {{.*}}, 0xH3C00		// CHECK-DAG: [[INC:%[a-z0-9]+]] = fadd half {{.*}}, 0xH3C00
// CHECK-DAG: store half [[INC]], half* %{{.*}}, align 2		// CHECK-DAG: store half [[INC]], half* %{{.*}}, align 2

_Float16 arr1l[] = { -1.f16, -0.f16, -11.f16 };		_Float16 arr1l[] = { -1.f16, -0.f16, -11.f16 };
// CHECK-AARCH64-DAG: @_ZZ4mainE5arr1l = private unnamed_addr constant [3 x half] [half 0xHBC00, half 0xH8000, half 0xHC980], align 2		// CHECK-DAG: @_ZZ4mainE5arr1l = private unnamed_addr constant [3 x half] [half 0xHBC00, half 0xH8000, half 0xHC980], align 2
// CHECK-X86-DAG: @_ZZ4mainE5arr1l = private unnamed_addr constant [3 x i16] [i16 -17408, i16 -32768, i16 -13952], align 2

float cvtf = f2n;		float cvtf = f2n;
//CHECK-DAG: [[H2F:%[a-z0-9]+]] = fpext half {{%[0-9]+}} to float		//CHECK-DAG: [[H2F:%[a-z0-9]+]] = fpext half {{%[0-9]+}} to float
//CHECK-DAG: store float [[H2F]], float* %{{.*}}, align 4		//CHECK-DAG: store float [[H2F]], float* %{{.*}}, align 4

double cvtd = f2n;		double cvtd = f2n;
//CHECK-DAG: [[H2D:%[a-z0-9]+]] = fpext half {{%[0-9]+}} to double		//CHECK-DAG: [[H2D:%[a-z0-9]+]] = fpext half {{%[0-9]+}} to double
//CHECK-DAG: store double [[H2D]], double* %{{.*}}, align 8		//CHECK-DAG: store double [[H2D]], double* %{{.*}}, align 8
Show All 15 Lines

cfe/trunk/test/CodeGenCXX/fp16-mangle.cpp

	// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi %s \| FileCheck %s			// RUN: %clang_cc1 -emit-llvm -o - -triple arm-none-linux-gnueabi %s \| FileCheck %s

	// CHECK: @_ZN1SIDhDhE1iE = global i32 3			// CHECK: @_ZN1SIDhDhE1iE = global i32 3
	template <typename T, typename U> struct S { static int i; };			template <typename T, typename U> struct S { static int i; };
	template <> int S<__fp16, __fp16>::i = 3;			template <> int S<__fp16, __fp16>::i = 3;

	// CHECK-LABEL: define void @_Z1fPDh(i16* %x)			// CHECK-LABEL: define void @_Z1fPDh(half* %x)
	void f (__fp16 *x) { }			void f (__fp16 *x) { }

	// CHECK-LABEL: define void @_Z1gPDhS_(i16* %x, i16* %y)			// CHECK-LABEL: define void @_Z1gPDhS_(half* %x, half* %y)
	void g (__fp16 x, __fp16 y) { }			void g (__fp16 x, __fp16 y) { }

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen][X86] Fix handling of __fp16 vectorsClosedPublic

Details

C code

generated IR

Diff Detail

Event Timeline

Revision Contents

Diff 126229

cfe/trunk/include/clang/Basic/TargetInfo.h

cfe/trunk/lib/Basic/Targets/AArch64.h

cfe/trunk/lib/Basic/Targets/ARM.h

cfe/trunk/lib/Basic/Targets/X86.h

cfe/trunk/lib/CodeGen/CGExprConstant.cpp

cfe/trunk/lib/CodeGen/CGExprScalar.cpp

cfe/trunk/lib/CodeGen/CodeGenTypes.cpp

cfe/trunk/lib/Sema/SemaExpr.cpp

cfe/trunk/test/CodeGen/fp16-ops.c

cfe/trunk/test/CodeGen/fp16vec-ops.c

cfe/trunk/test/CodeGenCXX/float16-declarations.cpp

cfe/trunk/test/CodeGenCXX/fp16-mangle.cpp

[CodeGen][X86] Fix handling of __fp16 vectors
ClosedPublic