This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/
-
cmake/
-
builtin-config-ix.cmake
-
lib/builtins/
-
builtins/
-
CMakeLists.txt
-
extendhfsf2.c
2/4
extendhftf2.c
3/5
fp_extend.h
-
fp_trunc.h
-
truncdfhf2.c
-
truncsfhf2.c
-
trunctfhf2.c
-
test/builtins/
-
builtins/
-
CMakeLists.txt
-
Unit/
1/4
extendhfsf2_test.c
-
extendhftf2_test.c
5/9
fp_test.h
-
truncdfhf2_test.c
-
truncsfhf2_test.c
1
trunctfhf2_test.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
RuntimeLibcalls.def
-
lib/
-
CodeGen/
-
TargetLoweringBase.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
arm64-fp128.ll

Differential D86453

[AArch64] Support conversion between fp16 and fp128
AbandonedPublic

Authored by zatrazz on Aug 24 2020, 6:07 AM.

Download Raw Diff

Details

Reviewers

SjoerdMeijer
bryanpkc
t.p.northover
javed.absar
howard.hinnant
efriedma

Summary

This is an updated version of https://reviews.llvm.org/D50685 .
The main changes are that

I have added a CMake test to check if compiler support _Float16 to avoid either to require a more recent gcc to build a stage1 compiler and avoid creating potentially invalid soft-fp symbols (as for ARM fp16 one when built with a compiler configured for hard-float).
Adjusted the internal tests to use _Float16 as well if compiler supports.

This issue has been brought by some developer from Apache TVM, where is
trigger an compiler ICE with the missing fp conversion.

This patch adds both extendhftf2 and trunctfhf2 to support
conversion between half-precision and quad-precision floating-point
values. They are enabled iff the compiler supports _Float16.

It also adjust the extendhfsf2, truncdfhf2 __truncsfhf2 to use
_Float16 when compiler supports it. On AArch64 it allows use the
native FP16 ABI, while on other architectures the expected current
semantic is preserved (arm for instance).

Diff Detail

Event Timeline

zatrazz created this revision.Aug 24 2020, 6:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2020, 6:07 AM

Herald added subscribers: Restricted Project, danielkiss, hiraditya and 2 others. · View Herald Transcript

zatrazz requested review of this revision.Aug 24 2020, 6:07 AM

Harbormaster completed remote builds in B69309: Diff 287364.Aug 24 2020, 6:39 AM

atrosinenko added a subscriber: atrosinenko.Aug 24 2020, 7:12 AM

atrosinenko added inline comments.

compiler-rt/lib/builtins/extendhftf2.c
2	`extendhfsf2.c` should probably be `extendhftf2.c`.
compiler-rt/test/builtins/Unit/fp_test.h
86	Shouldn't it have `(TYPE_FP16 result, uint16_t expected)` arguments like other compareResultX functions (as if `fp_t result, rep_t expected`)? The same for `test_truncXfhf2(... , uint16_t expected)`: if I get it right, the common tradition for them is expressing the reference value as corresponding `uintXX_t`. Meanwhile, this is probably related to your changes to second test case from extendhfsf2_test.c - the original sources look quite suspicious to me.

atrosinenko mentioned this in D84877: Support for soft fp16 to fp64 IEEE conversions.Aug 24 2020, 10:02 AM

zatrazz added inline comments.Aug 24 2020, 1:47 PM

compiler-rt/lib/builtins/extendhftf2.c
2	Indeed, I will update it.
compiler-rt/test/builtins/Unit/fp_test.h
86	Yeah, I agree it should follow the other compare functions and use a uintxx_t as second argument, I will fix it. The extendhfsf2_test.c issue is in fact that it is using a wrong comparison function that is not making explicit what it is currently testing. The test__extendhfsf2 has a 'float' as second argument, however calls compareResultH which expectes a uint16_t (without this change). This makes the compiler to silent cast the result and the expected results could be misleading. I think it should call compareResultF with toRep32 instead and to make explicit it compares against the uint32_t representation of the float, same as extenddftf2_test.c and extendsftf2_test.c are already doing. I will change it as well.

Updated version based on previous comments.

atrosinenko added inline comments.Aug 25 2020, 3:28 AM

compiler-rt/lib/builtins/extendhftf2.c
13	Technically, linter is used to complain about `\` being not at the 80th column for multi-line macroses, but I'm not sure all these rules apply to compiler-rt/builtins library.
compiler-rt/test/builtins/Unit/fp_test.h
6–10	This could probably be moved to fp_lib.h (or to int_types.h - looks like "int" means "internal" there...), possibly switching to `typedef`s. This should make other code more idiomatic: there would be single-line `typedef`s for src_t and dst_t in fp_extend.h and fp_trunc.h, just as for other precisions the precision converting LibCalls (extend / trunc) would look more idiomatically: COMPILER_RT_ABI concrete_return_type __libcallXfYf2(concrete_argument_type) { ... } ... with one of those types being TYPE_F16 alias instead of src_t / dst_t.
86	It would probably be even more idiomatically to declare test__extendhfsf2 with the second argument being `uint32_t expected`, but this may cause quite large changes that are not strictly related to this patch.

zatrazz added inline comments.Aug 25 2020, 6:47 AM

compiler-rt/lib/builtins/extendhftf2.c
13	Right, I will fix it.
compiler-rt/test/builtins/Unit/fp_test.h
6–10	The fp_lib.h now only defines a single floating-point type (one must define either SINGLE_PRECISION, DOUBLE_PRECISION, or QUAD_PRECISION). For instance on extendsftf2.c, it defines QUAD_PRECISION and includes fp_lib.h. To move the float16 definition to fp_lib.h it would require to add the possibility to define multiple types, similar to what is already done to fp_extend.h/fp_trunc.h. I am not sure if is better, the second fp type is used currently on the extend/trunc builtins. And I don' t think int_types.h is the correct place, afaiks _Float16 currently not really defined on all architectures so its not really related to an interger type. Ideally I think compiler-rt should define all 'hf' builtins to use _Float16 and built them iff the ABI implements the type (meaning the compiler actually emits libcalls to it). For ABI that support float16 operations without supporting _Float16 type, for instance ARM which supports __fp16, it would be better to move the libcall implementation to the arch-specific folders.
86	I think it should be feabile, it would require to redefine all the inputs in the test below though.

ldrumm added a subscriber: ldrumm.Aug 25 2020, 7:17 AM

ebevhan added a subscriber: ebevhan.Aug 26 2020, 12:50 AM

Hi atrosinenko, do you think this patch need any more change on the testing side?
The fp_lib.h/int_lib.h change would most likely require in a more complex without
much gain in organization imho.

In D86453#2241702, @zatrazz wrote:

Hi atrosinenko, do you think this patch need any more change on the testing side?
The fp_lib.h/int_lib.h change would most likely require in a more complex without
much gain in organization imho.

Sorry for the delay. Provided the particular testcases are correct, I have found just a single fromRep16 that was probably used instead of toRep16. Other comments are merely random thoughts just in case.

compiler-rt/test/builtins/Unit/extendhfsf2_test.c
10	As previously discussed, I would rather use `uint32_t expected` instead. But after all, the objective of this patch is not to fix `test__extendhfsf2`, so even if this would be implemented, then it should probably go to a separate patch and not clutter this one. Commenting this just for completeness.
17	`toRep16(a)` is probably expected.
compiler-rt/test/builtins/Unit/fp_test.h
6–10	Agree, looks like there is no more suitable place for that define right now.
18	If I get it right, this could be performed unconditionally. On the other hand, the `#else` branch may be either a good illustration for peculiarities of this function when no native `_Float16` is available or some misleading stuff...

zatrazz added inline comments.Aug 31 2020, 10:42 AM

compiler-rt/test/builtins/Unit/extendhfsf2_test.c
10	Using `uint32_t` indeed seems a better approach and its aligns somewhat to the change to use _Float16 where applicable. I will send a updated version with this fix.
17	Ack.
compiler-rt/test/builtins/Unit/fp_test.h
18	I don't have a strong preference either, it should be optimized away by the compiler for `!COMPILER_RT_HAS_FLOAT16` anyway. I will keep to make it explicit that for `COMPILER_RT_HAS_FLOAT16` `TYPE_FP16` is expected to be different tha `uint16_t`.

Updated patch based on previous comments.

MaskRay added a subscriber: MaskRay.Aug 31 2020, 10:50 AM

MaskRay added inline comments.

compiler-rt/lib/builtins/fp_extend.h
43	`#ifdef` ? ditto below
compiler-rt/test/builtins/Unit/trunctfhf2_test.c
15	Use 2-space indentation. Place `{` in the end. There is no need following the style of some violating files in this directory.

I have adapted the news files using clang-format and fixed minor style issues pointed by previous comments.

Ping.

MaskRay added inline comments.Sep 8 2020, 10:48 AM

compiler-rt/lib/builtins/fp_extend.h
43	You can mark the comment above as "Done". _Float16 and uint16_t are different types. Do all the `hf` functions change their signatures when `_Float16` is supported? This still looks strange.

MaskRay added a reviewer: efriedma.Sep 8 2020, 10:48 AM

zatrazz marked an inline comment as done.Sep 8 2020, 11:51 AM

zatrazz added inline comments.

compiler-rt/lib/builtins/fp_extend.h
43	I think using _Float16 for generic support does make more sense, since float16 support is really an extension and with different calling convention on each architecture and even on some different ABIs. While it is uint16_t for armv6 due the soft-fp ABI, it is a complete different type with a different calling convention for aarch64. And I expect that other architectures to use similar strategies now that it is supported on some vector extensions in some new chips (POWER10 for instance). Also, for ABI which expects hf to have 'uint16_t' as calling convetion it would be better to compartmentalize it to the specific arch folder, for instance for arm (and this is what libgcc does for instance).

MaskRay added a subscriber: ab.Sep 8 2020, 12:12 PM

MaskRay added inline comments.

compiler-rt/lib/builtins/fp_extend.h
43	If 'uint16_t' is more of an anomaly, @ab should the arm softfp implementation be moved to the specific arch folder?

zatrazz added inline comments.Sep 8 2020, 12:58 PM

compiler-rt/lib/builtins/fp_extend.h
43	I would say so, although I am not fully sure if any other architecture does use such routines (which in this case would need to add such routines as well).

Ping.

Ping (x2).

Ping (x3).

zatrazz abandoned this revision.Oct 26 2020, 11:00 AM

zatrazz mentioned this in D90175: [AArch64] Support conversion between fp16 and fp128.

kito-cheng mentioned this in D98670: [RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not..Mar 30 2021, 7:24 AM

Revision Contents

Path

Size

compiler-rt/

cmake/

builtin-config-ix.cmake

7 lines

lib/

builtins/

4 lines

6 lines

23 lines

4 lines

4 lines

4 lines

6 lines

21 lines

test/

builtins/

CMakeLists.txt

2 lines

Unit/

69 lines

97 lines

30 lines

8 lines

6 lines

127 lines

llvm/

include/

llvm/

IR/

RuntimeLibcalls.def

1 line

lib/

CodeGen/

TargetLoweringBase.cpp

2 lines

Target/

AArch64/

AArch64ISelLowering.cpp

2 lines

test/

CodeGen/

AArch64/

arm64-fp128.ll

14 lines

Diff 287488

compiler-rt/cmake/builtin-config-ix.cmake

	Show All 16 Lines
	builtin_check_c_compiler_source(COMPILER_RT_HAS_ATOMIC_KEYWORD			builtin_check_c_compiler_source(COMPILER_RT_HAS_ATOMIC_KEYWORD
	"			"
	int foo(int x, int y) {			int foo(int x, int y) {
	_Atomic int result = x * y;			_Atomic int result = x * y;
	return result;			return result;
	}			}
	")			")

				builtin_check_c_compiler_source(COMPILER_RT_HAS_FLOAT16
				"
				_Float16 foo(_Float16 x) {
				return x;
				}
				"
				)

	set(ARM64 aarch64)			set(ARM64 aarch64)
	set(ARM32 arm armhf armv6m armv7m armv7em armv7 armv7s armv7k)			set(ARM32 arm armhf armv6m armv7m armv7em armv7 armv7s armv7k)
	set(HEXAGON hexagon)			set(HEXAGON hexagon)
	set(X86 i386)			set(X86 i386)
	set(X86_64 x86_64)			set(X86_64 x86_64)
	set(MIPS32 mips mipsel)			set(MIPS32 mips mipsel)
	set(MIPS64 mips64 mips64el)			set(MIPS64 mips64 mips64el)
	▲ Show 20 Lines • Show All 164 Lines • Show Last 20 Lines

compiler-rt/lib/builtins/CMakeLists.txt

	Show First 20 Lines • Show All 163 Lines • ▼ Show 20 Lines
	# TODO: Several "tf" files (and divtc3.c, but not multc3.c) are in			# TODO: Several "tf" files (and divtc3.c, but not multc3.c) are in
	# GENERIC_SOURCES instead of here.			# GENERIC_SOURCES instead of here.
	set(GENERIC_TF_SOURCES			set(GENERIC_TF_SOURCES
	addtf3.c			addtf3.c
	comparetf2.c			comparetf2.c
	divtc3.c			divtc3.c
	divtf3.c			divtf3.c
	extenddftf2.c			extenddftf2.c
				extendhftf2.c
	extendsftf2.c			extendsftf2.c
	fixtfdi.c			fixtfdi.c
	fixtfsi.c			fixtfsi.c
	fixtfti.c			fixtfti.c
	fixunstfdi.c			fixunstfdi.c
	fixunstfsi.c			fixunstfsi.c
	fixunstfti.c			fixunstfti.c
	floatditf.c			floatditf.c
	floatsitf.c			floatsitf.c
	floattitf.c			floattitf.c
	floatunditf.c			floatunditf.c
	floatunsitf.c			floatunsitf.c
	floatuntitf.c			floatuntitf.c
	multc3.c			multc3.c
	multf3.c			multf3.c
	powitf2.c			powitf2.c
	subtf3.c			subtf3.c
	trunctfdf2.c			trunctfdf2.c
				trunctfhf2.c
	trunctfsf2.c			trunctfsf2.c
	)			)

	option(COMPILER_RT_EXCLUDE_ATOMIC_BUILTIN			option(COMPILER_RT_EXCLUDE_ATOMIC_BUILTIN
	"Skip the atomic builtin (these should normally be provided by a shared library)"			"Skip the atomic builtin (these should normally be provided by a shared library)"
	On)			On)

	if(NOT FUCHSIA AND NOT COMPILER_RT_BAREMETAL_BUILD)			if(NOT FUCHSIA AND NOT COMPILER_RT_BAREMETAL_BUILD)
	▲ Show 20 Lines • Show All 404 Lines • ▼ Show 20 Lines

	if (APPLE)			if (APPLE)
	add_subdirectory(Darwin-excludes)			add_subdirectory(Darwin-excludes)
	add_subdirectory(macho_embedded)			add_subdirectory(macho_embedded)
	darwin_add_builtin_libraries(${BUILTIN_SUPPORTED_OS})			darwin_add_builtin_libraries(${BUILTIN_SUPPORTED_OS})
	else ()			else ()
	set(BUILTIN_CFLAGS "")			set(BUILTIN_CFLAGS "")

				append_list_if(COMPILER_RT_HAS_FLOAT16 -DCOMPILER_RT_HAS_FLOAT16 BUILTIN_CFLAGS)

	append_list_if(COMPILER_RT_HAS_STD_C11_FLAG -std=c11 BUILTIN_CFLAGS)			append_list_if(COMPILER_RT_HAS_STD_C11_FLAG -std=c11 BUILTIN_CFLAGS)

	# These flags would normally be added to CMAKE_C_FLAGS by the llvm			# These flags would normally be added to CMAKE_C_FLAGS by the llvm
	# cmake step. Add them manually if this is a standalone build.			# cmake step. Add them manually if this is a standalone build.
	if(COMPILER_RT_STANDALONE_BUILD)			if(COMPILER_RT_STANDALONE_BUILD)
	if(COMPILER_RT_BUILTINS_ENABLE_PIC)			if(COMPILER_RT_BUILTINS_ENABLE_PIC)
	append_list_if(COMPILER_RT_HAS_FPIC_FLAG -fPIC BUILTIN_CFLAGS)			append_list_if(COMPILER_RT_HAS_FPIC_FLAG -fPIC BUILTIN_CFLAGS)
	endif()			endif()
	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

compiler-rt/lib/builtins/extendhfsf2.c

	//===-- lib/extendhfsf2.c - half -> single conversion -------------- C --===//			//===-- lib/extendhfsf2.c - half -> single conversion -------------- C --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#define SRC_HALF			#define SRC_HALF
	#define DST_SINGLE			#define DST_SINGLE
	#include "fp_extend_impl.inc"			#include "fp_extend_impl.inc"

	// Use a forwarding definition and noinline to implement a poor man's alias,			// Use a forwarding definition and noinline to implement a poor man's alias,
	// as there isn't a good cross-platform way of defining one.			// as there isn't a good cross-platform way of defining one.
	COMPILER_RT_ABI NOINLINE float __extendhfsf2(uint16_t a) {			COMPILER_RT_ABI NOINLINE float __extendhfsf2(src_t a) {
	return __extendXfYf2__(a);			return __extendXfYf2__(a);
	}			}

	COMPILER_RT_ABI float __gnu_h2f_ieee(uint16_t a) { return __extendhfsf2(a); }			COMPILER_RT_ABI float __gnu_h2f_ieee(src_t a) { return __extendhfsf2(a); }

	#if defined(__ARM_EABI__)			#if defined(__ARM_EABI__)
	#if defined(COMPILER_RT_ARMHF_TARGET)			#if defined(COMPILER_RT_ARMHF_TARGET)
	AEABI_RTABI float __aeabi_h2f(uint16_t a) { return __extendhfsf2(a); }			AEABI_RTABI float __aeabi_h2f(src_t a) { return __extendhfsf2(a); }
	#else			#else
	COMPILER_RT_ALIAS(__extendhfsf2, __aeabi_h2f)			COMPILER_RT_ALIAS(__extendhfsf2, __aeabi_h2f)
	#endif			#endif
	#endif			#endif

compiler-rt/lib/builtins/extendhftf2.c

This file was added.

				//===-- lib/extendhftf2.c - half -> quad conversion ---------------- C --===//
				//
				atrosinenkoUnsubmitted Not Done Reply Inline Actions `extendhfsf2.c` should probably be `extendhftf2.c`. atrosinenko: `extendhfsf2.c` should probably be `extendhftf2.c`.
				zatrazzAuthorUnsubmitted Done Reply Inline Actions Indeed, I will update it. zatrazz: Indeed, I will update it.
				// The LLVM Compiler Infrastructure
				//
				// This file is dual licensed under the MIT and the University of Illinois Open
				// Source Licenses. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#define QUAD_PRECISION
				#include "fp_lib.h"

				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT) && \
				atrosinenkoUnsubmitted Not Done Reply Inline Actions Technically, linter is used to complain about `\` being not at the 80th column for multi-line macroses, but I'm not sure all these rules apply to compiler-rt/builtins library. atrosinenko: Technically, linter is used to complain about `\` being not at the 80th column for multi-line…
				zatrazzAuthorUnsubmitted Done Reply Inline Actions Right, I will fix it. zatrazz: Right, I will fix it.
				defined(COMPILER_RT_HAS_FLOAT16)
				#define SRC_HALF
				#define DST_QUAD
				#include "fp_extend_impl.inc"

				COMPILER_RT_ABI long double __extendhftf2(src_t a) {
				return __extendXfYf2__(a);
				}

				#endif

compiler-rt/lib/builtins/fp_extend.h

Show All 34 Lines	#else
if (a & REP_C(0xffffffff00000000))		if (a & REP_C(0xffffffff00000000))
return __builtin_clz(a >> 32);		return __builtin_clz(a >> 32);
else		else
return 32 + __builtin_clz(a & REP_C(0xffffffff));		return 32 + __builtin_clz(a & REP_C(0xffffffff));
#endif		#endif
}		}

#elif defined SRC_HALF		#elif defined SRC_HALF
		#if defined COMPILER_RT_HAS_FLOAT16
		MaskRayUnsubmitted Done Reply Inline Actions `#ifdef` ? ditto below MaskRay: `#ifdef` ? ditto below
		MaskRayUnsubmitted Not Done Reply Inline Actions You can mark the comment above as "Done". _Float16 and uint16_t are different types. Do all the `hf` functions change their signatures when `_Float16` is supported? This still looks strange. MaskRay: You can mark the comment above as "Done". _Float16 and uint16_t are different types. Do all…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions I think using _Float16 for generic support does make more sense, since float16 support is really an extension and with different calling convention on each architecture and even on some different ABIs. While it is uint16_t for armv6 due the soft-fp ABI, it is a complete different type with a different calling convention for aarch64. And I expect that other architectures to use similar strategies now that it is supported on some vector extensions in some new chips (POWER10 for instance). Also, for ABI which expects hf to have 'uint16_t' as calling convetion it would be better to compartmentalize it to the specific arch folder, for instance for arm (and this is what libgcc does for instance). zatrazz: I think using _Float16 for generic support does make more sense, since float16 support is…
		MaskRayUnsubmitted Not Done Reply Inline Actions If 'uint16_t' is more of an anomaly, @ab should the arm softfp implementation be moved to the specific arch folder? MaskRay: If 'uint16_t' is more of an anomaly, @ab should the arm softfp implementation be moved to the…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions I would say so, although I am not fully sure if any other architecture does use such routines (which in this case would need to add such routines as well). zatrazz: I would say so, although I am not fully sure if any other architecture does use such routines…
		typedef _Float16 src_t;
		#else
typedef uint16_t src_t;		typedef uint16_t src_t;
		#endif
typedef uint16_t src_rep_t;		typedef uint16_t src_rep_t;
#define SRC_REP_C UINT16_C		#define SRC_REP_C UINT16_C
static const int srcSigBits = 10;		static const int srcSigBits = 10;
#define src_rep_t_clz __builtin_clz		#define src_rep_t_clz __builtin_clz

#else		#else
#error Source should be half, single, or double precision!		#error Source should be half, single, or double precision!
#endif // end source precision		#endif // end source precision
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

compiler-rt/lib/builtins/fp_trunc.h

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines

	#elif defined DST_SINGLE			#elif defined DST_SINGLE
	typedef float dst_t;			typedef float dst_t;
	typedef uint32_t dst_rep_t;			typedef uint32_t dst_rep_t;
	#define DST_REP_C UINT32_C			#define DST_REP_C UINT32_C
	static const int dstSigBits = 23;			static const int dstSigBits = 23;

	#elif defined DST_HALF			#elif defined DST_HALF
				#if defined COMPILER_RT_HAS_FLOAT16
				typedef _Float16 dst_t;
				#else
	typedef uint16_t dst_t;			typedef uint16_t dst_t;
				#endif
	typedef uint16_t dst_rep_t;			typedef uint16_t dst_rep_t;
	#define DST_REP_C UINT16_C			#define DST_REP_C UINT16_C
	static const int dstSigBits = 10;			static const int dstSigBits = 10;

	#else			#else
	#error Destination should be single precision or double precision!			#error Destination should be single precision or double precision!
	#endif // end destination precision			#endif // end destination precision

	Show All 20 Lines

compiler-rt/lib/builtins/truncdfhf2.c

	//===-- lib/truncdfhf2.c - double -> half conversion --------------- C --===//			//===-- lib/truncdfhf2.c - double -> half conversion --------------- C --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#define SRC_DOUBLE			#define SRC_DOUBLE
	#define DST_HALF			#define DST_HALF
	#include "fp_trunc_impl.inc"			#include "fp_trunc_impl.inc"

	COMPILER_RT_ABI uint16_t __truncdfhf2(double a) { return __truncXfYf2__(a); }			COMPILER_RT_ABI dst_t __truncdfhf2(double a) { return __truncXfYf2__(a); }

	#if defined(__ARM_EABI__)			#if defined(__ARM_EABI__)
	#if defined(COMPILER_RT_ARMHF_TARGET)			#if defined(COMPILER_RT_ARMHF_TARGET)
	AEABI_RTABI uint16_t __aeabi_d2h(double a) { return __truncdfhf2(a); }			AEABI_RTABI dst_t __aeabi_d2h(double a) { return __truncdfhf2(a); }
	#else			#else
	COMPILER_RT_ALIAS(__truncdfhf2, __aeabi_d2h)			COMPILER_RT_ALIAS(__truncdfhf2, __aeabi_d2h)
	#endif			#endif
	#endif			#endif

compiler-rt/lib/builtins/truncsfhf2.c

	//===-- lib/truncsfhf2.c - single -> half conversion --------------- C --===//			//===-- lib/truncsfhf2.c - single -> half conversion --------------- C --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#define SRC_SINGLE			#define SRC_SINGLE
	#define DST_HALF			#define DST_HALF
	#include "fp_trunc_impl.inc"			#include "fp_trunc_impl.inc"

	// Use a forwarding definition and noinline to implement a poor man's alias,			// Use a forwarding definition and noinline to implement a poor man's alias,
	// as there isn't a good cross-platform way of defining one.			// as there isn't a good cross-platform way of defining one.
	COMPILER_RT_ABI NOINLINE uint16_t __truncsfhf2(float a) {			COMPILER_RT_ABI NOINLINE dst_t __truncsfhf2(float a) {
	return __truncXfYf2__(a);			return __truncXfYf2__(a);
	}			}

	COMPILER_RT_ABI uint16_t __gnu_f2h_ieee(float a) { return __truncsfhf2(a); }			COMPILER_RT_ABI dst_t __gnu_f2h_ieee(float a) { return __truncsfhf2(a); }

	#if defined(__ARM_EABI__)			#if defined(__ARM_EABI__)
	#if defined(COMPILER_RT_ARMHF_TARGET)			#if defined(COMPILER_RT_ARMHF_TARGET)
	AEABI_RTABI uint16_t __aeabi_f2h(float a) { return __truncsfhf2(a); }			AEABI_RTABI dst_t __aeabi_f2h(float a) { return __truncsfhf2(a); }
	#else			#else
	COMPILER_RT_ALIAS(__truncsfhf2, __aeabi_f2h)			COMPILER_RT_ALIAS(__truncsfhf2, __aeabi_f2h)
	#endif			#endif
	#endif			#endif

compiler-rt/lib/builtins/trunctfhf2.c

This file was added.

				//===-- lib/trunctfhf2.c - quad -> half conversion ----------------- C --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is dual licensed under the MIT and the University of Illinois Open
				// Source Licenses. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#define QUAD_PRECISION
				#include "fp_lib.h"

				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT) && \
				defined(COMPILER_RT_HAS_FLOAT16)
				#define SRC_QUAD
				#define DST_HALF
				#include "fp_trunc_impl.inc"

				COMPILER_RT_ABI dst_t __trunctfhf2(long double a) { return __truncXfYf2__(a); }

				#endif

compiler-rt/test/builtins/CMakeLists.txt

Show All 38 Lines	if (${arch} STREQUAL "armhf")
string(REPLACE ";" " " BUILTINS_TEST_TARGET_CFLAGS "${BUILTINS_TEST_TARGET_CFLAGS}")		string(REPLACE ";" " " BUILTINS_TEST_TARGET_CFLAGS "${BUILTINS_TEST_TARGET_CFLAGS}")
endif()		endif()

if (${arch} STREQUAL "riscv32")		if (${arch} STREQUAL "riscv32")
list(APPEND BUILTINS_TEST_TARGET_CFLAGS -fforce-enable-int128)		list(APPEND BUILTINS_TEST_TARGET_CFLAGS -fforce-enable-int128)
string(REPLACE ";" " " BUILTINS_TEST_TARGET_CFLAGS "${BUILTINS_TEST_TARGET_CFLAGS}")		string(REPLACE ";" " " BUILTINS_TEST_TARGET_CFLAGS "${BUILTINS_TEST_TARGET_CFLAGS}")
endif()		endif()

		append_list_if(COMPILER_RT_HAS_FLOAT16 -DCOMPILER_RT_HAS_FLOAT16 BUILTINS_TEST_TARGET_CFLAGS)

# Compute builtins available in library and add them as lit features.		# Compute builtins available in library and add them as lit features.
if(APPLE)		if(APPLE)
# TODO: Support other Apple platforms.		# TODO: Support other Apple platforms.
set(BUILTIN_LIB_TARGET_NAME "clang_rt.builtins_${arch}_osx")		set(BUILTIN_LIB_TARGET_NAME "clang_rt.builtins_${arch}_osx")
else()		else()
set(BUILTIN_LIB_TARGET_NAME "clang_rt.builtins-${arch}")		set(BUILTIN_LIB_TARGET_NAME "clang_rt.builtins-${arch}")
endif()		endif()
if (NOT TARGET "${BUILTIN_LIB_TARGET_NAME}")		if (NOT TARGET "${BUILTIN_LIB_TARGET_NAME}")
Show All 34 Lines

compiler-rt/test/builtins/Unit/extendhfsf2_test.c

	// RUN: %clang_builtins %s %librt -o %t && %run %t			// RUN: %clang_builtins %s %librt -o %t && %run %t
	// REQUIRES: librt_has_extendhfsf2			// REQUIRES: librt_has_extendhfsf2

	#include <stdio.h>			#include <stdio.h>

	#include "fp_test.h"			#include "fp_test.h"

	float __extendhfsf2(uint16_t a);			float __extendhfsf2(TYPE_FP16 a);

	int test__extendhfsf2(uint16_t a, float expected)			int test__extendhfsf2(TYPE_FP16 a, float expected)
				atrosinenkoUnsubmitted Not Done Reply Inline Actions As previously discussed, I would rather use `uint32_t expected` instead. But after all, the objective of this patch is not to fix `test__extendhfsf2`, so even if this would be implemented, then it should probably go to a separate patch and not clutter this one. Commenting this just for completeness. atrosinenko: As previously discussed, I would rather use `uint32_t expected` instead. But after all, the…
				zatrazzAuthorUnsubmitted Not Done Reply Inline Actions Using `uint32_t` indeed seems a better approach and its aligns somewhat to the change to use _Float16 where applicable. I will send a updated version with this fix. zatrazz: Using `uint32_t` indeed seems a better approach and its aligns somewhat to the change to use…
	{			{
	float x = __extendhfsf2(a);			float x = __extendhfsf2(a);
	int ret = compareResultH(x, expected);			int ret = compareResultF(x, toRep32(expected));

	if (ret){			if (ret){
	printf("error in test__extendhfsf2(%#.4x) = %f, "			printf("error in test__extendhfsf2(%#.4x) = %f, "
	"expected %f\n", a, x, expected);			"expected %f\n", fromRep16(a), x, expected);
				atrosinenkoUnsubmitted Not Done Reply Inline Actions `toRep16(a)` is probably expected. atrosinenko: `toRep16(a)` is probably expected.
				zatrazzAuthorUnsubmitted Done Reply Inline Actions Ack. zatrazz: Ack.
	}			}
	return ret;			return ret;
	}			}

	char assumption_1[sizeof(__fp16) * CHAR_BIT == 16] = {0};			char assumption_1[sizeof(TYPE_FP16) * CHAR_BIT == 16] = {0};

	int main()			int main()
	{			{
	// qNaN			// qNaN
	if (test__extendhfsf2(UINT16_C(0x7e00),			if (test__extendhfsf2(fromRep16(0x7e00),
	makeQNaN32()))			makeQNaN32()))
	return 1;			return 1;
	// NaN			// NaN
	if (test__extendhfsf2(UINT16_C(0x7e00),			if (test__extendhfsf2(fromRep16(0x7d00),
	makeNaN32(UINT32_C(0x8000))))			makeQNaN32()))
	return 1;			return 1;
	// inf			// inf
	if (test__extendhfsf2(UINT16_C(0x7c00),			if (test__extendhfsf2(fromRep16(0x7c00),
	makeInf32()))			makeInf32()))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0xfc00),			if (test__extendhfsf2(fromRep16(0xfc00),
	-makeInf32()))			-makeInf32()))
	return 1;			return 1;
	// zero			// zero
	if (test__extendhfsf2(UINT16_C(0x0),			if (test__extendhfsf2(fromRep16(0x0),
	0.0f))			0.0f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x8000),			if (test__extendhfsf2(fromRep16(0x8000),
	-0.0f))			-0.0f))
	return 1;			return 1;
				if (test__extendhfsf2(fromRep16(0x4248),
	if (test__extendhfsf2(UINT16_C(0x4248),			3.140625f))
	3.1415926535f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0xc248),			if (test__extendhfsf2(fromRep16(0xc248),
	-3.1415926535f))			-3.140625f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x7c00),			if (test__extendhfsf2(fromRep16(0x7c00),
	0x1.987124876876324p+100f))			makeInf32()))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x6e62),			if (test__extendhfsf2(fromRep16(0x6e62),
	0x1.988p+12f))			0x1.988p+12f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x3c00),			if (test__extendhfsf2(fromRep16(0x3c00),
	0x1.0p+0f))			0x1.0p+0f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x0400),			if (test__extendhfsf2(fromRep16(0x0400),
	0x1.0p-14f))			0x1.0p-14f))
	return 1;			return 1;
	// denormal			// denormal
	if (test__extendhfsf2(UINT16_C(0x0010),			if (test__extendhfsf2(fromRep16(0x0010),
	0x1.0p-20f))			0x1.0p-20f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x0001),			if (test__extendhfsf2(fromRep16(0x0001),
	0x1.0p-24f))			0x1.0p-24f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x8001),			if (test__extendhfsf2(fromRep16(0x8001),
	-0x1.0p-24f))			-0x1.0p-24f))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x0001),			if (test__extendhfsf2(fromRep16(0x0001),
	0x1.5p-25f))			0x1p-24))
	return 1;			return 1;
	// and back to zero			// and back to zero
	if (test__extendhfsf2(UINT16_C(0x0000),			if (test__extendhfsf2(fromRep16(0x0000),
	0x1.0p-25f))			0x0p+0))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0x8000),			if (test__extendhfsf2(fromRep16(0x8000),
	-0x1.0p-25f))			-0x0p+0))
	return 1;			return 1;
	// max (precise)			// max (precise)
	if (test__extendhfsf2(UINT16_C(0x7bff),			if (test__extendhfsf2(fromRep16(0x7bff),
	65504.0f))			65504.0f))
	return 1;			return 1;
	// max (rounded)			// max (rounded)
	if (test__extendhfsf2(UINT16_C(0x7bff),			if (test__extendhfsf2(fromRep16(0x7bff),
	65504.0f))			65504.0f))
	return 1;			return 1;
	// max (to +inf)			// max (to +inf)
	if (test__extendhfsf2(UINT16_C(0x7c00),			if (test__extendhfsf2(fromRep16(0x7c00),
	makeInf32()))			makeInf32()))
	return 1;			return 1;
	if (test__extendhfsf2(UINT16_C(0xfc00),			if (test__extendhfsf2(fromRep16(0xfc00),
	-makeInf32()))			-makeInf32()))
	return 1;			return 1;
	return 0;			return 0;
	}			}

compiler-rt/test/builtins/Unit/extendhftf2_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t
				// REQUIRES: librt_has_extendhftf2

				#include "int_lib.h"
				#include <stdio.h>

				#if __LDBL_MANT_DIG__ == 113 && defined (COMPILER_RT_HAS_FLOAT16)

				#include "fp_test.h"

				COMPILER_RT_ABI long double __extendhftf2(TYPE_FP16 a);

				int test__extendhftf2(TYPE_FP16 a, uint64_t expectedHi, uint64_t expectedLo)
				{
				long double x = __extendhftf2(a);
				int ret = compareResultLD(x, expectedHi, expectedLo);

				if (ret)
				{
				printf("error in test__extendhftf2(%#.4x) = %.20Lf, "
				"expected %.20Lf\n", toRep16(a), x,
				fromRep128(expectedHi, expectedLo));
				}
				return ret;
				}

				char assumption_1[sizeof(TYPE_FP16) * CHAR_BIT == 16] = {0};

				#endif

				int main()
				{
				#if __LDBL_MANT_DIG__ == 113 && defined (COMPILER_RT_HAS_FLOAT16)
				// qNaN
				if (test__extendhftf2(makeQNaN16(),
				UINT64_C(0x7fff800000000000),
				UINT64_C(0x0)))
				return 1;
				// NaN
				if (test__extendhftf2(makeNaN16(UINT16_C(0x0100)),
				UINT64_C(0x7fff400000000000),
				UINT64_C(0x0)))
				return 1;
				// inf
				if (test__extendhftf2(makeInf16(),
				UINT64_C(0x7fff000000000000),
				UINT64_C(0x0)))
				return 1;
				if (test__extendhftf2(-makeInf16(),
				UINT64_C(0xffff000000000000),
				UINT64_C(0x0)))
				return 1;
				// zero
				if (test__extendhftf2(fromRep16(0x0U),
				UINT64_C(0x0), UINT64_C(0x0)))
				return 1;
				if (test__extendhftf2(fromRep16(0x8000U),
				UINT64_C(0x8000000000000000),
				UINT64_C(0x0)))
				return 1;
				// denormal
				if (test__extendhftf2(fromRep16(0x0010U),
				UINT64_C(0x3feb000000000000),
				UINT64_C(0x0000000000000000)))
				return 1;
				if (test__extendhftf2(fromRep16(0x0001U),
				UINT64_C(0x3fe7000000000000),
				UINT64_C(0x0000000000000000)))
				return 1;
				if (test__extendhftf2(fromRep16(0x8001U),
				UINT64_C(0xbfe7000000000000),
				UINT64_C(0x0000000000000000)))
				return 1;

				// pi
				if (test__extendhftf2(fromRep16(0x4248U),
				UINT64_C(0x4000920000000000),
				UINT64_C(0x0000000000000000)))
				return 1;
				if (test__extendhftf2(fromRep16(0xc248U),
				UINT64_C(0xc000920000000000),
				UINT64_C(0x0000000000000000)))
				return 1;

				if (test__extendhftf2(fromRep16(0x508cU),
				UINT64_C(0x4004230000000000),
				UINT64_C(0x0)))
				return 1;
				if (test__extendhftf2(fromRep16(0x1bb7U),
				UINT64_C(0x3ff6edc000000000),
				UINT64_C(0x0)))
				return 1;
				#else
				printf("skipped\n");
				#endif
				return 0;
				}

compiler-rt/test/builtins/Unit/fp_test.h

#include <stdlib.h>		#include <stdlib.h>
#include <limits.h>		#include <limits.h>
#include <string.h>		#include <string.h>
#include <stdint.h>		#include <stdint.h>

		#ifdef COMPILER_RT_HAS_FLOAT16
		#define TYPE_FP16 _Float16
		#else
		#define TYPE_FP16 uint16_t
		#endif
		atrosinenkoUnsubmitted Not Done Reply Inline Actions This could probably be moved to fp_lib.h (or to int_types.h - looks like "int" means "internal" there...), possibly switching to `typedef`s. This should make other code more idiomatic: there would be single-line `typedef`s for src_t and dst_t in fp_extend.h and fp_trunc.h, just as for other precisions the precision converting LibCalls (extend / trunc) would look more idiomatically: COMPILER_RT_ABI concrete_return_type __libcallXfYf2(concrete_argument_type) { ... } ... with one of those types being TYPE_F16 alias instead of src_t / dst_t. atrosinenko: This could probably be moved to fp_lib.h (or to int_types.h - looks like "int" means "internal"…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions The fp_lib.h now only defines a single floating-point type (one must define either SINGLE_PRECISION, DOUBLE_PRECISION, or QUAD_PRECISION). For instance on extendsftf2.c, it defines QUAD_PRECISION and includes fp_lib.h. To move the float16 definition to fp_lib.h it would require to add the possibility to define multiple types, similar to what is already done to fp_extend.h/fp_trunc.h. I am not sure if is better, the second fp type is used currently on the extend/trunc builtins. And I don' t think int_types.h is the correct place, afaiks _Float16 currently not really defined on all architectures so its not really related to an interger type. Ideally I think compiler-rt should define all 'hf' builtins to use _Float16 and built them iff the ABI implements the type (meaning the compiler actually emits libcalls to it). For ABI that support float16 operations without supporting _Float16 type, for instance ARM which supports __fp16, it would be better to move the libcall implementation to the arch-specific folders. zatrazz: The fp_lib.h now only defines a single floating-point type (one must define either…
		atrosinenkoUnsubmitted Done Reply Inline Actions Agree, looks like there is no more suitable place for that define right now. atrosinenko: Agree, looks like there is no more suitable place for that define right now.

enum EXPECTED_RESULT {		enum EXPECTED_RESULT {
LESS_0, LESS_EQUAL_0, EQUAL_0, GREATER_0, GREATER_EQUAL_0, NEQUAL_0		LESS_0, LESS_EQUAL_0, EQUAL_0, GREATER_0, GREATER_EQUAL_0, NEQUAL_0
};		};

static inline uint16_t fromRep16(uint16_t x)		static inline TYPE_FP16 fromRep16(uint16_t x)
{		{
		#ifdef COMPILER_RT_HAS_FLOAT16
		atrosinenkoUnsubmitted Not Done Reply Inline Actions If I get it right, this could be performed unconditionally. On the other hand, the `#else` branch may be either a good illustration for peculiarities of this function when no native `_Float16` is available or some misleading stuff... atrosinenko: If I get it right, this could be performed unconditionally. On the other hand, the `#else`…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions I don't have a strong preference either, it should be optimized away by the compiler for `!COMPILER_RT_HAS_FLOAT16` anyway. I will keep to make it explicit that for `COMPILER_RT_HAS_FLOAT16` `TYPE_FP16` is expected to be different tha `uint16_t`. zatrazz: I don't have a strong preference either, it should be optimized away by the compiler for `!
		TYPE_FP16 ret;
		memcpy (&ret, &x, sizeof (ret));
		return ret;
		#else
return x;		return x;
		#endif
}		}

static inline float fromRep32(uint32_t x)		static inline float fromRep32(uint32_t x)
{		{
float ret;		float ret;
memcpy(&ret, &x, 4);		memcpy(&ret, &x, 4);
return ret;		return ret;
}		}
Show All 10 Lines
{		{
__uint128_t x = ((__uint128_t)hi << 64) + lo;		__uint128_t x = ((__uint128_t)hi << 64) + lo;
long double ret;		long double ret;
memcpy(&ret, &x, 16);		memcpy(&ret, &x, 16);
return ret;		return ret;
}		}
#endif		#endif

static inline uint16_t toRep16(uint16_t x)		static inline uint16_t toRep16(TYPE_FP16 x)
{		{
		#ifdef COMPILER_RT_HAS_FLOAT16
		uint16_t ret;
		memcpy (&ret, &x, sizeof (ret));
		return ret;
		#else
return x;		return x;
		#endif
}		}

static inline uint32_t toRep32(float x)		static inline uint32_t toRep32(float x)
{		{
uint32_t ret;		uint32_t ret;
memcpy(&ret, &x, 4);		memcpy(&ret, &x, 4);
return ret;		return ret;
}		}
Show All 9 Lines
static inline __uint128_t toRep128(long double x)		static inline __uint128_t toRep128(long double x)
{		{
__uint128_t ret;		__uint128_t ret;
memcpy(&ret, &x, 16);		memcpy(&ret, &x, 16);
return ret;		return ret;
}		}
#endif		#endif

static inline int compareResultH(uint16_t result,		static inline int compareResultH(TYPE_FP16 result,
uint16_t expected)		uint16_t expected)
		atrosinenkoUnsubmitted Not Done Reply Inline Actions Shouldn't it have `(TYPE_FP16 result, uint16_t expected)` arguments like other compareResultX functions (as if `fp_t result, rep_t expected`)? The same for `test_truncXfhf2(... , uint16_t expected)`: if I get it right, the common tradition for them is expressing the reference value as corresponding `uintXX_t`. Meanwhile, this is probably related to your changes to second test case from extendhfsf2_test.c - the original sources look quite suspicious to me. atrosinenko: Shouldn't it have `(TYPE_FP16 result, uint16_t expected)` arguments like other compareResultX…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions Yeah, I agree it should follow the other compare functions and use a uintxx_t as second argument, I will fix it. The extendhfsf2_test.c issue is in fact that it is using a wrong comparison function that is not making explicit what it is currently testing. The test__extendhfsf2 has a 'float' as second argument, however calls compareResultH which expectes a uint16_t (without this change). This makes the compiler to silent cast the result and the expected results could be misleading. I think it should call compareResultF with toRep32 instead and to make explicit it compares against the uint32_t representation of the float, same as extenddftf2_test.c and extendsftf2_test.c are already doing. I will change it as well. zatrazz: Yeah, I agree it should follow the other compare functions and use a uintxx_t as second…
		atrosinenkoUnsubmitted Not Done Reply Inline Actions It would probably be even more idiomatically to declare test__extendhfsf2 with the second argument being `uint32_t expected`, but this may cause quite large changes that are not strictly related to this patch. atrosinenko: It would probably be even more idiomatically to declare test__extendhfsf2 with the second…
		zatrazzAuthorUnsubmitted Done Reply Inline Actions I think it should be feabile, it would require to redefine all the inputs in the test below though. zatrazz: I think it should be feabile, it would require to redefine all the inputs in the test below…
{		{
uint16_t rep = toRep16(result);		uint16_t rep = toRep16(result);

if (rep == expected){		if (rep == expected){
return 0;		return 0;
}		}
// test other possible NaN representation(signal NaN)		// test other possible NaN representation(signal NaN)
else if (expected == 0x7e00U){		else if (expected == 0x7e00U){
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	switch(expected){
case GREATER_0:		case GREATER_0:
return ">0";		return ">0";
default:		default:
return "";		return "";
}		}
return "";		return "";
}		}

static inline uint16_t makeQNaN16(void)		static inline TYPE_FP16 makeQNaN16(void)
{		{
return fromRep16(0x7e00U);		return fromRep16(0x7e00U);
}		}

static inline float makeQNaN32(void)		static inline float makeQNaN32(void)
{		{
return fromRep32(0x7fc00000U);		return fromRep32(0x7fc00000U);
}		}

static inline double makeQNaN64(void)		static inline double makeQNaN64(void)
{		{
return fromRep64(0x7ff8000000000000UL);		return fromRep64(0x7ff8000000000000UL);
}		}

#if __LDBL_MANT_DIG__ == 113		#if __LDBL_MANT_DIG__ == 113
static inline long double makeQNaN128(void)		static inline long double makeQNaN128(void)
{		{
return fromRep128(0x7fff800000000000UL, 0x0UL);		return fromRep128(0x7fff800000000000UL, 0x0UL);
}		}
#endif		#endif

static inline uint16_t makeNaN16(uint16_t rand)		static inline TYPE_FP16 makeNaN16(uint16_t rand)
{		{
return fromRep16(0x7c00U \| (rand & 0x7fffU));		return fromRep16(0x7c00U \| (rand & 0x7fffU));
}		}

static inline float makeNaN32(uint32_t rand)		static inline float makeNaN32(uint32_t rand)
{		{
return fromRep32(0x7f800000U \| (rand & 0x7fffffU));		return fromRep32(0x7f800000U \| (rand & 0x7fffffU));
}		}

static inline double makeNaN64(uint64_t rand)		static inline double makeNaN64(uint64_t rand)
{		{
return fromRep64(0x7ff0000000000000UL \| (rand & 0xfffffffffffffUL));		return fromRep64(0x7ff0000000000000UL \| (rand & 0xfffffffffffffUL));
}		}

#if __LDBL_MANT_DIG__ == 113		#if __LDBL_MANT_DIG__ == 113
static inline long double makeNaN128(uint64_t rand)		static inline long double makeNaN128(uint64_t rand)
{		{
return fromRep128(0x7fff000000000000UL \| (rand & 0xffffffffffffUL), 0x0UL);		return fromRep128(0x7fff000000000000UL \| (rand & 0xffffffffffffUL), 0x0UL);
}		}
#endif		#endif

static inline uint16_t makeInf16(void)		static inline TYPE_FP16 makeInf16(void)
{		{
return fromRep16(0x7c00U);		return fromRep16(0x7c00U);
}		}

static inline float makeInf32(void)		static inline float makeInf32(void)
{		{
return fromRep32(0x7f800000U);		return fromRep32(0x7f800000U);
}		}
Show All 12 Lines

compiler-rt/test/builtins/Unit/truncdfhf2_test.c

	// RUN: %clang_builtins %s %librt -o %t && %run %t			// RUN: %clang_builtins %s %librt -o %t && %run %t
	// REQUIRES: librt_has_truncdfhf2			// REQUIRES: librt_has_truncdfhf2

	#include <stdio.h>			#include <stdio.h>

	#include "fp_test.h"			#include "fp_test.h"

	uint16_t __truncdfhf2(double a);			TYPE_FP16 __truncdfhf2(double a);

	int test__truncdfhf2(double a, uint16_t expected)			int test__truncdfhf2(double a, uint16_t expected)
	{			{
	uint16_t x = __truncdfhf2(a);			TYPE_FP16 x = __truncdfhf2(a);
	int ret = compareResultH(x, expected);			int ret = compareResultH(x, expected);

	if (ret){			if (ret){
	printf("error in test__truncdfhf2(%f) = %#.4x, "			printf("error in test__truncdfhf2(%lf) = %#.4x, "
	"expected %#.4x\n", a, x, fromRep16(expected));			"expected %#.4x\n", a, toRep16(x), expected);
	}			}
	return ret;			return ret;
	}			}

	char assumption_1[sizeof(__fp16) * CHAR_BIT == 16] = {0};			char assumption_1[sizeof(__fp16) * CHAR_BIT == 16] = {0};

	int main()			int main()
	{			{
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

compiler-rt/test/builtins/Unit/truncsfhf2_test.c

	// RUN: %clang_builtins %s %librt -o %t && %run %t			// RUN: %clang_builtins %s %librt -o %t && %run %t
	// REQUIRES: librt_has_truncsfhf2			// REQUIRES: librt_has_truncsfhf2

	#include <stdio.h>			#include <stdio.h>

	#include "fp_test.h"			#include "fp_test.h"

	uint16_t __truncsfhf2(float a);			TYPE_FP16 __truncsfhf2(float a);

	int test__truncsfhf2(float a, uint16_t expected)			int test__truncsfhf2(float a, uint16_t expected)
	{			{
	uint16_t x = __truncsfhf2(a);			TYPE_FP16 x = __truncsfhf2(a);
	int ret = compareResultH(x, expected);			int ret = compareResultH(x, expected);

	if (ret){			if (ret){
	printf("error in test__truncsfhf2(%f) = %#.4x, "			printf("error in test__truncsfhf2(%f) = %#.4x, "
	"expected %#.4x\n", a, x, fromRep16(expected));			"expected %#.4x\n", a, toRep16(x), expected);
	}			}
	return ret;			return ret;
	}			}

	char assumption_1[sizeof(__fp16) * CHAR_BIT == 16] = {0};			char assumption_1[sizeof(__fp16) * CHAR_BIT == 16] = {0};

	int main()			int main()
	{			{
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

compiler-rt/test/builtins/Unit/trunctfhf2_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t
				// REQUIRES: librt_has_trunctfhf2

				#include "int_lib.h"
				#include <stdio.h>

				#if __LDBL_MANT_DIG__ == 113 && defined (COMPILER_RT_HAS_FLOAT16)

				#include "fp_test.h"

				TYPE_FP16 __trunctfhf2(long double a);

				int test__trunctfhf2(long double a, uint16_t expected)
				{
				TYPE_FP16 x = __trunctfhf2(a);
				MaskRayUnsubmitted Not Done Reply Inline Actions Use 2-space indentation. Place `{` in the end. There is no need following the style of some violating files in this directory. MaskRay: Use 2-space indentation. Place `{` in the end. There is no need following the style of some…
				int ret = compareResultH(x, expected);

				if (ret)
				{
				printf("error in test__trunctfhf2(%.20Lf) = %#.4x, "
				"expected %#.4x\n", a, toRep16(x), expected);
				}
				return ret;
				}

				char assumption_1[sizeof(TYPE_FP16) * CHAR_BIT == 16] = {0};

				#endif

				int main()
				{
				#if __LDBL_MANT_DIG__ == 113 && defined (COMPILER_RT_HAS_FLOAT16)
				// qNaN
				if (test__trunctfhf2(makeQNaN128(),
				UINT16_C(0x7e00)))
				return 1;
				// NaN
				if (test__trunctfhf2(makeNaN128(UINT64_C(0x810000000000)),
				UINT16_C(0x7e00)))
				return 1;
				// inf
				if (test__trunctfhf2(makeInf128(),
				UINT16_C(0x7c00)))
				return 1;
				if (test__trunctfhf2(-makeInf128(),
				UINT16_C(0xfc00)))
				return 1;
				// zero
				if (test__trunctfhf2(0.0L, UINT16_C(0x0)))
				return 1;
				if (test__trunctfhf2(-0.0L, UINT16_C(0x8000)))
				return 1;

				if (test__trunctfhf2(3.1415926535L,
				UINT16_C(0x4248)))
				return 1;
				if (test__trunctfhf2(-3.1415926535L,
				UINT16_C(0xc248)))
				return 1;
				if (test__trunctfhf2(0x1.987124876876324p+100L,
				UINT16_C(0x7c00)))
				return 1;
				if (test__trunctfhf2(0x1.987124876876324p+12L,
				UINT16_C(0x6e62)))
				return 1;
				if (test__trunctfhf2(0x1.0p+0L,
				UINT16_C(0x3c00)))
				return 1;
				if (test__trunctfhf2(0x1.0p-14L,
				UINT16_C(0x0400)))
				return 1;
				// denormal
				if (test__trunctfhf2(0x1.0p-20L,
				UINT16_C(0x0010)))
				return 1;
				if (test__trunctfhf2(0x1.0p-24L,
				UINT16_C(0x0001)))
				return 1;
				if (test__trunctfhf2(-0x1.0p-24L,
				UINT16_C(0x8001)))
				return 1;
				if (test__trunctfhf2(0x1.5p-25L,
				UINT16_C(0x0001)))
				return 1;
				// and back to zero
				if (test__trunctfhf2(0x1.0p-25L,
				UINT16_C(0x0000)))
				return 1;
				if (test__trunctfhf2(-0x1.0p-25L,
				UINT16_C(0x8000)))
				return 1;
				// max (precise)
				if (test__trunctfhf2(65504.0L,
				UINT16_C(0x7bff)))
				return 1;
				// max (rounded)
				if (test__trunctfhf2(65519.0L,
				UINT16_C(0x7bff)))
				return 1;
				// max (to +inf)
				if (test__trunctfhf2(65520.0L,
				UINT16_C(0x7c00)))
				return 1;
				if (test__trunctfhf2(65536.0L,
				UINT16_C(0x7c00)))
				return 1;
				if (test__trunctfhf2(-65520.0L,
				UINT16_C(0xfc00)))
				return 1;

				if (test__trunctfhf2(0x1.23a2abb4a2ddee355f36789abcdep+5L,
				UINT16_C(0x508f)))
				return 1;
				if (test__trunctfhf2(0x1.e3d3c45bd3abfd98b76a54cc321fp-9L,
				UINT16_C(0x1b8f)))
				return 1;
				if (test__trunctfhf2(0x1.234eebb5faa678f4488693abcdefp+453L,
				UINT16_C(0x7c00)))
				return 1;
				if (test__trunctfhf2(0x1.edcba9bb8c76a5a43dd21f334634p-43L,
				UINT16_C(0x0)))
				return 1;
				#else
				printf("skipped\n");
				#endif
				return 0;
				}

llvm/include/llvm/IR/RuntimeLibcalls.def

	Show First 20 Lines • Show All 280 Lines • ▼ Show 20 Lines
	HANDLE_LIBCALL(LLRINT_PPCF128, "llrintl")			HANDLE_LIBCALL(LLRINT_PPCF128, "llrintl")

	// Conversion			// Conversion
	HANDLE_LIBCALL(FPEXT_F32_PPCF128, "__gcc_stoq")			HANDLE_LIBCALL(FPEXT_F32_PPCF128, "__gcc_stoq")
	HANDLE_LIBCALL(FPEXT_F64_PPCF128, "__gcc_dtoq")			HANDLE_LIBCALL(FPEXT_F64_PPCF128, "__gcc_dtoq")
	HANDLE_LIBCALL(FPEXT_F80_F128, "__extendxftf2")			HANDLE_LIBCALL(FPEXT_F80_F128, "__extendxftf2")
	HANDLE_LIBCALL(FPEXT_F64_F128, "__extenddftf2")			HANDLE_LIBCALL(FPEXT_F64_F128, "__extenddftf2")
	HANDLE_LIBCALL(FPEXT_F32_F128, "__extendsftf2")			HANDLE_LIBCALL(FPEXT_F32_F128, "__extendsftf2")
				HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")
	HANDLE_LIBCALL(FPEXT_F32_F64, "__extendsfdf2")			HANDLE_LIBCALL(FPEXT_F32_F64, "__extendsfdf2")
	HANDLE_LIBCALL(FPEXT_F16_F32, "__gnu_h2f_ieee")			HANDLE_LIBCALL(FPEXT_F16_F32, "__gnu_h2f_ieee")
	HANDLE_LIBCALL(FPROUND_F32_F16, "__gnu_f2h_ieee")			HANDLE_LIBCALL(FPROUND_F32_F16, "__gnu_f2h_ieee")
	HANDLE_LIBCALL(FPROUND_F64_F16, "__truncdfhf2")			HANDLE_LIBCALL(FPROUND_F64_F16, "__truncdfhf2")
	HANDLE_LIBCALL(FPROUND_F80_F16, "__truncxfhf2")			HANDLE_LIBCALL(FPROUND_F80_F16, "__truncxfhf2")
	HANDLE_LIBCALL(FPROUND_F128_F16, "__trunctfhf2")			HANDLE_LIBCALL(FPROUND_F128_F16, "__trunctfhf2")
	HANDLE_LIBCALL(FPROUND_PPCF128_F16, "__trunctfhf2")			HANDLE_LIBCALL(FPROUND_PPCF128_F16, "__trunctfhf2")
	HANDLE_LIBCALL(FPROUND_F64_F32, "__truncdfsf2")			HANDLE_LIBCALL(FPROUND_F64_F32, "__truncdfsf2")
	▲ Show 20 Lines • Show All 262 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

	Show First 20 Lines • Show All 218 Lines • ▼ Show 20 Lines
	}			}

	/// getFPEXT - Return the FPEXT__ value for the given types, or			/// getFPEXT - Return the FPEXT__ value for the given types, or
	/// UNKNOWN_LIBCALL if there is none.			/// UNKNOWN_LIBCALL if there is none.
	RTLIB::Libcall RTLIB::getFPEXT(EVT OpVT, EVT RetVT) {			RTLIB::Libcall RTLIB::getFPEXT(EVT OpVT, EVT RetVT) {
	if (OpVT == MVT::f16) {			if (OpVT == MVT::f16) {
	if (RetVT == MVT::f32)			if (RetVT == MVT::f32)
	return FPEXT_F16_F32;			return FPEXT_F16_F32;
				if (RetVT == MVT::f128)
				return FPEXT_F16_F128;
	} else if (OpVT == MVT::f32) {			} else if (OpVT == MVT::f32) {
	if (RetVT == MVT::f64)			if (RetVT == MVT::f64)
	return FPEXT_F32_F64;			return FPEXT_F32_F64;
	if (RetVT == MVT::f128)			if (RetVT == MVT::f128)
	return FPEXT_F32_F128;			return FPEXT_F32_F128;
	if (RetVT == MVT::ppcf128)			if (RetVT == MVT::ppcf128)
	return FPEXT_F32_PPCF128;			return FPEXT_F32_PPCF128;
	} else if (OpVT == MVT::f64) {			} else if (OpVT == MVT::f64) {
	▲ Show 20 Lines • Show All 1,963 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i128, Custom);		setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i128, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::UINT_TO_FP, MVT::i128, Custom);		setOperationAction(ISD::UINT_TO_FP, MVT::i128, Custom);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i32, Custom);		setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i32, Custom);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Custom);		setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i128, Custom);		setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i128, Custom);
		setOperationAction(ISD::FP_ROUND, MVT::f16, Custom);
setOperationAction(ISD::FP_ROUND, MVT::f32, Custom);		setOperationAction(ISD::FP_ROUND, MVT::f32, Custom);
setOperationAction(ISD::FP_ROUND, MVT::f64, Custom);		setOperationAction(ISD::FP_ROUND, MVT::f64, Custom);
		setOperationAction(ISD::STRICT_FP_ROUND, MVT::f16, Custom);
setOperationAction(ISD::STRICT_FP_ROUND, MVT::f32, Custom);		setOperationAction(ISD::STRICT_FP_ROUND, MVT::f32, Custom);
setOperationAction(ISD::STRICT_FP_ROUND, MVT::f64, Custom);		setOperationAction(ISD::STRICT_FP_ROUND, MVT::f64, Custom);

// Variable arguments.		// Variable arguments.
setOperationAction(ISD::VASTART, MVT::Other, Custom);		setOperationAction(ISD::VASTART, MVT::Other, Custom);
setOperationAction(ISD::VAARG, MVT::Other, Custom);		setOperationAction(ISD::VAARG, MVT::Other, Custom);
setOperationAction(ISD::VACOPY, MVT::Other, Custom);		setOperationAction(ISD::VACOPY, MVT::Other, Custom);
setOperationAction(ISD::VAEND, MVT::Other, Expand);		setOperationAction(ISD::VAEND, MVT::Other, Expand);
▲ Show 20 Lines • Show All 15,226 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-fp128.ll

	Show First 20 Lines • Show All 213 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: %bb.			; CHECK-NEXT: %bb.
	; CHECK-NEXT: mov v[[VAL:[0-9]+]].16b, v0.16b			; CHECK-NEXT: mov v[[VAL:[0-9]+]].16b, v0.16b
	; CHECK-NEXT: [[IFFALSE]]:			; CHECK-NEXT: [[IFFALSE]]:
	; CHECK: str q[[VAL]], [{{x[0-9]+}}, :lo12:lhs]			; CHECK: str q[[VAL]], [{{x[0-9]+}}, :lo12:lhs]
	ret void			ret void
	; CHECK: ret			; CHECK: ret
	}			}

				@varhalf = global half 0.0, align 2
	@varfloat = global float 0.0, align 4			@varfloat = global float 0.0, align 4
	@vardouble = global double 0.0, align 8			@vardouble = global double 0.0, align 8

	define void @test_round() {			define void @test_round() {
	; CHECK-LABEL: test_round:			; CHECK-LABEL: test_round:

	%val = load fp128, fp128* @lhs, align 16			%val = load fp128, fp128* @lhs, align 16

				%half = fptrunc fp128 %val to half
				store half %half, half* @varhalf, align 2
				; CHECK: ldr q0, [{{x[0-9]+}}, :lo12:lhs]
				; CHECK: bl __trunctfhf2
				; CHECK: str h0, [{{x[0-9]+}}, :lo12:varhalf]

	%float = fptrunc fp128 %val to float			%float = fptrunc fp128 %val to float
	store float %float, float* @varfloat, align 4			store float %float, float* @varfloat, align 4
	; CHECK: bl __trunctfsf2			; CHECK: bl __trunctfsf2
	; CHECK: str s0, [{{x[0-9]+}}, :lo12:varfloat]			; CHECK: str s0, [{{x[0-9]+}}, :lo12:varfloat]

	%double = fptrunc fp128 %val to double			%double = fptrunc fp128 %val to double
	store double %double, double* @vardouble, align 8			store double %double, double* @vardouble, align 8
	; CHECK: bl __trunctfdf2			; CHECK: bl __trunctfdf2
	; CHECK: str d0, [{{x[0-9]+}}, :lo12:vardouble]			; CHECK: str d0, [{{x[0-9]+}}, :lo12:vardouble]

	ret void			ret void
	}			}

	define void @test_extend() {			define void @test_extend() {
	; CHECK-LABEL: test_extend:			; CHECK-LABEL: test_extend:

	%val = load fp128, fp128* @lhs, align 16			%val = load fp128, fp128* @lhs, align 16

				%half = load half, half* @varhalf
				%fromhalf = fpext half %half to fp128
				store volatile fp128 %fromhalf, fp128* @lhs, align 16
				; CHECK: ldr h0, [{{x[0-9]+}}, :lo12:varhalf]
				; CHECK: bl __extendhftf2
				; CHECK: str q0, [{{x[0-9]+}}, :lo12:lhs]

	%float = load float, float* @varfloat			%float = load float, float* @varfloat
	%fromfloat = fpext float %float to fp128			%fromfloat = fpext float %float to fp128
	store volatile fp128 %fromfloat, fp128* @lhs, align 16			store volatile fp128 %fromfloat, fp128* @lhs, align 16
	; CHECK: bl __extendsftf2			; CHECK: bl __extendsftf2
	; CHECK: str q0, [{{x[0-9]+}}, :lo12:lhs]			; CHECK: str q0, [{{x[0-9]+}}, :lo12:lhs]

	%double = load double, double* @vardouble			%double = load double, double* @vardouble
	%fromdouble = fpext double %double to fp128			%fromdouble = fpext double %double to fp128
	Show All 25 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Support conversion between fp16 and fp128AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 287488

compiler-rt/cmake/builtin-config-ix.cmake

compiler-rt/lib/builtins/CMakeLists.txt

compiler-rt/lib/builtins/extendhfsf2.c

compiler-rt/lib/builtins/extendhftf2.c

compiler-rt/lib/builtins/fp_extend.h

compiler-rt/lib/builtins/fp_trunc.h

compiler-rt/lib/builtins/truncdfhf2.c

compiler-rt/lib/builtins/truncsfhf2.c

compiler-rt/lib/builtins/trunctfhf2.c

compiler-rt/test/builtins/CMakeLists.txt

compiler-rt/test/builtins/Unit/extendhfsf2_test.c

compiler-rt/test/builtins/Unit/extendhftf2_test.c

compiler-rt/test/builtins/Unit/fp_test.h

compiler-rt/test/builtins/Unit/truncdfhf2_test.c

compiler-rt/test/builtins/Unit/truncsfhf2_test.c

compiler-rt/test/builtins/Unit/trunctfhf2_test.c

llvm/include/llvm/IR/RuntimeLibcalls.def

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/test/CodeGen/AArch64/arm64-fp128.ll

[AArch64] Support conversion between fp16 and fp128
AbandonedPublic