This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libc/
-
src/__support/
-
__support/
12/12
str_to_float.h
-
test/src/stdlib/
-
src/
-
stdlib/
-
strtof_test.cpp

Differential D113036

[libc] refactor atof string parsing
ClosedPublic

Authored by michaelrj on Nov 2 2021, 11:16 AM.

Download Raw Diff

Details

Reviewers

sivachandra
lntue

Commits

rG8298424cae9b: [libc] refactor atof string parsing

Summary

Split the code for parsing hexadecimal floating point numbers from the
code for parsing the decimal floating point numbers so that the parsing
can be faster for both of them.

This decreases the time for the benchmark in release mode by about 15%,
which noticeably beats GLibc.

Old version: 2.299s
New version: 1.893s
GLibc: 2.133s

Tests run by running the following command 10 times for each version:
time ~/llvm-project/build/bin/libc_str_to_float_comparison_test ~/parse-number-fxx-test-data/data/*

the parse-number-fxx-test-data-repository is here:
https://github.com/nigeltao/parse-number-fxx-test-data/tree/fe94de252c691900982050c8e7c503d1efd1299a

It's important to build llvm-libc in Release mode for accurate
performance comparisons against glibc (set -DCMAKE_BUILD_TYPE=Release in
your cmake).
You also have to build the libc_str_to_float_comparison_test target.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

michaelrj created this revision.Nov 2 2021, 11:16 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2021, 11:16 AM

Herald added subscribers: libc-commits, ecnelises, tschuett. · View Herald Transcript

michaelrj requested review of this revision.Nov 2 2021, 11:16 AM

Harbormaster completed remote builds in B132016: Diff 384168.Nov 2 2021, 11:36 AM

add __builtin_clz

Overall the patch looks good for me. I can see where the possible rounding errors come from and also we can definitely improve to support parsing longer strings than fitting into the integer types. But we can leave them for the followups.

libc/src/__support/str_to_float.h
468	BITS_IN_BITSTYPE is only used in this expression, and this part is also constexpr. So maybe merge this into the BITS_IN_BITSTYPE definition and change its name appropriately?
598–605	What do you think about restructuring this part to: if (*src == DECIMAL_POINT) { if (afterDecimal) { // comments break; } afterDecimal = true; ++src; continue; }
627	Is this the same as src = tempStrEnd?
673–680	Similar structure as the above comment for decimal?
685	Does it mean that we cannot parse the string longer than 16 hexadecimal digits?

This revision is now accepted and ready to land.Nov 4 2021, 11:04 AM

Harbormaster completed remote builds in B132498: Diff 384810.Nov 4 2021, 12:33 PM

sivachandra added inline comments.Nov 4 2021, 11:58 PM

libc/src/__support/str_to_float.h
459	This function is mostly `NormalFloat::operator T`. It would be nice if we can extend it as required and avoid the mostly duplicated logic. AFAICT, setting of `errno` is the difference? Input exponent needs to be corrected of course like on line 480, which can be done as a preprocessing step. You can do it as a cleanup later if it can be done.
500	Can the shift and round operation here lead to an overflow? See this for why it can happen: https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/NormalFloat.h#L124

clean up code in response to the comments, some of which improved speed significantly.

michaelrj edited the summary of this revision. (Show Details)Nov 5 2021, 3:34 PM

michaelrj added inline comments.

libc/src/__support/str_to_float.h
459	I think it's possible. What I'd probably want to do for that is add a way to initialize a NormalFloat with a mantissa that uses all the bits of a `UIntType`. Then it would be a relatively trivial process.
500	ah, yes it can. This is now fixed here and above, and a test has been added.
598–605	Doing the change you suggested was a noticeable speed improvement.
627	ah, I think I misunderstood how restrict worked. This is fixed now.
685	correct, but that's because that's just how many bits there are space for in a double. The mantissa actually stores a little less, the extra bits are so that the rounding is more accurate.

Harbormaster completed remote builds in B132790: Diff 385207.Nov 5 2021, 3:38 PM

lntue accepted this revision.Nov 8 2021, 4:23 PM

Closed by commit rG8298424cae9b: [libc] refactor atof string parsing (authored by michaelrj). · Explain WhyNov 9 2021, 10:12 AM

This revision was automatically updated to reflect the committed changes.

michaelrj added a commit: rG8298424cae9b: [libc] refactor atof string parsing.

Revision Contents

Path

Size

libc/

src/

__support/

str_to_float.h

449 lines

test/

src/

stdlib/

strtof_test.cpp

4 lines

Diff 385874

libc/src/__support/str_to_float.h

Show All 34 Lines	if (truncated < (1 << (amountToShift - 1))) {
return result;		return result;
} else if (truncated > (1 << (amountToShift - 1))) {		} else if (truncated > (1 << (amountToShift - 1))) {
return result + 1;		return result + 1;
} else {		} else {
return result + (result & 1); // This rounds towards even.		return result + (result & 1); // This rounds towards even.
}		}
}		}

template <class T> uint32_t leadingZeroes(T inputNumber) {		template <class T> uint32_t inline leadingZeroes(T inputNumber) {
// TODO(michaelrj): investigate the portability of using something like		// TODO(michaelrj): investigate the portability of using something like
// __builtin_clz for specific types.		// __builtin_clz for specific types.
constexpr uint32_t bitsInT = sizeof(T) * 8;		constexpr uint32_t bitsInT = sizeof(T) * 8;
if (inputNumber == 0) {		if (inputNumber == 0) {
return bitsInT;		return bitsInT;
}		}
uint32_t curGuess = bitsInT / 2;		uint32_t curGuess = bitsInT / 2;
uint32_t rangeSize = bitsInT / 2;		uint32_t rangeSize = bitsInT / 2;
Show All 14 Lines	while (((inputNumber >> curGuess) > 0) \|\|
}		}
}		}
if (inputNumber >> curGuess > 0) {		if (inputNumber >> curGuess > 0) {
curGuess++;		curGuess++;
}		}
return bitsInT - curGuess;		return bitsInT - curGuess;
}		}

		template <> uint32_t inline leadingZeroes<uint32_t>(uint32_t inputNumber) {
		return inputNumber == 0 ? 32 : __builtin_clz(inputNumber);
		}

		template <> uint32_t inline leadingZeroes<uint64_t>(uint64_t inputNumber) {
		return inputNumber == 0 ? 64 : __builtin_clzll(inputNumber);
		}

static inline uint64_t low64(__uint128_t num) {		static inline uint64_t low64(__uint128_t num) {
return static_cast<uint64_t>(num & 0xffffffffffffffff);		return static_cast<uint64_t>(num & 0xffffffffffffffff);
}		}

static inline uint64_t high64(__uint128_t num) {		static inline uint64_t high64(__uint128_t num) {
return static_cast<uint64_t>(num >> 64);		return static_cast<uint64_t>(num >> 64);
}		}

▲ Show 20 Lines • Show All 355 Lines • ▼ Show 20 Lines	if (eiselLemire<T>(mantissa, exp10, outputMantissa, outputExp2)) {
}		}
}		}

simpleDecimalConversion<T>(numStart, outputMantissa, outputExp2);		simpleDecimalConversion<T>(numStart, outputMantissa, outputExp2);

return;		return;
}		}

		// Takes a mantissa and base 2 exponent and converts it into its closest
		// floating point type T equivalient. Since the exponent is already in the right
		// form, this is mostly just shifting and rounding. This is used for hexadecimal
		// numbers since a base 16 exponent multiplied by 4 is the base 2 exponent.
		template <class T>
		static inline void
		binaryExpToFloat(typename fputil::FPBits<T>::UIntType mantissa, int32_t exp2,
		sivachandraUnsubmitted Done Reply Inline Actions This function is mostly `NormalFloat::operator T`. It would be nice if we can extend it as required and avoid the mostly duplicated logic. AFAICT, setting of `errno` is the difference? Input exponent needs to be corrected of course like on line 480, which can be done as a preprocessing step. You can do it as a cleanup later if it can be done. sivachandra: This function is mostly `NormalFloat::operator T`. It would be nice if we can extend it as…
		michaelrjAuthorUnsubmitted Done Reply Inline Actions I think it's possible. What I'd probably want to do for that is add a way to initialize a NormalFloat with a mantissa that uses all the bits of a `UIntType`. Then it would be a relatively trivial process. michaelrj: I think it's possible. What I'd probably want to do for that is add a way to initialize a…
		typename fputil::FPBits<T>::UIntType *outputMantissa,
		uint32_t *outputExp2) {
		using BitsType = typename fputil::FPBits<T>::UIntType;

		// This is the number of leading zeroes a properly normalized float of type T
		// should have.
		constexpr uint32_t NORMALIZED_LEADING_ZEROES =
		(sizeof(BitsType) * 8) - fputil::FloatProperties<T>::mantissaWidth - 1;
		constexpr BitsType OVERFLOWED_MANTISSA =
		lntueUnsubmitted Done Reply Inline Actions BITS_IN_BITSTYPE is only used in this expression, and this part is also constexpr. So maybe merge this into the BITS_IN_BITSTYPE definition and change its name appropriately? lntue: BITS_IN_BITSTYPE is only used in this expression, and this part is also constexpr. So maybe…
		BitsType(1) << (fputil::FloatProperties<T>::mantissaWidth + 1);

		// Normalization
		int32_t amountToShift =
		NORMALIZED_LEADING_ZEROES - leadingZeroes<BitsType>(mantissa);
		if (amountToShift < 0) {
		mantissa <<= -amountToShift;
		} else {
		mantissa = shiftRightAndRound(mantissa, amountToShift);
		if (mantissa == OVERFLOWED_MANTISSA) {
		mantissa >>= 1;
		exp2 += 1;
		}
		}
		exp2 += amountToShift;

		// Account for the fact that the mantissa represented an integer
		// previously, but now represents the fractional part of a normalized
		// number.
		exp2 += fputil::FloatProperties<T>::mantissaWidth;

		int32_t biasedExponent = exp2 + fputil::FPBits<T>::exponentBias;
		// handle subnormals
		if (biasedExponent <= 0) {

		// the most mantissa is currently normalized, meaning that the msb is
		// one bit left of where the decimal point should go.
		amountToShift = 1;
		BitsType mantissaCopy = mantissa >> 1;
		while (biasedExponent < 0 && mantissaCopy > 0) {
		mantissaCopy = mantissaCopy >> 1;
		++amountToShift;
		sivachandraUnsubmitted Done Reply Inline Actions Can the shift and round operation here lead to an overflow? See this for why it can happen: https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/NormalFloat.h#L124 sivachandra: Can the shift and round operation here lead to an overflow? See this for why it can happen…
		michaelrjAuthorUnsubmitted Done Reply Inline Actions ah, yes it can. This is now fixed here and above, and a test has been added. michaelrj: ah, yes it can. This is now fixed here and above, and a test has been added.
		++biasedExponent;
		}
		// If we cut off any bits to fit this number into a subnormal, then it's
		// out of range for this size of float.
		if ((mantissa & ((1 << amountToShift) - 1)) > 0) {
		errno = ERANGE; // NOLINT
		}
		mantissa = shiftRightAndRound(mantissa, amountToShift);
		if (mantissa == OVERFLOWED_MANTISSA) {
		mantissa >>= 1;
		exp2 += 1;
		} else if (mantissa == 0) {
		biasedExponent = 0;
		}
		}
		// handle numbers that're too large and get squashed to inf
		else if (biasedExponent >
		(1 << fputil::FloatProperties<T>::exponentWidth) - 1) {
		// This indicates an overflow, so we make the result INF and set errno.
		biasedExponent = (1 << fputil::FloatProperties<T>::exponentWidth) - 1;
		mantissa = 0;
		errno = ERANGE; // NOLINT
		}
		*outputMantissa = mantissa;
		*outputExp2 = biasedExponent;
		}

// checks if the next 4 characters of the string pointer are the start of a		// checks if the next 4 characters of the string pointer are the start of a
// hexadecimal floating point number. Does not advance the string pointer.		// hexadecimal floating point number. Does not advance the string pointer.
static inline bool is_float_hex_start(const char *__restrict src,		static inline bool is_float_hex_start(const char *__restrict src,
const char decimalPoint) {		const char decimalPoint) {
if (!(src == '0' && ((src + 1) \| 32) == 'x')) {		if (!(src == '0' && ((src + 1) \| 32) == 'x')) {
return false;		return false;
}		}
if (*(src + 2) == decimalPoint) {		if (*(src + 2) == decimalPoint) {
return isalnum((src + 3)) && b36_char_to_int((src + 3)) < 16;		return isalnum((src + 3)) && b36_char_to_int((src + 3)) < 16;
} else {		} else {
return isalnum((src + 2)) && b36_char_to_int((src + 2)) < 16;		return isalnum((src + 2)) && b36_char_to_int((src + 2)) < 16;
}		}
}		}

// Takes a pointer to a string and a pointer to a string pointer. This function		// Takes the start of a string representing a decimal float, as well as the
// is used as the backend for all of the string to float functions.		// local decimalPoint. It returns if it suceeded in parsing any digits, and if
		// the return value is true then the outputs are pointer to the end of the
		// number, and the mantissa and exponent for the closest float T representation.
		// If the return value is false, then it is assumed that there is no number
		// here.
template <class T>		template <class T>
static inline T strtofloatingpoint(const char *__restrict src,		static inline bool
char **__restrict strEnd) {		decimalStringToFloat(const char *__restrict src, const char DECIMAL_POINT,
		char **__restrict strEnd,
		typename fputil::FPBits<T>::UIntType *outputMantissa,
		uint32_t *outputExponent) {
using BitsType = typename fputil::FPBits<T>::UIntType;		using BitsType = typename fputil::FPBits<T>::UIntType;
fputil::FPBits<T> result = fputil::FPBits<T>();		constexpr uint32_t BASE = 10;
const char *originalSrc = src;		constexpr char EXPONENT_MARKER = 'e';

		const char *__restrict numStart = src;
		bool truncated = false;
bool seenDigit = false;		bool seenDigit = false;
src = first_non_whitespace(src);		bool afterDecimal = false;
		BitsType mantissa = 0;
		int32_t exponent = 0;

if (src == '+' \|\| src == '-') {		// The goal for the first step of parsing is to convert the number in src to
if (*src == '-') {		// the format mantissa * (base ^ exponent)
result.setSign(true);
		// The first loop fills the mantissa with as many digits as it can hold
		const BitsType BITSTYPE_MAX_DIV_BY_BASE =
		__llvm_libc::cpp::NumericLimits<BitsType>::max() / BASE;
		while ((isdigit(src) \|\| src == DECIMAL_POINT) &&
		mantissa < BITSTYPE_MAX_DIV_BY_BASE) {
		if (*src == DECIMAL_POINT) {
		if (afterDecimal) {
		break; // this means that *src points to a second decimal point, ending
		// the number.
		} else {
		afterDecimal = true;
		++src;
		continue;
		}
		}
		uint32_t digit = *src - '0';

		mantissa = (mantissa * BASE) + digit;
		seenDigit = true;
		if (afterDecimal) {
		--exponent;
}		}

++src;		++src;
}		}

static constexpr char DECIMAL_POINT = '.';		if (!seenDigit)
static const char *INF_STRING = "infinity";		return false;
static const char *NAN_STRING = "nan";

bool truncated = false;		// The second loop is to run through the remaining digits after we've filled
		// the mantissa.
		while (isdigit(src) \|\| src == DECIMAL_POINT) {
		if (*src == DECIMAL_POINT) {
		if (afterDecimal) {
		break; // this means that *src points to a second decimal point, ending
		// the number.
		} else {
		afterDecimal = true;
		lntueUnsubmitted Done Reply Inline Actions What do you think about restructuring this part to: if (src == DECIMAL_POINT) { if (afterDecimal) { // comments break; } afterDecimal = true; ++src; continue; } lntue:* What do you think about restructuring this part to: if (*src == DECIMAL_POINT) { if…
		michaelrjAuthorUnsubmitted Done Reply Inline Actions Doing the change you suggested was a noticeable speed improvement. michaelrj: Doing the change you suggested was a noticeable speed improvement.
		++src;
		continue;
		}
		}
		uint32_t digit = *src - '0';

if (isdigit(src) \|\| src == DECIMAL_POINT) { // regular number		if (digit > 0)
int base = 10;		truncated = true;
char exponentMarker = 'e';
if (is_float_hex_start(src, DECIMAL_POINT)) {		if (!afterDecimal)
base = 16;		++exponent;
src += 2;
exponentMarker = 'p';		++src;
seenDigit = true;
}		}
const char *__restrict numStart = src;
bool afterDecimal = false;

		if ((*src \| 32) == EXPONENT_MARKER) {
		if ((src + 1) == '+' \|\| (src + 1) == '-' \|\| isdigit(*(src + 1))) {
		++src;
		char *tempStrEnd;
		int32_t add_to_exponent = strtointeger<int32_t>(src, &tempStrEnd, 10);
		if (add_to_exponent > 100000)
		add_to_exponent = 100000;
		lntueUnsubmitted Done Reply Inline Actions Is this the same as src = tempStrEnd? lntue: Is this the same as src = tempStrEnd?
		michaelrjAuthorUnsubmitted Done Reply Inline Actions ah, I think I misunderstood how restrict worked. This is fixed now. michaelrj: ah, I think I misunderstood how restrict worked. This is fixed now.
		else if (add_to_exponent < -100000)
		add_to_exponent = -100000;

		src = tempStrEnd;
		exponent += add_to_exponent;
		}
		}

		strEnd = const_cast<char >(src);
		if (mantissa == 0) { // if we have a 0, then also 0 the exponent.
		*outputMantissa = 0;
		*outputExponent = 0;
		} else {
		decimalExpToFloat<T>(mantissa, exponent, numStart, truncated,
		outputMantissa, outputExponent);
		}
		return true;
		}

		// Takes the start of a string representing a hexadecimal float, as well as the
		// local decimal point. It returns if it suceeded in parsing any digits, and if
		// the return value is true then the outputs are pointer to the end of the
		// number, and the mantissa and exponent for the closest float T representation.
		// If the return value is false, then it is assumed that there is no number
		// here.
		template <class T>
		static inline bool
		hexadecimalStringToFloat(const char *__restrict src, const char DECIMAL_POINT,
		char **__restrict strEnd,
		typename fputil::FPBits<T>::UIntType *outputMantissa,
		uint32_t *outputExponent) {
		using BitsType = typename fputil::FPBits<T>::UIntType;
		constexpr uint32_t BASE = 16;
		constexpr char EXPONENT_MARKER = 'p';

		bool truncated = false;
		bool seenDigit = false;
		bool afterDecimal = false;
BitsType mantissa = 0;		BitsType mantissa = 0;
int32_t exponent = 0;		int32_t exponent = 0;

// The goal for the first step of parsing is to convert the number in src to		// The goal for the first step of parsing is to convert the number in src to
// the format mantissa * (base ^ exponent)		// the format mantissa * (base ^ exponent)

constexpr BitsType MANTISSA_MAX =		// The first loop fills the mantissa with as many digits as it can hold
BitsType(1) << (fputil::FloatProperties<T>::mantissaWidth +
1); // The extra bit is to give space for the implicit 1
const BitsType BITSTYPE_MAX_DIV_BY_BASE =		const BitsType BITSTYPE_MAX_DIV_BY_BASE =
__llvm_libc::cpp::NumericLimits<BitsType>::max() / base;		__llvm_libc::cpp::NumericLimits<BitsType>::max() / BASE;
while ((isalnum(src) \|\| src == DECIMAL_POINT) &&		while ((isalnum(src) \|\| src == DECIMAL_POINT) &&
mantissa < BITSTYPE_MAX_DIV_BY_BASE) {		mantissa < BITSTYPE_MAX_DIV_BY_BASE) {
if (*src == DECIMAL_POINT && afterDecimal) {		if (*src == DECIMAL_POINT) {
		if (afterDecimal) {
break; // this means that *src points to a second decimal point, ending		break; // this means that *src points to a second decimal point, ending
// the number.		// the number.
		lntueUnsubmitted Done Reply Inline Actions Similar structure as the above comment for decimal? lntue: Similar structure as the above comment for decimal?
} else if (*src == DECIMAL_POINT) {		} else {
afterDecimal = true;		afterDecimal = true;
++src;		++src;
continue;		continue;
}		}
		lntueUnsubmitted Done Reply Inline Actions Does it mean that we cannot parse the string longer than 16 hexadecimal digits? lntue: Does it mean that we cannot parse the string longer than 16 hexadecimal digits?
		michaelrjAuthorUnsubmitted Done Reply Inline Actions correct, but that's because that's just how many bits there are space for in a double. The mantissa actually stores a little less, the extra bits are so that the rounding is more accurate. michaelrj: correct, but that's because that's just how many bits there are space for in a double. The…
int digit = b36_char_to_int(*src);
if (digit >= base) {
break;
}		}
		uint32_t digit = b36_char_to_int(*src);
		if (digit >= BASE)
		break;

mantissa = (mantissa * base) + digit;		mantissa = (mantissa * BASE) + digit;
seenDigit = true;		seenDigit = true;
if (afterDecimal) {		if (afterDecimal)
--exponent;		--exponent;
}

++src;		++src;
}		}

		if (!seenDigit)
		return false;

// The second loop is to run through the remaining digits after we've filled		// The second loop is to run through the remaining digits after we've filled
// the mantissa.		// the mantissa.
while (isalnum(src) \|\| src == DECIMAL_POINT) {		while (isalnum(src) \|\| src == DECIMAL_POINT) {
if (*src == DECIMAL_POINT && afterDecimal) {		if (*src == DECIMAL_POINT) {
		if (afterDecimal) {
break; // this means that *src points to a second decimal point, ending		break; // this means that *src points to a second decimal point, ending
// the number.		// the number.
} else if (*src == DECIMAL_POINT) {		} else {
afterDecimal = true;		afterDecimal = true;
++src;		++src;
continue;		continue;
}		}
int digit = b36_char_to_int(*src);
if (digit >= base) {
break;
}		}
		uint32_t digit = b36_char_to_int(*src);
		if (digit >= BASE)
		break;

if (digit > 0) {		if (digit > 0)
truncated = true;		truncated = true;
}

if (!afterDecimal) {		if (!afterDecimal)
exponent++;		++exponent;
}

++src;		++src;
}		}

// if our base is 16 then convert the exponent to base 2		// Convert the exponent from having a base of 16 to having a base of 2.
if (base == 16) {
exponent *= 4;		exponent *= 4;
}

if ((*src \| 32) == exponentMarker) {		if ((*src \| 32) == EXPONENT_MARKER) {
if ((src + 1) == '+' \|\| (src + 1) == '-' \|\| isdigit(*(src + 1))) {		if ((src + 1) == '+' \|\| (src + 1) == '-' \|\| isdigit(*(src + 1))) {
++src;		++src;
char *tempStrEnd;		char *tempStrEnd;
int32_t add_to_exponent = strtointeger<int32_t>(src, &tempStrEnd, 10);		int32_t add_to_exponent = strtointeger<int32_t>(src, &tempStrEnd, 10);
if (add_to_exponent > 100000) {		if (add_to_exponent > 100000)
add_to_exponent = 100000;		add_to_exponent = 100000;
} else if (add_to_exponent < -100000) {		else if (add_to_exponent < -100000)
add_to_exponent = -100000;		add_to_exponent = -100000;
}		src = tempStrEnd;
src += tempStrEnd - src;
exponent += add_to_exponent;		exponent += add_to_exponent;
}		}
}		}
		strEnd = const_cast<char >(src);
if (mantissa == 0) { // if we have a 0, then also 0 the exponent.		if (mantissa == 0) { // if we have a 0, then also 0 the exponent.
exponent = 0;		*outputMantissa = 0;
} else if (base == 16) {		*outputExponent = 0;
		} else {
// These two loops should normalize the number if we assume the decimal		binaryExpToFloat<T>(mantissa, exponent, outputMantissa, outputExponent);
// point is after the bit at mantissaWidth.
// For example if type T is a 32 bit float, this should result in a
// mantissa with its most significant 1 being at bit 23.
while (mantissa < (MANTISSA_MAX >> 1)) {
mantissa = mantissa << 1;
--exponent;
}		}
BitsType mantissaCopy = mantissa;		return true;
unsigned int amountToShift = 0;
while (mantissaCopy > MANTISSA_MAX) {
mantissaCopy = mantissaCopy >> 1;
++amountToShift;
}		}
exponent += amountToShift;
mantissa = shiftRightAndRound(mantissa, amountToShift);

// Account for the fact that the mantissa represented an integer		// Takes a pointer to a string and a pointer to a string pointer. This function
// previously, but now represents the fractional part of a normalized		// is used as the backend for all of the string to float functions.
// number.		template <class T>
exponent += fputil::FloatProperties<T>::mantissaWidth;		static inline T strtofloatingpoint(const char *__restrict src,
		char **__restrict strEnd) {
int32_t biasedExponent = exponent + fputil::FPBits<T>::exponentBias;		using BitsType = typename fputil::FPBits<T>::UIntType;
if (biasedExponent <= 0) {		fputil::FPBits<T> result = fputil::FPBits<T>();
// handle subnormals here		const char *originalSrc = src;
		bool seenDigit = false;
		src = first_non_whitespace(src);

// the most mantissa is currently normalized, meaning that the msb is		if (src == '+' \|\| src == '-') {
// one bit left of where the decimal point should go.		if (*src == '-') {
amountToShift = 1;		result.setSign(true);
mantissaCopy = mantissa >> 1;
while (biasedExponent < 0 && mantissaCopy > 0) {
mantissaCopy = mantissaCopy >> 1;
++amountToShift;
++biasedExponent;
}
// If we cut off any bits to fit this number into a subnormal, then it's
// out of range for this size of float.
if ((mantissa & ((1 << amountToShift) - 1)) > 0) {
errno = ERANGE; // NOLINT
}		}
mantissa = shiftRightAndRound(mantissa, amountToShift);		++src;
if (mantissa == 0) {
biasedExponent = 0;
}		}
} else if (biasedExponent > result.maxExponent) {
// This indicates an overflow, so we make the result INF and set errno.		static constexpr char DECIMAL_POINT = '.';
biasedExponent = result.maxExponent;		static const char *INF_STRING = "infinity";
mantissa = 0;		static const char *NAN_STRING = "nan";
errno = ERANGE; // NOLINT
		// bool truncated = false;

		if (isdigit(src) \|\| src == DECIMAL_POINT) { // regular number
		int base = 10;
		char exponentMarker = 'e';
		if (is_float_hex_start(src, DECIMAL_POINT)) {
		base = 16;
		src += 2;
		exponentMarker = 'p';
		seenDigit = true;
}		}
		char *newStrEnd = nullptr;

result.setUnbiasedExponent(biasedExponent);
result.setMantissa(mantissa);
} else { // base is 10
BitsType outputMantissa = 0;		BitsType outputMantissa = 0;
uint32_t outputExponent = 0;		uint32_t outputExponent = 0;
decimalExpToFloat<T>(mantissa, exponent, numStart, truncated,		if (base == 16) {
		seenDigit = hexadecimalStringToFloat<T>(src, DECIMAL_POINT, &newStrEnd,
		&outputMantissa, &outputExponent);
		} else { // base is 10
		seenDigit = decimalStringToFloat<T>(src, DECIMAL_POINT, &newStrEnd,
&outputMantissa, &outputExponent);		&outputMantissa, &outputExponent);
		}

		if (seenDigit) {
		src += newStrEnd - src;
result.setMantissa(outputMantissa);		result.setMantissa(outputMantissa);
result.setUnbiasedExponent(outputExponent);		result.setUnbiasedExponent(outputExponent);
}		}

} else if ((*src \| 32) == 'n') { // NaN		} else if ((*src \| 32) == 'n') { // NaN
if ((src[1] \| 32) == NAN_STRING[1] && (src[2] \| 32) == NAN_STRING[2]) {		if ((src[1] \| 32) == NAN_STRING[1] && (src[2] \| 32) == NAN_STRING[2]) {
seenDigit = true;		seenDigit = true;
src += 3;		src += 3;
BitsType NaNMantissa = 0;		BitsType NaNMantissa = 0;
if (*src == '(') {		if (*src == '(') {
char *tempSrc = 0;		char *tempSrc = 0;
if (isdigit((src + 1)) \|\| (src + 1) == ')') {		if (isdigit((src + 1)) \|\| (src + 1) == ')') {
▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

libc/test/src/stdlib/strtof_test.cpp

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	TEST_F(LlvmLibcStrToFTest, HexadecimalNormalRoundingTests) {
// This gets rounded down to even		// This gets rounded down to even
runTest("0x123456500", 11, 0x4f91a2b2);		runTest("0x123456500", 11, 0x4f91a2b2);
// This doesn't get rounded at all		// This doesn't get rounded at all
runTest("0x123456600", 11, 0x4f91a2b3);		runTest("0x123456600", 11, 0x4f91a2b3);
// This gets rounded up to even		// This gets rounded up to even
runTest("0x123456700", 11, 0x4f91a2b4);		runTest("0x123456700", 11, 0x4f91a2b4);
}		}

		TEST_F(LlvmLibcStrToFTest, HexadecimalsWithRoundingProblems) {
		runTest("0xFFFFFFFF", 10, 0x4f800000);
		}

TEST_F(LlvmLibcStrToFTest, HexadecimalOutOfRangeTests) {		TEST_F(LlvmLibcStrToFTest, HexadecimalOutOfRangeTests) {
runTest("0x123456789123456789123456789123456789", 38, 0x7f800000, ERANGE);		runTest("0x123456789123456789123456789123456789", 38, 0x7f800000, ERANGE);
runTest("-0x123456789123456789123456789123456789", 39, 0xff800000, ERANGE);		runTest("-0x123456789123456789123456789123456789", 39, 0xff800000, ERANGE);
runTest("0x0.00000000000000000000000000000000000001", 42, 0x0, ERANGE);		runTest("0x0.00000000000000000000000000000000000001", 42, 0x0, ERANGE);
}		}

TEST_F(LlvmLibcStrToFTest, InfTests) {		TEST_F(LlvmLibcStrToFTest, InfTests) {
runTest("INF", 3, 0x7f800000);		runTest("INF", 3, 0x7f800000);
Show All 16 Lines