This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Implements 128-bit support in to_chars.
ClosedPublic

Authored by Mordante on Jun 30 2022, 10:51 AM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project
Commits
rG3f78683353e2: [libc++] Implements 128-bit support in to_chars.
Summary

This is required by the Standard and makes it possible to add full
128-bit support to format.

The patch also fixes 128-bit from_chars "support". One unit test
required a too large value, this failed on 128-bit; the fix was to add
more characters to the input.

Note only base 10 has been optimized. Other bases can be optimized.

Note the 128-bit lookup table could be made smaller. This will be done later. I
really want to get 128-bit working in to_chars and format in the upcomming
LLVM 15 release, these optimizations aren't critical.

Diff Detail

Event Timeline

Mordante created this revision.Jun 30 2022, 10:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2022, 10:51 AM
Mordante requested review of this revision.Jun 30 2022, 10:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2022, 10:51 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Mordante updated this revision to Diff 441478.Jun 30 2022, 12:01 PM

Fixes some CI issues.

Mordante updated this revision to Diff 441484.Jun 30 2022, 12:26 PM

Another CI fix.

Mordante updated this revision to Diff 441507.Jun 30 2022, 1:46 PM

Attempts to fix platforms not supporting 128-bit integerals.

nilayvaish added inline comments.
libcxx/include/__bits
46

Why have space between # and ifndef?

56–58

Would this not be identified by the compiler as constexpr if _LIBCPP_STD_VER > 11? Essentially I am trying to understand why we need #if on _LIBCPP_STD_VER.

libcxx/include/__charconv/tables.h
39

What's FMT?

119

Should this be 10^0 = 1? I did notice that this matches the tables for 32 bits and 64 bits.

119–158

Is it possible to use quotes (they are only allowed since C++14) after every three digits, starting from the least-significant digit? Else, maybe we should generate these individual items using pre-processing. I think in the current form it is hard to know if the number of digits is correct.

libcxx/include/__charconv/to_chars_base_10.h
146

I think the MSB is missing.

Mordante marked 3 inline comments as done.Jul 6 2022, 8:22 AM

Thanks for the review!

libcxx/include/__bits
46

This is our clang-format style. In some places, mainly __config we have a lot of nested preprocessor directives. Indenting them improves readability.

56–58

The above isn't valid in C++11 due to language limitations. But this isn't very readable so I added a more readable version for newer C++ versions.

libcxx/include/__charconv/tables.h
39

FMT is short for format, I added this for std::format, so I should clean this up for std::format.

119

No it shouldn't. The length determination algorithm uses __builtin_clzll which requires the value to be not 0 so there the value is adjusted (orred by 1). That way a value of zero has a width of 1 and needs no further adjustment.

I agree the naming used is a bit misleading, but I decided to keep the naming already used before.

119–158

It's not possible since this needs to compile in C++11. (In fact in another to_chars patch I used quotes until C++11 complained.) In the current form it's indeed looking at the pattern for a shift of one, not great. Do you have a suggestion how to do this cleanly using pre-processing; usually I feel pre-processing doesn't help to make things readable.

libcxx/include/__charconv/to_chars_base_10.h
146

What do you exactly mean with "the MSB is missing"?

ldionne accepted this revision.Jul 6 2022, 9:22 AM
ldionne added a subscriber: ldionne.
ldionne added inline comments.
libcxx/include/__bits
1

Can you please document our support for __int128_t & its unsigned counterpart under Libc++ Extensions in libcxx/docs/UsingLibcxx.rst?

56–58

IMO we should have only one version since both are equivalent. The comment can be left in place to explain why the implementation looks that way.

libcxx/include/__charconv/tables.h
115

I don't really understand the purpose of this function. Why don't we simply multiply the operands below?

139

And so on.

libcxx/include/__charconv/to_chars_base_10.h
66

I'm not sure this comment provides a lot of value. If that's really important, we should have enable_if or static_assert. If the function really does work with pretty much any integral type, then I think the comment just doesn't have its place.

68–69

Same thing for this comment.

libcxx/include/charconv
175

Can you add a reference to the documentation for this trick?

488

This is a leftover comment, I think we should be casting to uint64_t.

libcxx/test/std/utilities/charconv/charconv.from.chars/integral.pass.cpp
43–44

Weird indentation.

libcxx/test/support/charconv_test_helpers.h
153

That would make it clear it's a helper function only. Same comment applies to the other false_type version. And to fromchars above.

This revision is now accepted and ready to land.Jul 6 2022, 9:22 AM
nilayvaish added inline comments.Jul 6 2022, 12:40 PM
libcxx/include/__charconv/to_chars_base_10.h
146

2^64 = 18446744073709551616. The most significant digit would be 1.

libcxx/include/charconv
175

+1. I don't quite understand what we are trying to achieve here.

Mordante marked 18 inline comments as done.Jul 6 2022, 11:10 PM

Thanks for the reviews. I'll upload a new version and land it when the CI passes.

libcxx/include/__bits
1

I had a look at this page, but nothing in this file has documentation. So instead of adding it in this review I'll crate a separate review which documents this file and its new extension.

56–58

Done, based on the time we spend discussing the C++11 version I added comment to explain what this does.

libcxx/include/__charconv/tables.h
119–158

I've adjusted as suggested by @ldionne above I think that version is easier to inspect.

libcxx/include/__charconv/to_chars_base_10.h
146

Thanks, good catch! I thought you means the sign bit with MSB, which makes no sense for an unsigned value. Now I see you mean the leading digit, which indeed is the MSB.

I've updated the comment and made some minor changes to the algorithm since step 2 can actually process 19 digits.
(step 3 already used 19 digits which would be invalid if the current value were correct.)

libcxx/include/charconv
175

+1 ;-) I had intended to do that after I investigated this trick in the 32 and 64-bit code. Thanks for the reminder.

In the comment I also explain why the pow table starts with 0 instead of 1.

488

It's indeed a left-over comment. However we can't cast to uint64_t here. That would break the 128-bit code. (I tested that locally.)

Mordante updated this revision to Diff 442784.Jul 6 2022, 11:11 PM
Mordante marked 5 inline comments as done.

Addresses review comments.

This revision was automatically updated to reflect the committed changes.