Microsoft would like to contribute its implementation of floating-point to_chars to libc++. This uses the impossibly fast Ryu and Ryu Printf algorithms invented by Ulf Adams at Google. Upstream repos: https://github.com/microsoft/STL and https://github.com/ulfjack/ryu .
Licensing notes: MSVC's STL is available under the Apache License v2.0 with LLVM Exception, intentionally chosen to match libc++. We've used Ryu under the Boost Software License.
This patch contains minor changes from Jorg Brown at Google, to adapt the code to libc++. He verified that it works in Google's Linux-based environment, but then I applied more changes on top of his, so any compiler errors are my fault. (I haven't tried to build and test libc++ yet.) Please tell me if we need to do anything else in order to follow https://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes .
Notes:
- libc++'s integer charconv is unchanged (except for a small refactoring). MSVC's integer charconv hasn't been tuned for performance yet, so you're not missing anything.
- Floating-point from_chars isn't part of this patch because Jorg found that MSVC's implementation (derived from our CRT's strtod) was slower than Abseil's. If you're unable to use Abseil or another implementation due to licensing or technical considerations, Microsoft would be delighted if you used MSVC's from_chars (and you can just take it, or ask us to provide a patch like this). Ulf is also working on a novel algorithm for from_chars.
- This assumes that float is IEEE 32-bit, double is IEEE 64-bit, and long double is also IEEE 64-bit.
- I have added MSVC's charconv tests (the whole thing: integer/floating from_chars/to_chars), but haven't adapted them to libcxx's harness at all. (These tests will be available in the microsoft/STL repo soon.)
- Jorg added int128 codepaths. These were originally present in upstream Ryu, and I removed them from microsoft/STL purely for performance reasons (MSVC doesn't support int128; Clang on Windows does, but I found that x64 intrinsics were slightly faster).
- The implementation is split into 3 headers. In MSVC's STL, charconv contains only Microsoft-written code. xcharconv_ryu.h contains code derived from Ryu (with significant modifications and additions). xcharconv_ryu_tables.h contains Ryu's large lookup tables (they were sufficiently large to make editing inconvenient, hence the separate file). The xmeow.h convention is MSVC's for internal headers; you may wish to rename them.
- You should consider separately compiling the lookup tables (see https://github.com/microsoft/STL/issues/172 ) for compiler throughput and reduced object file size.
- See https://github.com/StephanTLavavej/llvm-project/commits/charconv for fine-grained history. (If necessary, I can perform some rebase surgery to show you what Jorg changed relative to the microsoft/STL repo; currently that's all fused into the first commit.)
Questions:
- MSVC's STL conventionally uses _Single_uppercase_ugly identifiers, but xcharconv_ryu.h (and tables) also use __double_lowercase_ugly in Ryu-derived code, for ease of search-and-replacing (and comparing) with upstream Ryu. (That file still uses _Single_uppercase_ugly for additions; it looks weird but helped me keep track of what surgery I was performing.) I've compared these files against MSVC's and the diffs are very readable, so I'd like to keep the identifiers following the same convention so our codebases don't diverge too much. Does libc++ want to use __double_lowercase_ugly identifiers? (But apparently you use _Single_uppercase_ugly for template parameters?) If so, I can change the identifiers in this patch, and in microsoft/STL charconv upstream.
- I have included Microsoft's Apache+LLVM and Ulf's Boost banners, as they have appeared in the microsoft/STL repo. Would you like something different?
Some of my comments are still alive, including some that don't make sense in this context. Maybe mention that I wrote the initial code?