This is an archive of the discontinued LLVM Phabricator instance.

[libc++][mingw] Remove setlocale from snprintf_l
AbandonedPublic

Authored by alvinhochun on Jun 26 2022, 12:26 AM.

Details

Reviewers
mstorsjo
EricWF
Group Reviewers
Restricted Project
Summary

The previous code used __libcpp_locale_guard, which caused severe
performance issues with certain usage of streams because setlocale is
_that_ slow on Windows. This change speeds up number-to-string
conversion using ostringstream by a factor of 5x to 100x.

Fixes https://github.com/llvm/llvm-project/issues/56202

Diff Detail

Event Timeline

alvinhochun created this revision.Jun 26 2022, 12:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2022, 12:26 AM
Herald added a subscriber: mstorsjo. · View Herald Transcript
alvinhochun published this revision for review.Jun 26 2022, 2:01 AM
alvinhochun added reviewers: mstorsjo, EricWF.
alvinhochun added a subscriber: thomasanderson.

This is an alternative of D59525 and D59727 for mingw. I think this implementation using _vsnprintf_s_l and _vscprintf_l should replicate the expected behaviour of snprintf_l, with NULL terminator and such. @mstorsjo should be able to confirm whether These two functions are available on both UCRT and MSVCRT configuration of mingw.

In case of insufficient buffer, this implementation has to process the format string twice, but doing this is still much faster than having to call setlocale multiple times.

I have not run any of the libcxx test suites on Windows and I don't have a build setup to do so.

I did test the same benchmark in https://github.com/llvm/llvm-project/issues/56202#issuecomment-1166229891, first built using the vanilla llvm-mingw-20220323-ucrt-x86_64 native toolchain and ran, then ran with a replacement libc++.dll built from llvmorg-14.0.0 with this patch applied. The performance difference is significant.

Without patch:

ns/opop/serr%totalbenchmark
--------------------:--------------------:--------:----------::----------
89.3211,195,659.270.8%2.39[ref] osstream with filler string
28,366.4835,252.880.7%2.42osstream << int (not C locale)
1,045.82956,183.670.7%2.41osstream << int (is C locale)
9.08110,145,251.380.4%2.42std::to_string(int)
4.65215,153,197.410.4%2.40std::to_chars(int)
28,814.2534,705.051.0%2.42osstream << double (not C locale)
1,368.22730,877.021.4%2.29osstream << double (is C locale)
348.632,868,353.501.8%2.41std::to_string(double)
32.6430,638,029.720.4%2.42std::to_chars(double)

With patch:

ns/opop/serr%totalbenchmark
--------------------:--------------------:--------:----------::----------
90.2311,083,182.990.9%2.39[ref] osstream with filler string
222.784,488,729.771.0%2.43osstream << int (not C locale)
223.614,472,112.671.1%2.42osstream << int (is C locale)
8.82113,315,748.741.6%2.38std::to_string(int)
4.76209,994,184.812.0%2.44std::to_chars(int)
482.472,072,672.211.4%2.41osstream << double (not C locale)
477.902,092,487.710.7%2.42osstream << double (is C locale)
321.763,107,894.840.3%2.42std::to_string(double)
32.5530,721,425.550.6%2.46std::to_chars(double)
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2022, 2:01 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
alvinhochun planned changes to this revision.Jun 26 2022, 2:02 AM

long double is an issue though...

long double is an issue though...

Yup, long double on x86 is a tricky thing. In the CI configuration, we build with -DLIBCXX_EXTRA_SITE_DEFINES="__USE_MINGW_ANSI_STDIO=1", which makes sure that both the library is built with this define, and the define is set whenever any user of the library includes libc++ headers (I think libstdc++ does the same, where including libstdc++ also enables the same for your code).

(This isn't the configuration I built it for llvm-mingw releases though, but maybe I should.)

So for this case, one (messy) possible way forward could be to use the modern locale APIs if __USE_MINGW_ANSI_STDIO isn't set (or is 0) which would help for the actual llvm-mingw releases for now, while using the slow approach when x86 long doubles are needed. (That codepath wouldn't be covered by CI unfortunately then, though.)

I guess the more complete/proper way forward would make a mingw-w64 specific implementation of these (or some similar) functions too, if __USE_MINGW_ANSI_STDIO is set to 1. (No idea offhand how hard that would be though...)

I guess the more complete/proper way forward would make a mingw-w64 specific implementation of these (or some similar) functions too, if __USE_MINGW_ANSI_STDIO is set to 1. (No idea offhand how hard that would be though...)

Hypothetically, if this means implementing custom formatting routines, wouldn't it be better to just use the same implementation for both msvc and mingw-w64?

(I may experiment with this for a bit, perhaps it can reuse some parts of the std::to_chars implementation...)

I guess the more complete/proper way forward would make a mingw-w64 specific implementation of these (or some similar) functions too, if __USE_MINGW_ANSI_STDIO is set to 1. (No idea offhand how hard that would be though...)

Hypothetically, if this means implementing custom formatting routines, wouldn't it be better to just use the same implementation for both msvc and mingw-w64?

I didn’t mean implementing formatters from scratch (fwiw, libc++ does have that already for the modern std::format stuff too). Mingw-w64 already has got its own formatters that are used when __USE_MINGW_ANSI_STDIO is defined - I meant extending that with an entry point that takes a locale parameter. (Not sure if that’s straightforward or requires touching everything though.)

But on the other hand, now that libc++ does have its own formatters, I guess it could be useful to use that for these older formatting cases too, instead of relying on the libc?

Libcxx wants to finish all outstanding reviews for the migration to github PRs. I guess we should abandon this one for now then, and then reopen as PR when we've got more time to spend effort on it? (It's annoying as this solution is 95% there for all practical cases, but handling the long double case is really annoying...)

alvinhochun abandoned this revision.Sep 8 2023, 1:09 AM

Yeah, I don't have time to work on this now, and I don't have a good idea how to solve this either.