Page MenuHomePhabricator

Avoid copy of __atoms when char_type is char

Authored by hiraditya on Feb 22 2017, 12:47 PM.



The function num_get<_CharT>::stage2_int_prep makes unnecessary copy of src into atoms when char_type is char. This can be avoided by creating
a switch on type and just returning __src when char_type is char. Running a synthetic benchmark shows the impact of this change:

The test case can be found here:

Without the change with llvm-project/trunk

$ export LD_LIBRARY_PATH=/work/llvm-project/install/lib; ./cxx/stringstream.bench.cpp.out
Run on (24 X 1200 MHz CPU s)
2017-02-22 14:37:34
Benchmark                        Time           CPU Iterations
BM_Istream_numbers/32         8328 ns       8336 ns      83121
BM_Istream_numbers/64         8312 ns       8320 ns      83754
BM_Istream_numbers/128        8301 ns       8309 ns      83975
BM_Istream_numbers/256        8298 ns       8306 ns      84349
BM_Istream_numbers/512        8303 ns       8311 ns      84308
BM_Istream_numbers/1024       8301 ns       8309 ns      84316

With the change on llvm-project/trunk

$ export LD_LIBRARY_PATH=/work/llvm-project/install-sstream/lib; ./cxx/stringstream.bench.cpp.out
Run on (24 X 1200 MHz CPU s)
2017-02-22 14:37:55
Benchmark                        Time           CPU Iterations
BM_Istream_numbers/32         7465 ns       7472 ns      91957
BM_Istream_numbers/64         7460 ns       7467 ns      93824
BM_Istream_numbers/128        7457 ns       7464 ns      93875
BM_Istream_numbers/256        7456 ns       7463 ns      93781
BM_Istream_numbers/512        7455 ns       7462 ns      93793
BM_Istream_numbers/1024       7457 ns       7464 ns      93757

Diff Detail


Event Timeline

hiraditya created this revision.Feb 22 2017, 12:47 PM
hiraditya edited the summary of this revision. (Show Details)
EricWF requested changes to this revision.Feb 28 2017, 9:53 PM

__num_get is externally instantiated. Adding non-template methods or changing method signatures is ABI breaking (As shown by the check-cxx-abilist rule output).

This change needs to be made in a way that maintains ABI compatibility, or it needs to be guarded under a _LIBCPP_ABI_FOO macro and only exposed in ABI v2.

This revision now requires changes to proceed.Feb 28 2017, 9:53 PM
hiraditya updated this revision to Diff 90214.Mar 1 2017, 11:37 AM
hiraditya edited edge metadata.

Guarded the new implementation with _LIBCPP_ABI_OPTIMIZED_LOCALE macro to avoid ABI incompatibilities.

OK, so ABI wise this seems good. I'll need a fresh head to review the locale changes for correctness. I'll get around to that tomorrow.

891 ↗(On Diff #90214)

Would it be possible to reduce the amount of duplicate code between the two implementations? Currently it seems like there is a lot.

hiraditya updated this revision to Diff 96159.Apr 21 2017, 9:38 AM

Minimized the diff by putting #ifdefs inside the function.

891 ↗(On Diff #90214)

Sorry for the late response, I have tried to minimize the duplication.

This revision now requires changes to proceed.May 1 2017, 9:12 AM
EricWF accepted this revision.Jun 13 2017, 12:31 PM


Could you please add the benchmark under libcxx/benchmarks.

Could you also make sure that the existing tests (under test/std) for number parsing have sufficient test coverages for the possible formats it might need to parse? Just to ensure this change isn't missing anything (or the old change isn't)

Once again sorry for the massive delay. <locale> patches are tricky, and I barely understand <locale> as is.

79 ↗(On Diff #96159)

Could you make the macro name slightly more descriptive, in case we have further optimizations to locale :-)

Also please add a short comment describing the change.

969 ↗(On Diff #96159)

Could you lift the 26 into a const int __atoms_size = 42 that's shared between both #if branches?

This revision is now accepted and ready to land.Jun 13 2017, 12:31 PM

Addressed comments from Eric.

This revision was automatically updated to reflect the committed changes.