Page MenuHomePhabricator

[libc++][format] Improves width estimate.
Needs ReviewPublic

Authored by Mordante on Feb 21 2023, 8:37 AM.

Details

Reviewers
ldionne
vitaut
tahonermann
Group Reviewers
Restricted Project
Summary

As obvious from the paper's title this is an LWG issue and thus retroactively
applied to C++20. This change may the output for certain code points:
1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15)

(mostly Tangut Ideographs)

2 Change the width of 85 unassigned code points from 2 to 1
3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER

TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK
SQUARE) from 2 to 1, because it seems questionable to make an exception
for those without input from Unicode

Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12.
(The last time I checked MSVC STL used Unicode 14.)

So in practice the only notable change is item 3.

Implements

P2675 LWG3780: The Paper
format's width estimation is too approximate and not forward compatible

Benchmark before these changes

Benchmark Time CPU Iterations

BM_ascii_text<char> 3928 ns 3928 ns 178131
BM_unicode_text<char> 75231 ns 75230 ns 9158
BM_cyrillic_text<char> 59837 ns 59834 ns 11529
BM_japanese_text<char> 39842 ns 39832 ns 17501
BM_emoji_text<char> 3931 ns 3930 ns 177750
BM_ascii_text<wchar_t> 4024 ns 4024 ns 174190
BM_unicode_text<wchar_t> 63756 ns 63751 ns 11136
BM_cyrillic_text<wchar_t> 44639 ns 44638 ns 15597
BM_japanese_text<wchar_t> 34425 ns 34424 ns 20283
BM_emoji_text<wchar_t> 3937 ns 3937 ns 177684

Benchmark after these changes

Benchmark Time CPU Iterations

BM_ascii_text<char> 3914 ns 3913 ns 178814
BM_unicode_text<char> 70380 ns 70378 ns 9694
BM_cyrillic_text<char> 51889 ns 51877 ns 13488
BM_japanese_text<char> 41707 ns 41705 ns 16723
BM_emoji_text<char> 3908 ns 3907 ns 177912
BM_ascii_text<wchar_t> 3949 ns 3948 ns 177525
BM_unicode_text<wchar_t> 64591 ns 64587 ns 10649
BM_cyrillic_text<wchar_t> 44089 ns 44078 ns 15721
BM_japanese_text<wchar_t> 39369 ns 39367 ns 17779
BM_emoji_text<wchar_t> 3936 ns 3934 ns 177821

Benchmarks without "if(code_point < (entries[0] >> 14))"

Benchmark Time CPU Iterations

BM_ascii_text<char> 3922 ns 3922 ns 178587
BM_unicode_text<char> 94474 ns 94474 ns 7351
BM_cyrillic_text<char> 69202 ns 69200 ns 10157
BM_japanese_text<char> 42735 ns 42692 ns 16382
BM_emoji_text<char> 3920 ns 3919 ns 178704
BM_ascii_text<wchar_t> 3951 ns 3950 ns 177224
BM_unicode_text<wchar_t> 81003 ns 80988 ns 8668
BM_cyrillic_text<wchar_t> 57020 ns 57018 ns 12048
BM_japanese_text<wchar_t> 39695 ns 39687 ns 17582
BM_emoji_text<wchar_t> 3977 ns 3976 ns 176479

This optimization does carry its weight for the Unicode and Cyrillic
test. For the Japanese tests the gains are minor and for emoji it seems
to have no effect.

Diff Detail

Unit TestsFailed

TimeTest
12,090 mslibcxx CI Clang-cl (Static) > llvm-libc++-static-clangcl-cfg-in.std/thread/thread_mutex/thread_lock/thread_lock_unique/thread_lock_unique_cons::mutex_duration.pass.cpp
Script: -- : 'COMPILED WITH'; 'C:/Program Files/LLVM/bin/clang-cl.exe' C:\ws\w7\llvm-project\libcxx-ci\libcxx\test\std\thread\thread.mutex\thread.lock\thread.lock.unique\thread.lock.unique.cons\mutex_duration.pass.cpp --driver-mode=g++ --target=x86_64-pc-windows-msvc -nostdinc++ -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/libcxx/test/support -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_STDIO_ISO_WIDE_SPECIFIERS -DNOMINMAX -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -llibc++experimental -nostdlib -L C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/lib -llibc++ -lmsvcrt -lmsvcprt -loldnames -o C:\ws\w7\llvm-project\libcxx-ci\build\clang-cl-static\test\std\thread\thread.mutex\thread.lock\thread.lock.unique\thread.lock.unique.cons\Output\mutex_duration.pass.cpp.dir\t.tmp.exe
12,350 mslibcxx CI Clang-cl (Static) > llvm-libc++-static-clangcl-cfg-in.std/thread/thread_mutex/thread_lock/thread_lock_unique/thread_lock_unique_cons::mutex_time_point.pass.cpp
Script: -- : 'COMPILED WITH'; 'C:/Program Files/LLVM/bin/clang-cl.exe' C:\ws\w7\llvm-project\libcxx-ci\libcxx\test\std\thread\thread.mutex\thread.lock\thread.lock.unique\thread.lock.unique.cons\mutex_time_point.pass.cpp --driver-mode=g++ --target=x86_64-pc-windows-msvc -nostdinc++ -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/libcxx/test/support -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_STDIO_ISO_WIDE_SPECIFIERS -DNOMINMAX -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -llibc++experimental -nostdlib -L C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/lib -llibc++ -lmsvcrt -lmsvcprt -loldnames -o C:\ws\w7\llvm-project\libcxx-ci\build\clang-cl-static\test\std\thread\thread.mutex\thread.lock\thread.lock.unique\thread.lock.unique.cons\Output\mutex_time_point.pass.cpp.dir\t.tmp.exe
12,470 mslibcxx CI Clang-cl (Static) > llvm-libc++-static-clangcl-cfg-in.std/thread/thread_mutex/thread_mutex_requirements/thread_sharedtimedmutex_requirements/thread_sharedtimedmutex_class::try_lock_for.pass.cpp
Script: -- : 'COMPILED WITH'; 'C:/Program Files/LLVM/bin/clang-cl.exe' C:\ws\w7\llvm-project\libcxx-ci\libcxx\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\try_lock_for.pass.cpp --driver-mode=g++ --target=x86_64-pc-windows-msvc -nostdinc++ -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/libcxx/test/support -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_STDIO_ISO_WIDE_SPECIFIERS -DNOMINMAX -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -llibc++experimental -nostdlib -L C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/lib -llibc++ -lmsvcrt -lmsvcprt -loldnames -o C:\ws\w7\llvm-project\libcxx-ci\build\clang-cl-static\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\Output\try_lock_for.pass.cpp.dir\t.tmp.exe
12,800 mslibcxx CI Clang-cl (Static) > llvm-libc++-static-clangcl-cfg-in.std/thread/thread_mutex/thread_mutex_requirements/thread_sharedtimedmutex_requirements/thread_sharedtimedmutex_class::try_lock_shared.pass.cpp
Script: -- : 'COMPILED WITH'; 'C:/Program Files/LLVM/bin/clang-cl.exe' C:\ws\w7\llvm-project\libcxx-ci\libcxx\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\try_lock_shared.pass.cpp --driver-mode=g++ --target=x86_64-pc-windows-msvc -nostdinc++ -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/libcxx/test/support -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_STDIO_ISO_WIDE_SPECIFIERS -DNOMINMAX -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -llibc++experimental -nostdlib -L C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/lib -llibc++ -lmsvcrt -lmsvcprt -loldnames -o C:\ws\w7\llvm-project\libcxx-ci\build\clang-cl-static\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\Output\try_lock_shared.pass.cpp.dir\t.tmp.exe
18,090 mslibcxx CI Clang-cl (Static) > llvm-libc++-static-clangcl-cfg-in.std/thread/thread_mutex/thread_mutex_requirements/thread_sharedtimedmutex_requirements/thread_sharedtimedmutex_class::try_lock_shared_for.pass.cpp
Script: -- : 'COMPILED WITH'; 'C:/Program Files/LLVM/bin/clang-cl.exe' C:\ws\w7\llvm-project\libcxx-ci\libcxx\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\try_lock_shared_for.pass.cpp --driver-mode=g++ --target=x86_64-pc-windows-msvc -nostdinc++ -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/include/c++/v1 -I C:/ws/w7/llvm-project/libcxx-ci/libcxx/test/support -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_STDIO_ISO_WIDE_SPECIFIERS -DNOMINMAX -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -llibc++experimental -nostdlib -L C:/ws/w7/llvm-project/libcxx-ci/build/clang-cl-static/lib -llibc++ -lmsvcrt -lmsvcprt -loldnames -o C:\ws\w7\llvm-project\libcxx-ci\build\clang-cl-static\test\std\thread\thread.mutex\thread.mutex.requirements\thread.sharedtimedmutex.requirements\thread.sharedtimedmutex.class\Output\try_lock_shared_for.pass.cpp.dir\t.tmp.exe
View Full Test Results (6 Failed)

Event Timeline

Mordante created this revision.Feb 21 2023, 8:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 8:37 AM
Herald added a subscriber: arichardson. · View Herald Transcript
Mordante updated this revision to Diff 499195.Feb 21 2023, 8:39 AM

Retrigger CI.

Mordante updated this revision to Diff 499521.Feb 22 2023, 8:21 AM

CI fixes.

Mordante updated this revision to Diff 499573.Feb 22 2023, 10:09 AM

Fixes copy-paste error.

Mordante updated this revision to Diff 500421.Sat, Feb 25, 5:18 AM

Rebased to trigger CI.

Mordante published this revision for review.Thu, Mar 9, 8:50 AM
Mordante added reviewers: ldionne, vitaut, tahonermann.
Herald added a project: Restricted Project. · View Herald TranscriptThu, Mar 9, 8:50 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

I'd like the other folks on the review take a look first since they are more knowledgeable about Unicode.