Page MenuHomePhabricator

[libc++] Forward to std::{,w}memcmp in std::find
Needs ReviewPublic

Authored by philnik on Feb 20 2023, 6:57 AM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project

Diff Detail

Unit TestsFailed

TimeTest
2,070 mslibcxx CI C++03 > llvm-libc++-shared-cfg-in.libcxx::clang_tidy.sh.cpp
Script: -- : 'RUN: at line 15'; clang-tidy-16 /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/libcxx/clang_tidy.sh.cpp --warnings-as-errors=* -header-filter=.* --checks='-*,libcpp-*' --load=/home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/libcxx/test/tools/clang_tidy_checks/libcxx-tidy.plugin -- -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++03 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -fno-modules
2,070 mslibcxx CI C++03 > llvm-libc++-shared-cfg-in.libcxx::clang_tidy.sh.cpp
Script: -- : 'RUN: at line 15'; clang-tidy-16 /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/libcxx/clang_tidy.sh.cpp --warnings-as-errors=* -header-filter=.* --checks='-*,libcpp-*' --load=/home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/libcxx/test/tools/clang_tidy_checks/libcxx-tidy.plugin -- -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++03 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -fno-modules
2,070 mslibcxx CI C++03 > llvm-libc++-shared-cfg-in.libcxx::clang_tidy.sh.cpp
Script: -- : 'RUN: at line 15'; clang-tidy-16 /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/libcxx/clang_tidy.sh.cpp --warnings-as-errors=* -header-filter=.* --checks='-*,libcpp-*' --load=/home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/libcxx/test/tools/clang_tidy_checks/libcxx-tidy.plugin -- -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++03 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -fno-modules
2,070 mslibcxx CI C++03 > llvm-libc++-shared-cfg-in.libcxx::clang_tidy.sh.cpp
Script: -- : 'RUN: at line 15'; clang-tidy-16 /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/libcxx/clang_tidy.sh.cpp --warnings-as-errors=* -header-filter=.* --checks='-*,libcpp-*' --load=/home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/libcxx/test/tools/clang_tidy_checks/libcxx-tidy.plugin -- -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++03 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -fno-modules
2,070 mslibcxx CI C++03 > llvm-libc++-shared-cfg-in.libcxx::clang_tidy.sh.cpp
Script: -- : 'RUN: at line 15'; clang-tidy-16 /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/libcxx/clang_tidy.sh.cpp --warnings-as-errors=* -header-filter=.* --checks='-*,libcpp-*' --load=/home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/libcxx/test/tools/clang_tidy_checks/libcxx-tidy.plugin -- -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/build/generic-cxx03/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/5c9ee79be048-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++03 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -fno-modules
View Full Test Results (38,255 Failed)

Event Timeline

philnik created this revision.Feb 20 2023, 6:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2023, 6:57 AM

Benchmarks:

---------------------------------------------------------------
Benchmark                                   old             new
---------------------------------------------------------------
bm_find<char>/1                         10.5 ns         10.5 ns
bm_find<char>/2                         18.3 ns         10.4 ns
bm_find<char>/3                         21.1 ns         10.4 ns
bm_find<char>/4                         22.4 ns         10.3 ns
bm_find<char>/5                         23.3 ns         10.3 ns
bm_find<char>/6                         23.7 ns         10.4 ns
bm_find<char>/7                         24.0 ns         10.3 ns
bm_find<char>/8                         24.5 ns         10.2 ns
bm_find<char>/16                        26.2 ns         10.2 ns
bm_find<char>/64                        29.9 ns         20.8 ns
bm_find<char>/512                       80.2 ns         36.0 ns
bm_find<char>/4096                       523 ns         43.0 ns
bm_find<char>/32768                     3838 ns          105 ns
bm_find<char>/262144                   30832 ns          988 ns
bm_find<char>/1048576                 122541 ns         4854 ns
bm_find<short>/1                        10.5 ns         10.4 ns
bm_find<short>/2                        18.3 ns         18.5 ns
bm_find<short>/3                        21.2 ns         21.4 ns
bm_find<short>/4                        22.6 ns         22.7 ns
bm_find<short>/5                        23.6 ns         23.6 ns
bm_find<short>/6                        24.1 ns         24.2 ns
bm_find<short>/7                        24.5 ns         24.6 ns
bm_find<short>/8                        24.6 ns         24.8 ns
bm_find<short>/16                       26.1 ns         26.6 ns
bm_find<short>/64                       33.3 ns         31.3 ns
bm_find<short>/512                       134 ns         80.4 ns
bm_find<short>/4096                      967 ns          499 ns
bm_find<short>/32768                    7670 ns         3842 ns
bm_find<short>/262144                  61106 ns        31371 ns
bm_find<short>/1048576                244510 ns       122120 ns
bm_find<int>/1                          10.4 ns         12.1 ns
bm_find<int>/2                          18.4 ns         12.0 ns
bm_find<int>/3                          21.9 ns         12.0 ns
bm_find<int>/4                          23.3 ns         11.9 ns
bm_find<int>/5                          24.2 ns         12.1 ns
bm_find<int>/6                          24.7 ns         12.0 ns
bm_find<int>/7                          24.8 ns         11.9 ns
bm_find<int>/8                          25.2 ns         11.9 ns
bm_find<int>/16                         26.2 ns         21.9 ns
bm_find<int>/64                         30.5 ns         33.8 ns
bm_find<int>/512                        80.4 ns         40.4 ns
bm_find<int>/4096                        497 ns         71.7 ns
bm_find<int>/32768                      3844 ns          498 ns
bm_find<int>/262144                    30722 ns         4901 ns
bm_find<int>/1048576                  123303 ns        22864 ns
bm_ranges_find<char>/1                  10.4 ns         10.4 ns
bm_ranges_find<char>/2                  18.3 ns         10.4 ns
bm_ranges_find<char>/3                  21.1 ns         10.4 ns
bm_ranges_find<char>/4                  22.3 ns         10.3 ns
bm_ranges_find<char>/5                  23.2 ns         10.3 ns
bm_ranges_find<char>/6                  23.6 ns         10.3 ns
bm_ranges_find<char>/7                  24.0 ns         10.4 ns
bm_ranges_find<char>/8                  24.4 ns         10.5 ns
bm_ranges_find<char>/16                 26.0 ns         10.3 ns
bm_ranges_find<char>/64                 30.6 ns         20.9 ns
bm_ranges_find<char>/512                80.2 ns         36.2 ns
bm_ranges_find<char>/4096                496 ns         43.1 ns
bm_ranges_find<char>/32768              3844 ns          106 ns
bm_ranges_find<char>/262144            30632 ns          994 ns
bm_ranges_find<char>/1048576          123127 ns         5099 ns
bm_ranges_find<short>/1                 10.5 ns         10.5 ns
bm_ranges_find<short>/2                 18.4 ns         18.3 ns
bm_ranges_find<short>/3                 21.2 ns         21.2 ns
bm_ranges_find<short>/4                 22.7 ns         22.5 ns
bm_ranges_find<short>/5                 23.6 ns         23.4 ns
bm_ranges_find<short>/6                 24.1 ns         23.8 ns
bm_ranges_find<short>/7                 24.4 ns         24.1 ns
bm_ranges_find<short>/8                 24.7 ns         24.5 ns
bm_ranges_find<short>/16                26.2 ns         26.1 ns
bm_ranges_find<short>/64                33.2 ns         30.4 ns
bm_ranges_find<short>/512                134 ns         80.4 ns
bm_ranges_find<short>/4096               969 ns          506 ns
bm_ranges_find<short>/32768             7648 ns         3853 ns
bm_ranges_find<short>/262144           61198 ns        30507 ns
bm_ranges_find<short>/1048576         248057 ns       122080 ns
bm_ranges_find<int>/1                   10.5 ns         11.8 ns
bm_ranges_find<int>/2                   18.5 ns         11.9 ns
bm_ranges_find<int>/3                   22.0 ns         11.9 ns
bm_ranges_find<int>/4                   23.4 ns         11.8 ns
bm_ranges_find<int>/5                   24.3 ns         11.8 ns
bm_ranges_find<int>/6                   24.7 ns         12.0 ns
bm_ranges_find<int>/7                   24.8 ns         11.9 ns
bm_ranges_find<int>/8                   25.1 ns         11.8 ns
bm_ranges_find<int>/16                  26.2 ns         21.9 ns
bm_ranges_find<int>/64                  30.7 ns         33.8 ns
bm_ranges_find<int>/512                 80.4 ns         39.7 ns
bm_ranges_find<int>/4096                 498 ns         72.6 ns
bm_ranges_find<int>/32768               3842 ns          506 ns
bm_ranges_find<int>/262144             30566 ns         5311 ns
bm_ranges_find<int>/1048576           123103 ns        22743 ns
philnik updated this revision to Diff 505808.Thu, Mar 16, 8:01 AM

Fix formatting

ldionne published this revision for review.Thu, Mar 16, 9:27 AM
ldionne added a subscriber: ldionne.
ldionne added inline comments.
libcxx/include/__algorithm/ranges_find.h
42–48

Maybe this conveys the intent a bit better?

libcxx/include/__string/constexpr_c_functions.h
65–87

I would suggest reorganizing like this, to make it more obvious what the intent of using __builtin_char_memchr is.

if (__libcpp_is_constant_evaluated()) {
  // use __builtin_char_memchr to optimize constexpr evaluation if we can
#if __has_builtin(__builtin_char_memchr) 
  if (is_same<_Tp, char>::value)
    return __builtin_char_memchr(__str, __char, __count);
#endif

  for (; __count; --__count) {
    if (*__str == __char)
      return __str;
    ++__str;
  }
  return nullptr;
} else {
  return static_cast<_Tp*>(__builtin_memchr(__str, __char, __count));
}
67
libcxx/include/cwchar
231

static_assert("diagnostic")

You also seem to be missing an important requirement on _Up? You need _Up to be trivially copyable to wchar_t or else your std::wmemcpy may not be valid? I would actually just replace _Up by wchar_t since we don't need the additional generalization, at least for now.

232

You probably want to remove_cv_t<_Tp> here to also catch the case where _Tp* == wchar_t const*.

233–234

Now you don't need this anymore if you take wchar_t __char as input.

235

Does reintepret_cast<T>(object-of-type-T) work inside constexpr? If not then this doesn't compile when constant-evaluated, and it means your test coverage is also insufficient in that area. Either way please make sure that is tested.

Herald added a project: Restricted Project. · View Herald TranscriptThu, Mar 16, 9:27 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne requested changes to this revision.Thu, Mar 16, 9:27 AM
This revision now requires changes to proceed.Thu, Mar 16, 9:27 AM
philnik updated this revision to Diff 506290.Sat, Mar 18, 5:48 AM
philnik marked 7 inline comments as done.

Address comments

philnik updated this revision to Diff 506371.Sun, Mar 19, 3:00 AM

Try to fix CI

Thanks for working on this, the benchmarks look very nice!

libcxx/benchmarks/algorithms/find.bench.cpp
9

I really would like the benchmark results you posted in the commit message.

libcxx/include/__algorithm/find.h
54

Why not test is_same_v<_Tp, wchar_t>?

58

Does this function work with char16_t or char32_t (depending on sizeof(wchar_t).)

libcxx/include/__string/constexpr_c_functions.h
73–83

Note that the current way it's written generates unneeded code

libcxx/include/cwchar
240

This seems duplicated code from the char version, can we combine it in a templated function.
The same issue with not using else in if constexpr applies here too,

philnik marked 2 inline comments as done.Wed, Mar 22, 1:45 PM
philnik added inline comments.
libcxx/include/__algorithm/find.h
54

Because that would disable the optimization for other types that have the same size as wchar_t. For these types this optimization is probably UB, but that shouldn't be a problem in practice, since it's on an ABI boundary and no compiler currently optimizes wmemchr specifically (and IMO no compiler should ever optimize based on this, since there is no potential gain I can see).

58

See my comment above.

libcxx/include/__string/constexpr_c_functions.h
73–83

This looks quite surprising to me, and I'm not sure it's worth the little bit of time the compiler spends on this.

libcxx/include/cwchar
240

I don't think it's worth it. The loop is really simple, and deduplicating this seems like a lot of extra complexity compared to the benefit.

ldionne added inline comments.Thu, Mar 23, 10:20 AM
libcxx/include/cwchar
232

Should we also assert the we have the same alignment as wchar_t here?

248–249

For the non-constexpr case, I would do this to make sure that we really never exploit any UB, even if it may seem to be benign right now:

if constexpr (std::is_trivial_v<_Tp>) { // we know it's safe to reinterpret cast (at least in C++20 with implicit lifetime types)
    return reinterpret_cast<_Tp*>(std::wmemchr(reinterpret_cast<__apply_cv<_Tp, wchar_t>::type*>(__str), __char, __count));
} else {
    // loop
}

Concretely, this won't change anything since we already only call this function with integral types, however if we were to e.g. specialize __is_trivially_equality_comparable for a non-trivial type, then my suggested code would work whereas the original code might be (benign) UB. WDYT?

libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find.pass.cpp
10

We should also update the tests for ranges::find.

38

We should also have a test that checks that if multiple elements match, we return the first one.

53

Let's also add a test for an empty range.

57

This shouldn't do the same compare that would be generated anyway, otherwise we are not testing properly that we don't use memchr.

59

We also need to test with types that have different sizes so we trigger the memchr and the wmemchr code paths. Right now we don't trigger both with non-integral types.

Mordante added inline comments.Fri, Mar 24, 10:23 AM
libcxx/include/__algorithm/find.h
54

The function you call has the following signature

inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 const wchar_t*
__constexpr_wmemchr(const wchar_t* __str, wchar_t __char, size_t __count)

So how can this work when _Tp is not a wchar_t? That requires a reinterpret_cast which is not allowed during constant evaluation. What am I missing?

libcxx/include/__string/constexpr_c_functions.h
73–83

This is something we have used before. Now it will generate dead code.

libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find.pass.cpp
78

I really would like to see tests for all types that will use this optimization. Especially to make sure the code is valid during constant evaluation.

philnik marked 2 inline comments as done.Fri, Mar 24, 10:40 AM
philnik added inline comments.
libcxx/include/__algorithm/find.h
54

Your signature is wrong, __constexpr_wmemchr is templated.

libcxx/include/__string/constexpr_c_functions.h
73–83

clang doesn't even emit the condition from the fronted AFAICT, let alone the backend: https://godbolt.org/z/87bvzbhbT. This is at best a minimal compile time speedup.

libcxx/test/std/algorithms/alg.nonmodifying/alg.find/find.pass.cpp
78

We currently only have integral types and pointers marked trivially equality comparable, and I don't think it's worth testing for pointers specifically, since they are for all intents and purposes integers with a funny syntax in this case.