This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Forward to std::memcmp for trivially comparable types in equal
ClosedPublic

Authored by philnik on Dec 7 2022, 9:42 AM.

Diff Detail

Event Timeline

philnik created this revision.Dec 7 2022, 9:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 9:42 AM
philnik updated this revision to Diff 481166.Dec 8 2022, 12:20 AM

Add a benchmark and tests

philnik updated this revision to Diff 481274.Dec 8 2022, 7:43 AM

Generate files

philnik updated this revision to Diff 481463.Dec 8 2022, 3:28 PM

Fix some stuff

philnik retitled this revision from [libc++] Forward to std::memcmp for trivially comparable types to [libc++] Forward to std::memcmp for trivially comparable types in equal.Dec 8 2022, 4:33 PM
philnik updated this revision to Diff 481490.Dec 8 2022, 6:09 PM

Try to fix CI

philnik updated this revision to Diff 481597.Dec 9 2022, 3:53 AM

Try to fix CI

philnik updated this revision to Diff 481867.Dec 10 2022, 9:18 AM
philnik published this revision for review.Dec 10 2022, 3:23 PM
--------------------------------------------------------
Benchmark                              old           new
--------------------------------------------------------
bm_equal_iter/1                   0.926 ns       2.08 ns
bm_equal_iter/2                    1.36 ns       2.20 ns
bm_equal_iter/3                    1.78 ns       2.10 ns
bm_equal_iter/4                    2.26 ns       2.10 ns
bm_equal_iter/5                    2.72 ns       2.09 ns
bm_equal_iter/6                    3.20 ns       2.10 ns
bm_equal_iter/7                    3.63 ns       2.10 ns
bm_equal_iter/8                    4.11 ns       2.10 ns
bm_equal_iter/16                   7.84 ns       2.10 ns
bm_equal_iter/64                   30.6 ns       1.89 ns
bm_equal_iter/512                   241 ns       6.41 ns
bm_equal_iter/4096                 1895 ns       40.9 ns
bm_equal_iter/32768               15152 ns        485 ns
bm_equal_iter/262144             120923 ns       4331 ns
bm_equal_iter/1048576            484883 ns      20014 ns
bm_equal/1                         1.16 ns       2.31 ns
bm_equal/2                         1.51 ns       2.32 ns
bm_equal/3                         1.85 ns       2.32 ns
bm_equal/4                         2.31 ns       2.32 ns
bm_equal/5                         2.87 ns       2.32 ns
bm_equal/6                         3.27 ns       2.32 ns
bm_equal/7                         3.75 ns       2.32 ns
bm_equal/8                         4.19 ns       2.32 ns
bm_equal/16                        7.88 ns       2.32 ns
bm_equal/64                        30.4 ns       2.09 ns
bm_equal/512                        241 ns       6.54 ns
bm_equal/4096                      1892 ns       41.1 ns
bm_equal/32768                    15078 ns        483 ns
bm_equal/262144                  120725 ns       4318 ns
bm_equal/1048576                 482791 ns      19228 ns
bm_ranges_equal/1                  1.16 ns       2.08 ns
bm_ranges_equal/2                  1.50 ns       2.09 ns
bm_ranges_equal/3                  1.85 ns       2.09 ns
bm_ranges_equal/4                  2.20 ns       2.09 ns
bm_ranges_equal/5                  2.54 ns       2.09 ns
bm_ranges_equal/6                  2.89 ns       2.09 ns
bm_ranges_equal/7                  3.24 ns       2.10 ns
bm_ranges_equal/8                  3.58 ns       2.09 ns
bm_ranges_equal/16                 6.36 ns       2.10 ns
bm_ranges_equal/64                 28.5 ns       1.87 ns
bm_ranges_equal/512                 183 ns       6.41 ns
bm_ranges_equal/4096               1428 ns       41.3 ns
bm_ranges_equal/32768             11390 ns        483 ns
bm_ranges_equal/262144            91268 ns       4187 ns
bm_ranges_equal/1048576          363866 ns      20177 ns

Note that this decreases binary size at the same time, since this forwards to a C library function now.

Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2022, 3:23 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
--------------------------------------------------------
Benchmark                              old           new
--------------------------------------------------------
bm_equal_iter/1                   0.926 ns       2.08 ns
bm_equal_iter/2                    1.36 ns       2.20 ns
bm_equal_iter/3                    1.78 ns       2.10 ns

I assume these are slower due to the extra function call overhead.

bm_equal_iter/32768 15152 ns 485 ns
bm_equal_iter/262144 120923 ns 4331 ns
bm_equal_iter/1048576 484883 ns 20014 ns

nice improvements!

Note that this decreases binary size at the same time, since this forwards to a C library function now.

For my curiosity can you share these numbers?

I'm in general happy with the patch, but I've some concern regarding one of the new type traits. I think it would be good to have a second pair of eyes for ranges experts.

libcxx/benchmarks/CMakeLists.txt
111

Why is this needed? Is it portable on all (Linux) platforms?

libcxx/benchmarks/algorithms/equal.bench.cpp
2

Please add the copyright header.

libcxx/include/__algorithm/comp.h
14

Can you add these to the type_traits header too.

libcxx/include/__algorithm/ranges_equal.h
24

Was pair already required?
Or is this to fix the modular build?
When it's the latter, the proper fix would be to export this header from __algorithm/unwrap_range.h.

libcxx/include/__type_traits/is_comparable.h
38

This looks really odd, and it would be nice to have comment why this design is chosen.
Why are the following types not in the set, floating-point values, enums, pointers.

Is __is_trivially_equality_comparable<int, float> true? The other way around isn't.

I really want some internal tests for this type trait.

libcxx/test/std/algorithms/robust_against_adl.compile.pass.cpp
14

Not entirely happy with this, but lets see what gcc-13 brings in the future.

philnik updated this revision to Diff 482961.Dec 14 2022, 12:24 PM
philnik marked 4 inline comments as done.

Address comments

libcxx/benchmarks/CMakeLists.txt
111

I'm not actually sure why it's needed, but on my system libbenchmark.a is in lib64 and not lib. I don't really see how it wouldn't be portable, since the directory just gets ignored if it doesn't exist. I didn't actually plan to change it in this patch, but it shouldn't be a huge problem.

libcxx/include/__type_traits/is_comparable.h
38

This looks really odd, and it would be nice to have comment why this design is chosen.

Could you elaborate on what's really odd about this design?

Why are the following types not in the set, floating-point values, enums, pointers.

floating-point: 0.f == -0.f is true, but they have different binary representations.
enum: AFAIK you are allowed to define your own operator== for enums, so we don't know that UserEnum::Value1 == UserEnum::Value2 is actually the same as comparing their binary representations unless we have compiler support. (That would even allow us to optimize for structs that have bool operator==(const Struct&) = default;)
pointers: are added through the specialization in line 40.

Is __is_trivially_equality_comparable<int, float> true? The other way around isn't.

No, is_same_v<inf, float> is false.

I really want some internal tests for this type trait.

I'll add some.

Mordante added inline comments.Dec 15 2022, 10:30 AM
libcxx/benchmarks/CMakeLists.txt
111

I've no strong objection against it, if breaks the benchmarks for users we can revert it.

libcxx/include/__algorithm/unwrap_iter.h
69

This will only be used in C++20 or later.

70
libcxx/include/__type_traits/is_comparable.h
38

Thanks I see I missed the is_same. I still feel it would be good to add some comment to this file what you mean with trivially equality comparable. It's not a "normal" term and it may be misleading

struct foo {
  int bar;
  bool operator==(const foo&) = default;
};

This struct is trivial, but not trivially equality comparable. I would not be surprised to see a paper proposing a similar type trait.

libcxx/test/libcxx/type_traits/is_trivially_comparable.compile.pass.cpp
37

Please add static_assert(!std::__is_trivially_equality_comparable<float, float>::value, ""); test

philnik updated this revision to Diff 487156.Jan 8 2023, 3:43 AM
philnik marked 3 inline comments as done.

Address comments

libcxx/include/__type_traits/is_comparable.h
38

I would actually consider your example to be trivially equality comparable, since memcmp would give the correct result. It's just that we can't detect this as trivially equality comparable without compiler support. (kind-of a tangent: would it be possible to detect with reflection? That would be really neat.)

philnik updated this revision to Diff 492360.Jan 26 2023, 1:48 AM

Try to fix CI

philnik updated this revision to Diff 494103.Feb 1 2023, 4:08 PM

Try to fix CI

ldionne requested changes to this revision.Feb 2 2023, 9:52 AM
ldionne added a subscriber: ldionne.

I like this change a lot, we seem to be the only implementation that doesn't do that yet.

libcxx/benchmarks/CMakeLists.txt
110–112

Otherwise you are also duplicating the other flags (sanitizer flags and -nodefaultlibs).

libcxx/benchmarks/algorithms/equal.bench.cpp
12

#include <vector>

libcxx/include/__algorithm/equal.h
1

We should release-note this change.

36–37

__equal_iter_trivial_impl is confusing -- it's the function that is used for non-trivially comparable types.

47

With the second suggestion below, you can actually inline __equal_iter_loop into this function directly.

58–62

I would simplify it to this. The current code has the following problems iMO:

  • It's checking that we're not in a constant expression and then it's calling __constexpr_memcmp, which is confusing.
  • It's checking for sizeof(_Tp) == 1 && !is_same<__remove_cv_t<_Tp>, bool>::value which seems rather clang specific (clang's __builtin_memcmp apparently works at compile-time with types that satisfy that). Since this is only a compile-time optimization, I would keep the code simple instead.

Otherwise, the other option I can see is:

template <class _Tp>
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 int
__constexpr_memcmp(const _Tp* __lhs, const _Tp* __rhs, size_t __count) {
  ... additional logic to be fast at compile-time when we're on Clang ...
}

__equal_iter_impl(_Tp* __first1, _Tp* __last1, _Up* __first2, _BinaryPredicate& /* unused */) {
  return std::__constexpr_memcmp(__first1, __first2, (__last1 - __first1) * sizeof(_Tp)) == 0;
}

I actually think I like this second option best.

88

Since we're about to make clang-format mandatory, I think this should be formatted as

_LIBCPP_NODISCARD inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
bool __equal_trivial_impl(args...) {
  ...
}

We have a lot of long lists of attributes in libc++ so this is going to be a common occurrence. Leaving the tiny name of the function alone at the end of the line IMO makes this really hard to read. If there's no way to make this work with clang-format built-in, I would do this:

_LIBCPP_NODISCARD inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 //
bool __equal_trivial_impl(args...) {
  ...
}

If clang-format had an option to prefer breaking after the attributes (or after the return type), that would be better than the status quo.

116–119

Same suggestion here.

libcxx/include/__algorithm/make_projected.h
58–66

I would considering inlining this below. I don't think it pulls its weigh anymore after this simplification.

101–102
libcxx/include/__type_traits/is_comparable.h
10

Let's name this is_equality_comparable.h since this is about equality.

35–36

Please write a comment explaining which types are considered trivially comparable and what's the reasoning behind that. In particular, please cover why e.g. float is not included, why it's OK to disregard cv-qualifiers, etc. This is a pretty fundamental type trait, it's worth putting some effort into it.

42–43

Can you please explain in a comment why this specialization is the way it is. What are the pitfalls, etc.

libcxx/include/__type_traits/is_predicate_kind.h
2

I would suggest naming this file predicate_traits.h. We can then have

__is_trivial_equality_predicate<Pred, T1, T2>
__is_trivial_weak_order_predicate<Pred, T1, T2> // to be added in a separate patch

Also __is_simple_comparator from sort.h should move here instead.

libcxx/include/cstring
117 ↗(On Diff #494103)

I think we should catch the case where we're calling this function with types for which memcmp doesn't do the right thing. I'd be happy with even a static_assert, I just think it would be silly to have a bug where we e.g. compare virtual-base/derived pointers and do the wrong thing at runtime when it could be avoided so easily.

libcxx/test/support/test_iterators.h
1328–1329

I think it would be better to instead introduce cv_qualified_versions<T>. Then you call it as cv_qualified_versions<T> and apply the pointer outside of that -- that'll be more reusable if we need this facility without a pointer type.

This revision now requires changes to proceed.Feb 2 2023, 9:52 AM
philnik updated this revision to Diff 494823.Feb 4 2023, 6:59 AM
philnik marked 18 inline comments as done.

Address comments

libcxx/include/__algorithm/equal.h
88

Actually, it looks like clang-format breaks after the return type if all the arguments fit on the same line. In this case the argument list is just so long that it doesn't fit anymore, so it splits after the parenthesis instead.

libcxx/include/__type_traits/is_comparable.h
42–43

I've put this in the general comment about trivially equality comparable.

philnik updated this revision to Diff 494837.Feb 4 2023, 9:06 AM

Generate files

philnik updated this revision to Diff 496750.Feb 12 2023, 3:36 AM

Try to fix CI

ldionne accepted this revision.EditedFeb 16 2023, 8:41 AM

LGTM with feedback applied. In particular, *please* address the test related feedback, since it's not clear to me that you are currently testing the actual changes in the patch (although the new equal tests are much much better than the previous ones, no doubt).

libcxx/include/__functional/operations.h
346–347

__is_trivial_equal_to_predicate_for<equal_to<>, _Tp, _Tp> should also be true, right?

libcxx/include/__type_traits/predicate_traits.h
22 ↗(On Diff #497269)

I think __is_trivial_equality_predicate is a better name -- when reading __is_trivial_equal_to_predicate_for, you can read either __is_trivial_equalto_predicate_for, or __is_trivial_equal_topredicatefor which keeps throwing me off.

libcxx/include/cstring
125–143 ↗(On Diff #497269)

I suggest this instead:

  if (__libcpp_is_constant_evaluated()) {
    // On Clang, we can use __builtin_memcmp at compile-time for non-bool 1-byte types.
#ifdef _LIBCPP_COMPILER_CLANG_BASED
    if constexpr (sizeof(_Tp) == 1 && !is_same<_Tp, bool>::value) {
      return __builtin_memcmp(__lhs, __rhs, __count);
    }
#endif

    while (...) { ... }
  } else {
    return __builtin_memcmp(__lhs, __rhs, __count);
  }
libcxx/test/libcxx/type_traits/is_trivially_comparable.compile.pass.cpp
37

You actually don't have float, float -- please add it.

Also please add tests for pointers to classes w/ inheritance.

libcxx/test/std/algorithms/alg.nonmodifying/alg.equal/equal.pass.cpp
15

Please make sure to test with an identity projection in ranges::equal, and also with std::equal_to<> and ranges::equal_to<>. Otherwise, the actual improvements you make in this patch are not tested.

This revision is now accepted and ready to land.Feb 16 2023, 8:41 AM
philnik updated this revision to Diff 498309.Feb 17 2023, 3:43 AM
philnik marked 5 inline comments as done.

Address comments

libcxx/test/std/algorithms/alg.nonmodifying/alg.equal/equal.pass.cpp
15

ranges::equal with identity and ranges::equal_to is checked in ranges.equal.pass.cpp. We don't have a negative test though, added one. I've added also tests with std::equal_to<>.

philnik updated this revision to Diff 498412.Feb 17 2023, 9:32 AM

Try to fix CI

philnik updated this revision to Diff 498592.Feb 18 2023, 8:37 AM

Try to fix CTry to fix CI

philnik updated this revision to Diff 499128.Feb 21 2023, 5:39 AM

Try to fix CI

This revision was landed with ongoing or failed builds.Feb 21 2023, 8:11 AM
This revision was automatically updated to reflect the committed changes.