This is an archive of the discontinued LLVM Phabricator instance.

P1144 "Trivially relocatable" (3/3): optimize std::vector reallocate/insert and std::swap for trivially relocatable types
AbandonedPublic

Authored by Mordante on Sep 12 2019, 2:51 PM.

Details

Reviewers
ldionne
zoecarver
EricWF
Quuxplusone
Group Reviewers
Restricted Project
Summary

Part 0/3 is D50119: it adds __has_extension(__is_trivially_relocatable) to the compiler.
Part 1/3 is D61761: it implements the library traits and algorithms mandated by P1144.
Part 2/3 is D63620: it implements Quality-of-Implementation features to warrant certain std library types as "trivially relocatable."
Part 3/3 is D67524: it implements Quality-of-Implementation features to optimize certain std library functions for better performance on types that have been warranted "trivially relocatable."

This patch has no ABI implications (or if it does, that's a bug, please tell me about it).

Notes on optimizations that I've made for https://p1144.godbolt.org/z/q2N_Ln but which are not part of this patch:

  • std::any can be made trivially relocatable, by changing _IsSmallObject to be false for non-trivially-relocatable types. However, that would break ABI.

Diff Detail

Event Timeline

Note: std::is_constant_evaluated() is implemented and you can use __libcpp_is_constant_evaluated before C++2a.

I'm going to play around with this patch (and the others) a bit. Then, I can give a more full review.

include/memory
1614 ↗(On Diff #220005)

Maybe leave a comment about what the various overloads mean.

Also, maybe add a default value (in case we want to use this outside of std::vector in the future).

5760 ↗(On Diff #220005)

Nit: it might be less confusing to just use ::allocator_type.

5770 ↗(On Diff #220005)

The more we can use _And the better. That way, at some point in the future, we can create an __and builtin type trait and improve build performance a bit.

include/vector
479 ↗(On Diff #220005)

We don't need to do this anymore! Every compiler that we support supports rvalues (I think).

Quuxplusone marked 3 inline comments as done.Sep 12 2019, 7:32 PM
Quuxplusone added inline comments.
include/memory
1614 ↗(On Diff #220005)

Defaulted function arguments are the devil.
I consider it a benefit that now it's impossible to call __construct_forward without explicitly saying whether you want to use memcpy or not; this is because the call to __construct_forward will usually be followed by a call to __destroy_at_end, and the final parameter to both calls must agree (or else you might call a constructor without the matching destructor, or vice versa).

Adding a comment is a good idea, but I'm not sure the best place to put it. Incidentally, in a previous revision I'd had something like

template<class X, class UseMemcpy>
static void foo(X x, UseMemcpy) { ... }
template<class X>
static void foo(X x, false_type) { ... }

where the first overload matches true_type; but that was unnecessary, error-prone, and not really any less confusing, so I trashed it.

5760 ↗(On Diff #220005)

This is actually doing SFINAE on the value of __is_uncustomized — which in user-defined specializations of allocator_traits will either be absent (if they went from scratch) or have the wrong value (if they derived from allocator_traits<allocator<T>> to fill in the boilerplate).

The __has_trivial_construct part of this patch is a watered-down version of D49317. For now and maybe forever, libc++ wisely doesn't permit users to specialize allocator_traits. (See https://quuxplusone.github.io/blog/2018/07/14/traits-classes/ .)
So I guess it would be fine for me to eliminate __has_customized_allocator_traits from this patch.

include/vector
479 ↗(On Diff #220005)

I think you're right, but then I'm confused by the similar code on the return type of move_if_noexcept (in trunk). What does _Tp&& mean, in C++03 mode? Is it a synonym for const _Tp&? If so, why does move_if_noexcept need that ifndef?

template <class _Tp>
inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR_AFTER_CXX11
#ifndef _LIBCPP_CXX03_LANG
typename conditional
<
    !is_nothrow_move_constructible<_Tp>::value && is_copy_constructible<_Tp>::value,
    const _Tp&,
    _Tp&&
>::type
#else  // _LIBCPP_CXX03_LANG
const _Tp&
#endif
move_if_noexcept(_Tp& __x) _NOEXCEPT
{
    return _VSTD::move(__x);
}
zoecarver added inline comments.Sep 13 2019, 9:38 AM
include/vector
479 ↗(On Diff #220005)

Clang supports rvalues in C++03 mode. So it will work just like in C++11 mode. move_if_noexcept probably just hasn't been updated. Example: https://godbolt.org/z/L2CJQa

Address most of @zoecarver's review comments. (Haven't looked into _And yet.)
Since __libcpp_is_constant_evaluated() is usable, use it in std::swap. (This passes all the test cases for swap, including the constexpr ones! Hooray!)

Quuxplusone retitled this revision from P1144 "Trivially relocatable" (3/3): optimize std::vector for trivially relocatable types to P1144 "Trivially relocatable" (3/3): optimize std::vector and std::swap for trivially relocatable types.Sep 13 2019, 4:24 PM
Quuxplusone edited the summary of this revision. (Show Details)
EricWF added a subscriber: EricWF.Sep 13 2019, 9:16 PM

How were the performance benefits measured? Are they visible in the existing benchmarks?

Add a new benchmark, and add a TriviallyRelocatableType to the matrix in algorithms.bench.cpp.

My expectation is that swap.bench.cpp should get noticeably faster after this patch.

My less confident expectation is that the new .*trivre.* test cases in algorithms.bench.cpp should initially be noticeably worse than the .*string.* cases, but after this patch they should catch up. What's happening there is that when you swap strings, you're calling string's own ADL overload of swap which is very fast, but when you swap TriviallyRelocatableTypes, you're calling the generic std::swap<TriviallyRelocatableType> which is not quite so fast until this patch makes it do memcpy.

I'm putting the benchmarks right here in this patch so that @EricWF can run them. (Zoe and I produced some good-looking benchmark numbers earlier tonight, with code similar to this code; but my build tree is too screwed up to reproduce them on demand.)

Fix a bug in my vector::insert; it wasn't exception-safe.

I'm putting the benchmarks right here in this patch so that @EricWF can run them. (Zoe and I produced some good-looking benchmark numbers earlier tonight, with code similar to this code; but my build tree is too screwed up to reproduce them on demand.)

@EricWF ping!

I'm putting the benchmarks right here in this patch so that @EricWF can run them. (Zoe and I produced some good-looking benchmark numbers earlier tonight, with code similar to this code; but my build tree is too screwed up to reproduce them on demand.)

@EricWF ping!

Rebased on master, and fixed an exception-safety bug in my patch to vector::insert/erase.

Quuxplusone marked 4 inline comments as done.Oct 29 2019, 2:35 PM

@EricWF ping!

Quuxplusone retitled this revision from P1144 "Trivially relocatable" (3/3): optimize std::vector and std::swap for trivially relocatable types to P1144 "Trivially relocatable" (3/3): optimize std::vector reallocate/insert and std::swap for trivially relocatable types.

Migrate to the monorepo.
@EricWF ping? What's the deal with this new benchmark?

Herald added a project: Restricted Project. · View Herald TranscriptJan 1 2020, 4:54 PM
ldionne added a reviewer: Restricted Project.Nov 2 2020, 2:25 PM
Mordante commandeered this revision.Nov 1 2023, 10:53 AM
Mordante added a reviewer: Quuxplusone.
Mordante added a subscriber: Mordante.

Since P1144 is still in flight in WG21 abandoning this patch for now. When the paper is approved we can look it up again and see how outdated it is.

Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2023, 10:53 AM
Herald added a subscriber: sunshaoce. · View Herald Transcript
Mordante abandoned this revision.Nov 1 2023, 10:53 AM