This is an archive of the discontinued LLVM Phabricator instance.

Use trivial relocation operations in std::vector, by porting D67524 and part of D61761 to work on top of the changes in D114732.
Needs ReviewPublic

Authored by devin.jeanpierre on Feb 9 2022, 4:02 PM.

Details

Reviewers
Quuxplusone
ldionne
philnik
Group Reviewers
Restricted Project
Summary

This change is based on work from @Quuxplusone in D67524 and D61761. All credit to them, all blame to me for any errors or problems, as I may have accidentally made it worse. :)

Previous change: D114732


The benefit here is that it can make trivially relocatable types -- by which, for now, we mean "types which are trivial for calls" -- easy to optimize. I have measured performance improvements of up to 0.25% on some large server macrobenchmarks.

For microbenchmarks, with my copy of clang, this change makes std::vector<std::unique_ptr<T>>::push_back ~38% faster when trivial_abi is enabled on unique_ptr.

This also also to unlocks further optimizations: it is, after this change, even more profitable to convert an existing type to trivial_abi, e.g. to enable it on even more types in the standard library than just the smart pointers. For example: it could become more worthwhile, after this change, to try to mark std::string as [[trivial_abi]].


Changes from D61761 and D67524:

  1. This change does not include support for std::swap (*relatively* low value, and very difficult to do correctly), just vector operations. It also does not include the relocate_at family of functions.
  2. (minor) Use __is_default_allocator instead of a new __is_exactly_std_allocator. At least to my eye, these seemed identical.
  3. bugfix to erase: it would memove N *bytes*, instead of N *items*.
  4. The tests now use std::conditional to enable trivial relocatability, as [[clang::trivial_abi]] doesn't take a bool parameter.
  5. Due to a (new?) warning that breaks the test, I had to add a layer of indirection to the countdown in insert_exception_safety_pass.
  6. in is_trivially_relocatable.pass.cpp:
    • I renamed the classes to suit my personal taste (e.g. NonEmpty->Polymorphic, as it's a nonempty polymorphic class).
      • I moved the static_asserts directly into main, so that I could see which one failed. This drops the test that is_trivially_relocatable and is_trivially_relocatable_v are identical, but this can be verified by inspection.
      • The test asserted that const MoveOnly2 was trivially relocatable, even though it was not trivially move-constructible due to the const. I reversed the test, but I'm not 100% sure whether that was the right fix.
  7. Finally, of course, some things in the source tree have changed, and I had to figure out where to put things. Not sure if I did it right!

The most important bit I want feedback on: are there any changes to the compiler needed to actually make it defined behavior to memcpy a trivially-relocatable type? I believe the answer is no, but want to double check. I chatted a bit with @rsmith, and the takeaway that I got was that Clang defines the behavior (or at least, doesn't exploit undefined behavior) for all non-polymorphic types. Since trivially relocatable types cannot be polymorphic, this should be a documentation-only change: we explicitly extend implicit-lifetime types to also include trivially-relocatable types (which are, at least for now, all trivial-abi types.)

It is possibly worthwhile to note that vector *already*, before this change, used memcpyor memmove on types which have nontrivial destructors, and are therefore not implicit-lifetime. So it already had in mind some extension to implicit-lifetime types, though not one which it describes explicitly. This change further extends it to also include all __is_trivially_relocatable types.

Diff Detail

Event Timeline

devin.jeanpierre requested review of this revision.Feb 9 2022, 4:02 PM
devin.jeanpierre created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2022, 4:02 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript

@Quuxplusone I'm not sure if this is ready for proper review yet (I only ran tests locally, couldn't figure out how to make it stop using the default gcc backend for tests, so it might be a bit broken at head), but I wanted your feedback on how this is structured. Does this look ~basically OK?

I'm also not sure how to do the crediting so that you get fair credit for this -- really all I did was copy, paste, and make some tweaks here and there. :)

Otherwise, I'm going to wait for buildbots to come back, and figure out how to run tests locally against my local build of clang (which includes __is_trivially_relocatable).

devin.jeanpierre edited the summary of this revision. (Show Details)Feb 9 2022, 4:06 PM
devin.jeanpierre edited the summary of this revision. (Show Details)Feb 9 2022, 4:09 PM
  • I think UNSUPPORTED: c++98, c++03 should be just UNSUPPORTED: c++03; we don't seem to use c++98 anywhere else in libcxx/test/ except two REQUIRES:es that should probably also be fixed.
  • I think you're selling std::swap short; it's used by rotate and sort and all kinds of cool algorithms that would be nice to speed up. We should figure out how to get std::swap to bitwise-swap trivially relocatable things, either by solving the problems or by decreeing that they shall not matter. :)
  • Credit-wise, I don't care; mention me in the commit message (as you've done) and that seems fine.
  • Might make sense to split out the addition of __libcpp_is_trivially_relocatable into its own PR, and then add the vector optimization as a child PR of that one, as I did with the D50119 patch series. However, it could also be argued that you shouldn't add __libcpp_is_trivially_relocatable without any consumers of it. So I'm ambivalent, and would wait to see what others think. (Similarly, there's the question of how to land it: one commit or two? If we end up having to revert it unexpectedly, should we have the option to revert just the vector optimization, or should we force ourselves to revert all of it to avoid leaving __libcpp_is_trivially_relocatable as a loose end with no consumers?)

@devin.jeanpierre re __is_default_allocator: You should take a look at https://github.com/Quuxplusone/llvm-project/compare/main...trivially-relocatable and specifically https://github.com/Quuxplusone/llvm-project/commit/dfbfc8b326e0f5a3bcc284e3a28893310b30d0bb , because that's the current source of truth for https://p1144.godbolt.org/ . That already includes some stuff like the memmove bugfix and the switch to __is_default_allocator.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.pass.cpp
90 ↗(On Diff #407331)

I would say that in a perfect world const TrivialMoveOnly ought to be trivially relocatable, because TrivialMoveOnly is a trivially relocatable type, and if you're moving-and-destroying it, then you don't really care that the source object was const-qualified.
However, I see that even under p1144r5, is_relocatable_v<const TrivialMoveOnly> would be false, and I did not intend that is_trivially_relocatable_v<T> && !is_relocatable_v<T> should ever be true for any T. So I'll have to change this part of p1144. Most library users, then, should probably use is_trivially_relocatable<remove_cvref_t<T>> for their purposes.

Relevant question for this PR: Should we permit the libc++ extension type std::vector<const X> to actually push_back and reallocate when is_trivially_relocatable_v<X>? In the past that was obviously impossible, but now it's kinda possible.

ldionne requested changes to this revision.Feb 10 2022, 7:15 AM
ldionne added a subscriber: ldionne.

Thanks for the patch! I have some comments and questions.

Also, I'd like to understand the relationship between this patch and Arthur's series of patches for trivially relocatable types -- is the intent that this one replaces the series (even though it's a slimmer version of Arthur's patches IIUC)?

libcxx/include/__memory/allocator_traits.h
401

I find this misleading. Indeed, the construction of _Tp might not be trivial at all, it's just that the allocator doesn't do something fancy in addition to constructing the object. In other words, I think something like __allocator_construct_is_placement_new is what we *mean*, but that name sucks. Other potential names: __has_default_allocator_construct, __overrides_allocator_construct, __has_custom_allocator_construct.

In particular, I would avoid mentioning "trivial" anywhere because it makes it look as though the operation itself is trivial, which is not what we mean.

The same comment applies to __has_trivial_destroy below.

libcxx/include/memory
862

I find those out-of-the-blue false_type and true_types awkward. Instead, I would much rather make the decision of which overload to take based on enable_if right here, in these functions. That way, std::vector would just call __construct_forward_with_exception_guarantees(a, b, e, b2) without having to pass __move_via_memcpy().

libcxx/include/type_traits
356–357

I'm fine if we make optimizations inside the library, but I don't want us to expose a new "customization point" that hasn't been standardized. That's just going to be a pain in the future, and it's a portability trap for our users.

This revision now requires changes to proceed.Feb 10 2022, 7:15 AM
libcxx/include/__memory/allocator_traits.h
1

Pulling this out into somewhere we can thread the discussion:

Also, I'd like to understand the relationship between this patch and Arthur's series of patches for trivially relocatable types -- is the intent that this one replaces the series (even though it's a slimmer version of Arthur's patches IIUC)?

As of D114732 / D119017, Clang now supports a built-in type trait __is_trivially_relocatable(T) which is currently true for types that are trivial for purposes of ABI and false for everyone else (including e.g. std::string and std::unique_ptr-with-the-default-ABI). The intent is that this state of affairs is not permanent, and eventually we'd like to extend the built-in type trait __is_trivially_relocatable(T) to encompass a greater set of types, perhaps corresponding to p1144, but we're not committing to exactly p1144 right now. But the people in charge of the [[clang::trivial_abi]] attribute are comfortable promising that those types will always be trivially relocatable in every possible sense of the word.

Clang does not currently support any way to "opt in" arbitrary types to trivial relocatability. For now, the only way to "become" trivially relocatable is to become also [[clang::trivial_abi]] at the same time.

This PR adds optimizations to std::vector<T> when __is_trivially_relocatable<T> (i.e., part of D67524), and also adds enough boilerplate in <type_traits> to support that optimization (i.e., part of D61761). It does not "opt in" any STL types — e.g., std::vector<int> itself will not be considered trivially relocatable even after this PR — because, as I said above, that's physically impossible in Clang so far.

This PR, if adopted, would shrink the diff between my trivially-relocatable branch and main, i.e., it's "friendly" in that sense.

I'm not saying it's a good idea, but I'm definitely not saying it's a bad idea. :)

401

The "construct" and "destroy" operations of the allocator are "trivial" in the sense that they run no (additional) code — kind of like a trivial default constructor or trivial destructor runs no code, except that ctor/dtor is a leaf on the call graph, whereas allocator::construct and allocator::destroy are non-leaves.

However, trunk already has traits named __has_construct and __has_destroy — and I think they could profitably be modified to just include the check for __is_default_allocator themselves, couldn't they? That should be its own PR, but if it works it'd be cool. It would basically mean that allocator_traits<allocator<T>>::construct would skip a level of indirection but should have exactly the same behavior.

libcxx/include/memory
862

@ldionne, you and I had this conversation 2 or 3 years ago, and I'm still of the same opinion: the condition is complicated and it's really really important that the caller understand exactly what the condition is, so that they can pass the matching type to both callees. See lines 923–939 in <vector>. I really think this way is clearer.
(That said, I agree we might think about ways to comment false_type with something like "don't use memcpy" and true_type with something like "just use memcpy" — I have no great suggestion for the wording or where it'd go exactly.)

libcxx/include/type_traits
356–357

Just like all type traits, it is UB for the user to specialize or otherwise customize this type-trait.
You can't say

template<> struct std::is_trivially_move_constructible<Widget> : std::true_type {};

The __libcpp_ should be good enough to warn people away from querying the trait willy-nilly.
Or are you just asking to remove this "documentation" from the synopsis?

devin.jeanpierre marked 2 inline comments as done.Feb 10 2022, 12:01 PM

No changes, just some quick replies and requests for clarification at the start of my day. :)

Thanks for the comments! If I didn't reply to one here, it's probably because I agree with it and will implement the change. :D


  • I think you're selling std::swap short; it's used by rotate and sort and all kinds of cool algorithms that would be nice to speed up. We should figure out how to get std::swap to bitwise-swap trivially relocatable things, either by solving the problems or by decreeing that they shall not matter. :)

We certainly can do std::swap correctly, but it's annoying! Doing it optimally would probably best be done via an additional compiler intrinsic to tell us the dsize of a type -- we can probably do without that intrinsic with a recursive templated function, but it's going to slow down compiles and be otherwise difficult.

The easiest way to make forward progress on std::swap is to start with std::has_unique_object_representations. That is overbroad but guarantees correctness AFAIK.

  • Might make sense to split out the addition of __libcpp_is_trivially_relocatable into its own PR, and then add the vector optimization as a child PR of that one, as I did with the D50119 patch series. However, it could also be argued that you shouldn't add __libcpp_is_trivially_relocatable without any consumers of it. So I'm ambivalent, and would wait to see what others think. (Similarly, there's the question of how to land it: one commit or two? If we end up having to revert it unexpectedly, should we have the option to revert just the vector optimization, or should we force ourselves to revert all of it to avoid leaving __libcpp_is_trivially_relocatable as a loose end with no consumers?)

These should *probably* be two commits, but I am still new to phabricator and didn't know how to do it. 👀

FWIW, I don't anticipate any users actually using __libcpp_is_trivially_relocatable, so it isn't very important to submit or roll back separately other than for convenience of review. (Anyone that wanted to use it would instead redefine it, so that their code works on other C++ standard library implementations.)

@devin.jeanpierre re __is_default_allocator: You should take a look at https://github.com/Quuxplusone/llvm-project/compare/main...trivially-relocatable and specifically https://github.com/Quuxplusone/llvm-project/commit/dfbfc8b326e0f5a3bcc284e3a28893310b30d0bb , because that's the current source of truth for https://p1144.godbolt.org/ . That already includes some stuff like the memmove bugfix and the switch to __is_default_allocator.

Thanks for the link! I'm going to try to do a diff and see if that pans out, or check if I need to look at it line by line :P

libcxx/include/__memory/allocator_traits.h
1

+1 to this summary.

libcxx/include/memory
862

Yeah, this surprised me too. But in fact, this pattern is already used elsewhere, e.g. __split_buffer: https://github.com/llvm-mirror/libcxx/blob/master/include/__split_buffer#L296-L312

I asked around and was told it's normal-looking code, and it was in the original revision, so I kept it as is. (I still need to add some kind of comment, as was requested in the original D67524.)

A problem is that if you're moving an element, then you only move via memcpy if it's trivially movable. If you're relocating via memcpy, then you additionally move via memcpy when it's trivially relocatable. And then, on the destruction side, you only destroy if it was a relocation operation, and was moved via memcpy.

So there are three options that I can think of:

  1. keep the code as it was before with the enable_if. Callers must, to do a trivial relocation, first reinterpret_cast their pointers to char so that it's picked up by the trivial enable_if. (This is what I would've done if I weren't copying an existing PR, but I'm not sure if we're allowed to use constexpr if inside vector. Also, that both __split_buffer and the pre-existing PR use the true_type/etc. trick, I'm worried this would be unusual.)
  2. keep the code as it was before with the enable_if, but also add a check for trivial relocatability. If the type is trivially relocatable, callers are obligated to release the underlying storage without invoking the destructor. Otherwise, this is a normal move operation. (This, then, becomes a relocation function, and callers must uphold relocation postconditions.)
  3. this version of the code: have an explicit selector for trivial vs nontrivial operations.

My biggest issue with #2 is that it hardcodes that this is relocation -- if someone wanted to move and keep the moved-from values, rather than relocate, we'd need to write a second function. I think in this case it's only used for relocation, so it's fine.

Do you have a preference among the 3 options?

libcxx/include/type_traits
356–357

Are you requesting I delete __libcpp_is_trivially_relocatable and __libcpp_is_trivially_relocatable_v, or just remove them from the file comment?

I'm fine deleting even the symbols themselves, but wouldn't know how best to share code in that case, as both vector and e.g. swap could want to use this.

(That said, I do at some point want to offer this up as an optimization opportunity for absl. Of course, absl would just do the same __has_keyword dance, regardless, as they can't depend specifically on libc++. I think that might be the portability trap you mention. :) )

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.pass.cpp
90 ↗(On Diff #407331)

Relevant question for this PR: Should we permit the libc++ extension type std::vector<const X> to actually push_back and reallocate when is_trivially_relocatable_v<X>? In the past that was obviously impossible, but now it's kinda possible.

My personal opinion is that this is too dangerous: people will rely on it by accident, not even knowing what they're doing (e.g. templated code with levels of indirection), and that would make it significantly harder to remove or change __is_trivially_relocatable.

Separately (bit of a cop-out), I think that should be a separate PR: it's a new feature, whereas this PR is "just" an optimization, if that makes sense.

90 ↗(On Diff #407331)

I would say that in a perfect world const TrivialMoveOnly ought to be trivially relocatable, because TrivialMoveOnly is a trivially relocatable type, and if you're moving-and-destroying it, then you don't really care that the source object was const-qualified.

Yeah, it is a bit unfortunate. FWIW I think the correct behavior right now is that it should not be trivially relocatable, as it is not trivially move constructible. If we want to make it considered trivially relocatable, we should also suitably alter the definition of triviality elsewhere -- I think, rather than searching for eligible constructors, the definition would search for eligible constructors without const qualification.

(No idea for volatile, I try not to think about it.)

Relevant question for this PR: Should we permit the libc++ extension type std::vector<const X> to actually push_back and reallocate when is_trivially_relocatable_v<X>? In the past that was obviously impossible, but now it's kinda possible.

My personal opinion is that this is too dangerous: people will rely on it by accident, not even knowing what they're doing (e.g. templated code with levels of indirection), and that would make it significantly harder to remove or change __is_trivially_relocatable.

Separately (bit of a cop-out), I think that should be a separate PR: it's a new feature, whereas this PR is "just" an optimization, if that makes sense.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.pass.cpp
90 ↗(On Diff #407331)

Relevant question for this PR: Should we permit the libc++ extension type std::vector<const X> to actually push_back and reallocate when is_trivially_relocatable_v<X>? In the past that was obviously impossible, but now it's kinda possible.

My personal opinion is that this is too dangerous: people will rely on it by accident, not even knowing what they're doing (e.g. templated code with levels of indirection), and that would make it significantly harder to remove or change __is_trivially_relocatable.

Good point. It would also be a library API extension that was supported only on Clang 14+ and not other compilers, which would be weird (even if it is an extension to an extension). Suggestion withdrawn.

devin.jeanpierre marked 7 inline comments as done.

Address comments.

Right now I'm looking at running a macrobenchmark internally to Google to test this change -- the first one I ran (late last year) gave only noise as a result, but I've been pointed at a better macrobenchmark that might give more reliable and useful numbers.

The idea being that this could fill out the big "TODO" in the PR description.

libcxx/include/__memory/allocator_traits.h
401

I renamed to __has_default_allocator_construct.

@Quuxplusone that's an good idea! Actually, since C++20 deletes construct from the standard allocator, that effectively means it's "just" backporting optimizations to C++17 and below, I think.

libcxx/include/memory
862

Kept as is, and added a comment.

libcxx/include/type_traits
356–357

Removed from the synopsis.

Add benchmark. Unfortunately, I've failed to actually run it against clang -- I don't know how to run benchmarks against the (unreleased) version of clang that's in the current git repo, so all I've done is run it locally (against gcc, I think).

I have a superset of this benchmark I ran at work (using the internal version of benchmark), shows a 40% performance improvement for trivially relocatable types when using the current version of clang. For example (using unique_ptr with trivial_abi):

name                                                        old cpu/op  new cpu/op  delta                                             
BM_Vector_Push_Copy<int*>                                   4.11ns ±38%  4.11ns ±44%   -0.05%  (p=0.021 n=2301+2238)             
BM_Vector_Push_Move<std::shared_ptr<int>>                   13.3ns ±37%   9.1ns ±42%  -31.71%  (p=0.000 n=1678+1483)                  
BM_Vector_Push_Move<std::unique_ptr<int>>                   7.05ns ±10%  4.36ns ± 8%  -38.11%  (p=0.000 n=1716+1957)

My expectation is that this benchmark I have added here would also show the same, if only I knew how to run it. :)

devin.jeanpierre edited the summary of this revision. (Show Details)Feb 17 2022, 12:41 PM
devin.jeanpierre edited the summary of this revision. (Show Details)

I think this is probably ready for review, I just wish I had the macrobenchmark results at hand. Do you all have any suggested reviewers?

devin.jeanpierre edited the summary of this revision. (Show Details)Feb 17 2022, 1:36 PM

Exclude C++03 from the added test case.

It makes sense to me to do this on top of __is_trivially_relocatable, and if that returns true for more types in the future, wonderful. Obviously, it needs to be used in non-ABI-impacting ways. I'm not the right person to review this patch, though.

devin.jeanpierre edited the summary of this revision. (Show Details)Feb 23 2022, 11:26 AM

I finished running a macrobenchmark. The benchmark itself is internal to Google, but I hope it is useful as a data point anyway -- I emulated the wording from the doc on adding trivial_abi to unique_ptr. And, of course, I want to remain somewhat cagey and say things like "up to", as the actual performance improvement depends on the application, and I only tested one application.

(0.25% is incredibly good in my books, I was not expecting this -- talked with some people and they explained that since this is more than just e.g. unique_ptr, and it unlocks optimizations in neighboring code, etc, the number is plausible.)

Friendly ping. Anyone I should send this review to in particular, perhaps? (I really don't know how LLVM reviews work. :))

Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2022, 6:43 AM
philnik added inline comments.
libcxx/include/memory
885

You can replace _VSTD with std in your changes. Why are you const_casting now?

libcxx/include/vector
329

Every standard library agrees that it's not allowed to have a cv-qualified type as the template parameter for a vector (https://godbolt.org/z/8xEnhEbYT). Ditto below.

718–719

Do we want this function in the dylib?

923–926

That looks very weird. Why is __move_via_memcpy move via memcpy or relocate via memcpy? It looks like you are using __vector_move_via_memcpy only in two places and both do this weird thing. Why not change __vector_move_via_memcpy? The naming for the second typedef looks also weird. (And you can use using instead)

1655–1667

Is this-> needed? If not, please remove it.

1660

Could you use std::move here? It should behave exactly the same and is easier better to read.

1697

noexcept isn't available in C++03 AFAIK.

1702

Could you rename this to something like __is_nothrow_constructible_alloc? __is_nothrow_constructible_a sounds a lot like you needed another name and you just slapped _a at the end.
Do you only have this to allow better optimizations in C++03? If that is the case I'd say just remove it. If you care about performace you should switch to C++11 or newer. We currently don't even enable the move overloads in C++03 in most (all?) classes.

libcxx/test/libcxx/containers/sequences/vector/insert_trivially_relocatable.pass.cpp
14

We don't support C++98. Ditto for the other tests

20

Could you make this into a local variable? These tests have to work during constant evaluation in the (hopefully near) future.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
1–8 ↗(On Diff #409779)

This looks like the wrong license header.

56–72 ↗(On Diff #409779)

Could you give these things a more meaningful name?

78–83 ↗(On Diff #409779)
85–86 ↗(On Diff #409779)

Please don't define this kind of stuff for convenience. What if in a new standard there will be std::is_trivially_relocatable? That's not that unlikely IMO and then we would have to rewrite this thing just because you felt like adding a macro here. It's not hard to replace the uses, so you can do it now.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.pass.cpp
16 ↗(On Diff #409779)

We don't support gcc-4.9.

94 ↗(On Diff #409779)

You can make this file a .compile.pass.cpp.

ldionne requested changes to this revision.Mar 18 2022, 8:47 AM

@devin.jeanpierre Would you be willing to get on a call with me to go over your patch? I think it would make it easier to make progress -- you have significantly more context than me around it, so it'd be helpful for me.

libcxx/include/memory
862

I continue to not understand why we want to make the caller aware of this. Take for example __construct_backward_with_exception_guarantees. it's always called with the same __move_via_memcpy() argument -- why not define that condition in enable_if_t in the function itself instead? Sorry if it's obvious to you, it's not to me.

Another way to see this: the true_type vs false_type is just a transparent optimization -- the semantics of __construct_backward_with_exception_guarantees are the same in both cases (right?). So the caller effectively doesn't need to know whether the optimization kicks in if __construct_backward_with_exception_guarantees can determine it alone. And if I'm wrong and true_type vs false_type does change the semantics, then we should have another function with a different name instead, because both would do different things.

877

This should be class _ContiguousIterator to document that effectively any contiguous iterator is what we expect? I assume this was changed from _Tp* because we had to support __wrap_iter?

libcxx/include/type_traits
3119–3123

I'd rather not define this at all, since it's an internal only type trait. Concretely, it's less complicated to say ::value everywhere than to check if we're in C++17 and use _v instead.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
1 ↗(On Diff #409779)

IIUC we're not adding std::is_trivially_relocatable, so this test should go away entirely.

1–8 ↗(On Diff #409779)

Yeah, this must have been before the LLVM re-licensing. Please make sure all your files are using the new license.

This revision now requires changes to proceed.Mar 18 2022, 8:47 AM
devin.jeanpierre marked 19 inline comments as done.

Update change in response to feedback. (Thank you, everyone!)

devin.jeanpierre added a comment.EditedMar 18 2022, 2:18 PM

@devin.jeanpierre Would you be willing to get on a call with me to go over your patch? I think it would make it easier to make progress -- you have significantly more context than me around it, so it'd be helpful for me.

Almost missed this request -- absolutely, I'd be happy to. I don't know the procedure around this stuff though.

Edit: sent wrong draft. Meant to add: I'm free during the day pacific time, you can schedule something with me out of band via email/chat to jeanpierreda [at] google [dot] com.

libcxx/include/memory
862

So it definitely does change semantics: if you call __construct_backward_with_exception_guarantees with true_type, but the object is not trivially-movable, then you are promising that 1) it is trivially relocatable (via [[clang::trivial_abi]], today), and 2) you will be releasing the underlying storage of the "moved"-from objects without invoking the destructor.

The compiler can infer (1): it could just check for trivially relocatability. That's what the callers do! But it cannot check for (2), the caller has to be careful here. So I do think it should probably be explicit.

If I were to do it myself, I'd probably pass in reinterpret_cast<const char*>(my_pointer), rather than defining a separate function. The benefit of having it be a true_type/false_type argument is, I suppose, that it's easy to call: it's just a parameter as to whether to use trivial relocation, and the caller can do neat so that they have just one call site with a varying parameter. I can see the merit to it. For example, one caller looks like so:

__construct_forward_with_exception_guarantees(this->__alloc(), __p, this->__end_, __v.__end_, __move_via_memcpy());

If we were to make it different functions, or require a reinterpret_cast in the caller, it would become:

if (__move_via_memcpy::value) {
  __trivial_construct_forward_with_exception_guarantees(this->__alloc(), __p, this->__end_, __v.__end__;
  // or: __construct_forward_with_exception_guarantees(this->__alloc(), reinterpret_cast<const char*(...),  reinterpret_cast<const char*(...), reinterpret_cast<const char*(...);
} else {
  __construct_forward_with_exception_guarantees(this->__alloc(), __p, this->__end_, __v.__end__;
}

So I can see the merit to it.

Of course, the magic boolean parameter seemed weird to me too. It was impressed on me when I asked a coworker that it is not so strange, and indeed cases already exist of it in llvm: for example, __move_assign in vector, __process in memory, etc.

So what I want to know is -- is this really that weird, and existing code is "legacy" code we should try not to emulate? Or is this relatively normal? I will do whatever the "normal" thing is. :P

Leaving your comment unresolved. :)

877

(I confess I'm not sure why this had to be changed. I can only assume because now it's selected in branches that, yes, use __wrap_iter, but it's very indirect.)

FWIW _Ptr was already in use in the other overload, so I would prefer to leave it unchanged for now. LMK if I should change both, or if it's acceptable/preferable to change just this one, and I'll do so.

885

The const_cast is to support vector<const T>, see e.g. libcxx/test/libcxx/containers/sequences/vector/const_value_type.pass.cpp.

I thought libc++ no longer supported const-qualified value types, but it seems that test is still there, and the change deleting it was rolled back. So until libc++ properly removes this, I need to retain the const_casting after all. Oops!

Note that this isn't "new" -- e.g. line 902 of the LHS revision at time of writing also does a const_cast for the same reason. However, I added a comment anyway, to avoid confusion.

Did my best to replace _VSTD with std.

libcxx/include/type_traits
3119–3123

That makes sense to me. In point of fact, I was thinking the same thing!

libcxx/include/vector
329

Deleted, and with joy in my heart.

Unfortunately I then proactively did the same for const, because I thought that change had gone in -- but the change deleting const-qualification support in libc++ was rolled back.

718–719

Unfortunately I'm not sure what this is asking, and probably don't have the background to give an informed answer. Anyone else want to step in here?

923–926

Yeah, I agree. WDYT about how I've done it now? Three categories:

  • trivial_relocate: relocation operations can be trivial (both move+destroy), because the type is trivially relocatable
  • relocate_trivial_move: relocation operations use memcpy -- maybe because it's trivially relocatable, maybe because it's trivially movable
  • relocate_trivial_destroy: relocation operations drop underlying storage without invoking the destructor -- maybe because it's trivially relocatable, maybe because (in a future change) it's trivially destructible?

I've made the corresponding changes to the source so you can see how it plays out.

1655–1667

Ah, sorry, I was just doing whatever the code already did. Looks like it was unnecessary.

1660

(Please forgive me if I'm misunderstanding something here.)

Not without casting to e.g. std::byte pointers first -- that would invoke move constructors, where here we should be copying raw bytes.

The difference in behavior comes up if you have a nontrivial but clang::trivial_abi type, such as:

struct [[clang::trivial_abi]] MyClass  {
  MyClass(const MyClass&) { ... }
  MyClass(MyClass&&) { ... }
  ~MyClass() {}
};

The intention of this change is to use memmove for this type, and not invoke move constructors etc.. So we skip across std::move and instead use memmove.

We also cannot move this optimization to std::move, because in general, std::move should still be calling move constructors: it is only in the specific case that the moved-from location is destroyed immediately afterwards that we can safely use memcpy/memmove: we can replace a move+destroy pair with a relocate + release-underlying-storage pair.

(That's not quite what's happening with erase, but we could imagine it was: erase is, I believe, allowed to work as if it moved everything after the erased point to a temporary buffer, destroyed the original objects, and then moved them back, destroying the temporary buffer.)

Hoping that addresses your comment!

1697

This is in the #else block, so that's fine, right? (Should only fire for C++11 and up, since e.g. 98 isn't supported.)

1702

Could you rename this to something like __is_nothrow_constructible_alloc? __is_nothrow_constructible_a sounds a lot like you needed another name and you just slapped _a at the end.

Agreed, and done!

Do you only have this to allow better optimizations in C++03? If that is the case I'd say just remove it. If you care about performace you should switch to C++11 or newer. We currently don't even enable the move overloads in C++03 in most (all?) classes.

Hmmm, correct me if I'm wrong, but I think it's the #else block that doesn't do anything. These two lines are basically equivalent:

struct __is_nothrow_constructible_a : integral_constant<bool, false> {};
struct __is_nothrow_constructible_a : integral_constant<bool, noexcept(allocator_traits<_Allocator>::construct(declval<_Allocator&>(), (_Tp*)nullptr, declval<_Up>()))> {};

... because allocator_traits<T>::construct is never noexcept unless you specialize it, which libc++ does not permit.

The intent of the check is clearly that the ::construct call will not throw, which we can determine, instead, by if the allocator defines construct at all. (Since we don't support overloading the trait).

Something like: is_nothrow_constructible<_Tp, _Up>::value && (__is_default_allocator<_Allocator>::value || !__has_construct<_Allocator, _Tp*, _Tp>::value

Does that look good?

libcxx/test/libcxx/containers/sequences/vector/insert_trivially_relocatable.pass.cpp
14

Apologies, I think earlier reviewers made the same comment and I accidentally failed to address it. My bad.

20

Not super sure what is allowed in constant evaluation, but hoping this is what you meant. At the very least, I believe it's an improvement.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
1 ↗(On Diff #409779)

Well, but we are adding __libcpp_is_trivially_relocatable. Perhaps this should go in a separate change, though, since morally speaking it's testing __is_trivially_relocatable? WDYT?

1–8 ↗(On Diff #409779)

Apologies, I didn't even think to check for that! Done -- looks like it was just the two tests.

56–72 ↗(On Diff #409779)

Agreed, and done. :)

Actually, I hadn't looked closely at this test. I think it can be simplified a bit too -- for example, rather than having a separate nontrivial key type, I just added operator== etc. to the nontrivial type. (Could do the same for the mutex type, but wasn't sure if that'd be even more confusing.)

78–83 ↗(On Diff #409779)

Ooh, indeed.

85–86 ↗(On Diff #409779)

You are 100% right, done.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.pass.cpp
16 ↗(On Diff #409779)

Apologies, and thanks for the correction!

94 ↗(On Diff #409779)

Good call, thank you!

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
243 ↗(On Diff #416606)

This should also be a .compile.pass.cpp. Just making the comment now before anyone else does -- I have to step away from my computer for a bit, but will get back to it.

This looks good so far. I think with Louis' and my comments addressed this should be good to go, but I want to do another thorough review with the comments addressed before approving this.

libcxx/include/vector
718–719

I think that's a question @ldionne has to answer. My question is essentially if we want to have this as part of the ABI or if it should be marked _LIBCPP_HIDE_FROM_ABI.

923–926

Yes, looks much better!

1660

Right, I missed that it's implicitly casting to void* in the call to std::memmove. Using std::move would look something like

std::move(reinterpret_cast<byte*>(__rawp + 1), reinterpret_cast<byte*>(__rawp + (__end_ - __p) * sizeof(_Tp)), __rawp);

. That's obviously not an improvement. Currently I also don't have any good idea to make it more readable.

1697

I think you misunderstood me a bit. We don't support C++98, but we do support C++03. I know there isn't a huge difference, but it's significant enough in a standard library implementation to point out. Although the C++03 support is very weird, but that's another story.

1702

It definitely makes sense to me. Maybe you could check if the allocator::construct member function is noexcept. That would include more allocators. Something like !__has_construct<_Allocator, _Tp*, _Tp>::value || __nothrow_invocable<_Allocator::construct, _Tp*, _Tp>::value should work. (I haven't tested it.)

1707

Is this a forward? If yes, please spell it as one.

libcxx/test/libcxx/containers/sequences/vector/insert_trivially_relocatable.pass.cpp
20

Yes, that's exactly what I had in mind.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
94–96 ↗(On Diff #416606)

Why aren't these trivially relocatable? These should be in the unstable ABI because they are marked [[clang::trivial_abi]], or am I not understanding something?

116–131 ↗(On Diff #416606)

Do you have any interest in making a follow-up PR for these? It should be possible to make these all trivially_relocatable, right?

devin.jeanpierre marked 4 inline comments as done.

Deleted libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp, changed static_cast to std::forward

Really sorry about the delay on this one, I stepped away to work on a build breakage for the predecessor PR, and then never came back to this. (And then I felt bad about never coming back to this, and... well, maybe you know how it is, "ugh field" etc.) I'll try to stay more on top of this, since I definitely want it to go in! I'm really excited about this change. :)

Let me know how to proceed. And, obviously, feel free to take your time, I don't have the right at this point to request speedy review.

libcxx/include/vector
1697

I'm still not following. My understanding had been that this should be fine, because noexcept is only used if _LIBCPP_CXX03_LANG is not defined, meaning that we only use noexcept on C++11 and above.

I think this is no longer relevant though, since that code snippet is deleted.

1702

It's more complicated than that, because the allocator is allowed not to define construct -- so we need to do the dance that __has_construct does to ensure that this fails gracefully (SFINAE) when it doesn't exist.

Can we defer this / not implement it now? I played around with this a bit, but as it turns out I'm really not very familiar with template metaprogramming in C++. If this is a strongly desired feature, I can write a followup as my very next change, but otherwise I'd actually prefer not to write it at all until a concrete use presents itself. :)

(I think such concrete uses may not exist at all, or be very rare -- it'd be a construct function that is defined only for types with noexcept constructors, which is I think impossible to guarantee in C++ (you can't enumerate all the constructors). So the only allocators that would satisfy this are those that are special-cased to known types to be noexcept, or which are noexcept even though the underlying constructors might throw.)

1707

Ah, yes, it is, since _Up is a template parameter. Fixed.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
94–96 ↗(On Diff #416606)

Great catch.

Um, actually, on reflection, this whole file is a nightmare of test failures depending on build mode. My experience with the predecessor PR D114732 is that this will be very hard to get right -- we will need ifdefs for win64 and for PS3, at the least -- not to mention, as you've noticed, checking for the unstable ABI.

Moreover, I think it doesn't really belong in this change. So at least for now, I'm reverting it entirely. If this change should go in, I think a separate PR would be best. (It's basically testing D114732, and possibly not this change.)

116–131 ↗(On Diff #416606)

Yes, I think we should put work into making many common datatypes [[clang::trivial_abi]], in the same way as already done for unique_ptr, and I want to do this for at least a couple common/easy ones (hoping std::vector, std::string qualify).

Others might require more substantial design work to make this happen (e.g. the map types), and it might be that yet others fundamentally can't (e.g. std::list -- something in there is going to be inherently non-trivially-relocatable, since it's self-referential by design).

philnik added inline comments.Apr 29 2022, 1:47 AM
libcxx/include/vector
718–719

We are in vector so this won't be exported in our dylib. I think this should be marked _LIBCPP_HIDE_FROM_ABI.

1702

You could have something like void construct(Args&&... args) noexcept(noexcept(T(args...))). I don't think that's too rare, but we can do that later if you want. If you don't feel comfortable writing that fun stuff I can try as well.

libcxx/test/libcxx/type_traits/is_trivially_relocatable.pass.cpp
116–131 ↗(On Diff #416606)

Great! I think starting with string would be good. That's like the most commonly used type in the library and there won't be that many more changes in the near future AFAIK.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.compile.pass.cpp
13

This shouldn't be here.

libcxx/test/std/containers/sequences/vector/vector.modifiers/insert_exception_safety.pass.cpp
65

What does that do? If it's just a cast to bool could you maybe just make if a functional cast? That's a lot less surprising.

devin.jeanpierre marked 6 inline comments as done.

Act on review comments. TY for the speedy review!

Manually reflow comment. (I had hoped that clang-format would do it for me, oops.)

Actually reflow comment. (SSH session had disconnected, saves were going to /dev/null. ... this actually is super discouraging behavior in vscode, I'm scared now.)

libcxx/include/vector
718–719

Sounds good to me, and thanks for following up here. Done.

1702

Ah, that goes to show my unfamiliarity with exceptions/noexcept. That's a compelling example, and I agree we should support it.

I think this would make a very decent standalone PR after the fact -- I don't think we need to do it in this PR. I will, uh, iron out the implementation with some coworkers who know the idioms here, and send a followup PR. So I'm marking as resolved for now. Should I add some form of TODO comment to mark this as a potential improvement, or no?

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.compile.pass.cpp
13

Definitely __libcpp_is_trivially_relocatable_v should be deleted, but I think __libcpp_is_trivially_relocatable does belong here -- it emulates the style of the other files. (Except it should start with // <type_traits>, not // type_traits).

For example, variant_size.pass.cpp opens with:

// <variant>

// template <class ...Types> class variant;

or, tuple_array_template_depth.pass.cpp opens with:

// <tuple>

// template <class... Types> class tuple;

// template <class Tuple, __tuple_assignable<Tuple, tuple> >
//   tuple & operator=(Tuple &&);

// This test checks that we do not evaluate __make_tuple_types
// on the array when it doesn't match the size of the tuple.

I see a similar pattern in every test file I open here: they (usually) list the header they're testing, and (always) list the specific function or type being tested.

(They do it below the UNSUPPORTED line though, so moved that.)

83

I have a feeling that this test is going to have failures on a few platforms (e.g. PS4, Win64), now that I've had experience with the predecessor change and its rollbacks. I'm asking in discord for how to run the tests locally, without doing a slow upload / look at build results loop. (I haven't even looked at the failures for this right here, actually.)

https://discord.com/channels/636084430946959380/642374147640524831/971015700665761833

Just leaving this (unresolved) comment to mark that additional work is needed here.

libcxx/test/std/containers/sequences/vector/vector.modifiers/insert_exception_safety.pass.cpp
65

Not sure what you mean by functional cast, but maybe it'd be clearer as a ;;?

I also added a comment to document the test itself and its behavior.

philnik accepted this revision as: philnik.May 3 2022, 9:26 AM

From me it's just nits at this point, so LGTM. Please wait for @ldionne to approve.

libcxx/include/memory
856–857

Why does this comment exist?

880

using is supported as an extension in C++03. You don't have to use using, but I think it's a lot nicer to read.

881

Could you reformulate this to make it a TODO?

libcxx/include/type_traits
3108–3117

I think this should be equivalent.

libcxx/include/vector
325–330
337–342
718–720
1702

I think adding a TODO would be good, since we know it can be improved relatively easily.

libcxx/test/libcxx/containers/sequences/vector/insert_trivially_relocatable.pass.cpp
14

Why exactly is it not supported in C++03?

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.compile.pass.cpp
13

I only meant the __libcpp_is_trivially_relocatable_v :)

15

Missed one.

libcxx/test/std/containers/sequences/vector/vector.modifiers/insert_exception_safety.pass.cpp
65

I mean that you could use bool(Mask & 1), .... I think it's actually called function-style cast, not functional cast.

devin.jeanpierre marked 10 inline comments as done.
This comment was removed by devin.jeanpierre.

A couple notes:

  • I was really struggling to run tests locally and only uploaded this to run the tests. Unfortunately, that also did things like mark comments as resolved, oops. Tomorrow I might try again to get local tests working, as I expect many rounds of test fixing, and CI is taking longer than expected.
  • I struggled a bit with git and lost some work. Did a couple passes over everything to make sure I actually addressed everything I hit done on (and didn't e.g. fix it before accidentally losing work), but I'm very sorry if I missed something!

Aside from comments, also tried to fix test failures (e.g. fixed my local copy of clang-format so that it runs -- my copy of Debian doesn't have the python binary that git clang-format expects, just python2 and python3; and disabled execution of the exception test on tests that disable exceptions)

libcxx/include/memory
856–857

Without reading the implementation, one doesn't know the difference between __construct_forward_with_exception_guarantees(a, x, y, z, true_type) and __construct_forward_with_exception_guarantees(a, x, y, z, false_type), so I figured documentation was appropriate to clarify. (Well, not that this function is incredibly well-documented right now.)

I can delete it if it's actually obvious and these overloads mean the same thing everywhere.

880

Ah, sorry. I should've gone through the whole PR and fixed these the first time around. Done.

The other instance I replaced was in __construct_backward_with_exception_guarantees.

881

Done!

libcxx/include/type_traits
3108–3117

Also gets rid of the memey #if inside of the integral_constant type.

libcxx/include/vector
325–330

This looks prettier than it has any right to. :) Done, and thank you!

718–720

Apologies, I had hoped arc diff would autoformat this correctly.

libcxx/test/libcxx/containers/sequences/vector/insert_trivially_relocatable.pass.cpp
14

Ah. I had copied this from the change I'm forward porting, but I believe it is trivial to support C++03 in this test, so I've done so and removed the comment.

libcxx/test/libcxx/utilities/meta/meta.unary.prop/is_trivially_relocatable.compile.pass.cpp
15

Sorry, I don't know what happened there.

libcxx/test/std/containers/sequences/vector/vector.modifiers/insert_exception_safety.pass.cpp
65

Sorry, I think phabricator was attributing your comment to the wrong line (I saw it on the for statement before. At the moment, it's now centered on a comment line.). Replaced with casts.

Oops, forgot C++03 doesn't do Template<Template2<T>>.

  1. actually make C++03 compatible: use __attribute__((trivial_abi)), a custom struct instead of std::conditional
  2. Add the static asserts for trivial relocatability to both tests that define a conditionally-trivially-relocatable type.

... sigh, apparently I made the static_assert change at the last minute and forgot to run check-cxx. Fixed.