Page MenuHomePhabricator

[libc++][ranges] Implement ranges::uninitialized_default_construct{,_n}.

Authored by var-const on Dec 7 2021, 8:07 PM.



Defined in [specialized.algorithms](


  • refactor the existing non-range implementation so that most of it can be shared between the range-based and non-range-based algorithms;
  • remove an existing test for the non-range version of uninitialized_default_construct{,_n} that likely triggered undefined behavior (it read the values of built-ins after default-initializing them, essentially reading uninitialized memory).

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ldionne added inline comments.Dec 8 2021, 11:49 AM

I am mostly neutral on this, however I don't believe the different algorithms are going to have significantly different header dependencies. In that case, I think it might be simpler to just cram them into the same header -- after all they are closely related.

Like I said, no strong opinion here anyway.


About final -- we have [[no_unique_address]] now so EBO shouldn't be an issue.

Regarding modules issues, basically it means you're going to need to export it from your private submodule, like so:

module ranges_uninitialized_algorithms {
    private header "__memory/ranges_uninitialized_algorithms.h"
    export __function_like

FWIW, I think there is some utility to removing things like addressability from our niebloids, the complexity around modules is just a bit annoying but it's still worth doing. Just my .02, this isn't anything close to "over my dead body".


When @Quuxplusone originally removed __voidify in D93153, I was fine with it. I just did some additional research and I suspect it might have been wrong:

I think we should figure out what the situation is and either confirm that we're conforming, or fix all the places where we use (void*)x but the standard says __voidify(x).


The only caveat to using // ranges::uninitialized_default_construct is that it's good to see the different overloads being tested. I really dislike seeing a test file that doesn't clearly say whether it tests all overloads of a function or not. In C++, different overloads often have different properties, and so they are really different functions entirely. They just happen to share the same "base name".

TLDR: IMO the current approach (copy-paste synopsis) is the best.


I actually buy @Quuxplusone 's reasoning as well.

Initially I thought we were verifying that we called the default constructor for trivially default constructible types, which does nothing (and hence doesn't overwrite the value). But I agree this looks like UB as well based on your reasoning.

IIUC, what we're *actually* trying to test here is that we are not calling any constructors. One way to do this would be to add a .sh.cpp test using FileCheck where we check the LLVM bitcode output, like they often do in Clang. I think it would be awesome to have that infrastructure available, however at the moment FileCheck isn't available while running the libc++ tests.

So IMO the pragmatic thing to do right now is to remove the test.

24–55 ↗(On Diff #392634)

[Using static_assert(std::is_class_v<...>)]

Interesting trick! It's true that if it's a class type, then ADL for sure won't trigger, so we'd have the same coverage. However, implementations are technically allowed to use functions (it would presumably be implemented with some sort of attribute) to inhibit ADL, so this would have to be LIBCPP_STATIC_ASSERT because std::is_class_v would be libc++ specific.

I don't have a strong opinion about this, but I agree this is pretty attractive from a simplicity perspective.

24–55 ↗(On Diff #392634)

implementations are technically allowed to use functions

Sure, technically, but until literally anyone does even a proof of concept patch to literally any compiler in the world to permit that, I think we can discount the possibility. ;) (This is closely adjacent to my logic against __function_like, btw. Everyone knows that niebloids are objects-not-functions; there's no engineering benefit gained by pretending things aren't as they are, and meanwhile there are engineering costs.)


(Argh, I keep forgetting to reply to things on the first go-round. Sorry.)
IIUC, STL's comment there indicates that we should add a test case for const iterators, e.g.

auto cit = forward_iterator<const Counted*>(as_array);
auto r = std::ranges::subrange(cit, sentinel_wrapper<forward_iterator<const Counted*>>(cit + 2));
std::ranges::uninitialized_default_construct(r.begin(), r.end());
assert(Counted::current_objects == 2);
assert(Counted::total_objects == 2);

That's missing test coverage today, but AFAIK it's not an argument against the simple cast to (void*). C-style casts are allowed to cast away constness.

var-const updated this revision to Diff 393014.Dec 8 2021, 8:00 PM
var-const marked 16 inline comments as done.
  • Address feedback;
  • Create Buffer class in tests.

Note: I have refactored the repetitive initialization code into a helper Buffer class that I hope can be reused across similar tests.


I'm slightly in favor of keeping these in a single file because they're closely related and I personally find it easier to read when related things are all in one place. I don't have a strong objection to separating these, though.

What I can do is to keep everything in a single file for the initial implementation and then have an NFC commit to split them. I hope at that point it would be clearer which approach is more readable here, and we can decide whether to merge or revert that patch.


I think the idea here is to make sure no users can end up accidentally depending on any properties that aren't explicitly guaranteed by the Standard (and yes, their code would be in error in that case, but it's still a possible problem that would require somebody to spend time fixing it, and we can prevent that problem from happening in the implementation). It does go beyond what is strictly necessary for conformance, but arguably it can result in slightly better experience for the users. Moreover, removing __function_like in the future should be a non-breaking change (I don't think ABI is a concern here, though I can easily be wrong on this), whereas going the other way round could realistically break somebody.

Also, we already have precedent in some of the iterator algorithms (e.g. advance and next). This isn't a particularly strong argument because there are very few instances, so changing them is quite simple. Basically, we would just need to make sure we are consistent.


Done, thanks!


Thanks! Very helpful article.


Sorry, I should have called it out in my comments. Yes, I did this deliberately because I couldn't think of a case where the two forms wouldn't be equivalent, and delegating directly to the internal helper _n function seemed simpler than creating a wrapped iterator then delegating to the other helper.


Added the test.

It looks like until C++20, the Standard defined voidify as performing a simple static_cast<void*> cast, but that would fail if the given type is cv-qualified because static_cast can't cast away cv qualifiers. The new definition works around it by doing two casts, a static_cast followed by a const_cast. However, it does seem like a simple C-style cast can achieve the same effect. I checked that the C-style cast succeeds when used with const iterators whereas the old static_cast<void*> approach fails as expected.

I still feel like we should have a dedicated __voidify function (that can use the C-style cast). For one, it would allow defining (and documenting) our approach in one place. It would also follow the way things are laid out in the Standard more closely, which I think is a nice bonus.




Done. I think I copied it from an unsubmitted patch which presumably was written at a time when GCC 10 was still supported.




Added counted_iterator, views::counted and reverse_view. If you have any other suggestions, please let me know.


Thank you for explaining your rationale. The main reason I generally prefer using auto when doing casts is that auto can be taken to mean "use the exact same type as returned by the expression" (omitting nuances about constness and references for the purposes of this discussion). In other words, auto guarantees that no implicit conversion can take place, which could be relevant in a case like:

struct Base { ... };
struct Derived : Base { ... };

Base* p = reinterpret_cast<Derived*>(buffer); // Implicitly converted from `Derived` to `Base`
auto* p = reinterpret_cast<Derived*>(buffer); // Reader can be sure no implicit conversion can take place

While this isn't relevant here, this usage of auto arguably simplifies things for the reader by removing one case to watch out for (if used consistently).

Having said that, I see the value of your point as well. I was somewhat on the fence, so I decided to use consistency as a tie-breaker. A simple grep shows a single case of using auto with casts in the existing code; most existing casts specify the type directly. For this reason, I decided to go with your suggestion; after all, the benefits of either approach are mostly relevant only if they are followed consistently.

EDIT: after writing this, I refactored this code so that it became a member function and this is no longer applicable.


Thanks! Removed it here and in uninitialized_default_construct{,_n}.pass.cpp.


Removed the alias. I took it from some existing code for consistency, but I don't have a strong preference.

24–55 ↗(On Diff #392634)

This approach is definitely very attractive due to its simplicity. These ADL tests are verbose, apply to every single range algorithm and can't really be simplified much (AFAICT). On the other hand, the current approach has the advantage of testing the effects whereas the new approach tests the implementation.

All in all, I'm inclined to go with the simpler approach because the boilerplate savings it provides are very significant. Thanks for suggesting this!

As you suggested, I moved these checks to the existing test files. I have also used LIBCPP_STATIC_ASSERT, though I don't feel super strongly about it (from a practical point of view, I presume it's true that no existing implementation uses or has any particular reason to use a different implementation method).

I've also added a TODO to rewrite the existing tests using the new approach.

var-const updated this revision to Diff 393020.Dec 8 2021, 8:30 PM

Run libcxx-generate-files.

ldionne accepted this revision.Dec 9 2021, 10:37 AM

Ship it. I suggest going back to add __voidify in the places that need it in a separate patch. It should be possible to figure out the places that need it by grepping for voidify in the standard.


FWIW, I suspect the difference might be that a C-style cast can't appear inside a constexpr context (because it ends up reinterpret_casting).
+1 for adding back __voidify -- we should probably make this one _LIBCPP_ALWAYS_INLINE to improve the debugging experience.

var-const updated this revision to Diff 393254.Dec 9 2021, 12:28 PM

Small fixes, should fix the CI.


I'm not sure if having a C-style cast affects constexpr in this case. From my experiments, it looks like the compiler allows C-style casts as long as the cast doesn't actually perform a reinterpret_cast:

template <typename T>
constexpr void* voidify1(T& from) {
  // This works fine
  return const_cast<void*>(static_cast<const volatile void*>(&from));

template <typename T>
constexpr void* voidify2(T& from) {
  // This also works fine, presumably because it ends up
  // doing a `static_cast` followed by a `const_cast`
  return (void*)&from;

template <typename T>
constexpr void* voidify3(T& from) {
  struct Bar{};
  Bar* from2 = (Bar*)&from;
  // This doesn't work because converting to `Bar` can only
  // be done via a `reinterpret_cast`
  return (void*)from2;

I still feel like having a dedicated __voidify function. Whichever approach we end up deciding upon, it should be in one place where it's easy to document and easy to change.

Quuxplusone added inline comments.Dec 9 2021, 1:12 PM

Right, reinterpret-casts are illegal in constexpr, but C-style-casts-that-are-not-reinterpret-casts are still perfectly legal.

I'm sticking to my rationale for not reintroducing __voidify: The more unnecessarily complicated you make the code, the more churn it kicks up, as people with different opinions repeatedly (over the course of several years) remove the complexity (because simplicity is good) and reintroduce the complexity (because ???). Simple is the local minimum, if we just have a general policy of "no additional complexity unless it serves a purpose."

My hypothesis for why the Standard uses voidify is that maybe they have a policy against ever using C-style casts (which means they need to use two new-style casts here, to mimic a fraction of its power). But that also seems far-fetched. Does the history of the cplusplus/draft repo give any hints?

var-const added inline comments.Dec 9 2021, 3:42 PM

To be clear, I don't necessarily propose changing the implementation. What I currently have in mind is something like:

template <typename T>
void* __voidify(T& from) {
  // Note: the C-style cast can remove cv-qualifiers, achieving the
  // same result as the sample implementation of `voidify` in the
  // Standard in a simpler manner.
  return (void*)_VSTD::addressof(from);


for (; __idx != __last; ++__idx)
        ::new (__voidify(*__idx)) _ValueType;

I see several advantages:

  • avoids duplication, which prevents potential divergence;
  • follows the Standard more closely -- it's obvious to the reader that we do implement voidify (with the current implementation, I was at first wondering if the implementation preceded the time voidify was added to the Standard and not updated);
  • explains our implementation choice in a single place -- this should particularly help prevent any potential confusion and discussions in the future (or at the very least, the discussions would start having the necessary context so that people don't have to spend time wondering why it's implemented the way it is).
var-const updated this revision to Diff 393311.Dec 9 2021, 3:49 PM

Fix the CI, add export __function_like to module.modulemap.

var-const updated this revision to Diff 393359.Dec 9 2021, 6:09 PM

Fix the CI, pt.2.

var-const updated this revision to Diff 393661.Dec 10 2021, 9:09 PM

Rebase on main.

Fix CI (?) and rebase on main.


Please include the granular <__concepts/foo.h> headers here [er, in a-z order above], not all of <concepts>.


This should be as in the blog post too:

namespace __uninitialized_default_construct {
struct __fn {
} // namespace __uninitialized_default_construct

AIUI, this keeps the CPO's own type from ADL'ing into the std::ranges namespace; e.g. foobar(std::ranges::uninitialized_default_construct) should not consider std::ranges::foobar a candidate even if std::ranges::foobar is not a CPO itself.
Also, of course, consistency (Chesterton's Fence, the economist's hundred-dollar bill): if it were safe to omit the namespace, we'd certainly want to do it everywhere, not just here.


Yay. :) IMHO a more helpful pattern for this comment would just be
// TODO: make consistent with std/iterato...compile.pass.cpp
and name one of the files that you consider to be correct. Then, even someone-not-you should be able to deduce exactly what this comment is asking them to do.


Btw, please commit this removal in a separate commit, and/or make sure there's something in your commit message about how we believe this was UB and that's why it's gone now.

var-const marked an inline comment as done.

Review feedback

var-const edited the summary of this revision. (Show Details)Dec 13 2021, 12:56 PM
var-const added inline comments.

Done (I added an _ns suffix to the namespace name to make sure it's different from the name of the helper functions). Do we actually need a separate namespace for each CPO, though?

By the way, can you please elaborate on how this could be a problem, given that the type of the CPO is an underscore-prefixed internal type?


I don't feel too strongly about this, but in general I'm a little wary about comments that (effectively) contain links to other code -- the linked code can change and the comment becomes stale (e.g. if in the future these checks were to be moved to a single file).

Perhaps it would be even better to combine both approaches (explain the intent _and_ link an example), but given that I plan to fix this in the next few days, it seems like overkill to me.


Updated the commit message.

ldionne added inline comments.Dec 13 2021, 1:10 PM

I *think* we could instead have a single namespace named something like __cpo_detail and then instead name our helpers __uninitialized_default_construct_fn. We could explore that in a separate PR, but for the time being let's be consistent.


This should be called __fn so that it shows up consistently to other CPOs in compiler diagnostics.


(FWIW, I'd still prefer not to expand the sphere of influence of __function_like in this PR; but we're all on the same page that it's going to be trivial to remove later. I hope this is also taken as another bit of evidence that having multiple competing conventions sucks, and that consistency saves brain cells.)


(I added an _ns suffix to the namespace name to make sure it's different from the name of the helper functions).

(1) please don't add that suffix; (2) what helper functions do you mean? Are you talking about std::__uninitialized_default_construct? It's not std::ranges::__uninitialized_default_construct, so we don't need to worry about this namespace colliding with it. Again, we have an established convention already and you're being asked to copy it so that we have one convention used everywhere, instead of multiple conventions.

We could (in a separate PR) change libc++'s convention to use names like std::ranges::__cpo_detail::__uninitialized_default_construct_fn instead of std::ranges::__uninitialized_default_construct::__fn, but that benefits us literally nothing AFAICT and costs us a few bytes per symbol name (which translates to potentially lots of .o file size on disk).

can you please elaborate on how this could be a problem

var-const updated this revision to Diff 394135.Dec 13 2021, 9:20 PM

Review feedback


Removed the suffix.

Are you talking about std::__uninitialized_default_construct? It's not std::ranges::__uninitialized_default_construct, so we don't need to worry about this namespace colliding with it.

Yes, I mean (potential) confusion for the reader, not for the compiler. Changed to follow the convention.

Thanks! When I first thought about it, I presumed that RandomFunction would have to be __RandomFunction in our case, but now I see that it's not necessarily true.


Done, thanks. Also added a TODO to change the other few places which don't follow the convention.

Quuxplusone accepted this revision.Dec 14 2021, 7:02 AM

Looks good to me at this point!


And put it in the __advance namespace, and put advance into inline namespace __cpo; and remove __function_like. :) If you're volunteering for this, I will be thrilled to review it. :)


Ack; OK if you leave this TODO comment as-is.
For the record, though, re "the linked code can change and the comment becomes stale": the benefit of wording the comment as "make consistent with [linked code]" is that even if the linked code does change, the comment remains correct! (Unless the linked code changes to match this code. ;)) A comment like "Here we should take approach X" might rot (if tomorrow we standardize on approach Y), but "Here we should be consistent and take the same approach as our neighbor" is evergreen.


Nit: Using the name capacity here instead of size is weird (compare to vector.size/capacity); but also can we just say 5 to sidestep the whole issue? (Applies throughout, I guess.)


FWIW, if you want, and if you rename Buffer.cbegin/cend to begin const/end const so that Buffer becomes more range-ish, then this can just be

This revision is now accepted and ready to land.Dec 14 2021, 7:02 AM
var-const marked 3 inline comments as done.

Address feedback, try to fix the CI.


Thanks, I plan to do this cleanup before the end of the week.


Sorry, I should have elaborated a bit more -- the kind of change I was thinking about is if the linked code was moved around (e.g. imagine in the future the code that checks CPO object properties is consolidated in a single file -- then the reader won't find anything in the test file that is "linked" here and would have to go dig through history to figure out what the comment was referring to).


Created a constant instead.

Regarding the naming issue: my initial thought was that what capacity returns essentially is just raw capacity, not size. The buffer allocates enough space for N elements, but it doesn't keep track of how many elements are actually created, which is what I would expect size to return.

OTOH, size in the vector sense arguably is inapplicable for Buffer because Buffer allows random access to all the N elements -- e.g. for a 5-element buffer, it's valid to have elements [1] and [3] initialized and everything else uninitialized, and there isn't really a way to capture that information as size.

Now that I think of it, it seems like std::array is a better model for Buffer than vector, and array uses size, not capacity.


Thanks! I'll experiment with this in the follow-up I'm working on.