This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Further improve the contiguous-iterator story, and fix some bugs in D94807.
ClosedPublic

Authored by Quuxplusone on Feb 3 2021, 3:35 PM.

Details

Summary
  • Quality-of-implementation: Avoid calling __unwrap_iter in constexpr contexts. The user might conceivably write a contiguous iterator where normal iterator arithmetic is constexpr-friendly but std::to_address(it) isn't.
  • Bugfix: When you pass contiguous iterators to std::copy, you should get back your contiguous iterator type, not a raw pointer. That means that libc++ can't __unwrap_iter unless it also does __rewrap_iter. Fortunately, this is implementable.
  • Improve test coverage of the new contiguous_iterator test iterator. This catches the bug described above.
  • Tests: Stop testing that we can std::copy into an input_iterator. Our test iterators may currently support that, but it seems nonsensical to me.

Diff Detail

Event Timeline

Quuxplusone created this revision.Feb 3 2021, 3:35 PM
Quuxplusone requested review of this revision.Feb 3 2021, 3:35 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptFeb 3 2021, 3:35 PM

Rebase on main (after landing D93661).

Quuxplusone added inline comments.Feb 4 2021, 6:31 AM
libcxx/include/algorithm
1649

It's been pointed out to me that this is fine; std::copy is required to have the constexpr keyword on the front, but it's never required to actually be constexpr. libc++ can say "std::copy is constexpr unless you give us a contiguous iterator with a non-constexpr to_address and/or operator->."

1653

I still lack any great way to work around this one, though. I think if we want to use __unwrap_iter(first) on arbitrary user-provided contiguous iterators in copy, then we'll have to separately compute the return value using something like

template<class _OrigIter, class _UnwrappedIter>
_Iter __rewrap_iter(_OrigIter __first, _UnwrappedIter __result)
{
    static_assert(__is_cpp17_contiguous_iterator<_OrigIter>, "");  // otherwise we won't get here
    return __first + (__result - _VSTD::__unwrap_iter(__first));
}

template<class _OrigIter>
_Iter __rewrap_iter(_OrigIter, _OrigIter __result)
{
    return __result;
}
[...]
return _VSTD::__rewrap_iter(__result,
     _VSTD::__copy(_VSTD::__unwrap_iter(__first),
                   _VSTD::__unwrap_iter(__last),
                   _VSTD::__unwrap_iter(__result)));

This mostly destroys our hopes of "reducing the number of template instantiations"; but it does permit us to keep using memmove for user-defined contiguous iterators.

ldionne added inline comments.Feb 4 2021, 7:33 AM
libcxx/include/algorithm
1656

I don't understand this change - didn't you work around the constexpr problem by *not* calling unwrap_iter on iterators when we're in a constexpr context?

Edit: Ah, I think you're trying to work around the *second* problem you pointed out, which is that you can't automatically re-create an arbitrary user-defined iterator from a raw pointer. Right? A solution for that could be to:

  1. Introduce an indirection like __rewrap_iter_impl with a nested __apply that creates an iterator from a raw pointer. That can be specialized as needed.
  2. Provide a default implementation for all iterators that are implicitly constructible from a pointer to their value_type.

That way, if the iterator isn't implicitly constructible from the pointer, a user like Chrome can still get the optimization by specializing the struct to tell libc++ how to get an iterator back from a pointer. And by default, it'll work for most iterators out of the box. Thoughts on that solution?

The only issue I can see is if implicit construction from a raw pointer is syntactically valid but doesn't do what we think it should do. I'm not sure how likely or dangerous that could be, but I would argue that if a contiguous iterator is constructible from a pointer but that constructor doesn't do the right thing, then you really voided your warranty with the Standard Library.

ldionne requested changes to this revision.Feb 4 2021, 7:34 AM

Also, if we go down the road of adding a __rewrap_iter struct, can you add a test exercising that extension point under libcxx/?

Requesting changes for the review queue.

This revision now requires changes to proceed.Feb 4 2021, 7:34 AM
Quuxplusone added inline comments.Feb 4 2021, 8:24 AM
libcxx/include/algorithm
1656

That way, if the iterator isn't implicitly constructible from the pointer, a user like Chrome can still get the optimization by specializing the struct to tell libc++ how to get an iterator back from a pointer.

I don't think we need __rewrap_iter to be a customization point at all. A contiguous iterator (even a user-defined one) is always a random-access iterator too, so we can use __first + (__result - _VSTD::__unwrap_iter(__first)) as I indicated in my Phab comment a few lines above this. (If __unwrap_iter is a no-op, meaning that _OrigIter wasn't a contiguous iterator, just return __result unmodified.)

I would argue that if a contiguous iterator is constructible from a pointer but that constructor doesn't do the right thing, then you really voided your warranty with the Standard Library.

I strongly agree in fantasy-everything-as-it-should-be-land, but I'm less sure in actual-C++-land. Remember this is the language that invented std::pointer_traits<It>::to_address(it) and std::pointer_traits<It>::pointer_to(ptr) instead of just using static_cast<T*>(it) and static_cast<It>(ptr).

Quuxplusone retitled this revision from [libc++] Essentially revert D94807, to avoid calling __unwrap_iter in constexpr contexts. to [libc++] Further improve the contiguous-iterator story, and fix some bugs in D94807..
Quuxplusone edited the summary of this revision. (Show Details)

Massive update and (happily) this is now less of a revert and more of a bugfix.

Quuxplusone edited the summary of this revision. (Show Details)Feb 4 2021, 9:57 AM
ldionne accepted this revision.Feb 5 2021, 7:36 AM
ldionne added inline comments.
libcxx/include/algorithm
1656

The issue with this approach is that it requires the unwrapped iterator to be reachable from the original iterator. That works for the algorithms we want to apply it to, but I could imagine other circumstances where that doesn't work, say if we're allocating a new range.

This revision is now accepted and ready to land.Feb 5 2021, 7:36 AM