This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Comment preconditions for __wrap_iter; make it work with C++20 to_address
AbandonedPublic

Authored by Quuxplusone on Apr 27 2021, 3:29 PM.

Details

Reviewers
cjdb
ldionne
EricWF
zoecarver
tcanens
glenjofe
Group Reviewers
Restricted Project
Summary
LWG3545 "std::pointer_traits should be SFINAE-friendly" is maybe related.
It's currently unclear what the difference is between "contiguous iterator"
(for which `std::to_address` is required to work) and "fancy pointer"
or "pointer-like type" (for which both `pointer_traits` and
`std::to_address` are required to work).

The precondition for __wrap_iter is that it's for use only with fancy pointers,
i.e., things that should in C++20 be unambiguously contiguous iterators...
but because of legacy code, we can't actually _enforce_ this precondition.
We sanity-check it only as far as verifying that the fancy-pointer type
we're given is at least a random-access iterator.
We explicitly do _not_ try to instantiate `pointer_traits<_Iter>`, because
legacy user-defined fancy pointer types might not support that. (The metaprogramming
around `pointer_traits` is vastly different and more stringent in C++20 from how it
was in C++11/14/17.) Instead, we compute `element_type` directly from `reference`.

Meanwhile, `__is_cpp17_contiguous_iterator` is used (only) for optimizations
in <algorithm>, so in C++17 we must be very careful not to apply those optimizations
to arbitrary user-defined types (in particular we mustn't assume that any old
random-access iterator with an `element_type` is contiguous).
In C++20 we can liberally assume that the optimizations should be applied to
any user-defined iterator types that advertise themselves as contiguous via
`iterator_category` and/or `iterator_concept` (which automatically includes
`__wrap_iter`).

Our `__wrap_iter` now correctly advertises itself as a contiguous iterator in C++20,
and produces the correct `pointer_traits<WI>::element_type`; this is now tested.

Refs D101396

Diff Detail

Event Timeline

zoecarver requested review of this revision.Apr 27 2021, 3:29 PM
zoecarver created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2021, 3:29 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
zoecarver edited the summary of this revision. (Show Details)Apr 27 2021, 3:30 PM
cjdb added a comment.Apr 27 2021, 3:30 PM
This comment was removed by cjdb.
cjdb accepted this revision.Apr 27 2021, 3:33 PM

Realised the test will be done by D101396. 🚢 it!

Should there be a test for string too?

Eh, I think it's fine. If you feel strongly I'll add it. This will implicitly be covered for "all" iterators by the tests in D101396.

Quuxplusone requested changes to this revision.Apr 27 2021, 6:21 PM

If this PR is the last thing needed to unblock landing something for C++20, then OK ship it, I don't think it's introducing any wrong behavior. However, if we have time, I would like to see further investigation of how this patch fits into the general scheme of D94807, and how much of that scheme is still relevant this month. In particular:

  • Can we make __is_cpp17_contiguous_iterator more similar to std::contiguous_iterator, perhaps by having __is_cpp17_contiguous_iterator look at whether element_type exists?
  • Can we now remove the specialization of __is_cpp17_contiguous_iterator for __wrap_iter?
  • Can we now remove the overload of _VSTD::__to_address for __wrap_iter? I think no, but I'm not sure (and it sure would be nice if we could).
  • You're providing element_type unconditionally, which means that __wrap_iter<It> will be treated as pointer-like even when It is not a (fancy) pointer type itself; e.g. __wrap_iter<std::list<int>::iterator>. My impression is that we actually only ever use __wrap_iter to wrap (fancy) pointers — is this accurate? If it is not accurate, then more digging is warranted. If it is accurate, then we should document that invariant, and maybe even search-and-replace _Iter to _Ptr throughout __wrap_iter.

D101003 is related. (I'm eliminating span::iterator's dependence on __wrap_iter<T*>.)

libcxx/include/iterator
1280

If my comments are correct, then this could+should use pointer_traits<_Iter> instead of pointer_traits<pointer>...
...and the _If on line 1284 is unconditionally true, and can be simplified.
Please try that simplification (at least locally) and see if it passes the test suite.

This revision now requires changes to proceed.Apr 27 2021, 6:21 PM

@Quuxplusone I didn't look at that patch, but now I have lots of thoughts 😁. First, after C++17 we should probably use std::contiguous_iterator to implement __is_cpp17_contiguous_iterator. Second, pre-C++20 AFAICT __is_cpp17_contiguous_iterator is the same as is_pointer_v + a specialization for __iter_wrap, right? I think we need to make this at least check for pointer_traits so that we can handle fancy pointers as well.

Can we now remove the specialization of is_cpp17_contiguous_iterator for wrap_iter?

Can we just remove __is_cpp17_contiguous_iterator all together? It's only used in two places in the whole codebase.

Can we now remove the overload of _VSTD::to_address for wrap_iter? I think no, but I'm not sure (and it sure would be nice if we could).

Not only can we, but we should. This is causing problems because it makes something work that really shouldn't. There's no reason __wrap_iter can't conform to contiguous_iterator via the standard method (by providing a template or element_type).

Side note: the reason we can't use the template to deduce the element type is that the template is a pointer (hence the "wrap" part of the name).

Anyway, simply providing element_type as this patch does is sufficient for to_address to start working.

You're providing element_type unconditionally, which means that wrap_iter<It> will be treated as pointer-like even when It is not a (fancy) pointer type itself; e.g. wrap_iter<std::list<int>::iterator>. My impression is that we actually only ever use wrap_iter to wrap (fancy) pointers — is this accurate? If it is not accurate, then more digging is warranted. If it is accurate, then we should document that invariant, and maybe even search-and-replace _Iter to _Ptr throughout wrap_iter.

I think we could make this guarantee and I don't see any reason not to, especially if it makes our implementation simpler/better.

Note: the only place static_assert(sizeof(typename pointer_traits<_Iter>::element_type), "") fails in the whole codebase is in the __is_cpp17_contiguous_iterator test: test/libcxx/iterators/contiguous_iterators.pass.cpp.

libcxx/include/iterator
1280

Alternatively we could add element_type with an _If (or maybe inheritance?) to make this work for non-pointer-like iterators.

Can we just remove __is_cpp17_contiguous_iterator altogether? It's only used in two places

Maybe, but I doubt it. The places it's used are specifically to implement the lowering of std::copy into memmove: https://godbolt.org/z/5P9TjGza4
For performance, we really want to be able to detect "this specific iterator is known contiguous" in C++17/14/11/03 mode.
Now, can we change __is_cpp17_contiguous_iterator to rely on std::contiguous_iterator in C++20 and later? I would say yes. But we can't just make it false in C++17-and-earlier; it's still got to do the right thing for known-contiguous iterators in all modes. Unless we decide we don't care about that optimization. There's definitely (and IMHO unfortunately) precedent for libc++ saying "optimization isn't our department, let's just ask Clang to optimize better," but at least I wouldn't want to unilaterally decide that we don't care about this specific optimization.

Note: the only place [...] fails in the whole codebase is in [...] test/libcxx/iterators/contiguous_iterators.pass.cpp.

This line, right?:

static_assert((!std::__is_cpp17_contiguous_iterator<std::__wrap_iter<std::reverse_iterator<T *> > >::value), "");

I'd be 100% cool with eliminating that line from the test, because I think it violates the unstated precondition on __wrap_iter's _Iter argument. (And then we can document that precondition and enforce it.)

Maybe, but I doubt it. The places it's used are specifically to implement the lowering of std::copy into memmove: https://godbolt.org/z/5P9TjGza4
For performance, we really want to be able to detect "this specific iterator is known contiguous" in C++17/14/11/03 mode.
Now, can we change __is_cpp17_contiguous_iterator to rely on std::contiguous_iterator in C++20 and later? I would say yes. But we can't just make it false in C++17-and-earlier; it's still got to do the right thing for known-contiguous iterators in all modes. Unless we decide we don't care about that optimization. There's definitely (and IMHO unfortunately) precedent for libc++ saying "optimization isn't our department, let's just ask Clang to optimize better," but at least I wouldn't want to unilaterally decide that we don't care about this specific optimization.

The thing I'm more annoyed about is that it is used in one place and spread across four files. Here's what I suggest: we "move" __is_cpp17_contiguous_iterator to algorithm and rename it to something like __is_pointer_like_iterator or __is_any_contiguous_iterator and define it as so (pseudo code):

#if C++20
template<class T>
constexpr inline bool __is_pointer_like_iterator = std::contiguous_iterator<T>;
#else
template<class T>
constexpr inline bool __is_pointer_like_iterator = requires { typename pointer_traits<T>::element_type };
#endif

WDYT?

This line, right?:

Yes, and looking more closely the only lines it fails on are similar assertions that check what types aren't contiguous iterators.

I'd be 100% cool with eliminating that line from the test, because I think it violates the unstated precondition on __wrap_iter's _Iter argument. (And then we can document that precondition and enforce it.)

:) Great, I agree.

tcanens added inline comments.
libcxx/include/iterator
1280

It's surprising to me that this is even needed. What breaks without this change?

Quuxplusone commandeered this revision.Apr 29 2021, 9:26 AM
Quuxplusone updated this revision to Diff 341546.
Quuxplusone edited reviewers, added: zoecarver; removed: Quuxplusone.
Quuxplusone retitled this revision from [libcxx] Fix __wrap_iter to work with to_address. to [libc++] Comment preconditions for __wrap_iter; make it work with C++20 to_address.
Quuxplusone edited the summary of this revision. (Show Details)
Quuxplusone added reviewers: tcanens, glenjofe.

Commandeer. Add tests for to_address-of-STL-iterator-types and pointer_traits-of-STL-iterator-types.

Quuxplusone added inline comments.
libcxx/include/iterator
1280

In the old code, pointer_traits<std::vector<int>::iterator>::element_type would be int*, not int. https://godbolt.org/z/PTPvn11bf
Providing a user-provided element_type fixes that (in both C++17 and C++20).

Interestingly, libstdc++ and MSVC STL also report pointer_traits<std::vector<int>::iterator>::element_type as something other than int. I don't know if this is a conscious decision ("contiguous iterators needn't support pointer_traits") or if they just haven't gotten around to that part of C++20 yet. @CaseyCarter ?

CaseyCarter added inline comments.Apr 29 2021, 9:55 AM
libcxx/include/iterator
1280

https://godbolt.org/z/ba6z9cEbe demonstrates fairly clearly that pointer_traits<vector<T>::iterator>::element_type is T for MSVC STL. It looks like GCC didn't need to specialize pointer_traits for the to_address requirement: presumably operator-> always works with their pointer-wrapper type. This is not the case for our _Vector_iterator which asserts in debug mode if you call operator-> on a non-dereferenceable iterator.

zoecarver added inline comments.Apr 29 2021, 10:21 AM
libcxx/include/__iterator/iterator_traits.h
431 ↗(On Diff #341546)

Heh, thanks for fixing this :P

467 ↗(On Diff #341546)

Can we make this use the concept? I guess we could make that a follow up nfc.

libcxx/include/iterator
469

Please keep this as per D101098:

All sub-headers should be included by their parent header. To use the example above, `<ranges> must include the sub-header __ranges/size.h, even if it doesn't use ranges::size itself. This inclusion is mandatory because the standard requires the top level header to provide these declarations. This rule falls out of the more general "Include What You Use." You should never rely on some other header to transitively include something that you need. If your header needs a definition of iterator_traits, then you should either #include <iterator> (exactly as user code does), or, to improve compile time, #include <__iterator/iterator_traits.h>. You should never expect iterator_traits` to be "pulled in" by any other header.

(Excuse the formatting).

1282

Who gets broken by making this __is_cpp17_contiguous_iterator? This is only used to wrap pointers and internal iterators. We should just make it work.

1285

If this is now unconditionally contiguous_iterator_tag we absolutely need to change the above assertion.

1289

Why can't we use pointer_traits with this? That would seem more correct to me. Especially if we're advertising this as a "known trivial wrapper around a pointer type."

The whole reason we want to have an element type is so that pointer_traits is correct for __warp_iter. If our wrapped type doesn't support pointer_traits then we really shouldn't either, because we'll be falsely "supporting" something that we don't actually "support." However, as I have said elsewhere, I don't think we actually wrap any types that don't support pointer_traits, so I think we are safe to use that (maybe with a static assert).

If nothing else, let's make this remove_pointer<iterator_traits::pointer>.

1469

Why is this only before C++20? Doesn't that mean we loose the optimization in C++20?

1474

Do we still need this? I think not.

Also the added tests are great, thank you!

tcanens added inline comments.Apr 29 2021, 11:15 AM
libcxx/include/iterator
1280

Except that nothing in the standard requires pointer_traits<std::vector<int>::iterator>::element_type to be meaningful, only that it doesn't explode. So what was broken?

If you want to claim that it is better QOI, that's fine (but since lots of - even most of - contiguous iterators are not even Cpp17NullablePointers I find using pointer_traits with them rather debatable), but that has nothing to do with to_address really.

zoecarver added inline comments.Apr 29 2021, 12:05 PM
libcxx/include/iterator
1280

That's a good point. Our implementation of to_address shouldn't care what element_type is so long as it exists. For __wrap_iter it should be using operator-> because pointer_traits<__wrap_iter>::to_address doesn't exist either way. Maybe this patch isn't the right fix (though I think the cleanups should be applied regardless).

We should investigate why we can't use to_address with the following type:

struct T {
  using element_type = int*;
  int *operator->();
};

https://godbolt.org/z/ed7Pqzd8n

@Quuxplusone thoughts?

libcxx/include/__iterator/iterator_traits.h
467 ↗(On Diff #341546)

Absolutely not, because that would break all C++17 fancy pointers when compiled in C++20 mode. We can't assume that everyone running the compiler in C++20 mode is actually able to use contiguous_iterator_tag.

libcxx/include/iterator
469

Yes, I know. Check alphabetization and duplication. (This trivial diff will disappear soon because I'll have landed it.)

1280

@CaseyCarter: I was looking at https://godbolt.org/z/f4zbjMMoz and seeing that pointer_traits<std::vector<int>::iterator>::element_type doesn't work at all.
Similarly with the static_assert you posted, https://godbolt.org/z/jGGEEaE5a , if compiled in any mode other than C++20 mode.
Ditto for GCC's libstdc++: https://godbolt.org/z/75sqxnKKa
However, I'm still inclining toward making everything Just Work for libc++. Otherwise, it's too confusing for my brain to keep track of what "works," what "doesn't work on purpose," what "doesn't work not on purpose," etc.

1282

All C++17 fancy pointers get broken by that. C++17 fancy pointers do not advertise themselves as contiguous iterators, because they cannot; there is no way to advertise yourself as a contiguous iterator in C++17-or-earlier.

1289

As the comment says, "Prior to C++20, there was no "standard" way to mark a type as a fancy pointer; after C++20, we would prefer all fancy pointers to have contiguous_iterator_tag, but we can't assume that they will, because of legacy code."
In particular, it is not safe to use pointer_traits<FancyPointer> in C++20 mode, because not all C++17 FancyPointers are safe to use with pointer_traits in C++20 mode. This is tested (sorta) via the test for __wrap_iter<my_random_access_iterator> in libcxx/test/libcxx/iterators/contiguous_iterators.pass.cpp.

1469

In C++20 and later, __is_cpp17_contiguous_iterator<__wrap_iter<_It>> is already true, because __wrap_iter<_It> advertises itself as a contiguous iterator in C++20. The specialization is needed only in C++17-and-earlier, because in those modes there is no established way for __wrap_iter to advertise itself as a contiguous iterator.

1474

We do.

Quuxplusone planned changes to this revision.Apr 29 2021, 2:54 PM
Quuxplusone added inline comments.
libcxx/include/iterator
1340–1347

https://reviews.llvm.org/harbormaster/unit/view/565840/
It turns out that __wrap_iter::operator->() assert-fails in situations where we need std::to_address not to assert-fail. So, since I want to_address to work — and since we can't implement to_address in terms of operator-> — the only remaining possibility is to implement to_address in terms of a manually specialized pointer_traits<__wrap_iter>::to_address.

(The specification of to_address should have done it as an ordinary ADL customization point, but they messed that up royally and now it's too late to fix it.)

So, since we have to specialize pointer_traits<__wrap_iter> anyway, we probably don't need any of the element_type stuff anymore. I'll look into this again soonish.

Quuxplusone abandoned this revision.May 11 2021, 9:16 AM
Quuxplusone marked 6 inline comments as done.

@zoecarver @ldionne: I've lost track of what's going on with contiguous/to_address anymore, but I'm pretty sure this PR was obsoleted by D101638. (It looks like we still need to make __wrap_iter properly C++20-contiguous in debug mode, so that we can enable "libcxx/test/std/utilities/memory/pointer.conversion/to_address_std_iterators.pass.cpp", but that's maybe all that's left.)