This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Refactor stride_counting_iterator
ClosedPublic

Authored by ldionne on Jan 4 2022, 11:36 AM.

Details

Reviewers
Quuxplusone
Group Reviewers
Restricted Project
Commits
rGa9bfb4c4f48d: [libc++] Refactor stride_counting_iterator
Summary

Instead of storing the wrapped iterator inside the stride_counting_iterator,
store its base so we can have e.g. a stride_counting_iterator of an
input_iterator (which was previously impossible because input_iterators
are not copyable).

As a fly-by fix, remove the member base() functions, which are super
confusing.

Diff Detail

Event Timeline

ldionne requested review of this revision.Jan 4 2022, 11:36 AM
ldionne created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2022, 11:36 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Quuxplusone requested changes to this revision.Jan 5 2022, 7:58 AM
Quuxplusone added a subscriber: Quuxplusone.
Quuxplusone added inline comments.
libcxx/test/support/test_iterators.h
667–676

I believe your intent here was (or should have been ;))

explicit stride_counting_iterator() requires std::default_initializable<It> = default;

because after all your efforts to make cpp17_input_iterator<int*> non-default-constructible, it wouldn't make sense if stride_counting_iterator<cpp17_input_iterator<int*>> were default-constructible again.

However, I think this also highlights an obstacle with this entire design (both old and new). You want stride_counting_iterator<It> to expose all the same API surface as It, but wrapped, and that's just physically very hard to achieve — requires all this metaprogramming. I can't help wondering if there's a simpler approach.
Have you tried simply writing a stride_counter<T*> that wraps a pointer and is always contiguous, and then using types like forward_iterator<stride_counter<T*>> in the tests (instead of stride_counter<forward_iterator<int*>>)? That violates our previously unwritten rule that forward_iterator<It> is only ever instantiated with T*; but it doesn't violate it too much, does it? Or do we end up with tons of cryptic errors?

668

This is the only place iterator_concept_t is used. Could we change this to something like

using iterator_concept = std::conditional_t<
    std::contiguous_iterator<It>, std::contiguous_iterator_tag, std::conditional_t<
    std::random_access_iterator<It>, std::random_access_iterator_tag, std::conditional_t<
    std::bidirectional_iterator<It>, std::bidirectional_iterator_tag, std::conditional_t<
    std::forward_iterator<It>, std::forward_iterator_tag, std::conditional_t<
    std::input_iterator<It>, std::input_iterator_tag, std::output_iterator_tag
>>>>>;

and then eliminate lines 649–689?
Alternatively, is it possible to replace the whole thing with just

using iterator_category = std::iterator_traits<It>::iterator_category;

or does that fail tests such as

static_assert(std::contiguous_iterator<stride_counting_iterator<int*>>);

? (This would all be much easier if ITER-CONCEPT were a real thing instead of being exposition-only.)

682

If you don't take my advice about flipping to forward_iterator<stride_counter<T*>>, so you keep the tests using stride_counting_iterator<forward_iterator<int*>>, then I suggest investigating whether base(sci<fi<int*>>) should simply return the int* itself instead of a fi<int*>. That would eliminate a lot of base(base( cruft in the tests.
I wonder if sentinel_wrapper and sized_sentinel should also be done that way. I must have considered it circa D115766 and decided not to, but I don't remember why; it might have just been to avoid churn in the places that used the old code.

690
774–776

This overload should go away. Well-formedness of It == Sent doesn't imply stride_counter<It> == Sent; those types are unrelated. In tests, we should only ever use stride_counter<It> == sentinel_wrapper<stride_counter<It>>.

(Notice that the old code failed to provide an overload for stride_counter<It> - Sent; that's good IMO, because if a test wants a sized sentinel they should be using stride_counter<It> - sized_sentinel<stride_counter<It>>.)

784–788

Scope creep, feel free to disregard or defer; but for the record, I'd love to see these relops rewritten to use a more copy-pastey style:

friend constexpr bool operator<(stride_counting_iterator const& x, stride_counting_iterator const& y)
    requires std::random_access_iterator<It> {
    return It(x.base_) < It(y.base_);
}

friend constexpr bool operator<=(stride_counting_iterator const& x, stride_counting_iterator const& y)
    requires std::random_access_iterator<It> {
    return It(x.base_) <= It(y.base_);
}

friend constexpr bool operator>(stride_counting_iterator const& x, stride_counting_iterator const& y)
    requires std::random_access_iterator<It> {
    return It(x.base_) > It(y.base_);
}

and so on, where the only token that changes from function to function is the spelling of the operator itself.
I now observe that I never pulled out and submitted a PR to change all the existing test iterators' relops to hidden friends. (contiguous_iterator uses this style already, but I mean to make the others do so also. I'll go submit that today.)

This revision now requires changes to proceed.Jan 5 2022, 7:58 AM
Quuxplusone added inline comments.Jan 9 2022, 7:36 AM
libcxx/test/support/test_iterators.h
682

I wonder if sentinel_wrapper and sized_sentinel should also be done that way.

This morning I found some concrete evidence that they should be done that way. With the current system, we have

struct Common {
    forward_iterator<int*> begin() const;
    forward_iterator<int*> end() const;
};
struct Uncommon {
    forward_iterator<int*> begin() const;
    sentinel_wrapper<forward_iterator<int*>> end() const;
};
Common c;
Uncommon uc;
assert(base(c.begin()) == base(c.end()));  // Common: Nice!
assert(base(uc.begin()) == base(base(uc.end())));  // Uncommon: Ugly!

If ADL base(sentinel) returned the iterator's base type, then we'd be able to use the "Nice!" code for both Common and Uncommon. Which would be nice, especially in (hypothetically) templated code and (pragmatically) copy-pasted code.

ldionne updated this revision to Diff 400846.Jan 18 2022, 7:41 AM
ldionne marked 5 inline comments as done.

Apply most review comments and rebase onto main.

I'll investigate calling base() recursively in another patch.

libcxx/test/support/test_iterators.h
667–676

I'm going to take the requires default_initializable change. I think trying to commute the stride counting and the forward-iteratorness (or other archetype) is a great idea. I've vaguely considered it in the past too, but it seemed like a lot of work upfront and I could hardly justify it to myself.

I still think this patch is an improvement as-is, so I'd like to land it without expanding the scope so much. I think we can get to where we both want incrementally. First, we can make base() call itself recursively so that it unwraps all levels of iterator-ness. Then, we can try to commute stride_counting_iterator.

668

I will use your conditional_t formulation.

Quuxplusone added inline comments.
libcxx/test/support/test_iterators.h
667–676

I still think this patch is an improvement as-is, so I'd like to land it without expanding the scope so much. I think we can get to where we both want incrementally.

Ack, and go ahead.

What follows is just preemptive nitpicking/punditry on the rest of your plan:

First, we can make base() call itself recursively so that it unwraps all levels of iterator-ness.

I don't think we want base() to call itself recursively. I've discussed an exactly isomorphic problem with the author of https://oleksandrkvl.github.io/2021/10/11/projected-ranges.html — he calls the single step .base() and the recursive version .root(). I maintain that .root() is simply useless in generic code: You get a gift. You put it inside a box, maybe several nested boxes. Then you call .root() to unwrap all the boxes — but you won't get back your original gift in the case that your original gift was itself wrapped in a box.
Instead, I think we want each test iterator to have a clear idea of how it's supposed to be used: e.g. what is its type parameter supposed to be (only T*? any test iterator from the prior level? any test iterator at all? any iterator at all?). Then, its ADL base(x) won't need to be recursive; it should have a clear idea of exactly how many layers (if any) need to be unwrapped and/or re-applied inside base.

I still think it makes sense to have base(sentinel_wrapper<forward_iterator<T*>>) return T* instead of forward_iterator<T*>. But it's not going to do that because "recursion"; it's going to do that because we want base(it) == base(sent) to be well-formed.

Then, we can try to commute stride_counting_iterator.

+1.

ldionne accepted this revision as: Restricted Project.Jan 18 2022, 8:12 AM

Will ship once CI is green, thanks for the discussion @Quuxplusone .

libcxx/test/support/test_iterators.h
667–676

I still think it makes sense to have base(sentinel_wrapper<forward_iterator<T*>>) return T* instead of forward_iterator<T*>. But it's not going to do that because "recursion"; it's going to do that because we want base(it) == base(sent) to be well-formed.

Interesting, but in that case I would argue that defining base(T*) as being the identity is breaking this philosophy. base(T*) is acting exactly as the "end of the recursion" in our unwrapping right now, and it's really convenient. Maybe defining base(T*) won't be necessary at all if we use your approach?

This revision is now accepted and ready to land.Jan 18 2022, 8:12 AM
This revision was automatically updated to reflect the committed changes.
libcxx/test/support/test_iterators.h
667–676

We definitely still need base(T*), because we want to be able to do

template<class It> void test() { ~~~ assert(base(it) == base(sent)); ~~~ }
int main() {
    test<std::forward_iterator<int*>>();
    test<std::random_access_iterator<int*>>();
    test<int*>();  // !!
}

so we need base(it) to be well-formed for plain pointers as well as test iterators.
(This was our only test coverage for contiguous iterators prior to C++20; but even in C++20 I think testing with non-class-type iterators i.e. pointers is still super important.)

Maybe we could get away with just a T *base(T *p) { return p; } instead of the full-blown T base(T it) { return it; } that we have today. I doubt it on principle, but certainly we can't expect base to do much of anything sensible if it's called on a completely arbitrary iterator (e.g. an std::istream_iterator<int*> or a std::list<int>::iterator).