This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Fix algorithms which use reverse_iterator
ClosedPublic

Authored by philnik on Jun 29 2022, 4:31 PM.

Details

Summary

This adds a C++20-version of reverse_iterator which doesn't SFINAE away the operators for use inside the classic STL algorithms. Pre-C++20 _AlgRevIter is just an alias for reverse_iterator.

Diff Detail

Event Timeline

philnik created this revision.Jun 29 2022, 4:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 4:31 PM
philnik requested review of this revision.Jun 29 2022, 4:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 4:31 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
philnik retitled this revision from [libc++] Fix algorihtms using reverse_iterator to [libc++] Fix algorihtms which use reverse_iterator.Jun 29 2022, 4:32 PM

It's quite a bit of code duplication and complexity for __unwrap_iter_impl, but I think it's the best solution. Using reverse_iterator and other wrappers allows us to simplify a lot of algorithms. Not having a C++20-compatible reverse_iterator would also mean that we'd have to duplicate __half_inplace_merge. I'm open to other suggestions though. I hope we can remove _AlgRevIter in a few versions, since it's clearly a language bug. (BTW did anybody file a CWG issue yet?)

libcxx/include/__iterator/iterator_traits.h
504–508

I'm planning to use these in other parts of the code base, but I think it would be better to do this in NFC patches.

philnik updated this revision to Diff 441316.Jun 30 2022, 2:34 AM
  • Try to fix C++03
Mordante added inline comments.Jul 3 2022, 4:47 AM
libcxx/include/__iterator/reverse_iterator.h
342

I'm not really happy with this name, it doesn't tell me anything. I also would like some comment what this iterator wrapper is.

342

How is this class tested?

362

In general we prefer not to use compiler deduced auto on return types.

philnik updated this revision to Diff 441944.Jul 3 2022, 5:17 AM

Rebased on top of D129039

libcxx/include/__iterator/reverse_iterator.h
342

Do you have any naming suggestion? Or would you just like to have it spelled out more like _AlgorihtmReverseIterator?

342

It's tested indirectly through copy_backward, inplace_merge and stable_sort. I'm planning to use it in more algorithms. Would you like have some explicit tests?

362

That makes sense to me with ABI-relevant parts of the code, but what's wrong with it in an implementation-detail class?

Mordante added inline comments.Jul 3 2022, 5:44 AM
libcxx/include/__iterator/reverse_iterator.h
342

Do you have any naming suggestion? Or would you just like to have it spelled out more like _AlgorihtmReverseIterator?

How do you about cpp17_reverse_iterator? I'm not sure whether we want to use algorithm in the name, this implies it should only be used in algorithms and not at other places.

At least with the comment it's easier to understand what the class is about. I had other associations with Alg which didn't make much sense.

342

It's tested indirectly through copy_backward, inplace_merge and stable_sort. I'm planning to use it in more algorithms. Would you like have some explicit tests?

I think it would be good to have some sanity checks for this class. That way when another test fails you can be sure it's the new code and not a bug in this class.

362

In my experience it makes it harder to understand the code. Instead of looking at a function signature I need to look in the function to see the returned type. It just saves typing a few characters for the writer, but puts a higher burden on the reader. I agree with Google in that regard; code should be optimized for the reader.

philnik updated this revision to Diff 441957.Jul 3 2022, 8:55 AM
philnik marked 2 inline comments as done.
  • Address comments
libcxx/include/__iterator/reverse_iterator.h
342

I think the implication you get from the name is correct. IMO we should only use _AlgRevIter (or whatever we will call it) in the places in which it is necessary. The only places I know of where that applies is in the classic STL algorithms. I think __cpp17_reverse_iterator sounds a bit too much like "this should be used pre-C++20 if you want a reverse_iterator", but it might be the better option if there are indeed other places where we want to use this reverse iterator.

342

I've copy-pasted the tests for reverse_iterator and removed a few tests (for iter_move, iter_swap, concepts and a few thing I haven't implemented like reverse_iterator<const char*> = reverse_oterator<char*> which should never be used in any algorithm anyways).

huixie90 added inline comments.
libcxx/include/__iterator/reverse_iterator.h
344

question: is this class only supposed to be used in the classic algorithms? (that is why it doesn't need iter_move and iter_swap)

354–355

why c++20 iterator_concept is defined in terms of c++17 trait?

370–376

could you please clarify why SFIANE is removed and how do you prevent error on instantiating the class template with an _Iter that does not support ->? (assuming all non-template member functions are generated when class template is instantiated)

philnik marked an inline comment as done.Jul 7 2022, 5:22 PM
philnik added inline comments.
libcxx/include/__iterator/reverse_iterator.h
344

Yes, that's correct.

354–355

This is how it's defined in reverse_iterator. (Although it looks like it's a bug there, but that's another PR)

Edit: I think it's actually safe to remove the iterator_concept entirely here. We shouldn't rely on it anywhere in the classic algorithms. We might want to optimize on it at some point, but hopefully at that point this code doesn't exist anymore.

370–376

The new SFINAE in C++20 is the whole reason this PR exists at all. That's the part that breaks the classic algorithms. Regarding operator->: that has to exist for the classic algorithms. See https://en.cppreference.com/w/cpp/named_req/InputIterator.

huixie90 added inline comments.Jul 7 2022, 5:46 PM
libcxx/include/__iterator/reverse_iterator.h
370–376

Then this class is incorrectly constrained (the static assert bit) the input _Iter has to model the c++17 bidi iterator, because c++20 bidi iterator wouldn’t work. It may not have -> , and it may have prvalue reference, which is not allowed in c++ 17 bidi

So the input Iter has to be c++17 and the result iterator is also c++17. Why not just call it c++17 reverse iterator.

Regarding random access operations not sfinae , when you pass in c++17 bidi only , wouldn’t it error on instantiating those member functions?

philnik marked an inline comment as done.Jul 7 2022, 5:59 PM
philnik added inline comments.
libcxx/include/__iterator/reverse_iterator.h
348

Fix to only allow C++17 bidirectional iterators.

370–376

Thanks, I made a note for myself at the correct place.

I think _Cpp17RevIter sounds too much like it should be use pre-C++20 if you want a reverse iterator, not that this is added for compatibility and should only be used in specific circumstances (specifically, that you get a user-defined iterator which may be C++20-hostile).

SFINAE only applies when you try to check if a function exists. If you never call the function the compiler doesn't instantiate it and you don't get an error.

huixie90 added inline comments.Jul 7 2022, 6:17 PM
libcxx/include/__iterator/reverse_iterator.h
370–376

I totally agree that cpp17_xxx is really confusing. The standard uses these names
https://eel.is/c++draft/iterator.traits#2
When the standard says cpp17-iterator, it really means the classic iterator that model the classic Iterator named requirement, which has nothing to do with c++ 17. I don’t usually debate about names and I am ok with names that make sense for most people. for me I found the name cpp17, probably just because I used to how the spec calls those things

Regarding SFINAE, IIUC, for non template member function of a class template, the function body is instantiated as soon as the class template is instantiated, and this could trigger hard error in your case even you never call those functions

philnik updated this revision to Diff 444484.Jul 13 2022, 6:40 PM
philnik marked 7 inline comments as done.
  • Address comments
  • Rebased
libcxx/include/__iterator/reverse_iterator.h
370–376

Every major compiler disagrees with you: https://godbolt.org/z/TxxsWxE9o.

huixie90 added inline comments.Jul 13 2022, 11:31 PM
libcxx/include/__iterator/reverse_iterator.h
370–376

I think you are right here. I was thinking more about explicit instantiation. https://godbolt.org/z/T7n6986rz but we shouldn't be doing that as most stl templates are not SFINAE friendly.

anyway, IIUC, the purpose of this class is, in C++20 mode, this class "reverses" a C++17 iterator, and results in a C++17 iterator. The reason to have this class is that the reverse_iterator's SFINAE broke some iterators in C++20 mode.
Could you please elaborate what exactly is broken? This sounds either a bug in the standard or in our implementation of reverse_iterator

philnik added inline comments.Jul 14 2022, 5:19 AM
libcxx/include/__iterator/reverse_iterator.h
370–376

If you want an example of what breaks you can look at robust_against_cpp20_hostile_iterators.compile.pass.cpp. Basically the problem is that the standard considers the operator== overloads ambiguous. Clang considers them not ambiguous and spits out a warning. The problem now is that overload resolution would be different if clang would also consider it not ambiguous in SFINAE contexts, so there it isn't ambiguous but an error. That means that overload resolution fails instead of selecting the correct overload.

huixie90 added inline comments.Jul 14 2022, 8:12 AM
libcxx/include/__iterator/reverse_iterator.h
370–376

Thanks for the explanation. I think it is worth to put your explanation in the comment in case in the future someone look at this and wonder why we need another reverse_iterator.

But IIUC, some iterators which are valid in C++17 standard, becomes invalid because of == starts being ambiguous in C++20 standard. This sounds a flag to me. Let me put this in sg9's reflector and see what people think.

philnik updated this revision to Diff 444826.Jul 14 2022, 4:04 PM
philnik marked 2 inline comments as done.
  • Address comment and try to fix CI
philnik updated this revision to Diff 444907.Jul 15 2022, 1:14 AM
  • Try to fix CI
var-const retitled this revision from [libc++] Fix algorihtms which use reverse_iterator to [libc++] Fix algorithms which use reverse_iterator.Jul 15 2022, 1:22 AM
philnik updated this revision to Diff 444935.Jul 15 2022, 3:51 AM
  • Extend copy_backward tests and try to fix CI
var-const requested changes to this revision.Jul 21 2022, 1:23 AM

@philnik My initial impression is that we shouldn't try to solve this at the library level. This looks like a legitimate issue with the core language.

If I understand it correctly (and please point out if I'm wrong), the issue is that there is no perfect match for operator==, and the only operator== available has essentially this form:

bool operator==(const BaseIter&, const DerivedIter&);

Which would work except that the C++20 rewriting rules also generate the same function with reversed arguments:

bool operator==(const DerivedIter&, const BaseIter&);

Which then becomes ambiguous.

I think it would be easier to rewrite our existing algorithms to avoid using reverse_iterator. We can add a TODO to revisit the implementation when (and if) the core language issue gets fixed.

This revision now requires changes to proceed.Jul 21 2022, 1:23 AM
var-const added inline comments.Jul 21 2022, 1:28 AM
libcxx/include/__iterator/reverse_iterator.h
211

If we're doing essentially a workaround, would it be possible/easier to instead define an internal comparator for reverse iterators and use it in the affected algorithms?

bool __reverse_iter_eq(const reverse_iterator<_Iter1>& __x, const reverse_iterator<_Iter2>& __y) {
  return __x.base() == __y.base();
}
// In the algorithm
for (; !std::__reverse_iter_eq(__first, __last); ++__first) {
429

Hmm, so this expression is not ambiguous, but when it's used as a constraint in a requires clause, it is considered ambiguous? Am I missing something?

@philnik My initial impression is that we shouldn't try to solve this at the library level. This looks like a legitimate issue with the core language.

I agree that this is a core issue, but not fixing it is also not an option IMO.

If I understand it correctly (and please point out if I'm wrong), the issue is that there is no perfect match for operator==, and the only operator== available has essentially this form:

bool operator==(const BaseIter&, const DerivedIter&);

Which would work except that the C++20 rewriting rules also generate the same function with reversed arguments:

bool operator==(const DerivedIter&, const BaseIter&);

Which then becomes ambiguous.

Yes.

I think it would be easier to rewrite our existing algorithms to avoid using reverse_iterator. We can add a TODO to revisit the implementation when (and if) the core language issue gets fixed.

I don't think that it would be easier two rewrite existing algorithms. This would lead to a lot of complicated duplicated code. That is always a recipe for chaos and divergence. Effectively duplicating reverse_iterator isn't perfect either, but the code really isn't complicated. Most member functions are a single line.

libcxx/include/__iterator/reverse_iterator.h
211

Not really. There would be no easy way to distinguish between an internal reverse_iterator and an external one.

429

No, that's exactly the case.

var-const added inline comments.Jul 21 2022, 2:34 AM
libcxx/include/__iterator/reverse_iterator.h
429

Interesting. Why is that?

philnik added inline comments.Jul 21 2022, 2:48 AM
libcxx/include/__iterator/reverse_iterator.h
429

I don't know exactly why, but I think it is because that would change the overload resolution. With the change in SFINAE contexts you could end up calling different functions, not just allow code that would otherwise be an error.

I am also not happy with this, but I don't see a better solution for fixing the underlying issue.

Other alternatives:

  1. Stop using reverse_iterator in the implementation of algorithms. That would mean reimplementing some of the logic we have in std::copy for lowering to memmove, unwrapping iterators and more. I'm ambivalent about this, but @philnik seems to be convinced that it would be worse than this patch, and I would tend to agree.
  2. Stop using reverse_iterator in the implementation of algorithms, and don't try to apply optimizations in those. That way we lose the complexity of this patch, but we also lose the perf benefits. I don't think we should do this.

@philnik, can you please file a CWG issue when the LLVM 15 branch is done? On a related note, I should create a space where we can share issue drafts for libc++ contributors to look at and chime in on. I'm taking this as an action item.

@var-const Can you think of other alternatives that would fix those algorithms with C++20 hostiles iterators?

libcxx/include/__iterator/reverse_iterator.h
338–339

What do you think about __unconstrained_reverse_iterator instead? It's a mouthful, but we don't plan on using it in too many places anyways.

@var-const Can you think of other alternatives that would fix those algorithms with C++20 hostiles iterators?

@philnik

@var-const and I just discussed and we agreed to move forward with this approach. We'll spend some time investigating whether another approach would be viable once the branch is cut, but this shouldn't prevent this patch from moving forward. Let's apply outstanding comments and ping me so we can merge this.

After offline discussions, I'm fine with this approach (I'd like to explore simplifying our implementation of copy* and move* and potentially removing the internal reverse iterator, but that's for the future and is not guaranteed to work out well). The most important comment I have is to have a lot of context in the class comment to explain the purpose of the class. It's a tricky issue, so we should help the reader of the code as much as we can.

libcxx/include/__algorithm/copy_backward.h
48

Question: why doesn't this use __copy? (I presume there's a reason, just checking)

libcxx/include/__iterator/reverse_iterator.h
211

Ah, good point.

338–339

+1 -- there might be a better name, but comparing these two, __unconstrained_reverse_iterator makes it easier to understand the purpose of the class.

341–481

Here's my attempt to improve the wording here. IMO, explaining context here is very important because otherwise it's not clear why we want to maintain essentially two versions of reverse_iterator:
"""
This class allows us to use reverse iterators in algorithms' implementation by working around a language issue in C++20.

In C++20, when a reverse iterator wraps certain C++20-"hostile" iterators, calling comparison operators on it will result in a compilation error on Clang. However, calling comparison operators on the pristine hostile iterator is not an error. Thus, we cannot use reverse iterators in the implementation of an algorithm that accepts a hostile iterator. This class is an internal workaround -- it is a copy of reverse_iterator with tweaks to make it support hostile iterators.

A "hostile" iterator is one that defines a comparison operator where one of the arguments is an exact match and the other requires an implicit conversion, for example:

friend bool operator==(const BaseIter&, const DerivedIter&);

C++20 rules for rewriting equality operators create another overload of this function with parameters reversed:

friend bool operator==(const DerivedIter&, const BaseIter&);

This creates an ambiguity in overload resolution.

However, the ambiguity is only a compilation error when used in a requires clause, not in the actual function body. This class simply removes the problematic constraints from comparison functions.
"""

402

Nit: there's one extra blank line here, please remove it.

428

Consider adding a comment like:

// Deliberately unconstrained unlike the comparison functions in `reverse_iterator` -- see the class comment for the rationale.
libcxx/test/libcxx/iterators/predef.iterators/_AlgRevIter/reverse.iter.cmp/equal.pass.cpp
9 ↗(On Diff #444935)

The new reverse iterator is only used in C++20 mode and above, right? Would it make sense to mark these tests as only supporting recent language versions and simplify them (replace TEST_CONSTEXPR_FOO with constexpr, etc.)?

var-const added inline comments.Jul 21 2022, 2:11 PM
libcxx/include/__iterator/reverse_iterator.h
429

It looks like this paper aims to fix the language issue: https://github.com/cplusplus/papers/issues/1127
Unfortunately, even though the paper was approved this February, it doesn't fix all the potential cases that create ambiguity, so it won't change anything for us even after it's implemented in the compiler (or at least that's how I read it).

huixie90 added inline comments.Jul 21 2022, 3:25 PM
libcxx/include/__iterator/reverse_iterator.h
429

First of all, I think the current clang behaviour is really confusing.
See this example https://godbolt.org/z/v767Kdd7K

  • If you write foo1 == foo2, it actually compiles fine with just a warning
  • If you write a concept that checks foo1 == foo2 is valid expression, it returns false

I think clang contradicts itself.

@var-const Thanks for posting this paper P2468R2, I actually think the paper fixes our issue. The example in the paper section 1.2.1
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2468r2.html#open-source-sampling
is exactly what we are facing in robust_against_cpp20_hostile_iterators.compile.pass.cpp
https://github.com/llvm/llvm-project/blob/main/libcxx/test/libcxx/algorithms/robust_against_cpp20_hostile_iterators.compile.pass.cpp#L50

They both have CRTP base with a member == that takes a derived class. And the usage of derived == derived is ambiguous due to the fact that both sides are equally convertible from derived to base.

The straw poll of this paper will happen next Monday. I am not sure what is the best action for us for llvm-15

var-const added inline comments.Jul 21 2022, 5:10 PM
libcxx/include/__iterator/reverse_iterator.h
429

@huixie90 Yes, this is the part that confuses me. I understand why Clang sees an ambiguity there, but I don't understand why it behaves differently in two different contexts. If Clang wants to be pedantic about this issue, it seems like it should reject foo1 == foo2, not issue a warning. If Clang wants to be permissive, it should evaluate the concept to true. Of course, it's possible that I'm missing something, but it looks like it might be unintentional.

Re. the proposal -- if I'm reading it correctly, it's only proposing the Second resolution, which is to avoid rewrites if the class has both a user-provided operator== and a user-provided operator!=. A valid iterator can still define a "hostile" operator== without defining an operator!=, which would still lead to the same issue we're dealing with. In the proposal, it says that 8 out of 110 projects they evaluated are still failing after the proposed fix. Unfortunately, from our perspective it means that the proposal doesn't really change anything. Even if the number of iterators considered hostile is reduced 10x, it's still a valid use case that we have to support.

var-const added inline comments.Jul 21 2022, 5:14 PM
libcxx/include/__iterator/reverse_iterator.h
341–481

However, the ambiguity is only a compilation error when used in a requires clause, not in the actual function body.

This sentence would be better as:
"""
Confusingly, Clang treats this ambiguity differently in different contexts. When operator== is actually called in the function body, the code is accepted with a warning. When a concept requires operator== to be a valid expression, however, it evaluates to false. Thus, the implementation of reverse_iterator::operator== can actually call operator== on its base iterators, but the constraints on reverse_iterator::operator== prevent it from being considered during overload resolution. This class simply removes the problematic constraints from comparison functions.
"""

philnik updated this revision to Diff 447129.Jul 24 2022, 6:28 AM
philnik marked 15 inline comments as done.
  • Address comments
libcxx/include/__algorithm/copy_backward.h
48

Just because it doesn't really make a difference and using the public interface leads to less refactorings when changing the copy internals.

libcxx/include/__iterator/reverse_iterator.h
429

I think clang does that, because allowing it in evaluated contexts only allows a program to compile that would otherwise not be a valid program. Having requires(T lhs, T rhs) { lhs == rhs; } return true would change the meaning of the program, not just let a program compile. IOW, I think Clang is actually the only conforming compiler in this example. I could be completely wrong though.

Regarding the proposal, I think it does solve our issue. I would guess that Clang, GCC and MSVC drop the extension after implementing the paper, meaning that it would be an error anyways and thus not be a problem for us anymore.

libcxx/test/libcxx/iterators/predef.iterators/_AlgRevIter/reverse.iter.cmp/equal.pass.cpp
9 ↗(On Diff #444935)

I think it's not a bad idea to check that the iterator works the same in all language versions. Only running the tests in C++20 and 23 would simplify them a bit, but not that much.

var-const accepted this revision.Jul 24 2022, 12:39 PM
This revision is now accepted and ready to land.Jul 24 2022, 12:39 PM
var-const added inline comments.Jul 24 2022, 12:40 PM
libcxx/include/__iterator/reverse_iterator.h
429

Regarding the proposal, I think it does solve our issue.

Wouldn't an iterator type that only defines bool operator==(const Iter& res); // no const and no corresponding operator!= still be broken?

philnik added inline comments.Jul 25 2022, 6:46 AM
libcxx/include/__iterator/reverse_iterator.h
429

I think it would be broken, but it would be broken either way. There is nothing we can do about it if the compiler also errors when calling it in a function (which would be my expectation when the paper is implemented).

This revision was automatically updated to reflect the committed changes.