This is an archive of the discontinued LLVM Phabricator instance.

[libc++][ranges] Implement `ranges::partial_sort`.
ClosedPublic

Authored by var-const on Jun 28 2022, 10:35 AM.

Diff Detail

Event Timeline

var-const created this revision.Jun 28 2022, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2022, 10:35 AM
Herald added subscribers: mgrang, mgorny. · View Herald Transcript
var-const requested review of this revision.Jun 28 2022, 10:35 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript

Use the actual patch number.

Formatting.

This looks pretty good to me, with some suggestions. I'm especially curious to see what it looks like once we've addressed the issue of using the proper ranges:: vs std:: operations (iter_swap & friends), as we discussed offline.

libcxx/include/__algorithm/partial_sort.h
42–48

Instead, I think _LIBCPP_DEBUG_RANDOMIZE_RANGE should support (Iter first, Sentinel last). You can call std::ranges::next from within _LIBCPP_DEBUG_RANDOMIZE_RANGE when Iter != Sentinel.

You should wait for @philnik to land his patch about replacing the macro by a function. Also, we should probably move it to a separate header to avoid dragging parts of <ranges> in any header that requires __libcpp_debug_randomize_range (or whatever the name was).

53

I find the layering a bit confusing here. We've got: partial_sort -> __partial_sort_impl -> __partial_sort. I would expect either partial_sort -> __partial_sort -> __partial_sort_impl (so just swap the names of __partial_sort and __partial_sort_impl), or you could even inline __partial_sort_impl into __partial_sort and start calling the latter instead of the former from public functions, since __partial_sort is only used here.

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
68

This is not the expected sequence since you are only checking a subsequence of it. It's really the full sorted sequence. IMO this renaming (or similar) would make it easier, otherwise one might try to understand why the expected sequence has N elements when we're partially sorting mid_index <= N elements.

76–77

This makes it easier to understand what the std::equal is comparing.

89–90

Same reformatting.

96

Maybe this is closer to what it does? When I think of "cases", I think of what's tested in check() below.

var-const marked 5 inline comments as done.Jul 2 2022, 6:51 PM

@ldionne I took a stab at the _Tag approach we discussed, please let me know what you think. Among other things, do you think having algorithms default to _STLClassic be a good idea? It would avoid the need to update most callers.

var-const updated this revision to Diff 441915.Jul 2 2022, 6:51 PM

Address feedback.

var-const updated this revision to Diff 441918.Jul 2 2022, 7:06 PM

libcxx-generate-files

var-const added inline comments.
libcxx/include/__algorithm/sift_down.h
53 ↗(On Diff #441915)

@huixie90 This should call ranges::iter_move when the function is invoked from a range algorithm (and just std::move(*__start)) when invoked from classic algorithms). Let me know what you think.

philnik added inline comments.Jul 3 2022, 3:25 AM
libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

Isn't this pretty much exactly the same as _StdIterOps and _RangesIterOps (in __algorithm/iterator_operations.h), just with a bit more boiler-plate?

huixie90 requested changes to this revision.Jul 3 2022, 1:25 PM

Thanks for fixing the iter_swap and iter_move. It almost LGTM

libcxx/include/__algorithm/partial_sort.h
32

A general question on these __impl functions. There are lots of double underscore impl functions in the algorithms that are shared between different algorithms. e.g. __copy. these functions are not constrained and sometimes I am unsure what condition these functions can be used. I guess some of them are not meant to be used outside the file thus omitting any constraint checks. But it is not easy to find out if it is ok to use these impl functions. I am wondering if it is a good idea to add some comments or static_assert on these functions?

libcxx/include/__algorithm/ranges_basic_operations.h
22 ↗(On Diff #441918)

do we need to guard this header with c++ version?

33 ↗(On Diff #441918)

nit: could do ranges::iter_swap instead of std::ranges::iter_swap

libcxx/include/__algorithm/sift_down.h
42 ↗(On Diff #441918)

I had a look at the implementation of _make_projected_comp, if the new range's algorithm caller pass in a member function pointer and std::identity as projection, this __comp would be kept as a member function pointer and fail to compile here calling operator()

is there any tests covering this case?

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
285

is there any test case that tests zip-iterator-like behavior ?

I am working on a test iterator that does something similar with zip-iterator but much simiplied

This revision now requires changes to proceed.Jul 3 2022, 1:25 PM
var-const updated this revision to Diff 442434.Jul 5 2022, 8:40 PM
var-const marked 5 inline comments as done.

Address feedback, fix the CI.

libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

Sorry, I completely missed that patch. You're right, it's pretty much the same idea. The only difference of any significance is that a tag can be used to make arbitrary decisions -- or rather, passing a generic tag instead of a class containing a set of operations makes it clearer that this is intended to be a generic mechanism, not limited to just iterator operations. I could imagine something like:

template <class _Tag, class _Iter>
decltype(auto) __foo_impl(_Iter __b, _Iter __e) {
  // ...
  if constexpr (same_as<_Tag, _RangeAlgorithms>)
    return __e;
}

(Of course, this could be done as if constexpr(same_as<_IterOps, _RangesIterOps>) too, but I think it is a bit less readable)

In either case, there's definitely duplication here. Let's discuss whether being more generic with tags is worth the boilerplate, and I'll consolidate the approach in this patch (whichever we decide).

(By the way, the static constexpr auto advance = ranges::advance; idea is pretty neat)

libcxx/include/__algorithm/partial_sort.h
32

I think that these functions generally start their life being intended to use only within their file, but later end up being used from outside. The fact that they aren't constrained might be deliberate -- any checks are supposed to have already been performed by the public interface. Comments are definitely useful; static_asserts might be useful as well.

libcxx/include/__algorithm/sift_down.h
42 ↗(On Diff #441918)

This is a great catch, thanks! Do you plan to fix this in D129099 (since that patch already contains an overload of __make_projected_comp that avoids the issue)?

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
285

Not yet -- I plan to add a test once your patch lands.

huixie90 accepted this revision.Jul 6 2022, 2:41 AM
huixie90 added inline comments.
libcxx/include/__algorithm/sift_down.h
42 ↗(On Diff #441918)

I updated D129099 to fix the first overload. could you please double check if that is OK?

var-const updated this revision to Diff 443125.Jul 7 2022, 8:19 PM

Address feedback, fix the CI.

var-const updated this revision to Diff 443127.Jul 7 2022, 8:28 PM
var-const marked an inline comment as done.

Remove unnecessary underscores from names.

libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

I took a stab at consolidating the two -- the only change I did (other than adding new functions) is using the tag approach for reasons outlined above (more generic). Please let me know what you think.

huixie90 added inline comments.Jul 7 2022, 11:41 PM
libcxx/include/__algorithm/iterator_operations.h
82 ↗(On Diff #443127)

is it just return __last?

philnik requested changes to this revision.Jul 8 2022, 4:56 AM

Could you maybe move the _IterOps changes into it's own PR? There are quite a few changes unrelated to ranges::partial_sort in here.

libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

FWIW we can still change it later if we want to. But for now, I don't think you are using the _Tag for anything other than calling functions from _BasicOperations, right? In that case I would definitely go for less boiler-plate.

libcxx/include/__algorithm/iterator_operations.h
38 ↗(On Diff #443127)

I think you have to use __iter_move, since it doesn't exist pre-C++20.

68 ↗(On Diff #443127)

Why is the std::forward required?

73 ↗(On Diff #443127)

Why aren't this one and the next also marked _LIBCPP_CONSTEXPR_AFTER_CXX11?

82 ↗(On Diff #443127)

+1

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

The name _Tag looks quite opaque to me. How about _Namespace or something like that?

This revision now requires changes to proceed.Jul 8 2022, 4:56 AM
var-const marked 7 inline comments as done.Jul 8 2022, 1:45 PM

@philnik I have moved the changes to iterator_operations.h to https://reviews.llvm.org/D129390 and addressed the related feedback there.

It's true that we can go for the more generic version later, but it also gives me a certain peace of mind that no matter what kind of issues may come up from sharing the same implementation, they can be worked around (even if they turn out not to be related to iterator operations).

libcxx/include/__algorithm/iterator_operations.h
68 ↗(On Diff #443127)

This is to make the implementation as similar as possible to the actual C++20 iter_move. Perhaps because the iterator type could have overloads of operator* based on whether it's an lvalue or an rvalue? FWIW, the One Ranges Proposal doesn't have forward, so it's something that must have come up during subsequent discussions. Regardless of the original motivation, it seems easier to keep it for consistency with the actual function -- what do you think?

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

How about something like _AlgFamily or _InvocationContext?

libcxx/include/__algorithm/sift_down.h
42 ↗(On Diff #441918)

Thank you!

huixie90 requested changes to this revision.Jul 9 2022, 2:12 PM
huixie90 added inline comments.
libcxx/include/__algorithm/iterator_operations.h
66–68 ↗(On Diff #443127)

FWIW, auto return type and this explicit return type can be different. e.g for std::vector<bool>::iterator, std::move(*std::forward<_Iter>(__i)); is proxy&& and value_type&& is bool&&. However. in classic algorithm, you have to do write

value_type temp = std::move(*it);

because

auto temp = std::move(*it);

does the wrong thing for vector<bool> and friends.

So here this explicit return type does the conversion sometimes and probably IS what the caller wants

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

Or _Policy and we have _RangesPolicy and _ClassicPolicy

libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

For C++20 algorithms, we should use iter_difference_t .

the difference_type is coming from incrementable_traits when iterator_traits is not specialized

(Note this applies to all other usages of difference_type)

libcxx/include/__algorithm/pop_heap.h
28–29

For C++20 ranges algorithms, we should use iter_value_t.
because the value_type is coming from indirectly_readable_traits when iterator_traits is not specialized

This applies to all other places we use value_type

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
282

btw, you can rebase and try sorting the ProxyRange to verify if iter_swap is used properly used and sorting ProxyRange of move only types to see if iter_move is properly used

huixie90 added inline comments.Jul 9 2022, 7:51 PM
libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

Actually, it is fine for this algorithm to use iterator_traits

since the input are required to model c++20 random_access_iterator and C++ 20 random_access_iterator implies c++17 input iterator and that makes sure that iterator_traits exists and consistent with c++20’s iter_differnece_t (and iter_value_t for the value_type)

I think this can be a problem only in algorithms that take c++20 input iterators, because c++ 20 input_iterators can have empty iterator_traits

var-const marked 4 inline comments as done.Jul 12 2022, 2:36 AM
var-const added inline comments.
libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

I really like "policy" -- went with _RangeAlgPolicy and _ClassicAlgPolicy (in https://reviews.llvm.org/D129390). Please let me know what you think.

var-const marked 3 inline comments as done.Jul 12 2022, 2:39 AM
var-const added inline comments.
libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

Thanks a lot for spotting this. I've added checking algorithms that take input iterators to my backlog.

var-const updated this revision to Diff 444135.Jul 12 2022, 6:38 PM
var-const marked an inline comment as done.

Rebase on main (including the iterator_operations.h changes from
https://reviews.llvm.org/D129390).

var-const marked 2 inline comments as done.Jul 12 2022, 6:39 PM
var-const added inline comments.
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
282

I'd prefer to do this in a follow-up once this lands.

var-const marked an inline comment as done.Jul 12 2022, 6:44 PM

@philnik @huixie90 This is ready for another round of review.

huixie90 accepted this revision.Jul 13 2022, 1:36 AM
huixie90 added inline comments.
libcxx/include/__algorithm/partial_sort.h
32

Can we have a naming convention for this? It would be good to distinguish the purpose of __partial_sort and __partial_sort_impl at the first glance.

It would be good to distinguish

  1. helper function purely for implementation of the classic algorithm (perhaps with some assumptions that the iterators are classic iterators)
  2. helper function purely for the implementation of ranges algorithm (perhaps with assumptions that iterators are c++20 iterators)
  3. helper function that is meant to be shared between the classic algorithm and the ranges algorithm
  4. helper function can be reused any other algorithms

The reason why I was asking is that when I implemented other algorithms and I'd like to call copy, I was puzzled if I should call std::__copy or std::__copy_impl

37

question. why do we need inline here as it is already a template (we don't need to worry about ODR violation)?

philnik requested changes to this revision.Jul 13 2022, 4:36 AM

Mostly nits, but I'd like to see the portability trap addressed. Did you add any moves in other classic algorithms? If yes, please also update them.

libcxx/include/__algorithm/partial_sort.h
36

Could you add a TODO to remove this and update __debug_randomize_range to support ranges once we implement ranges::shuffle?

100–101

Could you either not move these or static_assert that they are copy-constructible and copy-assignable to avoid portability problems?

libcxx/include/__algorithm/pop_heap.h
62

Pre-existing: Please also remove the moves or add static_asserts.

libcxx/include/__algorithm/push_heap.h
66

Also here.

libcxx/include/__algorithm/sort_heap.h
50–51

Same here.

libcxx/test/libcxx/algorithms/partial_sort_stability.pass.cpp
37–38

Shouldn't this be _ClassicAlgPolicy?

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
71–73

Could you rename these to begin, middle and end? That would make it a lot easier to read IMO.

89

Why aren't you using b and m here?

106–136

Could you put this into a function? The calls below look really weird.

158–160

Why aren't these in a single line?

210–246

This should be unnecessary now, right?

248–284

And this.

This revision now requires changes to proceed.Jul 13 2022, 4:36 AM
var-const updated this revision to Diff 445382.Jul 17 2022, 8:44 PM
var-const marked 10 inline comments as done.

Address feedback (partial).

libcxx/include/__algorithm/partial_sort.h
32

My intention was for __foo to be the main internal entry point, i.e., __foo functions are the public interface for internal usage, and other algorithms should call into __foo. __foo_impl would be an implementation detail local to the file (ideally I'd put it into an internal namespace as well).

I'm not sure the finer-grained distinction is necessary -- from what I've seen, we're rewriting all the internal functions to be agnostic to whether they're invoked from a classic or a range algorithm. Please let me know if there's an example where it would be helpful.

Note that in this case, I can't easily combine these two functions -- __partial_sort does randomization before and after doing actual work, and the partial_sort_stability test relies on being able to call partial_sort with and without randomization. If combining the two, one straightforward alternative would be to pass this as a boolean template parameter.

37

It was unnecessary, thanks for spotting it.

42–48

Added a TODO -- this will be much easier to implement once ranges::shuffle is implemented.

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
71–73

Personally, I much prefer the shorter names in cases like this (where the meaning is easy to understand because the pattern of having begin/end is so ubiquitous, and shorter names help cut down on verbosity). However, it's not worth spending much time debating this, so I renamed like you suggested.

89

Because of std::equal -- perhaps ranges::equal was unavailable when I was first writing this. In any case, changed to use ranges::equal and begin/end.

106–136

This was suggested by @ldionne. The advantage is to have all the related code in the same scope, i.e. to avoid the details spilling out of the function.

158–160

I find it easier to read this way. Reverted.

Note: this was rebased on https://reviews.llvm.org/D129823 which significantly reduced the delta.

philnik added inline comments.Jul 18 2022, 2:20 AM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

By that logic, shouldn't everything be in nested lambdas in tests? You aren't capturing anything, so I really don't see how that's better than having it as an extra function. What exactly do you think is "spilling out of the function"?

var-const added inline comments.Jul 18 2022, 8:57 PM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

Let's leave this debate for another day, after the release. Changed.

var-const updated this revision to Diff 445686.Jul 18 2022, 8:57 PM

Fix the CI, address comment.

ldionne added inline comments.Jul 19 2022, 1:20 PM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

By that logic, shouldn't everything be in nested lambdas in tests? You aren't capturing anything, so I really don't see how that's better than having it as an extra function. What exactly do you think is "spilling out of the function"?

FWIW the idea is that a top-level function exposes a name, whereas a lambda doesn't because it's a local variable in another scope. It makes it clearer that it's only an implementation detail of e.g. test_iterators() used only for avoiding code duplication. I don't mind which approach we take given that we have more important stuff to focus on right now, though.

ldionne accepted this revision as: ldionne.Jul 19 2022, 1:21 PM

This LGTM at a glance, but I'll leave final approval to the other reviewers who had comments.

philnik accepted this revision.Jul 19 2022, 1:25 PM

I currently don't have a lot of time, so I trust @ldionne's approval.

This revision is now accepted and ready to land.Jul 19 2022, 1:25 PM
var-const updated this revision to Diff 445984.Jul 19 2022, 5:04 PM
var-const marked 4 inline comments as done.

Address the remaining feedback, rebase on main.

var-const updated this revision to Diff 445991.Jul 19 2022, 5:23 PM

Rebase on main.

var-const updated this revision to Diff 446000.Jul 19 2022, 6:16 PM

Rebase on main.

var-const added inline comments.Jul 19 2022, 6:21 PM
libcxx/include/__algorithm/partial_sort.h
100–101

Ah, this is a very good catch. Added static assertions and added auditing those to my backlog.

This revision was automatically updated to reflect the committed changes.