Page MenuHomePhabricator

[libc++][ranges] Implement `ranges::partial_sort`.
ClosedPublic

Authored by var-const on Jun 28 2022, 10:35 AM.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Address feedback, fix the CI.

var-const updated this revision to Diff 443127.Jul 7 2022, 8:28 PM
var-const marked an inline comment as done.

Remove unnecessary underscores from names.

libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

I took a stab at consolidating the two -- the only change I did (other than adding new functions) is using the tag approach for reasons outlined above (more generic). Please let me know what you think.

huixie90 added inline comments.Jul 7 2022, 11:41 PM
libcxx/include/__algorithm/iterator_operations.h
82 ↗(On Diff #443127)

is it just return __last?

philnik requested changes to this revision.Jul 8 2022, 4:56 AM

Could you maybe move the _IterOps changes into it's own PR? There are quite a few changes unrelated to ranges::partial_sort in here.

libcxx/include/__algorithm/basic_operations.h
30 ↗(On Diff #441918)

FWIW we can still change it later if we want to. But for now, I don't think you are using the _Tag for anything other than calling functions from _BasicOperations, right? In that case I would definitely go for less boiler-plate.

libcxx/include/__algorithm/iterator_operations.h
38 ↗(On Diff #443127)

I think you have to use __iter_move, since it doesn't exist pre-C++20.

68 ↗(On Diff #443127)

Why is the std::forward required?

73 ↗(On Diff #443127)

Why aren't this one and the next also marked _LIBCPP_CONSTEXPR_AFTER_CXX11?

82 ↗(On Diff #443127)

+1

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

The name _Tag looks quite opaque to me. How about _Namespace or something like that?

This revision now requires changes to proceed.Jul 8 2022, 4:56 AM
var-const marked 7 inline comments as done.Jul 8 2022, 1:45 PM

@philnik I have moved the changes to iterator_operations.h to https://reviews.llvm.org/D129390 and addressed the related feedback there.

It's true that we can go for the more generic version later, but it also gives me a certain peace of mind that no matter what kind of issues may come up from sharing the same implementation, they can be worked around (even if they turn out not to be related to iterator operations).

libcxx/include/__algorithm/iterator_operations.h
68 ↗(On Diff #443127)

This is to make the implementation as similar as possible to the actual C++20 iter_move. Perhaps because the iterator type could have overloads of operator* based on whether it's an lvalue or an rvalue? FWIW, the One Ranges Proposal doesn't have forward, so it's something that must have come up during subsequent discussions. Regardless of the original motivation, it seems easier to keep it for consistency with the actual function -- what do you think?

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

How about something like _AlgFamily or _InvocationContext?

libcxx/include/__algorithm/sift_down.h
42 ↗(On Diff #441918)

Thank you!

huixie90 requested changes to this revision.Jul 9 2022, 2:12 PM
huixie90 added inline comments.
libcxx/include/__algorithm/iterator_operations.h
66–68 ↗(On Diff #443127)

FWIW, auto return type and this explicit return type can be different. e.g for std::vector<bool>::iterator, std::move(*std::forward<_Iter>(__i)); is proxy&& and value_type&& is bool&&. However. in classic algorithm, you have to do write

value_type temp = std::move(*it);

because

auto temp = std::move(*it);

does the wrong thing for vector<bool> and friends.

So here this explicit return type does the conversion sometimes and probably IS what the caller wants

libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

Or _Policy and we have _RangesPolicy and _ClassicPolicy

libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

For C++20 algorithms, we should use iter_difference_t .

the difference_type is coming from incrementable_traits when iterator_traits is not specialized

(Note this applies to all other usages of difference_type)

libcxx/include/__algorithm/pop_heap.h
28–29

For C++20 ranges algorithms, we should use iter_value_t.
because the value_type is coming from indirectly_readable_traits when iterator_traits is not specialized

This applies to all other places we use value_type

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
282

btw, you can rebase and try sorting the ProxyRange to verify if iter_swap is used properly used and sorting ProxyRange of move only types to see if iter_move is properly used

huixie90 added inline comments.Jul 9 2022, 7:51 PM
libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

Actually, it is fine for this algorithm to use iterator_traits

since the input are required to model c++20 random_access_iterator and C++ 20 random_access_iterator implies c++17 input iterator and that makes sure that iterator_traits exists and consistent with c++20’s iter_differnece_t (and iter_value_t for the value_type)

I think this can be a problem only in algorithms that take c++20 input iterators, because c++ 20 input_iterators can have empty iterator_traits

var-const marked 4 inline comments as done.Jul 12 2022, 2:36 AM
var-const added inline comments.
libcxx/include/__algorithm/lower_bound.h
30 ↗(On Diff #443127)

I really like "policy" -- went with _RangeAlgPolicy and _ClassicAlgPolicy (in https://reviews.llvm.org/D129390). Please let me know what you think.

var-const marked 3 inline comments as done.Jul 12 2022, 2:39 AM
var-const added inline comments.
libcxx/include/__algorithm/make_heap.h
30 ↗(On Diff #443127)

Thanks a lot for spotting this. I've added checking algorithms that take input iterators to my backlog.

var-const updated this revision to Diff 444135.Jul 12 2022, 6:38 PM
var-const marked an inline comment as done.

Rebase on main (including the iterator_operations.h changes from
https://reviews.llvm.org/D129390).

var-const marked 2 inline comments as done.Jul 12 2022, 6:39 PM
var-const added inline comments.
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
282

I'd prefer to do this in a follow-up once this lands.

var-const marked an inline comment as done.Jul 12 2022, 6:44 PM

@philnik @huixie90 This is ready for another round of review.

huixie90 accepted this revision.Jul 13 2022, 1:36 AM
huixie90 added inline comments.
libcxx/include/__algorithm/partial_sort.h
32

Can we have a naming convention for this? It would be good to distinguish the purpose of __partial_sort and __partial_sort_impl at the first glance.

It would be good to distinguish

  1. helper function purely for implementation of the classic algorithm (perhaps with some assumptions that the iterators are classic iterators)
  2. helper function purely for the implementation of ranges algorithm (perhaps with assumptions that iterators are c++20 iterators)
  3. helper function that is meant to be shared between the classic algorithm and the ranges algorithm
  4. helper function can be reused any other algorithms

The reason why I was asking is that when I implemented other algorithms and I'd like to call copy, I was puzzled if I should call std::__copy or std::__copy_impl

37

question. why do we need inline here as it is already a template (we don't need to worry about ODR violation)?

philnik requested changes to this revision.Jul 13 2022, 4:36 AM

Mostly nits, but I'd like to see the portability trap addressed. Did you add any moves in other classic algorithms? If yes, please also update them.

libcxx/include/__algorithm/partial_sort.h
36

Could you add a TODO to remove this and update __debug_randomize_range to support ranges once we implement ranges::shuffle?

100–101

Could you either not move these or static_assert that they are copy-constructible and copy-assignable to avoid portability problems?

libcxx/include/__algorithm/pop_heap.h
62

Pre-existing: Please also remove the moves or add static_asserts.

libcxx/include/__algorithm/push_heap.h
66

Also here.

libcxx/include/__algorithm/sort_heap.h
50–51

Same here.

libcxx/test/libcxx/algorithms/partial_sort_stability.pass.cpp
37–38

Shouldn't this be _ClassicAlgPolicy?

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
71–73

Could you rename these to begin, middle and end? That would make it a lot easier to read IMO.

89

Why aren't you using b and m here?

106–136

Could you put this into a function? The calls below look really weird.

158–160

Why aren't these in a single line?

210–246

This should be unnecessary now, right?

248–284

And this.

This revision now requires changes to proceed.Jul 13 2022, 4:36 AM
var-const updated this revision to Diff 445382.Sun, Jul 17, 8:44 PM
var-const marked 10 inline comments as done.

Address feedback (partial).

libcxx/include/__algorithm/partial_sort.h
32

My intention was for __foo to be the main internal entry point, i.e., __foo functions are the public interface for internal usage, and other algorithms should call into __foo. __foo_impl would be an implementation detail local to the file (ideally I'd put it into an internal namespace as well).

I'm not sure the finer-grained distinction is necessary -- from what I've seen, we're rewriting all the internal functions to be agnostic to whether they're invoked from a classic or a range algorithm. Please let me know if there's an example where it would be helpful.

Note that in this case, I can't easily combine these two functions -- __partial_sort does randomization before and after doing actual work, and the partial_sort_stability test relies on being able to call partial_sort with and without randomization. If combining the two, one straightforward alternative would be to pass this as a boolean template parameter.

37

It was unnecessary, thanks for spotting it.

42–48

Added a TODO -- this will be much easier to implement once ranges::shuffle is implemented.

libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
71–73

Personally, I much prefer the shorter names in cases like this (where the meaning is easy to understand because the pattern of having begin/end is so ubiquitous, and shorter names help cut down on verbosity). However, it's not worth spending much time debating this, so I renamed like you suggested.

89

Because of std::equal -- perhaps ranges::equal was unavailable when I was first writing this. In any case, changed to use ranges::equal and begin/end.

106–136

This was suggested by @ldionne. The advantage is to have all the related code in the same scope, i.e. to avoid the details spilling out of the function.

158–160

I find it easier to read this way. Reverted.

Note: this was rebased on https://reviews.llvm.org/D129823 which significantly reduced the delta.

philnik added inline comments.Mon, Jul 18, 2:20 AM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

By that logic, shouldn't everything be in nested lambdas in tests? You aren't capturing anything, so I really don't see how that's better than having it as an extra function. What exactly do you think is "spilling out of the function"?

var-const added inline comments.Mon, Jul 18, 8:57 PM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

Let's leave this debate for another day, after the release. Changed.

var-const updated this revision to Diff 445686.Mon, Jul 18, 8:57 PM

Fix the CI, address comment.

ldionne added inline comments.Tue, Jul 19, 1:20 PM
libcxx/test/std/algorithms/alg.sorting/alg.sort/partial.sort/ranges_partial_sort.pass.cpp
106–136

By that logic, shouldn't everything be in nested lambdas in tests? You aren't capturing anything, so I really don't see how that's better than having it as an extra function. What exactly do you think is "spilling out of the function"?

FWIW the idea is that a top-level function exposes a name, whereas a lambda doesn't because it's a local variable in another scope. It makes it clearer that it's only an implementation detail of e.g. test_iterators() used only for avoiding code duplication. I don't mind which approach we take given that we have more important stuff to focus on right now, though.

ldionne accepted this revision as: ldionne.Tue, Jul 19, 1:21 PM

This LGTM at a glance, but I'll leave final approval to the other reviewers who had comments.

philnik accepted this revision.Tue, Jul 19, 1:25 PM

I currently don't have a lot of time, so I trust @ldionne's approval.

This revision is now accepted and ready to land.Tue, Jul 19, 1:25 PM
var-const updated this revision to Diff 445984.Tue, Jul 19, 5:04 PM
var-const marked 4 inline comments as done.

Address the remaining feedback, rebase on main.

var-const updated this revision to Diff 445991.Tue, Jul 19, 5:23 PM

Rebase on main.

var-const updated this revision to Diff 446000.Tue, Jul 19, 6:16 PM

Rebase on main.

var-const added inline comments.Tue, Jul 19, 6:21 PM
libcxx/include/__algorithm/partial_sort.h
100–101

Ah, this is a very good catch. Added static assertions and added auditing those to my backlog.

This revision was automatically updated to reflect the committed changes.