This is an archive of the discontinued LLVM Phabricator instance.

[libc++][ranges] Implement the changes to `deque` from P1206 (`ranges::to`):
ClosedPublic

Authored by var-const on May 4 2023, 2:19 AM.

Details

Summary
  • add the from_range_t constructors and the related deduction guides;
  • add the insert_range/assign_range/etc. member functions.

(Note: this patch is split from https://reviews.llvm.org/D142335)

Diff Detail

Event Timeline

var-const created this revision.May 4 2023, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 2:19 AM
var-const requested review of this revision.May 4 2023, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 2:19 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne added inline comments.
libcxx/include/__algorithm/copy_n.h
42

I think this needs a LWG issue.std::copy_n and ranges::copy_n have the exact same specification (http://eel.is/c++draft/alg.copy#lib:copy_n), but std::copy_n seems to be expected not to increment the input iterator if it can avoid it. ranges::copy_n does increment the iterator. For something like istream_iterator, this means we will be extracting different numbers of elements from the underlying stream.

This feels similar to the views::take issue we discussed in Issaquah, it might be worth checking whether that issue did talk about copy_n as another case of the same problem.

But for certain, I think we'll break users if we start increment this here, so we shouldn't do it unless we're blessed by the standard explicitly.

var-const updated this revision to Diff 520576.May 8 2023, 8:19 PM

Fix the CI and rebase.

var-const updated this revision to Diff 520586.May 8 2023, 9:50 PM

Sort includes

var-const updated this revision to Diff 520602.May 8 2023, 11:55 PM

Rebase on main (now that the vector patch is committed, the delta is significantly reduced).

var-const added inline comments.
libcxx/include/__algorithm/copy_n.h
42

Digging through the list of LWG issues, this has been raised before ranges as http://wg21.link/lwg2471. Ranges do make the issue more noticeable, though, because now the discrepancy is not just between copy_n and uninitialized_copy_n, but also copy_n and ranges::copy_n. It doesn't look like there's been any update on the issue, though.

@jwakely Were there any updates on this issue? I seem to remember this issue being brought up in Issaquah, but I couldn't find anything in LWG notes, so I might be confusing this with something. I can file a new LWG issue to talk specifically about the discrepancy between copy_n and ranges::copy_n -- do you think it makes sense?

var-const marked an inline comment as done.

Feedback

libcxx/include/__algorithm/copy_n.h
42

@ldionne I ended up doing two versions of assign_with_size -- one for random access iterators and one for everything else. The random-access iterator version is essentially the previous implementation that uses std::copy (with all the potential optimizations). The other version essentially inlines copy_n (thankfully it's trivial without optimizations, which don't apply in this case) but keeps track of the incremented input iterator to avoid going through the input range twice. It's still more efficient than assign_with_sentinel: only one comparison per iteration when copying input elements into the existing capacity (the sentinel version has to compare both the input iterator and the deque iterator to their respective sentinels), and it can also allocate the necessary additional capacity at once.

ldionne accepted this revision.May 12 2023, 12:26 PM

LGTM w/ comments. In particular it would be nice to confirm that this doesn't introduce a regression in the existing deque methods (I don't think it would, but it's simple to confirm). Thanks!

libcxx/include/deque
1286–1297

Feel free to clang-format this bit.

1290

I would either use the integer as a counter (since then we know comparison is cheap), or at least cache end() since it's kind of expensive and we don't know for sure whether the compiler is smart enough hoist it out.

libcxx/test/std/containers/sequences/deque/deque.cons/from_range.pass.cpp
2

I think we should also add benchmarks in libcxx/benchmarks/ContainerBenchmarks.h for the new construction methods we're adding.

This revision is now accepted and ready to land.May 12 2023, 12:26 PM
var-const marked 3 inline comments as done.May 16 2023, 12:07 AM
var-const added inline comments.
libcxx/test/std/containers/sequences/deque/deque.cons/from_range.pass.cpp
2

Done (also added for vector). This patch doesn't seem to make any difference for existing constructors (seems within fluctuation) and the difference between the (iter, iter) constructor and the from_range constructor also appears negligible as expected.

For completeness' sake, here are the results before this patch:

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_ConstructSize/deque_byte/5140480          232868 ns       220132 ns         3163
BM_ConstructSizeValue/deque_byte/5140480     189452 ns       172314 ns         4026
BM_ConstructIterIter/deque_char/1024           79.0 ns         78.9 ns      9187075
BM_ConstructIterIter/deque_size_t/1024          300 ns          300 ns      2223959
BM_ConstructIterIter/deque_string/1024       107435 ns       107390 ns         6744

And with this patch:

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_ConstructSize/deque_byte/5140480          231027 ns       219294 ns         3148
BM_ConstructSizeValue/deque_byte/5140480     176521 ns       170531 ns         4105
BM_ConstructIterIter/deque_char/1024           77.5 ns         76.0 ns      9186111
BM_ConstructIterIter/deque_size_t/1024          301 ns          301 ns      2327840
BM_ConstructIterIter/deque_string/1024       105603 ns       105535 ns         6881
BM_ConstructFromRange/deque_char/1024          74.3 ns         74.2 ns      9481112
BM_ConstructFromRange/deque_size_t/1024         301 ns          301 ns      2288232
BM_ConstructFromRange/deque_string/1024      100816 ns       100813 ns         6983

And also vector with this patch:

------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
BM_ConstructSize/vector_byte/5140480          272017 ns       271861 ns         2394
BM_ConstructSizeValue/vector_byte/5140480     260968 ns       260866 ns         2566
BM_ConstructIterIter/vector_char/1024           39.5 ns         39.5 ns     17688158
BM_ConstructIterIter/vector_size_t/1024          139 ns          139 ns      5072096
BM_ConstructIterIter/vector_string/1024        94366 ns        94365 ns         7484
BM_ConstructFromRange/vector_char/1024          40.1 ns         40.1 ns     17496195
BM_ConstructFromRange/vector_size_t/1024         157 ns          148 ns      4767970
BM_ConstructFromRange/vector_string/1024      107067 ns       102741 ns         6937
var-const marked an inline comment as done.

Address feedback and rebase.

A note in case it every comes in handy later: while working on this patch, I noticed that deque tests are significantly (5-10x) slower compared to the equivalent vector tests when compiled by GCC; with Clang, the difference is almost negligible. This is a preexisting issue -- running the test suite for insert_range while using the old insert function that takes two iterators on main shows a similarly large performance difference when compiling with GCC. From a conversation with @ldionne, one explanation could be that when compiling under GCC, many functions get force-inlined which might affect deque more significantly (and negatively) than vector. In any case, this patch doesn't make the situation worse.

ldionne accepted this revision.May 16 2023, 11:01 AM

Thanks for adding benchmarks! LGTM once CI passes, which I think will require a CMake update on the Windows bots or some other workaround.

var-const updated this revision to Diff 522881.May 16 2023, 7:11 PM

Remove the benchmarks (moved to a separate patch: https://reviews.llvm.org/D150747).

Note: the benchmarks have been moved to https://reviews.llvm.org/D150747 (because they fail on the CI under Windows and I don't want to block this patch on that investigation).