This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Implement P1072R10 (std::basic_string::resize_and_overwrite)
ClosedPublic

Authored by philnik on Nov 2 2021, 6:37 AM.

Diff Detail

Event Timeline

philnik requested review of this revision.Nov 2 2021, 6:37 AM
philnik created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2021, 6:37 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
philnik updated this revision to Diff 384066.Nov 2 2021, 6:40 AM
  • Remove unused includes
Mordante requested changes to this revision.Nov 2 2021, 12:35 PM

Thanks for working on this!
Please mark the feature complete in libcxx/docs/Status/Cxx2bPapers.csv.

Please update libcxx/utils/generate_feature_test_macro_components.py with the new feature-test macro __cpp_lib_string_resize_and_overwrite; its values should be 202110L and run this script to update the appropriate tests.

I didn't have enough time to do a thorough review; I'll have a closer look after all issues have been addressed.

libcxx/include/string
160–163

Please update the synopsis.

template<class Operation>
constexpr void resize_and_overwrite(size_type n, Operation op); // since C++23
985

This looks wrong. "Implementations should avoid unnecessary copies and allocations" This may allocate more than required. The paper 3. Implementation refers to __resize_default_init as example.

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
14

Please use the entire signature of the function.

21

All tests are now done on an empty string. Please also test with non-empty strings.

65

It would be good to test with all string types. libcxx/test/support/make_string.h has a helper to create the same string for all supported string types. Then the test function can be templated on the string type.

77

Please use int main(int, char**).

80

Please return 0; Some of our supported platforms require this even for main.

This revision now requires changes to proceed.Nov 2 2021, 12:35 PM
philnik updated this revision to Diff 384241.Nov 2 2021, 2:43 PM
philnik marked 7 inline comments as done.
  • Using __resize_default_init now and extended tests
Quuxplusone requested changes to this revision.Nov 3 2021, 7:37 AM
Quuxplusone added inline comments.
libcxx/include/string
993

Line-break after constexpr, please.
Also, I certainly wouldn't complain if s/_Operation/_Op/.

994

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1072r10.html doesn't mention any Constraints, only a Mandates that std::move(__o)(declval<_CharT*&>(), __n) should have an integer-like type. I don't think "integer-like-ness" is expressible in C++, but prove me wrong. :)
Anyway, this function definitely should not be constrained, because (1) the standard says it shouldn't be, and (2) if we get our constraint wrong then we're needlessly preventing valid use-cases.

...Hm, I guess [p1072r10] has a minor wording bug, because obviously they intended to mandate the type of std::move(op)(std::move(p), std::move(n)), not std::move(op)(p, n). I think we should "do what they mean, not what they say."

998

Forgot the std::move here. Please add a test case that catches this (e.g. use an _Op with a &&-ref-qualified operator(), and/or one where operator()() && and operator()() & do different things).

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
77

I question the usefulness of all of the "nonsense" strings. I understand "" and "Banane" (i.e. "short") and "long long string so no SSO", but the other five seem calculated only to make the z/OS folks unhappy. ;) If they're redundant, please eliminate them.

This revision now requires changes to proceed.Nov 3 2021, 7:37 AM
philnik updated this revision to Diff 384540.Nov 3 2021, 11:35 AM
philnik marked 2 inline comments as done.
  • Added move only tests
libcxx/include/string
993

What do you mean by

Also, I certainly wouldn't complain if s/_Operation/_Op/.

?

994

I think the paper is right. What would be the point of moving a raw pointer or a size_type?

Quuxplusone requested changes to this revision.Nov 3 2021, 1:07 PM
Quuxplusone added a subscriber: ldionne.
Quuxplusone added inline comments.
libcxx/include/string
993

"I wouldn't complain if X" means "I'm not asking for X, but I'd be pleased if X were to happen anyway." Okay, usually it means I'm asking for X. ;)
s/foo/bar/ is awk/sed-speak for "replace foo with bar."
As usual, the further down in the review I got, the more strongly I liked _Op over _Operation.

994

The question is whether it is permissible for libc++ to reject

std::string s;
s.resize_and_overwrite(100, [](char *p, int& n) {
    return 0;
});

or, vice versa, whether it is permissible for us to accept

std::string s;
s.resize_and_overwrite(100, [](char *p, int&& n) {
    return 0;
});

That is, it's not so much about the machine code generated as it is about the value-categories of things.

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
23–39

For the ref-qualification/value-category issue, this simple non-template test will suffice; you can get rid of move_only_functor.

void test_value_categories() {
    std::string s;
    s.resize_and_overwrite(10, [](char*&, size_t&) { return 0; });
    struct RefQual {
        int operator()(char *, size_t) && { return 0; }
    };
    s.resize_and_overwrite(10, RefQual());
}

(Here I'm following libstdc++ in assuming that p, n are passed as lvalues, even though I said above that they really ought to be passed as rvalues. If we hear that rvalues were intended, then just change & to && above.)

90–95

Throughout: I see your reason for using the names begin, size for Op's parameters, but I think it will be clearer to use p, n the same way the standard does. Also, speaking of s.begin() and s.size(), it looks like you're assuming things about their values without asserting those things. Could we assert them instead?

constexpr void test_undersized(S src, S orig)
{
  S dst = orig;
  size_t new_size = 1.5 * src.size();
  auto func = [&](auto *p, size_t n) {
    assert(n == new_size);
    assert(dst.size() == new_size);
    p = std::copy_n(src.begin(), src.size(), p);
    std::copy_n(src.begin(), src.size() / 2, p);
    return n;
  };
  dst.resize_and_overwrite(new_size, func);
  assert(dst == src + src.substr(0, src.size() / 2);
}

But, writing it out this way, I just don't see the point of this test. What is orig doing there (your former s3)? Why does your func overwrite not just the new space but the original contents of the string too? What is the significance of the division in size() / 2 as opposed to using a constant like 42?
Could test_undersized and test_oversized both be replaced with something like...?

template<class S>
constexpr void test_appending(int k, int N) {
    assert(N > k);
    auto s = S(k, 'a');
    s.resize_and_overwrite(N, [&](char *p, int n) {
        assert(n == N);
        LIBCPP_ASSERT(s.size() == n);
        LIBCPP_ASSERT(s.begin() == p);
        assert(std::all_of(p, p+k, [](auto ch) { return ch == 'a'; }));
        LIBCPP_ASSERT(p[k] == '\0');
        std::fill(p+k, p+n, 'b');
        p[n] = 'c';  // will be overwritten
        return n;
    });
    auto expected = S(k, 'a') + S(N-k, 'b');
    assert(s == expected);
}

template<class S>
constexpr void test_truncating(int o, int N) {
    assert(N < o);
    auto s = S(o, 'a');
    s.resize_and_overwrite(N, [&](char *p, int n) {
        assert(n == N);
        LIBCPP_ASSERT(s.size() == n);
        LIBCPP_ASSERT(s.begin() == p);
        assert(std::all_of(p, p+n, [](auto ch) { return ch == 'a'; }));
        LIBCPP_ASSERT(p[n] == 'a');
        p[n-1] = 'b';
        p[n] = 'c';  // will be overwritten
        return n;
    });
    auto expected = S(N-1, 'a') + 'b';
    assert(s == expected);
}

However, this is still missing any coverage for what happens when op(p, n) < n. We should have coverage for that.

108–113

Throughout: IIUC, s1 means dest and s2 means src. (On line 87, s means dest and s3 means src. And so on.) Please consistent-ize the parameter names.

124

template<class CharT> — it's not a String.
Likewise on line 114 — class CharT, class S — although in that case, S is invariably basic_string<CharT>, so I don't see why it needs to be a parameter at all. Just using S = basic_string<CharT> in the body of the function is good enough.

133–135

(Moot) These lines are too long.
(1) Unless @ldionne asked for this (in which case his preference can overrule mine), I'd much much prefer to see the usual libc++ style of test_x(); static_assert(test_x()); written out longhand. It's only 10 lines of tests (as opposed to what you have here: 5 lines of tests plus 3 lines of macros).
(2) I don't understand why test_overwrite_str calls test_copy_strings calls test_all calls test_{exact_size, oversized, undersized}. These names don't seem to reflect any sort of hierarchy. I think it should just be that test<CharT> calls all the different tests, period.
(3) Only the one function that you're static_assert'ing on needs to return true; the rest should have return type void. (I suggest that that one bool-returning function should be named test<CharT>().)

This revision now requires changes to proceed.Nov 3 2021, 1:07 PM
philnik updated this revision to Diff 385456.Nov 8 2021, 4:37 AM
philnik marked 9 inline comments as done.

Rebased and updated test

LGTM at this point, mod minor comments.

libcxx/include/string
339–340

Please move this up to between resize and capacity, to match https://eel.is/c++draft/basic.string
...er, I guess we have some pre-existing issues in that ordering. Either just move this up to between resize and reserve, or else redistribute our synopses of reserve-thru-empty to match the Standard's synopsis ordering. (But I wouldn't recommend anything more drastic than that.)

2668

Here and line 3495, since we're changing it anyway, I'd prefer to see

_LIBCPP_CONSTEXPR_AFTER_CXX17 void

We shouldn't need the inline keyword at all, because this is a template; and constexpr void reads better than void constexpr.
https://quuxplusone.github.io/blog/2021/04/03/static-constexpr-whittling-knife/

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
24

Neat. Perhaps resizeSize should be named newCapacity? (The ultimate size of the final string will be N.)

62

I think you're missing an interesting case: something like test_appending<S>(10, 30, 15), where the original string fits in SSO, then we need to allocate for the lambda call, and then the final string would again fit in SSO (but IIUC we do not actually put it back into SSO — we just leave it with the extra allocated capacity, right? This is fine).

Mordante requested changes to this revision.Nov 8 2021, 9:09 AM

In general this looks good! I've some minor nit. It also seems you missed some of my "bookkeeping" remarks. I added these as comments.

libcxx/include/string
339–340

Basically where my original request for updating the synopsis is ;-)

340

In my original reply I mentioned some not yet done things. For clarity I add them here to it's easier to tick the box once done.

Please mark the feature complete in libcxx/docs/Status/Cxx2bPapers.csv.

Please update libcxx/utils/generate_feature_test_macro_components.py with the new feature-test macro __cpp_lib_string_resize_and_overwrite; its values should be 202110L and run this script to update the appropriate tests.

996

Is there a reason to use variable instead of calling data() in __erase_to_end.
The LLVM coding convention doesn't like auto when the type isn't clear.

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
24

Also please use snake_case.

37

Please don't use auto.

38

Can you test whether s.c_str() contains a proper NUL-terminated string. The same for the truncating test.

87

Next time it would be better to make a helper function that calls all 5 variants, just to keep main simpler.

This revision now requires changes to proceed.Nov 8 2021, 9:09 AM
Quuxplusone added inline comments.Nov 8 2021, 9:27 AM
libcxx/include/string
996

Sadly yes, OP's parameters are overspecified as lvalues, not prvalues. I've just requested an LWG issue number for this.
https://eel.is/c++draft/basic.string#string.capacity-7

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
38

Ah, +1; I'd been implicitly assuming that s == expected would also check the null byte, but I guess it doesn't necessarily have to.

const S expected = S(k, 'a') + S(N - k, 'b');
assert(s == expected);
assert(s.c_str()[N] == '\0');
87

I forget if I specifically requested the current status quo, but I very well might have — personally I like <10 LOC of code repetition in main better than a second layer of test functions. :)

Mordante added inline comments.Nov 8 2021, 9:48 AM
libcxx/include/string
996

In that case once you have a number, can you provide it here. Then we can add a comment why this is required. (And clean it up once the issue is resolved.)

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
87

I prefer the extra layer, especially due to the duplication for static_assert.
However I don't object against this version.

philnik updated this revision to Diff 385554.Nov 8 2021, 10:29 AM
philnik marked 14 inline comments as done.

Added feature-test macro and some minor changes

Mordante accepted this revision.Nov 8 2021, 11:45 AM

LGTM, assuming the CI passes.

This revision is now accepted and ready to land.Nov 8 2021, 11:45 AM
Quuxplusone added inline comments.
libcxx/include/string
996

@mzeren-vmw @ckennelly (paper authors), thoughts? Shall we even just go ahead and do the sane thing here, on the assumption that LWG will catch up to us? :)

philnik updated this revision to Diff 397650.Jan 5 2022, 11:39 AM
  • Remove _LIBCPP_CONSTEXPR_AFTER_CXX17
  • Disable constexpr test

@Quuxplusone, @Mordante do you want to take another look before I land it?

Quuxplusone accepted this revision.Jan 5 2022, 12:16 PM
Quuxplusone added inline comments.
libcxx/include/string
996

The LWG issue for this is https://timsong-cpp.github.io/lwg-issues/3645
@philnik, I suggest that you land D113013 as-is (with the lvalue __data); but then (if you don't mind the extra work) I think you should immediately submit another PR that changes the offending line to __erase_to_end(_VSTD::move(__op)(data(), +__n)); with a message something like "[libc++] resize_and_overwrite: Adopt the P/R for LWG3645." I can't vouch for @ldionne, but personally I'd accept such a PR right away. :)

libcxx/test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp
32

Nit: remove const, or add &, depending on your intent

I don't mind landing it now, but let's make sure we show the correct state. In the followup patch you can then update the state to fully implemented.

libcxx/docs/Status/Cxx2bPapers.csv
29

Please add a note the constexpr part isn't done yet.
The LLVM 14 branching will happen later this month, so it's not certain the complete version will be in LLVM 14. So let's make the state correct.

libcxx/utils/generate_feature_test_macro_components.py
630

Likewise show the correct state.

@Mordante I'd argue that it is fully implemented. The function signature is complete. It only doesn't work, because the rest of string isn't constexpr yet. And P1679R3 is also marked complete, including feature test macro.

Good point, then lets keep it as is.

philnik updated this revision to Diff 397967.Jan 6 2022, 12:45 PM
philnik marked 5 inline comments as done.
  • Address Arthur's nit
This revision was landed with ongoing or failed builds.Jan 6 2022, 3:10 PM
This revision was automatically updated to reflect the committed changes.