This is an archive of the discontinued LLVM Phabricator instance.

Implement http://wg21.link/P0254R1: "Integrating std::string_view and std::string"
ClosedPublic

Authored by mclow.lists on Jun 16 2016, 9:34 PM.

Details

Summary

This is not ready to be committed. However, it is (I believe) a full implementation of P0254R1, and I'm looking for feedback.

A couple of things to note:

  • This is a very large patch, but most of the changes are new tests.
  • I've moved a bunch of stuff around, and sunk common stuff in a new header <__string>.
  • Though string_view is a C++17 feature, I've made it available all the way back to C++03. Otherwise, the integration with <string> gets really painful.
  • Some of the routines taking a`string_view` are marked as inline where the corresponding string ones are not, because string and wstring are explicitly instantiated in the dylib, and if they aren't inline, you'll need a new dylib. Example: basic_string& assign(const basic_string& __str, size_type __pos, size_type __n=npos); is not inline, but basic_string& assign(basic_string_view __sv, size_type pos, size_type n=npos); is.

Diff Detail

Event Timeline

mclow.lists retitled this revision from to Implement http://wg21.link/P0254R1: "Integrating std::string_view and std::string".
mclow.lists updated this object.
mclow.lists added reviewers: EricWF, howard.hinnant.
mclow.lists added a subscriber: cfe-commits.

I know that there are more tests needed.

mclow.lists added inline comments.Jun 17 2016, 4:43 PM
test/std/strings/string.view/string.view.template/nothing_to_do.pass.cpp
11

This should be <string_view>

jbcoe added a subscriber: jbcoe.Jun 20 2016, 4:11 AM

More tests; removed the last mentions of basic_string from <string_view>

EricWF edited edge metadata.Jul 20 2016, 5:28 PM

FYI this patch adds 6 new symbols to the dylib:

Symbol added: _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE7replaceEmmNS_17basic_string_viewIcS2_EEmm
    {'type': 'FUNC', 'name': '_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE7replaceEmmNS_17basic_string_viewIcS2_EEmm'}

Symbol added: _ZNKSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE7compareEmmNS_17basic_string_viewIwS2_EEmm
    {'type': 'FUNC', 'name': '_ZNKSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE7compareEmmNS_17basic_string_viewIwS2_EEmm'}

Symbol added: _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6insertEmNS_17basic_string_viewIcS2_EEmm
    {'type': 'FUNC', 'name': '_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6insertEmNS_17basic_string_viewIcS2_EEmm'}

Symbol added: _ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertEmNS_17basic_string_viewIwS2_EEmm
    {'type': 'FUNC', 'name': '_ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertEmNS_17basic_string_viewIwS2_EEmm'}

Symbol added: _ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE7replaceEmmNS_17basic_string_viewIwS2_EEmm
    {'type': 'FUNC', 'name': '_ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE7replaceEmmNS_17basic_string_viewIwS2_EEmm'}

Symbol added: _ZNKSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE7compareEmmNS_17basic_string_viewIcS2_EEmm
    {'type': 'FUNC', 'name': '_ZNKSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE7compareEmmNS_17basic_string_viewIcS2_EEmm'}

More comments to come.

include/string
3206

Why is this call no longer qualified? It has UDT's that can force ADL.

3216

Why is this call no longer qualified? It has UDT's that can force ADL.

Why did you remove the _VSTD qualified on most of the __str_find function calls? I think that should be a qualified call because a UDT traits type can hijack overload resolution.
(Although __str_find* is a reserved name).

I'm happy with the changes to <string>. Moving onto <string_view>

include/string
797

indent.

997

Missing _LIBCPP_INLINE_VISIBILITY

1038

Missing _LIBCPP_INLINE_VISIBILITY

1157

Missing _LIBCPP_INLINE_VISIBILITY

2374

Missing _LIBCPP_INLINE_VISIBILITY

2384

Missing _LIBCPP_INLINE_VISIBILITY

2940

This call should be qualified.

2950

This call should be qualified.

2960

This call should be qualified.

2980

This call should be qualified.

2993

This call should be qualified.

3003

This call should be qualified.

The <string_view>` bits LGTM.

include/string_view
210

I think we have two choices for this diagnostic:

  1. Remove it, since it will never work with C++11 constexpr.
  2. Only expose the assertion in C++14 or newer.
  3. Only may this constructor constexpr in C++14, and keep the assertion.

(3) seems like a Good option to me since offering string_view in C++11 is an extension anyway.

315

These lines should be removed?

332

These lines should be removed.

356

Weird whitespace.

mclow.lists added inline comments.Jul 20 2016, 6:43 PM
include/string
3216

Because it has a reserved name. Users cannot provide a function with this name in their programs.

include/string_view
210

Either #2 or #3 is fine with me.

332

Yeah. Frackin' C++11 constexpr.

mclow.lists edited edge metadata.

Address most of Eric's concerns.

EricWF added inline comments.Jul 20 2016, 7:29 PM
include/__string
710

This declaration should be exposed in all dialects since parts of iomanip depend on it.

mclow.lists accepted this revision.Jul 20 2016, 10:39 PM
mclow.lists added a reviewer: mclow.lists.

Landed as revision 276238.

This revision is now accepted and ready to land.Jul 20 2016, 10:39 PM
mclow.lists closed this revision.Jul 20 2016, 10:39 PM
kimgr added a subscriber: kimgr.Jul 21 2016, 1:01 AM

Inline question on ctor+nullptr

include/string_view
216

I'm working from the paper at https://isocpp.org/files/papers/N3762.html, and I find it a little sketchy on the policy for nullptrs.

Since the ctor above accepts nullptr as long as the length is zero, would it make sense to do that here too? That is, only call _Traits::length for non-nullptr __s args?

mclow.lists added inline comments.Jul 21 2016, 6:33 AM
include/string_view
216

Reading from N4600: Requires: [str, str + traits::length(str)) is a valid range.

So, no - passing nullptr here is undefined.

kimgr added inline comments.Jul 21 2016, 8:57 AM
include/string_view
216

OK, that feels more principled to me, anyway.

But the (const char*, size_t) constructor has the same requirement and it *does* accept nullptr, provided the length is zero. I saw you had to get rid of the assertion, but I feel like the constructor should probably not silently accept nullptr for zero-sized strings. Or do you know if that's intentional? Many StringRef/StringPiece implementations seem to have the same behavior.

mclow.lists added inline comments.Jul 21 2016, 11:56 AM
include/string_view
216

It is absolutely intentional. [nullptr, nullptr+0) is a perfectly fine half-open range. It contains no characters.

However, the ctor that takes just a pointer has to calculate the length .. by dereferencing the pointer.

I had to get rid of the assertion because one of the bots (gcc 4.9) has a bug about constexpr ctors in c++11 mode. Even though the assertion was #ifdefed on C++ > 11, the compiler complained about it. I'll be putting the assertion back as soon as I can figure out how to keep gcc from complaining.

216

This was discussed (at length) in LEWG during the design of string_view.

kimgr added inline comments.Jul 21 2016, 1:58 PM
include/string_view
216

Ah, got it, thanks! It opens up for cases where data() for an empty string_view can sometimes return "" and sometimes nullptr, but I guess that problem extends beyond string_view's responsibilities.

Thanks for the thorough explanation.

mclow.lists added inline comments.Jul 21 2016, 10:17 PM
include/string_view
216

I think you're laboring under a misapprehension here.

An empty string_view points to *no characters*, not an empty null-terminated string.

Treating the pointer that you get back from string_view::data as a null-terminated string will lead to all sorts of problems. This is explicitly called out in [string.view.access]/19:

Note: Unlike basic_string::data() and string literals, data() may return a pointer to a buffer that is not null-terminated. Therefore it is typically a mistake to pass data() to a routine that takes just a const charT* and expects a null-terminated string.

kimgr added a comment.Jul 28 2016, 9:11 AM

This is probably not the right place for this discussion, but I thought I'd offer one more note.

include/string_view
216

Thanks, I think I'm free of that particular misapprehension :-)

The case I had in mind was:

void f(const string_view& s) {
  char buf[50] = {0};
  assert(s.size() < 50);
  memcpy(buf, s.data(), s.size());
}

There's no assumption of null-termination here, but as I understand it, memcpy has undefined behavior if source is nullptr, irrespective of whether num is zero or not.

So these would be valid:

f(string_view("", 0));
f(string_view(""));

but this wouldn't:

f(string_view(nullptr, 0));

which I find slightly asymmetric.

But again, maybe this is not string_view's fault, but memcpy's, in which case nullptr kludges could be moved to wherever memcpy is used instead of wherever string_view::string_view is used.

test/std/strings/string.view/string.view.ops/compare.size_size_sv_pointer_size.pass.cpp