This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Explicitly mark erroneous string_view ctors as deleted
AbandonedPublic

Authored by georgthegreat on May 5 2020, 9:15 AM.

Details

Summary

According to the standard, the behavior is undefined if [s, s+count) is not a valid range (even though the constructor may not access any of the elements of this range)[1].

At the time, libcxx gives runtime assertion in case of invokation of std::string_view ctor will nullptr passed.
This patch improves the diagnostic of the described case by making the compilation fail.

The same applies to std::string ctors. I will update the patch if the idea would proof useful.

[1]: https://en.cppreference.com/w/cpp/string/basic_string_view/basic_string_view

Diff Detail

Event Timeline

georgthegreat created this revision.May 5 2020, 9:15 AM
georgthegreat edited the summary of this revision. (Show Details)May 5 2020, 9:20 AM
georgthegreat edited the summary of this revision. (Show Details)May 5 2020, 10:11 AM

You're not really solving the problem you're talking about.

This patch *will* catch this (really really rare) code: string(nullptr, 3).
but not this (far, far more common) code:

const char *p = nullptr; // or more likely, p = functionThatCanReturnNull()
string s(p, 3);

That one can only be caught at run time.

georgthegreat added a comment.EditedMay 6 2020, 12:37 AM

Indeed, the proposed change does not allow to get rid of runtime checks — it just makes invalid cases easier to detect.
As for the rarity, I have applied these changes (both to std::string_view and std::string) and ran CI checks over a private monorepo.
It gave me 7 possible problematic places, one of which would actually lead to a segfault, if reached in runtime (the code was actually reachable). Most of the problems are caused by contrib code, including projects like

BTW, string(nullptr, 0) is a valid call. The range [nullptr, nullptr) is valid.

georgthegreat added a comment.EditedMay 6 2020, 7:31 AM

If you know that the size of your data is always zero, you should use default ctor, shouldn't you?

While I support this change, it should really be accompanied by a library working group issue or paper.

Delet

If you know that the size of your data is always zero, you should use default ctor, shouldn't you?

No, because string_view(nullptr) isn't valid code, but string_view(nullptr, 0) is.

Also, most cases of this bug I've encountered take the form:

if (sv == nullptr) { ... }

Where they really meant to write

if (sv.empty()) {}

So I think this i a big enough problem to want to clean it up.

libcxx/include/string_view
229

This part of the change is incorrect because std::string_view sv(nullptr, 0) is valid code.

georgthegreat added a comment.EditedMay 18 2020, 8:47 AM

I have just submitted P2166R0 to LEWG-I, LEWG, LWG.
While the proposal is not published, this link can be used to preview it.

The proposal does not suggest removing (nullptr, size_t) constructors, though I still think that there is no cases for such behaviour to be allowed.

georgthegreat added a subscriber: ldionne.EditedJun 8 2021, 3:16 AM

I did everything as @EricWF suggested.
P2166R1 was just approved and awaits being officially textualized into the working draft.

@EricWF @ldionne @mclow.lists, is there a chance to finally merge this?

Do you think if C++23 guards are required?

How about some tests?

Corresponding diff landed in D106801.

I am discarding this.

georgthegreat abandoned this revision.Dec 30 2021, 5:37 AM