Page MenuHomePhabricator

[libc++][P1679] add string contains
ClosedPublic

Authored by WimLeflere on Dec 29 2020, 12:11 PM.

Details

Summary

C++23 string contains implementation and tests

Paper: https://wg21.link/P1679R3
Standard (string): https://eel.is/c++draft/string.contains
Standard (string_view): https://eel.is/c++draft/string.view.ops#lib:contains,basic_string_view

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Where are the feature test macros defined?
__cpp_lib_string_contains should be added

The macro should be added to libcxx/utils/generate_feature_test_macro_components.py, then run the script to generate the proper macros and tests.

libcxx/include/string
1442

Wouldn't it be better to use constexpr and noexcept since the code is never compiled using C++98.

Where are the feature test macros defined?
__cpp_lib_string_contains should be added

The macro should be added to libcxx/utils/generate_feature_test_macro_components.py, then run the script to generate the proper macros and tests.

Thanks!
Should I check in the (changed) generated files?

The macro should be added to libcxx/utils/generate_feature_test_macro_components.py, then run the script to generate the proper macros and tests.

Thanks!
Should I check in the (changed) generated files?

Yes please.

WimLeflere updated this revision to Diff 315247.Jan 7 2021, 2:28 PM

Added feature test macro to script and updated generated files.

This looks like WAY too much code for "add string contains".

ldionne requested changes to this revision.Jan 7 2021, 2:49 PM

Added feature test macro to script and updated generated files.

Thanks for your contribution! I think you should try rebasing against main, as most of the changes @curdeius was discussing have been checked in now (thanks Marek!).

I believe that should remove the need for most of the boilerplate/generated stuff in this patch, which should end up being much simpler. I see you added tests and updated the synopsis, that's great.

This revision now requires changes to proceed.Jan 7 2021, 2:49 PM

This looks like WAY too much code for "add string contains".

The bulk of the changes are coming from introducing C++2b in the generate_feature_test_macro script.
This generates a lot of C++2b feature macro tests which are not directly related to string contains.

Should introducing C++2b in the generate_feature_test_macro script be done in a separate review?

Added feature test macro to script and updated generated files.

Thanks for your contribution! I think you should try rebasing against main, as most of the changes @curdeius was discussing have been checked in now (thanks Marek!).

I believe that should remove the need for most of the boilerplate/generated stuff in this patch, which should end up being much simpler. I see you added tests and updated the synopsis, that's great.

The patch was created after rebasing on 1a2eaeb.
Should I wait for other changes?

There is already D94227. You might want to wait for it and rebase when it's merged.

Merged main with changes from D94227 and D93830.

@Mordante 's question is still open

Wouldn't it be better to use constexpr and noexcept since the code is never compiled using C++98.

curdeius added inline comments.Jan 8 2021, 11:28 AM
libcxx/test/std/strings/basic.string/string.contains/contains.ptr.pass.cpp
49

I think you lack some test cases like:
assert(!s3.contains("ce"));
That is all characters exist in the string but not in this order.
Also, "ced" - permutation.

Added extra testcases

Please reupload the diff setting the repo to rG LLVM Monorepo so that the CI gets triggered. You might need to do it twice to make it work. Otherwise use arcanist and it should do the job.
Otherwise looks good!

WimLeflere set the repository for this revision to rG LLVM Github Monorepo.

Setting correct repo

curdeius accepted this revision.Jan 10 2021, 6:03 AM

Great.
LGTM if CI passes. But please wait for an approval from somebody from libc++ group, e.g. @ldionne.

curdeius requested changes to this revision.Jan 11 2021, 5:24 AM
curdeius added inline comments.
libcxx/include/string
1442

@WimLeflere, please use constexpr and noexcept directly as suggested above.

libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
22

You cannot just use std::string as it's not constexpr-friendly (yet), but need to use a constexpr-loving allocator or do some other hacks.

libcxx/test/std/strings/string.view/string.view.template/contains.string_view.pass.cpp
9

You should still use "c++2a" and not "c++20" as it's derived from the flag name (and we still use c++2a).
The same applies to all other tests.

This revision now requires changes to proceed.Jan 11 2021, 5:24 AM
curdeius added inline comments.Jan 11 2021, 5:27 AM
libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
22

Actually, I think we just need to comment out constexpr test and add a FIXME note.

  • replaced constexpr/noexcept macros with keywords
  • made const char* overload not noexcept
  • disabled constexpr std::string tests
  • used c++2a label for test support
WimLeflere added inline comments.Jan 11 2021, 10:42 AM
libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
22

I could add a #ifdef __cpp_lib_constexpr_string error message

Quuxplusone added inline comments.Jan 11 2021, 11:34 AM
libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
22

Just add a FIXME comment that contains the exact string P0980. That way, anybody implementing P0980 "Making std::string constexpr" will be able to find this FIXME comment, and fix it, as part of their grepping for P0980.

added paper nr to FIXME

  • removed constexpr from string test functions
  • removed noexcept assert for const char* overload

A few small comments. This is looking really good :)

libcxx/include/string
1450

Thoughts on making this one noexcept as well? (I know it's not noexcept in the standard, does anyone know the reason for that?)

libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
36

See my comment below... that means we don't need this either.

43

I think we should leave this out. Whoever implements that can add tests as part of their patch.

libcxx/test/std/strings/basic.string/string.contains/contains.ptr.pass.cpp
23

We can also use using in C++03 now, just FYI.

Quuxplusone added inline comments.Jan 11 2021, 6:56 PM
libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
43

On further consideration, I think @zoecarver is right. Whoever implements [P0980] is going to have to adjust the tests for all string member functions (not just contains) — we'll need to constexprize the tests for append and substr and operator[] and so on and so forth. No specially greppable reminder ought to be needed in this one test, because it lives right alongside all the other tests that will need just as much auditing.

Quuxplusone added inline comments.Jan 11 2021, 7:01 PM
libcxx/include/string
1450

Lakos Rule. If __s is null, or points to a non-null-terminated array, then this overload has UB; whereas the other two overloads never have UB.

No useful opinion from me on making this one noexcept. Personally I can never remember whether vendors are allowed to add noexcept, or constexpr, or neither. I also don't know if libc++ thinks it's OK to add either keyword if it's not mandated by the standard, because that might be a portability trap for users (they rely on its being noexcept and then their code breaks when they switch to libstdc++ or something).

WimLeflere added inline comments.Jan 12 2021, 12:17 AM
libcxx/include/string
1450

The implementation is based on find(const char*) which is not noexcept (string_view::find).
So I think it's best to keep it not noexcept.

I think a vendor can be more 'strict' than the standard and still be conformant.
In the MS STL they are more strict for example: https://github.com/microsoft/STL/pull/1478#issuecomment-730789583

removed constexpr string test FIXME

Any action required from my side?

zoecarver added inline comments.Jan 18 2021, 12:19 PM
libcxx/include/string
1450

Libc++ adds noexcept to some things that aren't required to have it in the standard. But I don't think we add constexpr. Maybe @mclow.lists or @ldionne can confirm this, though.

The implementation is based on find(const char*) which is not noexcept (string_view::find).
So I think it's best to keep it not noexcept.

Well... they're all implemented with __str_find which is unconditionally noexcept so, I think it might make sense. (Given that, it probably also makes sense to mark that find function as noexcept, but that's out of scope.) But, we should definitely also make the corresponding string_view overload noexpect if we make this one noexecept.

Separate note: I know the paper suggests using __self_view::find, but why not just use basic_string::find (which is noexecpt, against the standard specification, btw)?

ldionne accepted this revision.Jan 18 2021, 12:39 PM
ldionne added inline comments.
libcxx/include/string
1450

Indeed, we've historically added noexcept to some things as "extensions", and my understanding is that we're allowed by the Standard to do so: http://eel.is/c++draft/res.on.exception.handling#5. My personal preference would be to avoid adding these unless they add tremendous value, as they make us diverge from the Standard, and could potentially work against the goal of having a "safe" standard library mode (e.g. when you do something bad, we do something customizeable instead of having UB or necessarily terminating).

We haven't been adding constexpr to things not marked as such in the Standard, and the Standard explicitly forbids us from doing it: http://eel.is/c++draft/constexpr.functions. It would be a bad idea anyway due to portability and future-proofness of our implementation anyway.

@Quuxplusone Arthur, did you still have objections? If not, please accept and I'll commit this.

zoecarver added inline comments.Jan 18 2021, 12:51 PM
libcxx/include/string
1450

My personal preference would be to avoid adding these unless they add tremendous value, as they make us diverge from the Standard, and could potentially work against the goal of having a "safe" standard library mode (e.g. when you do something bad, we do something customizeable instead of having UB or necessarily terminating).

That's fair, but in this case, we're going to necessarily terminate either way. This function is a simple wrapper for string_view::find which is a simple wrapper for __str_find which is always noexcept. Hopefully, the compiler will inline this all away, but there's a fair possibility that we're making the codegen a bit worse, and there is absolutely no benefit afaict. I like the idea of a "safe" standard library mode, and while marking this as noexcept isn't moving towards that, I'd argue it's also not really moving away from it.

@ldionne: LGTM except that the non-constexpr tests no longer need to return true; from bool test() — they could do void test() for now.
Also, I don't think this will rebase as-is on generate_feature_test_macro_components.py, will it? You'll have to add the entry for __cpp_lib_string_contains, which will be easy — but I notice that it will be a divergence from what's literally in https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations as of this writing. I would bet that's fine; I'm just bringing it up for awareness.

The current direction on constexpr/noexcept sounds great to me.

libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp
36

Nit: you can remove the bool and return true for now.

Quuxplusone accepted this revision.Jan 18 2021, 1:07 PM
This revision is now accepted and ready to land.Jan 18 2021, 1:07 PM
ldionne requested changes to this revision.Jan 18 2021, 2:41 PM

Thanks Arthur. Indeed, this doesn't apply cleanly on top of main. Please rebase and address Arthur's comments. Requesting changes just so it shows up in the review queue, but this LGTM except for what I just said.

This revision now requires changes to proceed.Jan 18 2021, 2:41 PM
mclow.lists added inline comments.Jan 18 2021, 2:45 PM
libcxx/include/string
1450

My personal preference would be to add NOEXCEPT here. All this does is call a noexcept constructor, and then calls a noexcept member fn, and returns a bool (which is a noexcept operation as well.

libc++ has traditionally been quite liberal in marking things NOEXCEPT; it can help codegen tremendously in a few very specific circumstances.

ldionne added inline comments.Jan 18 2021, 2:48 PM
libcxx/include/string
1450

Yeah, okay, I'm swayed. I think I agree. Feel free to add noexcept here.

Ok, I take it back. __str_find calls Traits::find, which is NOT always noexcept. (It *is* for all the specializations of std::char_traits, but that's not enough). So we can't slap NOEXCEPT here; in fact, we should revisit __str_find, and remove some of the NOEXCEPTs that are already there.

@ldionne: LGTM except that the non-constexpr tests no longer need to return true; from bool test() — they could do void test() for now.
Also, I don't think this will rebase as-is on generate_feature_test_macro_components.py, will it? You'll have to add the entry for __cpp_lib_string_contains, which will be easy — but I notice that it will be a divergence from what's literally in https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations as of this writing. I would bet that's fine; I'm just bringing it up for awareness.

The current direction on constexpr/noexcept sounds great to me.

The entry for __cpp_lib_string_contains in generate_feature_test_macro_components.py was already added in D94227, but I can rebase anyway.

https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations seems out of date anyway, the draft standard is maybe a better source?
https://eel.is/c++draft/version.syn

https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations seems out of date anyway, the draft standard is maybe a better source? https://eel.is/c++draft/version.syn

Good call. I had not thought to check https://eel.is/c++draft/version.syn when I did D93830. On the one hand version.syn is more authoritative; on the downside version.syn fails to mention what P-numbered papers each macro refers to, and also thus fails to include a "history" of how each macro should be set in each different language mode. I continue to think that at some point we'll need to rejigger generate_feature_test_macro_components.py to include P-numbered papers (e.g. "p1679") in the big data table, as well as language modes (e.g. "c++14").

I wonder if WG21 considers https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations to be unmaintained/obsolete, now that version.syn is part of the formal standard.

Rebased and removed unused bool return from basic_string tests.

It's unclear to me if I should add noexcept to the const char* overloads.

It's unclear to me if I should add noexcept to the const char* overloads.

I think @mclow.lists is right. We shouldn't have noexcept here, and we also need to remove it from some of the __str_find overloads (as a separate patch, which I'm happy to make).

ldionne accepted this revision.Jan 19 2021, 11:33 AM

It's unclear to me if I should add noexcept to the const char* overloads.

I think @mclow.lists is right. We shouldn't have noexcept here, and we also need to remove it from some of the __str_find overloads (as a separate patch, which I'm happy to make).

@WimLeflere Don't add noexcept. Your patch is fine as-is -- we'll tackle the noexcept story separately.

Ok, I take it back. __str_find calls Traits::find, which is NOT always noexcept. (It *is* for all the specializations of std::char_traits, but that's not enough). So we can't slap NOEXCEPT here; in fact, we should revisit __str_find, and remove some of the NOEXCEPTs that are already there.

Hmm, indeed. std::char_traits::find is marked as constexpr since C++17, but technically it could still throw at runtime. @zoecarver Feel free to submit a patch to remove it if you want. You could also audit the other helper functions (like __str_rfind - I see there's a bunch) for the same problem.

This revision is now accepted and ready to land.Jan 19 2021, 11:33 AM
This revision was automatically updated to reflect the committed changes.

Thanks for your contribution!

@WimLeflere I just noticed that during review we didn't mention the https://libcxx.llvm.org/docs/Cxx2bStatus.html should be updated. Does this commit fully implement the paper?

Once you confirm whether it's completed or partly done I'll update the status page.

The paper is fully implemented.

With the caveat that std::string::contains is not constexpr yet.
std::string_view::contains is constexpr

The paper is fully implemented.

With the caveat that std::string::contains is not constexpr yet.
std::string_view::contains is constexpr

Looking at the code that's only because std::string itself can't be used as constexpr yet. The patch has the proper constexpr.

I'll update the status page. Thanks for the information.