Details
- Reviewers
• Quuxplusone ldionne - Group Reviewers
Restricted Project - Commits
- rG425620ccdd47: [libc++] Implement P0980R1 (constexpr std::string)
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
- Changed split non-/constexpr tests into one and used TEST_IS_CONSTANT_EVALUATED
- replaced main() with main(int, char**)
- Updated feature-test macro
Will do, but I was waiting for the build to be green. Can you ping me (on Discord) when you want me to take a look?
Please update the PR summary (i.e. the proposed commit message and general info about the patch). Particularly, I'd like to see some paragraph-form text explaining the general approach. I'm sure it's been explained and discussed several times, e.g. on Discord, probably a couple times in this PR's comments... but IIUC the details have also changed over time. An up-to-date description belongs in the PR summary, and ultimately in the commit message. It would even be appropriate to add something to libcxx/docs/DesignDocs/. All of this will benefit the next person to maintain this and/or to implement constexpr vector or constexpr whatever-they-do-in-C++23.
It would help if this were split into relatively orthogonal chunks, most of which could be "NFC." For example:
- The test_iterators.h diff, which is just adding constexpr to one type, can be landed today AFAIC.
- The test_macros.h diff is probably unnecessary?
- A huge number of diffs in <string> are just adding the word _LIBCPP_CONSTEXPR_AFTER_CXX17; since this is all template code where the compiler shouldn't complain about incorrect constexpr keywords, can we land those as a separate patch, before the behavioral changes?
- Vice versa, can we land the behavioral changes first and then the _LIBCPP_CONSTEXPR_AFTER_CXX17 markings second?
- Several new member functions: __begin_lifetime, __finish_replace, __free_memory_constexpr, __insert_not_in_range. Can we introduce those first (changing the "shape" of the control flow), and then in a second step change their behavior to include the constexpr-friendly codepaths?
- The constexpr-ified tests do need to be landed in the same commit as the behavior they test... but for review purposes they are orthogonal/distracting.
- Even if we land this PR as a monolith, it would still be vastly easier to review in small pieces like that. If you're trying to review the control flow, all the _LIBCPP_CONSTEXPR_AFTER_CXX17 markings are distracting; and vice versa.
I don't find Phab's interface very convenient for splitting up patches in these ways. You can make a bunch of Phab PRs and link them together with dependencies, but really it might be easier to just edit the issue summary to include [the design info, plus] a link to a branch on your own GitHub (I mean a URL like https://github.com/Quuxplusone/llvm-project/compare/main...trivially-relocatable ) and say "Hey reviewers, if you prefer to see this PR broken down into its orthogonal component parts, take a look at this series of commits in GitHub."
libcxx/include/string | ||
---|---|---|
1818–1820 | _VSTD::construct_at(_VSTD::addressof(__p[__i]), value_type())); for ADL. | |
4210–4218 | The PR summary (and possibly a design doc) should explain why the class invariants need to change in constexpr-land. size_type __sso_capacity = __libcpp_is_constant_evaluated() ? 0 : (__min_cap - 1); if (size() > capacity()) return false; if (capacity() < __sso_capacity) return false; if (data() == nullptr) return false; if (data()[size()] != value_type()) return false; return true; IIUC, you use this calculation several places in this PR. Consider introducing a function __sso_capacity() for this purpose... except that at first glance I'm worried about any function whose return value changes depending on constexprness. I'm not sure that such a function is safe to use. So maybe pulling it out into a function would be very bad, I'm not sure. |
Thanks for working on this huge task!
In general I'm happy, but I'm not familiar enough with the internals of our string implementation to validate you didn't miss any important parts. Therefore I leave approval to somebody more familiar with string.
(This patch basically touches every function in string and thus "breaks" other patches for string. Would it be an idea to make some cleanup patches after this patch lands? Then we can use _LIBCPP_HIDE_FROM_ABI and std::.)
libcxx/include/string | ||
---|---|---|
77 | Please update the synopsis. | |
1002–1003 | The test test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp can be updated by removing | |
1465–1466 | Nit: I would keep the order LIBCPP_INLINE_VISIBILITY constexpr for consistency with the other changes. | |
1520 | Should these new functions be private members? | |
libcxx/test/std/strings/basic.string/string.access/at.pass.cpp | ||
12 ↗ | (On Diff #416937) | Please update the synopsis in all tests. |
libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp | ||
82 ↗ | (On Diff #416937) | Why don't you test this in constexpr? The signature has changed in the paper. |
libcxx/test/std/strings/basic.string/string.modifiers/string_insert/size_pointer_size.pass.cpp | ||
53 ↗ | (On Diff #416937) | Why is this part removed? |
libcxx/test/std/strings/basic.string/string.starts_with/starts_with.char.pass.cpp | ||
19 ↗ | (On Diff #416937) | These functions are only available in C++20. (The same for ends_with and contains.) |
libcxx/include/string | ||
---|---|---|
1465–1466 | I'd keep it as-is for now. I don't want to make this patch unnecessarily larger. | |
libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp | ||
82 ↗ | (On Diff #416937) | This test makes extremely large allocations, like numeric_limits<size_t>::max() large. Because of this the constant evaluations fails. |
libcxx/test/std/strings/basic.string/string.modifiers/string_insert/size_pointer_size.pass.cpp | ||
---|---|---|
53 ↗ | (On Diff #416937) | This is duplicated for std::string and std::basic_string<char, std::char_traits<char>, min_allocator<char>>. I just made this a template and removed the duplicated code. |
SGTM modulo some small issues.
libcxx/test/std/strings/basic.string/string.capacity/max_size.pass.cpp | ||
---|---|---|
82 ↗ | (On Diff #416937) | Then please a test to validate the function can be called compile-time and add comment why we done the minimal testing for compile-time testing. |
libcxx/test/std/strings/basic.string/string.contains/contains.char.pass.cpp | ||
19 ↗ | (On Diff #418040) |
Thanks for the patch! Mostly small comments about forgetting stuff in the tests, however I think we'll need to figure out a better story for the string's invariant inside constexpr. I think we agreed on a direction to investigate during our live review, so let's try that and re-evaluate after. Thanks!
cross-project-tests/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp | ||
---|---|---|
12 ↗ | (On Diff #391005) | I don't think you intended to upload that. |
libcxx/include/string | ||
917–918 | Please consider not reformatting as part of this patch since it's good to keep it as mechanical as possible. | |
1946–1947 | What if instead of doing this, we made sure to have the invariants of the string hold before calling __init, and then we'd "simply" have to check whether we already have a long string allocated at the beginning of __init? | |
2096 | This line does not look correct to me. Indeed, if __str is a short string, after the change we are accessing __str's long representation instead of its raw representation, which means that we might not be copying some part of the small buffer, if the small buffer is larger than the long representation. This could happen if we had a very large character type where 2 * sizeof(CharType) is larger than sizeof(LongRepresentation). In that case, our short string buffer will contain two CharTypes at least, yet our long representation will be smaller than that. Please add a test to catch this, or explain why that doesn't work (you mentioned a bug related to large character types during live review). | |
2365–2370 | Why don't we call __default_init() on a moved-from string instead? Right now, after calling __zero() during constexpr evaluation, the string is invalid to access, thus we don't have the invariant __is_long(). Instead, I think we should make sure that the string satisfies __is_long() at all times during constexpr evaluation, which probably means replacing a few calls to __zero() by calls to __default_init(). This would allow removing the diff above. | |
2575 | Same here -- if the invariant __is_long() held during constexpr evaluation, this nullptr check could go away. | |
2593 | For instance, I think this is one place where we should call __default_init() instead. | |
2903–2909 | I'd like to understand why this diff is needed. What happens if you remove it? | |
3145–3148 | Same question here -- why is this required? | |
4212–4213 | If we never put the string in an invalid state inside constexpr, is it possible to remove this diff entirely? | |
4449–4450 | Could we investigate adding a private constructor that allows passing a capacity to pre-allocate, so this could be implemented like this: using _String = basic_string<_CharT, _Traits, _Allocator>; typename _String::size_type __lhs_sz = __lhs.size(); typename _String::size_type __rhs_sz = __rhs.size(); _String __r(_String::__alloc_traits::select_on_container_copy_construction(__lhs.get_allocator()), __capacity_tag{}, __lhs_sz + __rhs_sz); __r.append(__lhs.data(), __lhs_sz); __r.append(__rhs.data(), __rhs_sz); return __r; Can we please do this as a separate review? That should be easy to split out. | |
libcxx/test/std/strings/basic.string/string.cons/default_noexcept.pass.cpp | ||
26–37 ↗ | (On Diff #419393) | Unless I am mistaken, we don't seem to have any test for basic_string::basic_string() that runs at runtime. This is kind of crazy. As a different review, can you please move this test to default.pass.cpp, make it run the default constructor at runtime, and prepare it for constexpr-friendliness as well? That way, this diff can look like the other ones where you basically uncomment static_assert(test());. |
libcxx/test/std/strings/basic.string/string.cons/dtor_noexcept.pass.cpp | ||
13 ↗ | (On Diff #419393) | Same thing here -- we don't have a runtime test for the string destructor outside of the static variable below. Please add one -- this can be done in the same review as the default ctor. You could use a special allocator and check that deallocate() has been called after the string is destroyed. |
libcxx/test/std/strings/basic.string/string.iterators/iterators.pass.cpp | ||
81 ↗ | (On Diff #419393) | I think this needs to change to TEST_STD_VER > 17? Please also grep for __cpp_lib_constexpr_string after applying your patch just to make sure there aren't any stray ones left where they shouldn't be. |
libcxx/test/std/strings/basic.string/string.modifiers/string_insert/size_T_size_size.pass.cpp | ||
1867–1898 ↗ | (On Diff #419393) | I think you missed those. I'm also not a big fan of the 2k lines of test here, but this has nothing to do with this review. |
libcxx/test/std/strings/basic.string/string.modifiers/string_insert/size_string_size_size.pass.cpp | ||
1827–1857 ↗ | (On Diff #419393) | You forgot those too! Please grep for // static_assert in the test suite just to make sure you haven't missed other ones. |
libcxx/test/std/strings/basic.string/string.modifiers/string_replace/iter_iter_iter_iter.pass.cpp | ||
1060–1069 ↗ | (On Diff #419393) | Here! |
libcxx/test/std/strings/basic.string/string.modifiers/string_replace/iter_iter_pointer_size.pass.cpp | ||
995–1006 ↗ | (On Diff #419393) | Here! |
libcxx/test/std/strings/basic.string/string.modifiers/string_replace/size_size_T_size_size.pass.cpp | ||
6079 ↗ | (On Diff #419393) | Here! |
libcxx/test/std/strings/basic.string/string.modifiers/string_replace/size_size_pointer_size.pass.cpp | ||
1340 ↗ | (On Diff #419393) | Here! |
libcxx/test/std/strings/basic.string/string.modifiers/string_replace/size_size_string_size_size.pass.cpp | ||
6069 ↗ | (On Diff #419393) | Here! Wow, 6k lines of test. I'm sure they are all 100% relevant -_-. |
libcxx/test/std/strings/basic.string/string.nonmembers/string.io/get_line.pass.cpp | ||
14 ↗ | (On Diff #419393) | I don't think you meant to make this one constexpr. |
libcxx/test/std/strings/basic.string/string.nonmembers/string.io/get_line_delim.pass.cpp | ||
14 ↗ | (On Diff #419393) | Same, not constexpr. |
libcxx/test/std/strings/basic.string/string.nonmembers/string.io/get_line_delim_rv.pass.cpp | ||
14 ↗ | (On Diff #419393) | Same! |
libcxx/test/std/strings/basic.string/string.nonmembers/string.io/get_line_rv.pass.cpp | ||
14 ↗ | (On Diff #419393) | Same, and for all string.io probably. |
libcxx/test/std/strings/basic.string/string.ops/string_compare/size_size_T_size_size.pass.cpp | ||
5955 ↗ | (On Diff #419393) | We are missing some constexpr-ification here for sure. |
libcxx/test/std/strings/basic.string/string.ops/string_compare/size_size_pointer_size.pass.cpp | ||
1335 ↗ | (On Diff #419393) | Missed these ones! |
libcxx/test/std/strings/basic.string/string.ops/string_compare/size_size_string_size_size.pass.cpp | ||
6041 ↗ | (On Diff #419393) | Missed! |
libcxx/test/std/strings/basic.string/string.ops/string_find.last.not.of/string_view_size.pass.cpp | ||
154–157 ↗ | (On Diff #419393) | Can you try uncommenting this and seeing what happens? |
libcxx/test/support/constexpr_char_traits.h | ||
121 ↗ | (On Diff #419393) | // fails in constexpr because we might be comparing unrelated pointers |
- Rebased
- Address comments
libcxx/include/string | ||
---|---|---|
1946–1947 | I think we agreed that this is OK and we make sure that __init is only ever called inside constructors. | |
2903–2909 | We may compare unrelated pointers with __p + __pos <= __s && __s < __p + __sz. | |
3145–3148 | Same here - we may compare unrelated pointers. | |
4449–4450 | This is D123058 now. | |
libcxx/test/std/strings/basic.string/string.cons/dtor_noexcept.pass.cpp | ||
13 ↗ | (On Diff #419393) | This is now D123129. |
libcxx/test/std/strings/basic.string/string.ops/string_find.last.not.of/string_view_size.pass.cpp | ||
154–157 ↗ | (On Diff #419393) | Seems to just work. |
Thanks! Still a few comments and questions, but this is looking really good now.
libcxx/include/string | ||
---|---|---|
1590–1591 | I don't understand why this diff is required when running in constexpr. What happens if you remove it? | |
1620 | Is there any reason why we don't simply always do __r_.first() = __rep()? That would work both in constexpr and non-constexpr, and it seems to be pretty much equivalent: https://godbolt.org/z/T85f13ozT And when that's done, perhaps we can even remove __zero() altogether? | |
1793–1794 | This needs a comment! | |
3145–3147 | Note to future me: I looked at this and it seems OK, no need to spend another 15 minutes validating. | |
3522 | Can we repace this by a call to __fits_in_sso(__target_capacity)? (or maybe __target_capacity +/- 1, please double-check) | |
libcxx/test/support/constexpr_char_traits.h | ||
130 ↗ | (On Diff #421756) | @daltenty The CI is currently failing in pretty bad ways on AIX. The compiler is being reported as IBM Open XL C/C++ for AIX 17.1.0 (5725-C72, 5765-J18), clang version 13.0.0, which we claim to support on our page. I believe there might be some patches to the constant evaluator on top of Clang 13 that have not been cherry-picked to your internal branch. I suspect there are probably no reasonable workarounds to make std::string work inside constexpr on that compiler without fixing the compiler itself. Hence, I am unsure what's the best course of action here. I think what we should do is either:
Do you have thoughts on that? Note to @philnik: Since I'm about to go OOO, if there is no resolution for this by EOW, feel free to land this with option (1) above (which is simplest) and we can always do (2) afterwards if that's the preferred solution. |
libcxx/include/string | ||
---|---|---|
1640–1642 | Can you explain this? |
libcxx/include/string | ||
---|---|---|
1640–1642 | During runtime we never actually allocate __min_cap bytes, so it's OK that we return an even number in this case. It only exists for checking if the string fits in SSO or is in SSO mode AFAICT. (This should probably be changed at some point) During constant evaluation we have to return an odd number to make the allocation size even, otherwise we save garbage in the capacity member. That's why the other way to fix the bug was saving the actual capacity. |
Please rebase onto main to get a clean CI run, and ship it. Thanks!
libcxx/include/string | ||
---|---|---|
1640–1642 | Ok. As discussed offline, let's land this as-is, but let's also make a follow-up patch that tries to fix places that assume __min_cap - 1 is returned when the string fits in the SSO, so we can remove this if and always return static_cast<size_type>(__min_cap);. |
For internal implementation details such as __compressed_pair, please be aggressive in adding constexpr (anywhere it's actually needed, of course). On these two lines, you can use _LIBCPP_CONSTEXPR instead of _LIBCPP_CONSTEXPR_AFTER_CXX17. The benefit is that on that far-future day where we drop support for C++03, we can just mechanically change all _LIBCPP_CONSTEXPR to constexpr; whereas we can't mechanically change _LIBCPP_CONSTEXPR_AFTER_CXX17 to constexpr until we drop support for C++17!
For user-facing (public, standards-conforming) bits, of course we have to apply constexpr conservatively.