This is an archive of the discontinued LLVM Phabricator instance.

[String]: Inline clear_and_shrink and make it constexpr
AbandonedPublic

Authored by philnik on Sep 14 2022, 11:40 PM.

Details

Reviewers
bemeryesr
Group Reviewers
Restricted Project
Summary

Background: __clear_and_shrink() (added in https://reviews.llvm.org/D41976) is used in the copy assignment of allocators. It was added as an optimization to prevent calling __shrink_to_fit(), when we have just called clear() on the string as it makes an additional allocation. Instead, it deallocates the allocated string and then sets the short string size to 0, making the active member of the union in __rep, __s I.e. a short string. It changes the representation to a short string to satisfy the invariant data()[size()] != value_type() and to prevent deallocation of the heap allocated long string a second time in the string destructor.

Issue: Since C++20, it is legal to change the active member of a union within a constant expression. However, it is not legal to read from a non-active member of a union. During constant evaluation, short string optimisation is not used so the string is always a long string. After calling __clear_and_shrink, the active member is changed to a short string. Therefore, any operations which attempt to read from __rep will cause a compiler error. So this function has the requirement that no read operations are performed on __rep after it is called.

Reproducing: The following code demonstrates the issue:

#include <cassert>
#include <string>

constexpr bool test()
{
    std::string l = "Long string so that allocation definitely, for sure, absolutely happens. Probably.";
    l.__clear_and_shrink();

    return true;
}

int main(int, char**)
{
    static_assert(test());

    return 0;
}

This code fails with the error:

../src/clear_and_shrink_test.cpp:14:19: error: static assertion expression is not an integral constant expression
    static_assert(test());
                  ^~~~~~
/home/brendanemery/work_esr/llvm-project/build/include/c++/v1/string:1618:17: note: read of member '__l' of union with active member '__s' is not allowed in a constant expression
        {return __r_.first().__l.__data_;}
                ^
/home/brendanemery/work_esr/llvm-project/build/include/c++/v1/string:2347:47: note: in call to '&l->__get_long_pointer()'
        __alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
                                              ^
../src/clear_and_shrink_test.cpp:6:17: note: in call to '&l->~basic_string()'
    std::string l = "Long string so that allocation definitely, for sure, absolutely happens. Probably.";
                ^
../src/clear_and_shrink_test.cpp:14:19: note: in call to 'test()'
    static_assert(test());

Solution: Since __clear_and_shrink() is only used in one place, my proposed solution is to inline the contents of the function in the calling code. This (1) maintains the intended optimisation i.e. not calling shrink_to_fit() in the copy assignment of allocators. (2) Prevents the string left being in a dangerous state after calling __clear_and_shrink() i.e. causing a compiler error if a read operation is performed on __rep.

Diff Detail

Event Timeline

bemeryesr created this revision.Sep 14 2022, 11:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2022, 11:40 PM
bemeryesr edited the summary of this revision. (Show Details)Sep 14 2022, 11:44 PM
bemeryesr edited the summary of this revision. (Show Details)Sep 14 2022, 11:54 PM
bemeryesr edited the summary of this revision. (Show Details)Sep 15 2022, 12:14 AM
bemeryesr edited the summary of this revision. (Show Details)
bemeryesr edited the summary of this revision. (Show Details)Sep 15 2022, 2:28 AM
bemeryesr edited the summary of this revision. (Show Details)Sep 15 2022, 2:34 AM
bemeryesr updated this revision to Diff 460339.Sep 15 2022, 2:42 AM
bemeryesr edited the summary of this revision. (Show Details)

Update the commit message

bemeryesr edited the summary of this revision. (Show Details)Sep 15 2022, 9:45 AM
bemeryesr published this revision for review.Sep 15 2022, 9:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 9:50 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

@philnik Can you take a first stab at this? I think this solves some important issues with constexpr that @jorgbrown was telling me about.

Do you have a reproducer that doesn't rely on calling internal functions?

libcxx/include/string
1726

Please run the string benchmarks to make sure this change doesn't cause performance regressions.

libcxx/test/std/strings/basic.string/string.cons/copy_assignment.pass.cpp
46

Why did these tests change?

philnik commandeered this revision.Aug 31 2023, 5:05 PM
philnik added a reviewer: bemeryesr.

[Github PR transition cleanup]

I'm commandeering and abandoning this, since there hasn't been any activity for months.

philnik abandoned this revision.Aug 31 2023, 5:05 PM
libcxx/include/string