This is an archive of the discontinued LLVM Phabricator instance.

Low-hanging fruit optimization in string::__move_assign().
ClosedPublic

Authored by tvanslyke on Jan 11 2018, 7:15 PM.

Details

Summary

shrink_to_fit() ends up doing a lot work to get information that we already know since we just called clear(). This change seems concise enough to be worth the couple extra lines and my benchmarks show that it is indeed a pretty decent win. It looks like the same thing is going on twice in __copy_assign_alloc(), but I didn't want to go overboard since this is my first contribution to llvm/libc++. Go easy on me!

Diff Detail

Event Timeline

tvanslyke created this revision.Jan 11 2018, 7:15 PM

I went ahead and just pulled it out to a small inline member function and added it to __copy_assign_alloc(). Removed some other redundancies.

vsk added a subscriber: vsk.

Adding some folks who may be interested.

Can you share your benchmark results, please?

Here are the results:

Comparing old-copymove.json to new-copymove.json
Benchmark                        Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------
BM_string_move_SSO            -0.5833         -0.5833            12             5            12             5
BM_string_copy_SSO            +0.0000         +0.0000            23            23            23            23
BM_string_move_no_SSO         -0.6250         -0.6250             8             3             8             3
BM_string_copy_no_SSO         +0.0000         +0.0000            11            11            11            11
BM_string_stable_sort         -0.0779         -0.0779       2772114       2556118       2772031       2556036

I put the benchmark on pastebin because it's a tad too long to paste here IMO: https://pastebin.com/mk1JXQme

I'll add that the reason I felt compelled to submit this change was because perf was saying most of my program, that only ever moved strings, was spending a majority of it's time in string::reserve() calls. If for no other reason, I think it's confusing for people to see string::reserve getting called when all they wanted was a move assignment.

Here are the results:

Comparing old-copymove.json to new-copymove.json
Benchmark                        Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------
BM_string_move_SSO            -0.5833         -0.5833            12             5            12             5
BM_string_copy_SSO            +0.0000         +0.0000            23            23            23            23
BM_string_move_no_SSO         -0.6250         -0.6250             8             3             8             3
BM_string_copy_no_SSO         +0.0000         +0.0000            11            11            11            11
BM_string_stable_sort         -0.0779         -0.0779       2772114       2556118       2772031       2556036

I put the benchmark on pastebin because it's a tad too long to paste here IMO: https://pastebin.com/mk1JXQme

Perhaps it should be added to the already-existing benchmarks/string.bench.cpp?
Not sure whether those are being used in test-suite/LNT.

I can adapt it to the style guide if it is decided that it should be added.

I'm a bit leery of this patch. Not because of what it's trying to do, but rather, the introduction of a method __clear_and_shrink that leaves the string in an invalid state. For all the uses that you put it to, I don't think that's a problem (though I'm still working through the failure possibilities), but I can see other people attempting to use this method - and not realizing that you have to "put the string back together" afterwards.

Maybe I'm just being paranoid.

tvanslyke added a comment.EditedJan 16 2018, 7:12 AM

I think the only thing we need is a call to __set_long_cap(0) in the body of the if() to make the string be in a valid state, but if you follow the logic for a call to reserve(0) (after having called clear()) it doesn't even end up doing that (unless I'm missing something). I definitely have no problem adding the __set_long_cap(0) call to the diff.

Just to elaborate, in reserve(0) after the deallocation there's a check if (__now_long) which takes the else branch and only ends up calling __set_short_size(__sz); // __sz = 0 here. So even without this diff the only thing reserve(0) does after deallocating is set the short size to zero, but it already is zero (because of the call to clear()).

tvanslyke added a comment.EditedJan 16 2018, 8:24 AM

Sorry if I'm spamming too much. Just to be clear, I agree completely with your sentiment but I'm pretty sure that __clear_and_shrink() has exactly the same effects as clear(); shrink_to_fit();. Maybe reserve(); should should be unconditionally calling __set_long_cap(0); immediately after deallocating? I could be missing something obvious though.

tvanslyke updated this revision to Diff 129969.Jan 16 2018, 8:55 AM

Implemented changes to ensure string state is valid after calling __clear_and_shrink(). Benchmark results are identical.

mclow.lists accepted this revision.Jan 19 2018, 12:08 PM

This looks good to me.
Please add a test in test/libcxx/strings/string.modifiers and commit.

Something like this:

#include <string>
#include <cassert>

int main () {
    std::string l = "This makes this a long string, I hope; just rambling on and on...";
    std::string s = "short";
    assert(l.__invariants());
    assert(s.__invariants());

    s.__clear_and_shrink();
    assert(s.__invariants());
    assert(s.size() == 0);

    l.__clear_and_shrink();
    assert(l.__invariants());
    assert(l.size() == 0);
}
This revision is now accepted and ready to land.Jan 19 2018, 12:08 PM

Since __clear_and_shrink() is private the test covers copy and move assignment. I also ran the libcxx/strings/basic.string and std/strings tests with a hard-coded assert(__invariants()); at the end of __clear_and_shrink() and saw no failures.

I don't have commit access myself so I've added the test to the diff.

I don't have commit access myself so I've added the test to the diff.

I'll commit it then.

Since __clear_and_shrink() is private

There's no reason it has to be private.
People aren't supposed to call anything with a reserved name.

Moved __clear_and_shrink() to live with the other public dunder methods and adapted the associated test accordingly.

Hello. I would just like to follow up on the status of this patch. Is there anything else that is needed from me? I just want to make sure that nobody is waiting on me for anything; I'm still not 100% familiar with the process. Thanks in advance.

vsk added a comment.Mar 8 2018, 10:32 AM

@mclow.lists is this still fine to commit? I can land it and watch the bots, if you'd like.

In D41976#1031674, @vsk wrote:

@mclow.lists is this still fine to commit? I can land it and watch the bots, if you'd like.

Sure. Please do so.

This revision was automatically updated to reflect the committed changes.
vsk added a comment.Mar 8 2018, 1:18 PM

Thanks @tvanslyke and @mclow.lists, landed as r327064.

Howard just pointed out to me that __clear_and_shrink should be noexcept - otherwise we get the generation of an exception table and a call to terminate in string's move assignment operator. Fixed in revision 333435.