HomePhabricator

[libc++] Optimize / partially inline basic_string copy constructor

Authored by EricWF on Jan 17 2020, 1:53 PM.

Description

[libc++] Optimize / partially inline basic_string copy constructor

Splits copy constructor up inlining short initialization, outlining long
initialization into __init_long() which is the externally instantiated slow
path initialization.

Subsequently changing the copy ctor to be inlined (not externally instantiated)
provides significant speed ups for short string initialization.

Generated code given:

void StringCopyCtor(void* mem, const std::string& s) {

std::string*p = new(mem) std::string{s};

}

asm:

cmp     byte ptr [rsi + 23], 0
js      .LBB0_2
mov     rax, qword ptr [rsi + 16]
mov     qword ptr [rdi + 16], rax
movups  xmm0, xmmword ptr [rsi]
movups  xmmword ptr [rdi], xmm0
ret

.LBB0_2:

jmp     std::basic_string::__init_long # TAILCALL

Benchmark:
BM_StringCopy_Empty 5.19ns ± 6% 1.50ns ± 8% -71.02% (p=0.000 n=10+10)
BM_StringCopy_Small 5.14ns ± 8% 1.53ns ± 7% -70.17% (p=0.000 n=10+10)
BM_StringCopy_Large 18.9ns ± 0% 19.3ns ± 0% +1.92% (p=0.000 n=10+10)
BM_StringCopy_Huge 309ns ± 1% 316ns ± 5% ~ (p=0.633 n=8+10)

Patch from Martijn Vels (mvels@google.com)
Reviewed as D72160.