A common pattern when using SmallString is to repeatedly call append to build a larger string.
The issue here is the optimizer can't see through this and often has to check there is enough space in the storage for each string you try to append.
This results in lots of conditional branches and potentially multiple calls to grow needing to be emitted if the buffer wasn't large enough.
By taking an initializer_list of StringRefs, SmallString can preallocate the storage it needs for all of the StringRefs which only need to grow one time at most, then use a fast path of copying all the strings into its storage knowing there is guaranteed to be enough capacity.
By using StringRefs, this also means you can append different string like types in one go as they will all be implicitly converted to a StringRef.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Could you refresh my memory on whether reserve has the same growth semantics as other size-increasing operations? I'm slightly concerned that if reserve only reserves the exact amount required (compared to an append or similar operation exceeding the capacity which would/should cause the usual multiplication factor growth) that this change as-is might hinder performance (eg: a loop that appends two things today would be getting the multiplicative growth - but then if it ends up with a reserve(size of two strings) call in it maybe it keeps only growing just enough on each loop iteration, incurring more growths?)?
Reserve and grow call the same function to actually grow the container, they both use the next power of 2 above the requested size. Reserve also won't allocate if it already has the capacity needed
I wonder, would it be valuable to do this for Twine? I'm thinking something like:
Optional<size_t> Twine::tryGetStorageLength() const;
which returns None if there is a leaf whose size isn't known in constant time (say, other than CStringKind, StdStringKind, and SmallStringKind). Then change Twine::toVector to use this to reserve space:
void Twine::toVector(SmallVectorImpl<char> &Out) const { if (auto Size = tryGetStorageLength()) Out.reserve(Out.size() + Size); raw_svector_ostream OS(Out); print(OS); }
If so, might that cover this use case as well?
I can't see there being as much of a win in that case, but its definitely worth exploring, not in this patch though.
There already is an assign test, should probably improve the append test so the string isn't empty before hand
Extend Append test so it actually has something to append to.
Demonstrate it can concatenate strings of different types.
Oh, soryr - not sure what I was looking at that I missed the other test. Thanks for improving the coverage a bit there in any case!