This is an archive of the discontinued LLVM Phabricator instance.

Add StringExtras join_items function
ClosedPublic

Authored by zturner on Sep 23 2016, 1:13 PM.

Details

Summary

There's already llvm::join which is useful for joining a container of items. But there is no way to join items of disparate types, or types who are not in a container for whatever reason.

For example, if you have a std::string, a llvm::StringRef, and a const char *, you would have to manually construct a container to put them in before joining them, which is inconvnient.

This adds a variadic join_items function for this purpose.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner updated this revision to Diff 72336.Sep 23 2016, 1:13 PM
zturner retitled this revision from to Add StringExtras join_items function.
zturner updated this object.
zturner added reviewers: majnemer, sanjoy.
zturner added a subscriber: llvm-commits.
zturner updated this revision to Diff 72337.Sep 23 2016, 1:15 PM

Added missing unit test file.

zturner updated this revision to Diff 72338.Sep 23 2016, 1:16 PM

Trying this again, for some reason StringExtrasTest.cpp isn't uploading.

zturner updated this revision to Diff 72340.Sep 23 2016, 1:32 PM

Added some tests to show that it works when you use operands of different types.

Also templatized the separator so that it can be a char (or anything else for that matter), and not just a StringRef.

Ping. Anyone have objections?

sanjoy edited edge metadata.Sep 26 2016, 11:55 AM
sanjoy added a subscriber: dblaikie.

Looks fine to me, but I don't grok the intricacies of std::forward and variadic templates well enough to review this. Perhaps @majnemer or @dblaikie can chime in?

How does it compare to Twine in practice?

How does it compare to Twine in practice?

I don't think it solves the same problem as a Twine. While twines are good for concatenating values as you descend into a call hierarchy, it doesn't work when you want to build a concatenation of the results of many different function return values and then use that -- perhaps multiple times -- in one function.

What about: `auto Result = (Twine(foo()) + bar() + baz()).str()`?

What about: `auto Result = (Twine(foo()) + bar() + baz()).str()`?

Hadn't thought of that, but that also doesn't work of foo(), bar(), and baz() are of type const char *. In that case you need to invoke operator += on the std::string

Also I feel like Twine(foo() + '/' bar() + '/' + baz()).str() is a big more difficult to visually grok than llvm::join_items('/', foo(), bar(), baz());

What about: `auto Result = (Twine(foo()) + bar() + baz()).str()`?

Hadn't thought of that, but that also doesn't work of foo(), bar(), and baz() are of type const char *. In that case you need to invoke operator += on the std::string

No: the twine capture the pointers, and create a string *at the end only* (the call to str()). That said it is not terribly efficient: it'll stream to a SmallVector and copy the result to the string. (You can instead create a small vector and supply it to Twine::toVector() to save a copy).
Anyway...

Also I feel like Twine(foo() + '/' bar() + '/' + baz()).str() is a big more difficult to visually grok than llvm::join_items('/', foo(), bar(), baz());

Right, I agree with that.

include/llvm/ADT/StringExtras.h
222 ↗(On Diff #72340)

Could you add a "reserve" stage here?

zturner added inline comments.Sep 26 2016, 8:48 PM
include/llvm/ADT/StringExtras.h
222 ↗(On Diff #72340)

Seems reasonable, but how big? (max(1, sizeof...(Items)) - 1) * Separator.size()) is a lower bound. Should I just go with that?

mehdi_amini added inline comments.Sep 26 2016, 9:10 PM
include/llvm/ADT/StringExtras.h
222 ↗(On Diff #72340)

I'd have a variadic template (with some overload) that sum the length of each elements. So that the exact length is computed. This is what join_impl is doing iteratively I believe.

zturner updated this revision to Diff 72663.Sep 27 2016, 8:46 AM
zturner edited edge metadata.

Updated to compute the length before doing the concatenation.

mehdi_amini added inline comments.Sep 27 2016, 8:54 AM
include/llvm/ADT/StringExtras.h
246 ↗(On Diff #72663)

Isn't there a missing + 1 for the trailing '\0' ?

zturner added inline comments.Sep 27 2016, 8:55 AM
include/llvm/ADT/StringExtras.h
246 ↗(On Diff #72663)

Good catch. I'll fix that locally, not worth uploading a new patch for though.

mehdi_amini accepted this revision.Sep 27 2016, 8:59 AM
mehdi_amini added a reviewer: mehdi_amini.

LGTM.

include/llvm/ADT/StringExtras.h
226 ↗(On Diff #72663)

Do you need the B in this function?

This revision is now accepted and ready to land.Sep 27 2016, 8:59 AM
zturner added inline comments.Sep 27 2016, 9:03 AM
include/llvm/ADT/StringExtras.h
226 ↗(On Diff #72663)

My understanding is that it is needed to disambiguate the two overloads. If you have these two:

template <typename A1>
size_t join_items_size(const A1 &A) {
}

template<typename A1, typename... As>
size_t join_items_size(const A1 &A, As &&... Items) {
}

Then if you call it as:

size_t x = join_items_size("Test");

it will be ambiguous, because the second overload could have a parameter pack of size 0. Correct me if I'm wrong though, I admit I am not an expert here :)

This revision was automatically updated to reflect the committed changes.