This is an archive of the discontinued LLVM Phabricator instance.

make SmallString::str() return std::string
AbandonedPublic

Authored by yaron.keren on Nov 20 2014, 12:53 AM.

Details

Reviewers
None
Summary

Note! this compiles but breaks tests, it's not ready. I must have made an error somewhere.
I'm keeping it in Phabricator hoping to continue fixing it time permitting.
If anyone else is interested in taking over, feel free.

This is an attempt in making SmallString::str() return std::string.
The first phase required making SmallString an equal-rights citizen in LLVM.

There were four use-cases for SmallString::str():

1 Input to a twine or StringRef accepting function.
2 << streaming to raw_stream.
3 Select an overloaded function accepting Twine and StringRef.
4 Really convert to std::string.

To solve 1 and 2, I extended Twine and raw_stream to accept SmallVectorImpl<char>.
This made most hundreds of existing .str() calls redundant which is very nice.

3 was solved by either adding a SmallVectorImpl<char>-accepting function or
specifying StringRef(SmallString).

For 4, where std::string was really required, I added std::string SmallString::to_str().
.str() can not be reused immediatly as all the existing .str() code would compile OK but
with performance regressions due to the creation of std::string instead of a StringRef.

All .str() users must be reviewed to be removed most of the time or modified into
to_str() before we can reuse .str().

Diff Detail

Event Timeline

yaron.keren retitled this revision from to make SmallString::str() return std::string.
yaron.keren updated this object.
yaron.keren edited the test plan for this revision. (Show Details)
yaron.keren added a subscriber: Unknown Object (MLST).

(posting this with full context would help make it easier to see/review in Phab)

include/llvm/ADT/SmallString.h
100

This doesn't look correct at all - that's comparing RHS to a default-constructed StringRef (the empty string). (similarly for the other functions here)

include/llvm/ADT/Twine.h
142

Might make more sense to represent many of these as a const StringRef value (then we wouldn't need so many variants - just build a StringRef from a std::string or a SmallVectorImpl<char>, etc... ) rather than pointers to the actual objects - just pointing straight into the buffer (& storing the length) would be simpler.

Thanks, I meant to call the StringRef conversion but ended up with empty StringRef.
I will first split the twine and raw_stream changes into their own patch so they will be easier to review.

Hi David,

I have looked into using a single StringRef for std::string, StringRef and SmallString. This is easily done but three problems emerge:

1 We lose efficient in .str(). It does not know to return a pointer to std::string even if the twine actually stores a single StringRef to the std::string data. A Twine-accepting funciton called with std::string input and then calls Twine::str() on its input will result in deep copy of the std::string.

  1. The prinout routine printOneChildRepr() output would be less informative than today, we no longer have the information about the original data type for std::string and SmallString nor their actual address, just their data and size.

3 The Twine could be twice as large.

I'm not sure it's worthwhile going this way compared with adding another pointer to SmallString member union, what do you think?

yaron.keren abandoned this revision.Mar 30 2017, 12:07 PM