Page MenuHomePhabricator

[libc++][format][3/6] Adds a __container_buffer.
Needs ReviewPublic

Authored by Mordante on Sun, Sep 26, 6:48 AM.


Group Reviewers
Restricted Project

Instead of writing every character directly into the container by using
a back_insert_iterator the data is buffered in an array. This buffer
is then inserted to the container by calling its insert member function.

Since there's no guarantee every container's insert behaves properly
containers need to opt-in to this behaviour. The appropriate standard
containers opt-in to this behaviour.

This change improves the performance of the format functions that use a

Depends on D110495

Diff Detail

Event Timeline

Mordante created this revision.Sun, Sep 26, 6:48 AM
Mordante requested review of this revision.Sun, Sep 26, 6:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptSun, Sep 26, 6:48 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Mordante updated this revision to Diff 375099.Sun, Sep 26, 7:11 AM

Attempt to fix Clang-11 build issue.

Quuxplusone added inline comments.

The machinery to __enable_insertable in a container right now is like 16 lines. You could get it down to 1 line, and no helper headers, and C++11-compatible, if you changed this to

is_same_v<_Container, typename _Container::__enable_format_insertable>

Then the containers (vector, deque, string, list, etc.) would just have to say e.g.

using __enable_format_insertable = vector<_Tp, _Alloc>;

without any extra headers or ifdef-guards or anything.


add_pointer_t<typename _Container::value_type> is invariably typename _Container::value_type*, right?


Rather than lines 129-140, I suggest that you add a member function container_type *back_insert_iterator::__get_container() const to libc++'s back_insert_iterator.
I suppose you still need a partial-specialization-based approach equivalent to lines 117-127, in order to tell you whether the user's iterator type is exactly back_insert_iterator (in which case this optimization is safe) or merely something that slices to back_insert_iterator (in which case this optimization is dubious). But once you know it's exactly back_insert_iterator, you can just ask for its container_type member typedef, and use your new member function to get a pointer to the container; you don't have to use tricks for that part.


Why not _CharT __buffer_[__buffer_size];? (What does the std::array dependency buy you? and the zero-initialization?)

442 ↗(On Diff #375099)


Mordante planned changes to this revision.Sun, Sep 26, 10:00 AM

Needs to be updated due to the removal of __put in D110495.

Mordante marked 5 inline comments as done.Sat, Oct 9, 5:25 AM
Mordante added inline comments.

Thanks for the suggestion, this indeed is a lot less intrusive.


See D110573.


I prefer an array just for convenience. I need to know the last element and I like it better to use end().
The zero-initialization isn't needed.

442 ↗(On Diff #375099)

This is no longer needed due to your earlier suggestions.

Mordante updated this revision to Diff 378439.Sat, Oct 9, 5:37 AM
Mordante marked 4 inline comments as done.

Addresses the review comments.
Temporary include the used part of D110573 to avoid possible Buildkite dependency issues.

I think it's up to @ldionne and/or @vitaut to say whether this PR is ultimately a good tradeoff, complexity-and-performance-wise. IMHO we've successfully reached the local optimum, code-wise. :)


Style nit: _CharT* __buffer_it_ = __buffer_.begin(); is easier to read, and equivalent. Could even say and then I wouldn't need to consciously avoid looking up on cppreference whether array::iterator is guaranteed to be _CharT*. ;)


Style nit: _Tp, _Alloc with a space


FWIW, I think it is not useful to test no_value_type, no_end, or no_insert, because we don't permit users to define such types and we don't expect to define any such types ourselves. Seeing a correct "specialization" of __enable_format_insertable should be enough to indicate that the type author intends the type to be insertable. If it's somehow not insertable, then that's a serious show-stopping bug in the type, not something that <format> should be silently working-around.


Also test

struct UserVector : public std::vector<char> {
    using std::vector<char>::vector;

This is important to test because users are allowed to do stupid things like this. (They shouldn't; it's bad style and has many downsides; but they are allowed to.)


What do you expect to happen for std::vector<int> or std::vector<std::string>? On the one hand, int and string are not "char types," right? But on the other hand, vec.insert(vec.end(), charptr, charendptr) will work as expected, so we do want to do the optimization, right?

Mordante planned changes to this revision.Sun, Oct 10, 10:27 AM
Mordante marked an inline comment as done.

Thanks for the feedback, I'll look at these issues when Buildkite is working properly again.


The format specific functions require a char or wchar_t. So I expect them not to pass this trait. When we want to use this more generic, we can add an std::__insertable and rename the typedef in the containers. I didn't restrain the typedef in the classes; I looked at it but it became quite ugly quite fast. Instead it's constrained in the std::__format::__insertable concept. So it would be quite trivial to add std::__insertable, but I don't have a usecase for it.

Mordante updated this revision to Diff 378701.Mon, Oct 11, 9:26 AM
Mordante marked 5 inline comments as done.

Addresses review comments.


I prefer the begin() since that conveys the intended usage better.


Interesting clang-format didn't catch that for me.


I not entirely convinced that removing them is a good idea. I'd like to hear @ldionne's opinion.

Mordante updated this revision to Diff 380179.Sat, Oct 16, 6:20 AM

Rebased to test with wchar_t changes.

Mordante updated this revision to Diff 380181.Sat, Oct 16, 7:11 AM

Fixes wide char build.