Page MenuHomePhabricator

[libc++][format][2/6] Adds a __output_iterator.
Needs ReviewPublic

Authored by Mordante on Sep 26 2021, 5:49 AM.

Details

Reviewers
ldionne
vitaut
Group Reviewers
Restricted Project
Summary

Instead of using a temporary string in __vformat_to_wrapped use a new
generic iterator. This aids to reduce the number of template instantions
and avoids using a string to buffer the entire formatted output.

This changes the type of format_context and wformat_context, this can
still be done since the code isn't ABI stable yet.

Several approaches have been evaluated:

  • Using a __output_buffer base class with:
    • a put function to store the buffer in its internal buffer
    • a virtual flush function to copy the internal buffer to the output
  • Using a function to forward the output operation to the output buffer, much like the next method.
  • Using a type erased function point to store the data in the buffer.

The last version resulted in the best performance. For some cases there's
still a loss of speed over the original method. This loss many becomes
apparent when large strings are copied to a pointer like iterator, before
the compiler optimized this using memcpy.

Depends on D110494

Diff Detail

Event Timeline

Mordante created this revision.Sep 26 2021, 5:49 AM
Mordante requested review of this revision.Sep 26 2021, 5:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2021, 5:49 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Quuxplusone added inline comments.
libcxx/include/__format/buffer.h
29

/that too/it is too/?

38

Nit: type-erased hyphenated.

45–46

Generally I find the javadoc-style comments unhelpful; but also in this case it's a bit misleading, since of course __obj can't be any old class object containing a put; it must be the specific object, of the specific type, that you instantiated __put with (or that you instantiated __output_iterator's ctor with, if you take my refactor below). Fortunately, if you take my refactor below, then you won't need _Fun in multiple places and so you won't need a distinct name for it and so you won't need a comment about what the name means. :)

72–76

The comment is ungrammatical and unhelpful, and T needs uglification; but why just not make this a private detail of __output_iterator? Instead of

_LIBCPP_HIDE_FROM_ABI explicit __output_iterator(_Fun __fun, void* __obj)
    : __fun_(__fun), __obj_(__obj) {}

do

_LIBCPP_HIDE_FROM_ABI explicit __output_iterator(_Tp *__t)
    : __fun_([](_CharT __c, void *__o){ static_cast<_Tp*>(__o)->put(__c); }), __obj_(__t) {}

and construct it on line 504 as

_VSTD::__format::__output_iterator<_CharT>(_VSTD::addressof(__buffer))

(notice addressof, not operator&, for ADL-proofing and hijacking-avoidance)

84–85

Btw, I see some inconsistency throughout on whether we're doing 2-space tabs in this file or 4-space (lines 57, 88). Doesn't matter, though.

libcxx/test/libcxx/utilities/format/format.formatter/format.context/types.compile.pass.cpp
111

Your commit message implies that you did some performance testing benchmarks between this implementation and some other implementation. What were those benchmarks, and can you check them into libcxx/benchmarks/?

Mordante marked 4 inline comments as done.Sep 26 2021, 9:07 AM
Mordante added inline comments.
libcxx/include/__format/buffer.h
29

I'll leave this as is since the same block is copied several times. Still hope we soon only need to support compilers with proper concepts support.

84–85

This is what clang-format makes of it. I noticed clang-format-13 seems to have some support for concepts and I still want to reformat all format related headers once we switch to clang-format-13 and the number of active format patches is reduced.

libcxx/test/libcxx/utilities/format/format.formatter/format.context/types.compile.pass.cpp
111

See D110501.

Mordante marked 5 inline comments as done.Sep 26 2021, 10:05 AM
Mordante added inline comments.
libcxx/include/__format/buffer.h
72–76

Your suggestion indeed looks better than the current code. Thanks!

Mordante updated this revision to Diff 375123.Sep 26 2021, 10:09 AM
Mordante marked an inline comment as done.

Addresses review comments.
@Quuxplusone's suggestion to use a lambda instead of a helper function means the other patches in this series need to be adjusted in a similar fashion.

Mordante updated this revision to Diff 378435.Oct 9 2021, 4:40 AM

Rebased; changes the generated files.

Mordante updated this revision to Diff 380172.Oct 16 2021, 5:08 AM

Rebased to test with wchar_t changes.

vitaut requested changes to this revision.Oct 17 2021, 7:30 AM

I'm very surprised that this is the solution you went with. Indirect function call to write each character individually will likely be slow in practice compared to a contiguous buffer and a bulk copy. Please post benchmark results.

This revision now requires changes to proceed.Oct 17 2021, 7:30 AM

Thanks for the review. I'll try your suggestion and benchmark both versions.

I've created a proof-of-concept here with benchmark results D112361.
That revision is just to show what I did. It's intended to be a basis for the changes, it *is not* to be used as-is.

vitaut added a comment.EditedOct 24 2021, 9:46 AM

I've created a proof-of-concept here with benchmark results D112361.
That revision is just to show what I did. It's intended to be a basis for the changes, it *is not* to be used as-is.

I think D112361 is a step in the right direction. You can do even better and make it represent a contiguous (not necessarily fixed-size) buffer with a single type-erased reallocation/flush function. Then it can be used to write to contiguous containers like vector and string which is a common case directly avoiding copies. This is basically what {fmt} does.

Mordante updated this revision to Diff 387079.Sun, Nov 14, 6:38 AM

Rebased and make changes based on D112361.

vitaut added inline comments.Thu, Nov 25, 8:11 AM
libcxx/include/__format/buffer.h
58–60

As discussed in D112361, it would be good to avoid extra buffering for the common case of appending to a contiguous container such as string or vector.

Mordante added inline comments.Thu, Nov 25, 11:25 AM
libcxx/include/__format/buffer.h
58–60

What do you mean with "appending"?
std::back_inserter(std::string) or std::string::begin()

I think it makes sense to buffer the back_inserter. For example every push_back in a std::string NUL-terminates the string. The insert operation does one NUL-termination.

D110497 changes the behaviour for these containers to buffer in this object and then use insert in these containers.

But I can test with the benchmarks whether the extra buffering has a positive or negative effect.

vitaut added inline comments.Fri, Nov 26, 7:59 AM
libcxx/include/__format/buffer.h
58–60

What do you mean with "appending"?

std::back_inserter(std::string)

For example every push_back in a std::string NUL-terminates the string.

You should avoid character-by-character insertion in the first place. Also adding buffering for std::string is easy but you cannot remove it so the current approach is inherently less efficient than needed.

Mordante marked 2 inline comments as done.Wed, Dec 1, 10:02 AM
Mordante added inline comments.
libcxx/include/__format/buffer.h
58–60

I agree and I should improve this inefficiency. I added a todo on line 71 to that effect and have more todos for this in my personal notes.
For now my main focus is feature completeness, so I prefer to postpone these changes to a later time.
D110497 already has some improvements for containers, but for string and vector we can do better.

Mordante updated this revision to Diff 391073.Wed, Dec 1, 10:04 AM
Mordante marked an inline comment as done.

Rebased and some minor improvements.