Page MenuHomePhabricator

[libc++][format] Implement formatters.
ClosedPublic

Authored by Mordante on Feb 14 2021, 6:01 AM.

Details

Summary

This implements the initial version of the std::formatter class and its specializations. It also implements the following formatting functions:

  • format
  • vformat
  • format_to
  • vformat_to
  • format_to_n
  • formatted_size

All functions have a char and wchar_t version. Parsing the format-spec and
using the parsed format-spec hasn't been implemented. The code isn't optimized,
neither for speed, nor for size.

The goal is to have the rudimentary basics working, which can be used as a
basis to improve upon. The formatters used in this commit are simple stubs that
will be replaced by real formatters in later commits.

The formatters that are slated to be replaced in this patch series don't have
an availability macro to avoid merge conflicts.

Note the formatter for bool uses 0 and 1 instead of "false" and
"true". This will be fixed when the stub is replaced with a real
formatter.

Implements parts of:

  • P0645 Text Formatting

Completes:

  • LWG3539 format_to must not copy models of output_iterator<const charT&>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vitaut added inline comments.Jul 10 2021, 9:28 AM
libcxx/include/format
701

I don't think you need to pass __args.__size() here because it is for compile-time checks only.

815

same here

vitaut added inline comments.Jul 10 2021, 9:38 AM
libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.bool.pass.cpp
38 ↗(On Diff #357274)

I'd add a check that the return value points to the end of the format string here and elsewhere.

51–52 ↗(On Diff #357274)

I'd test that empty string is also parsed correctly which is an important corner case.

libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.floating_point.pass.cpp
58 ↗(On Diff #357274)

Why do you have "}" in other tests and "{}" here?

Mordante marked 7 inline comments as done.Jul 11 2021, 6:14 AM

Thanks a lot for your review!

libcxx/include/__format/format_string.h
100–101

Good catch!

libcxx/include/format
682

I like your reasoning, I've changed it.

701

But doesn't that mean I need to add it again when I start working on P2216?

libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.bool.pass.cpp
38 ↗(On Diff #357274)

AFAIK this doesn't need to point to the end of the format string. Based on earlier comments and wording of http://eel.is/c++draft/format#tab:formatter

Parses format-spec ([format.string]) for type T in the range [pc.begin(), pc.end()) until the first unmatched character.
Throws format_­error unless the whole range is parsed or the unmatched character is }.
[Note 1: This allows formatters to emit meaningful error messages.
— end note]
Stores the parsed format specifiers in *this and returns an iterator past the end of the parsed range.

I made changes to this function. It now either returns an iterator to fmt.end() or an iterator at the closing } of the replacement-field.

51–52 ↗(On Diff #357274)

In other patches, for example` D103433. I added an extra layer to test both "}" and "" from one tester. I applied the same method to the tests in this directory. I also changed the names of the test functions to not use the overload resolution to pick the proper function. When we're happy with that approach I'll use these improvements for later patches in the series.

libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.floating_point.pass.cpp
58 ↗(On Diff #357274)

This should also be "}", this was an older approach. The floats haven't had as much love as the other implemented types yet. (Mainly since for floats I still need to get std::to_chars working properly.)

Mordante updated this revision to Diff 357796.Jul 11 2021, 6:25 AM
Mordante marked 5 inline comments as done.

Addresses the review comments.

Mordante updated this revision to Diff 357801.Jul 11 2021, 7:51 AM

Rebase, hopefully fixes Arm builds.
Fixes UB in tests detected by the debug and UBSAN builds.

vitaut added inline comments.Jul 11 2021, 10:41 AM
libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.bool.pass.cpp
38 ↗(On Diff #357274)

Good point.

29–43 ↗(On Diff #349177)

is this really the most ergonomic way to fit all these puzzle pieces together?

Looks OK to me since it's only a test . These are rarely used in user code, mostly when reusing formatters which is much easier.

Mordante updated this revision to Diff 358844.Jul 14 2021, 11:16 PM

rebased
updated to changes in the std::basic_format_context
Mark some test libc++ only
Use private module headers

Mordante updated this revision to Diff 359639.Jul 18 2021, 11:13 AM

Guard unit tests for implementation details

Mordante added inline comments.Jul 20 2021, 10:13 AM
libcxx/include/__format/format_string.h
47

s/_LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI/ here and all other occurrences in this patch.

Mordante updated this revision to Diff 361450.Jul 24 2021, 5:41 AM

s/_LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI/
Guard better against ADL
Move internal unit tests
Generate private module tests
Disable the floating-point stub, since the series has no replacement

libcxx/include/__format/format_string.h
131

s/values/value/
and please don't line-break

libcxx/include/format
447

s/128 bit/128-bit/g

625–626

Style tip: When you break the line like this, it makes it harder for a maintainer to git grep __throw_format_error.*terminate when they're trying to track down where a given error was thrown. (Although line-breaking in the middle of a string is of course infinitely worse!)
In this case there's no reason to line-break at all; just join lines 630 and 631. Ideally, every single time you throw a format_error, it'll be on a line of the form

[ ]*__throw_format_error(".*");

(Tangent: I would not object if you made it _VSTD::__throw_format_error throughout. You're right that the string literal won't trigger ADL, so ADL-proofing isn't necessary; but it might help visually distinguish this free function from a member function.)

643

/misses the/is missing a/

653–654
  • Don't break in the middle of this error message.
  • This function is called in only two places; I recommend inlining it in those places. Then you can improve the message by customizing it to the specific situation being parsed.
  • If you do keep it as a free function (which I don't recommend), please _VSTD:: its calls, and consider giving it a name that (1) doesn't look so much like the uglified version of std::next, (2) involves the word format.
682

s/charater/character/

687–688

Lines 692, 677, 657: I question the usefulness of these [[unlikely]] attributes. Compilers are generally very good at determining that the throwing path is the unlikely path, already; and this seems like cluttering up our code for no practical purpose.
(Ditto [[nodiscard]] on non-user-visible functions, such as on lines 45, 63, 70, 80; ditto noexcept on non-user-visible functions that are never evaluated for noexceptness; ditto constexpr on non-user-visible-functions that are never called in a constexpr context.)

Mordante marked 7 inline comments as done.Jul 25 2021, 6:52 AM

Thanks for the review comments!

libcxx/include/format
625–626

This change is automatically done by clang-format since I've still limited it to 80 characters instead of 120 as our official standard is. This since parts of this series were written before the limit was adjusted. After all outstanding patches are merged I want to apply our real coding style with clang-format, this will fix it. I don't want to do that now, since it will result in merge conflicts.

I don't mind using _VSTD::__throw_format_error but currently all calls to __throw_xxx functions use this style. So when I change it here I want to create a separate patch to do the rest of our code base. @ldionne what's your preference?

643

I changed it to 'misses a', this is more in line with the other messages in this file.

653–654

Again this is what clang-format does. When we don't like it we should adjust its settings to the format we want.
Originally this function was used in more places, but addressing other review comments reduced the number of use cases.
With only two users it indeed seems better to inline them. This allows for a small code improvement at the second call site.

687–688

I wasn't aware compilers already optimize the throwing path as unlikely. I tested it on compiler explorer and it indeed seems to be the case.
I like [[nodiscard]] for two reasons:

  • It avoids accidental bugs by not evaluating.
  • Documents the intended usage.

Even if a function isn't used in constexpr context it can be useful. I've used it during testing and suddenly UB became a compilation error.

So I don't consider [[nodiscard]], noexcept and constexpr as clutter.

Mordante updated this revision to Diff 361503.Jul 25 2021, 7:33 AM
Mordante marked 4 inline comments as done.

Addresses review comments.

Mordante added inline comments.Jul 29 2021, 9:36 AM
libcxx/include/format
687–688

I had a talk with @ldionne and he also prefers to remove the extra [[nodiscard]]s. He mention they've also been removed in the ranges code, which previously used the same style as I did here. So we agreed to remove them.

Mordante updated this revision to Diff 362813.Jul 29 2021, 9:56 AM

Rebased on main and updated to work against main.
Remove [[nodiscard]]s

ldionne requested changes to this revision.Aug 31 2021, 11:58 AM

Got a few comments on the tests, but apart from that this mostly LGTM. I'm assuming that the several TODOs are handled in subsequent patches.

libcxx/include/__format/formatter.h
52

Why is this _LIBCPP_HIDDEN and not _LIBCPP_HIDE_FROM_ABI?

libcxx/test/std/utilities/format/format.functions/format_to_n.pass.cpp
12

You can remove mentions of gcc-10 everywhere now.

Also, it would be great to have a comment explaining why this is unsupported on gcc-11 (in all places where you do that).

libcxx/test/std/utilities/format/format.functions/tests.inc
18 ↗(On Diff #362813)

Instead of using this inclusion technique, I would rather see this done this way:

// TestFunction must be callable as check(expected-result, string-to-format, args-to-format...)
// ExceptionTest must be callable as check_exception(expected-exception, string-to-format, args-to-format...)
template <class CharT, class TestFunction, class ExceptionTest>
void format_tests(TestFunction check, ExceptionTest check_exception) {
  check(STR("hello 01"), STR("hello {0:}{1:}"), false, true);
  // ...
}

Then, from the various tests, you can do:

#include "format_tests.h"

auto test = []<class CharT, class... Args>(std::basic_string<CharT> expected, std::basic_string<CharT> fmt, const Args&... args) {
  // ...
};

auto test_exception = []<class CharT, class... Args>(std::string_view what, std::basic_string<CharT> fmt, const Args&... args) {
  // ...
};

int main(int, char**) {
  format_tests<char>(test, test_exception);
  format_tests<wchar_t>(test, test_exception);

  return 0;
}

IMO this is a less surprising way to reuse code and that makes it easier to understand what's happening when you look at the test in isolation. WDYT?

libcxx/test/std/utilities/format/format.functions/vformat.locale.pass.cpp
43

This should be TEST_HAS_NO_EXCEPTIONS (elsewhere too).

libcxx/test/std/utilities/format/format.functions/vformat.pass.cpp
25

Those comments are not especially useful IMO, we could remove them altogether.

This revision now requires changes to proceed.Aug 31 2021, 11:58 AM
Mordante marked 5 inline comments as done.Sep 1 2021, 8:48 AM

Got a few comments on the tests, but apart from that this mostly LGTM. I'm assuming that the several TODOs are handled in subsequent patches.

Part of them will be done in the patches in this series. Others will be done in not yet finished patches. But the intention is to solve them all before marking the <format> header as complete.

libcxx/include/__format/formatter.h
52

Not sure, but fixed it.

libcxx/test/std/utilities/format/format.functions/format_to_n.pass.cpp
12

I need to investigate why gcc-11 doesn't work. I've added a TODO to look at it later. I'll need to update one of my systems so I have gcc-11 locally which makes testing easier.

libcxx/test/std/utilities/format/format.functions/tests.inc
1 ↗(On Diff #362813)

This file is obsolete and should be removed. I prefer to keep this file a little bit longer. This will make it a lot easier to migrate the other patches in the series to the new format_tests.h. Once that migration is done this file can be removed.

18 ↗(On Diff #362813)

I like this solution a lot! Thanks for the suggestion.

libcxx/test/std/utilities/format/format.functions/vformat.pass.cpp
25

I feel they are useful due to the usage of the tests.inc. But since that file will be replaced with your suggested format_tests.h I agree that they are not needed.

Mordante updated this revision to Diff 369976.Sep 1 2021, 10:23 AM
Mordante marked 4 inline comments as done.

Rebased.
Addresses review comments.

Mordante updated this revision to Diff 369982.Sep 1 2021, 10:39 AM

Guard against min/max macros.

ldionne accepted this revision.Sep 3 2021, 12:04 PM
ldionne added inline comments.
libcxx/test/std/utilities/format/format.functions/format_to_n.pass.cpp
12

FWIW, you can use our Docker image to match the CI environment if you'd like:

./libcxx/utils/ci/run-buildbot-container # this will run libc++'s Docker image that we use in CI, which contains GCC 11 amongst other things
$ ./libcxx/utils/ci/run-buildbot generic-gcc # run this from the Docker image
libcxx/test/std/utilities/format/format.functions/tests.inc
1 ↗(On Diff #362813)

It's very unusual to approve patches with these sorts of artifacts, let's make this an exception.

This revision is now accepted and ready to land.Sep 3 2021, 12:04 PM
Mordante added inline comments.Sep 4 2021, 2:40 AM
libcxx/test/std/utilities/format/format.functions/tests.inc
1 ↗(On Diff #362813)

Thanks. That really helped to migrate the other patches to the new format_tests.h without introducing merge conflicts in the process.

This revision was automatically updated to reflect the committed changes.
vitaut added inline comments.Sep 5 2021, 7:19 AM
libcxx/include/format
734

I know this diff has already landed but you definitely don't want to inline the top-level "v" functions such as vformat to reduce per-call code bloat (see e.g. https://godbolt.org/z/f8c1P5bP5).

Mordante added inline comments.Sep 5 2021, 8:31 AM
libcxx/include/format
734

Correct. I haven't spend much time reducing the the code bloat, but it's on my TODO list. At the moment I'm focusing on implementing the papers and LWG-issues. (Actually I'm working on reducing the code bloat required by P2216 ;-))

I added this remark to my TODO list.