This is an archive of the discontinued LLVM Phabricator instance.

[libc++][format] Add __format_arg_store.
ClosedPublic

Authored by Mordante on May 29 2021, 1:29 AM.

Details

Summary

This implements the struct __format_arg_store and its dependencies:

  • the class basic_format_arg,
  • the class basic_format_args,
  • the class basic_format_context,
  • the function make_format_args,
  • the function wmake_format_args,
  • the function visit_format_arg,
  • several Standard required typedefs.

The following parts will be implemented in a later patch:

  • the child class basic_format_arg::handle,
  • the function basic_format_arg::basic_format_arg(const T* p).

The following extension has been implemented:

  • the class basic_format_arg supports __[u]int128_t on platform where libc++ supports 128 bit integrals.

Implements parts of:

  • P0645 Text Formatting

Completes:

  • LWG3371 visit_format_arg and make_format_args are not hidden friends
  • LWG3542 basic_format_arg mishandles basic_string_view with custom traits

Note https://mordante.github.io/blog/2021/06/05/format.html gives a bit more information about the goals and non-goals of this initial patch series.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Mordante updated this revision to Diff 349040.Jun 1 2021, 11:42 AM

Rebased on main
Polishing
implement a real output_iterator concept, main has the require prerequisite.

Mordante updated this revision to Diff 349056.Jun 1 2021, 12:20 PM

Attempt to fix the build.

Mordante added inline comments.Jun 1 2021, 12:48 PM
libcxx/include/module.modulemap
614

Work-around. Needs to be removed when the modules work properly in main.

Mordante edited the summary of this revision. (Show Details)Jun 5 2021, 3:59 AM
Mordante added reviewers: curdeius, ldionne, miscco, vitaut.
Mordante edited the summary of this revision. (Show Details)Jun 8 2021, 10:50 PM
Mordante updated this revision to Diff 350782.Jun 8 2021, 10:59 PM

Fixes an bug in the format_arg constructor for string_view.
Improve the format_arg constructor for string by addressing LWG3542.

Note the paper status will be updated once the C++2b LWG issues have been updated.

vitaut requested changes to this revision.Jun 15 2021, 7:30 AM

"the function make_format_args," appears twice in the summary. I guess the second one was supposed to be "make_wformat_args".

libcxx/include/__format/format_arg.h
18

Do we have to include all of functional for invoke?

70

Visiting an empty argument should pass monostate to the visitor, not throw an exception.

101

Why store a C string as void* and cast? I suggest storing as const char_type* instead.

117

Initializing __bool is unnecessary because the value is never accessed if type is __none. Probably doesn't matters in practice but there is no harm is doing less unnecessary work.

libcxx/include/__format/format_context.h
52–63

basic_format_context is not supposed to be constructed by users so these ctors better be private.

93

create -> created

94

allow -> allows

95–96

Lazy creation is fine but using optional is suboptimal since locale is basically a pointer.

libcxx/include/format
18

It is no longer considering, the paper has been accepted.

240–244

This is identical to make_format_args so it just adds unnecessary indirection and a ton of template instantiation. I suggest removing.

libcxx/test/std/utilities/format/format.arguments/format.arg.store/class.pass.cpp
29

This better be parameterized on char type, not context (and the same throughout the tests).

This revision now requires changes to proceed.Jun 15 2021, 7:30 AM
Mordante planned changes to this revision.Jun 16 2021, 11:31 AM
Mordante marked 11 inline comments as done and an inline comment as not done.

Thanks for the review!

"the function make_format_args," appears twice in the summary. I guess the second one was supposed to be "make_wformat_args".

Good catch.

libcxx/include/__format/format_arg.h
18

It seems we do.

70

I thought this was an allowed optimization, but that seems not to be the case.
Adding monostate will require including <variant>. At the moment we're busy modularizing our headers, but that unveils some clang issues. So I'll include the entire header with a TODO.
This change will break some other patches; I'll update them later.

117

I recall I originally had it like that, but that caused some issues. However testing again it seems not to be required (anymore).

libcxx/include/__format/format_context.h
52–63

Not being able construct it makes it hard to test the implementation. So I prefer to keep them public.
MSVC STL also implements their constructors as public.

95–96

True, want to look at a better solution later, hence the TODO. I initially wanted to focus on getting something working and optimize later.

libcxx/include/format
18

:-) (It was valid at the time of writing, I probably need to update the status of P2216 at more places.)

libcxx/test/std/utilities/format/format.arguments/format.arg.store/class.pass.cpp
29

Nice suggestion, that makes the callers a bit less verbose.

Mordante edited the summary of this revision. (Show Details)Jun 16 2021, 11:31 AM
Mordante updated this revision to Diff 352779.Jun 17 2021, 10:11 AM
Mordante marked 7 inline comments as done.

Addresses review comments.
Update module map.
Update P2216 status; it will be part of C++23 and backported to C++20.

Mordante updated this revision to Diff 352788.Jun 17 2021, 10:50 AM

Tabs -> space.

vitaut accepted this revision.Jun 20 2021, 6:44 AM

LGTM but this will need to be reviewed by someone familiar with libc++ of course =).

libcxx/include/__format/format_arg.h
70

Microsoft STL pulled out monostate into a separate header to avoid including variant.

Mordante updated this revision to Diff 353419.Jun 21 2021, 10:47 AM

Tab -> space.

Mordante marked an inline comment as done.Jun 21 2021, 10:57 AM

LGTM but this will need to be reviewed by someone familiar with libc++ of course =).

Thanks for the review. Yeah the final approval will be done by someone more familiar with libc++.

libcxx/include/__format/format_arg.h
70

I plan to do the same. I added a todo at line 20.

Mordante updated this revision to Diff 353430.Jun 21 2021, 11:11 AM
Mordante marked an inline comment as done.

Forward declare visit_format_arg. Test whether this fixes the modular build.

Mordante updated this revision to Diff 353444.Jun 21 2021, 11:44 AM

Revert last change.
Included more headers and the test passes locally.

Mordante updated this revision to Diff 353669.Jun 22 2021, 9:11 AM

Adds some missing includes. Hopefully that fixes the GCC builds.

Mordante updated this revision to Diff 354005.Jun 23 2021, 9:54 AM

Disable the failing test on GCC.

Mordante updated this revision to Diff 354160.Jun 23 2021, 11:10 PM

Disable test gcc-10.
Attempt to fix the modular build.

Mordante updated this revision to Diff 354312.Jun 24 2021, 11:16 AM
Mordante added a subscriber: cjdb.

Fixes for the build. The test pass now locally both with a normal and a modular build.
Thanks to @ldionne and @cjdb for their input.

Mordante updated this revision to Diff 354432.Jun 24 2021, 11:02 PM

Rebase to trigger CI, the previous build failed due to a flacky test.
Use <__utility/forward.h> instead of <utility> in <__format/format_fwd.h>.

Mordante updated this revision to Diff 357273.Jul 8 2021, 10:41 AM

Rebased.
Use new monostate header.

vitaut requested changes to this revision.Jul 11 2021, 7:25 AM
vitaut added inline comments.
libcxx/include/__format/format_arg.h
233–235

An overload for [const] void* is missing.

239–240

In general basic_format_args are not comparable because they can refer to type-erased user-defined objects/formatters that can't be compared. If this is for testing I suggest moving to tests.

libcxx/include/__format/format_context.h
64

You could expose __uglified public "factory" functions for testing. MSVC is not a great example, they were doing a bunch of things wrong, I would reported this one if I had time.

libcxx/include/__format/format_fwd.h
2

What is the purpose of the format_fwd.h header?

libcxx/include/format
237

This could be a built-in array and the dependency on <array> removed. There is no advantage to using std::array here.

libcxx/test/std/utilities/format/format.arguments/format.arg.store/class.pass.cpp
36–39

These checks seem overspecified especially considering that the representation will have to change to implement one of the TODOs. I'd recommend doing checks by converting __format_arg_store into basic_format_args and using the public API. Same elsewhere.

Another advantage of going through basic_format_args is that it will mimic the actual use.

libcxx/test/std/utilities/format/format.arguments/format.arg/visit_format_arg.pass.cpp
88–109

All of this will have to change when we add char*_t support. I think we should static_assert in basic_format_context or elsewhere that CharT is char or wchar_t for now since other char type are not usable with <format> anyway and these conversions are morally wrong.

132

Create an alias for std::conditional_t<sizeof(long) == sizeof(int), int, long long> to avoid copy-paste?

198–217

ditto

281–291

Would be nice to add a test for basic_string_view with traits and basic_string with traits and an allocator.

libcxx/test/std/utilities/format/format.formatter/format.context/arg.pass.cpp
30

nit: I'd call it output or result to distinguish from string above.

This revision now requires changes to proceed.Jul 11 2021, 7:25 AM
Mordante updated this revision to Diff 358369.Jul 13 2021, 11:29 AM
Mordante marked 9 inline comments as done.Jul 13 2021, 11:29 AM

Rebased
Addresses most review comments
Uses new XFAIL regex for apple platforms
Minor cleanups

Mordante planned changes to this revision.Jul 13 2021, 11:31 AM

Thanks for the review. I'll some more time to look at your last suggestion.

libcxx/include/__format/format_arg.h
233–235

That's the TODO below. I haven't added proper pointer formatting yet. (In hindsight I should also have removed the nullptr constructor for now.

239–240

It's for testing purposes. However when I originally added this I hadn't implemented the visitor yet. So I can used the visitor in the tests and remove this altogether.

libcxx/include/__format/format_fwd.h
2

Adding some declarations for types that need to know the other types exist.
(And slightly abused to declare some parts of the not yet implemented iterator library. That will be removed once libc++ implements these parts of the C++20 iterator library.)

libcxx/include/format
237

I did some investigation, but changing this to a built-in array was not trivial. Especially since sizeof...(_Args) can be zero. I still need to implement the required size optimizations for basic_format_args. So I prefer to tackle these two at the same time.
For now I added a TODO in the code.

libcxx/test/std/utilities/format/format.arguments/format.arg.store/class.pass.cpp
36–39

I think that's a valid point. Most of this test is "identical" to the test libcxx/test/std/utilities/format/format.arguments/format.args/ctor.pass.cpp. For now I changed the test to use libc++ specific asserts.

@ldionne What's your opinion of testing exposition only classes? I think the tests were great to have during development, but I agree with @vitaut that it would be better to test this class indirectly by using the public interface.

libcxx/test/std/utilities/format/format.arguments/format.arg/visit_format_arg.pass.cpp
88–109

For some place where it's easy I already test for char*_t support. I think it would make more sense to restrict that in the formatters. There the Standard is explicit about the supported types. I'll add a note to my todo list to update the formatters.

281–291

I really like this suggestion!

Mordante updated this revision to Diff 358384.Jul 13 2021, 12:10 PM

Now really rebase.

Mordante planned changes to this revision.Jul 13 2021, 12:11 PM
Mordante added inline comments.Jul 13 2021, 1:30 PM
libcxx/include/__format/format_context.h
64

I thought MSVC was heavily based on libfmt.

I did some investigations and your suggestion seems to work.
When I make the constructor private I'll need a "factory" or a friend for the internal parts of <format>.
So I'll go for a "factory" so the unit tests and internal implementation can share the same function.

Mordante updated this revision to Diff 358654.Jul 14 2021, 10:03 AM

Make basic_format_context's constructor private and add a helper for its users.
Mark unit tests using the helper as libc++ specific.
Disabled a failing unit tests that fails on GCC.
Make the module headers private.

Mordante marked an inline comment as done.Jul 14 2021, 10:05 AM
Mordante updated this revision to Diff 358682.Jul 14 2021, 11:46 AM

Fixes localization build.
Rebased, hopefully fixes Apple builds.

Mordante updated this revision to Diff 358695.Jul 14 2021, 12:19 PM

Rebase to fix the GCC build.

vitaut accepted this revision.Jul 17 2021, 6:18 AM

LGTM

vitaut added inline comments.Jul 17 2021, 6:28 AM
libcxx/include/__format/format_context.h
64

I thought MSVC was heavily based on libfmt.

It was loosely based on {fmt} with important pieces removed for some reason (but partly reintroduced now via PRs and issues).

Mordante updated this revision to Diff 359638.Jul 18 2021, 11:07 AM

Guard unit tests for implementation details

Mordante added inline comments.Jul 20 2021, 10:13 AM
libcxx/include/__format/format_arg.h
67

s/_LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI/ here and all other occurrences in this patch.

Quuxplusone added inline comments.
libcxx/include/__format/format_context.h
60

Re your comment on D106567 — It would be cleaner here to write "one function declaration per function signature," i.e. get rid of the defaulted function argument ( https://quuxplusone.github.io/blog/2020/04/18/default-function-arguments-are-the-devil ), and then I bet you wouldn't run into any modules issues with the 2-argument overload (because it wouldn't mention optional in its signature) nor the 3-argument overload (because it wouldn't mention optional in its signature, either). The next step would be to give the two things not-the-same name.
I actually talk about the "ooh, I'll take an optional that defaults to nullopt!" antipattern in Back to Basics: Algebraic Data Types (CppCon 2020), and obviously I've talked about When to Give Two Things the Same Name (CppCon 2021) as well. :)

Mordante updated this revision to Diff 361302.Jul 23 2021, 12:28 PM

etarget for LLVM 14.
s/_LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI/
Guard better against ADL
Move internal unit tests
Generate private module tests

Mordante updated this revision to Diff 361307.Jul 23 2021, 12:46 PM

Update the private module tests.
These failed due to local merge conflict files.

Mordante marked 3 inline comments as done.Jul 25 2021, 7:10 AM
Mordante added inline comments.
libcxx/include/__format/format_context.h
60

I prefer the current way. This function used to be a constructor before the class' constructor was made private. This was done to prevent users from directly creating basic_format_context objects. I read your blog post before and I partly agree with it, however default arguments are a tool. In this case I feel the code looks better with a default argument instead of having two functions.

I agree this would be bad when the function would take several bool arguments with a defaulted value. IMO a function with multiple bool arguments often is already a code smell by itself.

ldionne requested changes to this revision.Jul 28 2021, 11:02 AM
ldionne added inline comments.
libcxx/docs/Status/Format.rst
42–44 ↗(On Diff #361307)

I'm not sure this is worth mentioning at all, since we're going to implement https://wg21.link/P2216 and users will never notice. Also, it'll have been implemented by the time we ship <format> for the first time.

libcxx/include/__format/format_arg.h
38

We can remove this now. LLVM 13 has branched, and the oldest compiler we support should support concepts unless I'm mistaken. There will be a similar cleanup in the <ranges> code we merged already.

Note: I need to validate that the oldest compiler we support has support for concepts - I think that would be AppleClang 12, but I'm not sure all versions of it support concepts.

145

Why aren't we using a variant here?

@vitaut I'm being told you originally requested that Mark moves from variant to this union, was there a reason? I'd like to avoid going back and forth on this issue.

175–178

Suggestion to reduce the number of overloads:

template <class _Tp>
  requires (__libcpp_signed_integer<_Tp> || __libcpp_unsigned_integer<_Tp>) // might be a better way to do this
_LIBCPP_HIDE_FROM_ABI explicit basic_format_arg(_Tp __v) noexcept
{
  if constexpr (sizeof(_Tp) <= sizeof(int)) { }
  else if constexpr (...) { }
  // etc...
}

Basically, this single constructor should handle all the branches in http://eel.is/c++draft/format#arg-5.

221

IMO I'd rather have a longer line than a funky line break here.

Suggestion: Put _LIBCPP_HIDE_FROM_ABI on a line above, like so:

_LIBCPP_HIDE_FROM_ABI
explicit basic_format_arg(...);

This is what we do in most places. As we just talked about, if this creates pain because of clang-format and rebasing, don't waste your time on this. It's just a suggestion.

libcxx/include/__format/format_args.h
57

Is this [[nodiscard]] really adding anything? Non blocking.

IMO, using nodiscard everywhere because C++ has the wrong default here just adds syntactic overhead and reduces the impact of seeing [[nodiscard]] on a method that could truly be used incorrectly. I think we should stick to putting it on key methods where we really want to diagnose if the user discards the result, because that would most likely be a bug (e.g. anything called empty, something like lock_guard, etc).

libcxx/include/__format/format_context.h
123

Maybe those constructors should all be explicit?

libcxx/include/__format/format_fwd.h
40

This has been implemented now.

libcxx/include/format
13–20

This is not required. We're not shipping the library until it's stable anyway, with the mechanism you added.

libcxx/test/libcxx/utilities/format/format.formatter/format.context/locale.pass.cpp
40 ↗(On Diff #361307)

Would it be possible to move those tests into libcxx/test/std if we added a function that creates a basic_format_context to the test suite? Different implementations could hook into that function (in the test suite) to tell the test suite how to create a basic_format_context. Do you think that would make sense and be feasible?

I'm trying to see how we can test those parts of the API which are kind of implementation details of the library, but at the same time they're defined in the spec.

libcxx/test/support/test_basic_format_arg.h
15 ↗(On Diff #361307)

Please drop [[nodiscard]] in the tests unless it adds a clear benefit (e.g. a method that could meaningfully be misused).

This revision now requires changes to proceed.Jul 28 2021, 11:02 AM
vitaut added inline comments.Jul 28 2021, 11:36 AM
libcxx/include/__format/format_arg.h
145

Not using variant allows optimizing the representation for the common case of a small number of formatting arguments: http://eel.is/c++draft/format#args-note-1

Implementations should optimize the representation of basic_­format_­args for a small number of formatting arguments.
[Note 1: For example, by storing indices of type alternatives separately from values and packing the former. — end note]

libcxx/include/__format/format_arg.h
145

This open-coded __arg_t __type_ member takes only 1 byte; libc++'s <variant> would use a field __variant_index_t __index_ which takes 4 bytes unless _LIBCPP_ABI_VARIANT_INDEX_TYPE_OPTIMIZATION is set.
See D40210.
The Right Answer here might depend on how close we think we are to "ABI v2 by default". Currently that looks like "never". ;)

However however, my understanding of @vitaut's comment is that this code still is not implementing the recommended practice. The recommended practice is that basic_format_args shouldn't store an array of basic_format_arg {union, char} structs at all; it should store a struct of arrays (one array of unions and one array of chars).

I don't think there's any motivation to optimize the layout of basic_format_arg singular. Using variant would be fine, except that it would introduce an extra dependency on <variant> and probably harm compile times.

libcxx/include/__format/format_args.h
63

This is the array-of-structs that @vitaut pointed to. It should be rewritten as a struct-of-arrays.

vitaut added inline comments.Jul 28 2021, 1:52 PM
libcxx/include/__format/format_arg.h
145

My understanding is that the optimization is planned for a follow-up diff in which case switch to variant would have to be undone. cc @Mordante

Mordante marked 15 inline comments as done.Jul 29 2021, 7:00 AM
Mordante added inline comments.
libcxx/docs/Status/Format.rst
42–44 ↗(On Diff #361307)

This was based on the original plan to have an incomplete version in LLVM 13. But after our discussion last week I agree this is no longer useful. I'll remove the entire remark.

libcxx/include/__format/format_arg.h
38

I just tested with a quick hack (D107066) and it seems not all CI nodes support concepts.
https://buildkite.com/llvm-project/libcxx-ci/builds/4579

145

Yes the goal is to first have a working implementation where the observable behaviour matches the Standard.
Then this working implementation can be improved. This makes it a lot easier to detect regressions and measure the improvements.
There is a TODO FMT so I won't forget about it.

175–178

I experimented with this approach and I wasn't happy with the result. If feels odd to first require __libcpp_signed_integer || __libcpp_unsigned_integer and then in the code test which of the two has been used. Adding char_type and bool made to code even less readable. Especially when char_type is wchar_t it accepts char, but it doesn't allow narrowing the other way around.
So I reduced this set to 4 constructors

  • bool unmodified
  • char_type unmodified
  • __libcpp_signed_integer all in one overload and then use the if constexpr(sizeof(_Tp)...
  • __libcpp_unsigned_integer like above

That IMO leads to a better balance between removing overloads and having readable code.

221

I tested it. Making this change manually will be reverted by clang-format.

libcxx/test/libcxx/utilities/format/format.formatter/format.context/locale.pass.cpp
40 ↗(On Diff #361307)

It's very feasible I just need 2 wrapper functions that call std::__format_context_create; one with a std::locale and one without that argument not; basically the same as the std::format* functions.

Whether it makes sense, that depends on whether or not other vendors will supply patches for their hooks.

Mordante updated this revision to Diff 362782.Jul 29 2021, 7:55 AM
Mordante marked 6 inline comments as done.

Rebased and updated to work against main.
Addresses review comments.

ldionne accepted this revision.Aug 31 2021, 11:31 AM
This revision is now accepted and ready to land.Aug 31 2021, 11:31 AM

I read the comments and while I am not a big fan of reimplementing std::variant, I understand why it's needed (unless we want to break std::variant's ABI, which we don't). Also, this needs to make progress so I'm approving it -- worst case I can request some post-commit changes.

Mordante updated this revision to Diff 369930.Sep 1 2021, 8:04 AM

Rebased to give it a CI run test all still is fine.
Remove tests for gcc-10 since it's no longer supported.

Mordante updated this revision to Diff 369931.Sep 1 2021, 8:08 AM

Retry to start CI.

I read the comments and while I am not a big fan of reimplementing std::variant, I understand why it's needed (unless we want to break std::variant's ABI, which we don't). Also, this needs to make progress so I'm approving it -- worst case I can request some post-commit changes.

Thanks!

A note on the std::variant; breaking the ABI won't help. The tags and unions should be stored separately, something like:

std::array<tag, 4> tags;
std::array<unions, 4> unions;

And the note mentions packing the unions, so something like:

std::array<tag, 4> tags;
std::tuple<t0, t1, t2, t3> packed_unions;

(This is an example, I haven't investigated the solution I want to use.)

Mordante updated this revision to Diff 369950.Sep 1 2021, 9:06 AM

Seems Buildkite became unstuck, retry run the CI.

This revision was automatically updated to reflect the committed changes.