This is an archive of the discontinued LLVM Phabricator instance.

[libc++][format] Implement format-string.
ClosedPublic

Authored by Mordante on Mar 12 2022, 10:55 AM.

Details

Reviewers
EricWF
vitaut
ldionne
Group Reviewers
Restricted Project
Commits
rGaed5ddf8d097: [libc++][format] Implement format-string.
Summary

Implements the compile-time checking of the formatting arguments.

Completes:

  • P2216 std::format improvements

Diff Detail

Event Timeline

Mordante created this revision.Mar 12 2022, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 12 2022, 10:55 AM
Mordante requested review of this revision.Mar 12 2022, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 12 2022, 10:55 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
EricWF requested changes to this revision.Mar 12 2022, 3:30 PM
EricWF added a subscriber: EricWF.

The patch to the headers looks phenomenal. I only reviewed the code. I haven't looked at the spec yet. But the code was gorgeous. Well done.

The test needs to be entirely rewritten. The macro stuff is complicated and hard to read. Tests need to be correct upon inspection. That means repeating yourself rather than adding in direction.

Also that test file is way too big. Come up with a nice way to split it into different files.

Thank you for working on this.

libcxx/include/__format/parser_std_format_spec.h
330

Why comment it out?

libcxx/include/format
88

There's some delicious irony in that name.

215–389

I think you've done a really good job of minimizing the number of instantiations required to do all this compile time work. Well done!

230

Isn't it undefined behavior to have a constexpr function that is never constexpr?

234

I have a weak preference for using an alias to define the type to simplify the declaration

297

This is beautiful

332

Why does this need to be if constexpr?
It doesn't look like it instantiates anything

502

Why not just take a string view as the parameter? The overload won't be selected unless the argument can be implicitly converted to string view.

522

Can you do the type identity thing once on the top level type rather than a bunch on each of the arguments?

559–560

Why are we always in lining this?

561

Is that move required by the standard? Moving an iterator seems weird.

libcxx/test/std/utilities/format/format.functions/format.pass.cpp
50–51

Please don't write giant macro test like this.

Macros are hard for readers. They're hard for tooling, they're hard for IDEs. Their diagnostics are awful.

Repetitive code in tests is fine. What's important is that it be correct upon inspection. We don't have tests for our tests.

Optimized tests for readability. That will get you correctness.

libcxx/test/std/utilities/format/format.functions/format_tests.h
33 ↗(On Diff #414864)

No. Absolutely not.

We need to find a simpler way to write this in normal code. They can be reused reasonably. But this is way too complex to be at the root of correct tests. By the time these macros are used, who can have any idea what's going on.

I apologize for being terse

This revision now requires changes to proceed.Mar 12 2022, 3:30 PM
jloser added a subscriber: jloser.Mar 12 2022, 6:33 PM
jloser added inline comments.
libcxx/include/format
383

Nit: typo. s/shoule/should.

Mordante added inline comments.Mar 14 2022, 9:43 AM
libcxx/test/std/utilities/format/format.functions/format_tests.h
33 ↗(On Diff #414864)

@EricWF I agree, I'm not happy with this approach either. I posted it with [WIP] in the title for that reason. I wanted to it in the weekend since then a known bad patch doesn't take up too long on the CI.

I'll will post on Discord to see what a better approach will might be. (There are some caveats with this code.)

Mordante updated this revision to Diff 416711.Mar 19 2022, 11:05 AM

Updated test based on a discussion on Discord.
These new tests still need further discussion.

Mordante planned changes to this revision.Mar 19 2022, 11:06 AM

The code still isn't ready for a real review.

ldionne added inline comments.
libcxx/test/std/utilities/format/format.functions/format.pass.cpp
55

Is it not possible to call std::format with a non-constant format-string, in which case I would assume that we still get an exception? If so, we should test it too.

Mordante marked an inline comment as done.Mar 23 2022, 1:00 PM
Mordante added inline comments.
libcxx/test/std/utilities/format/format.functions/format.pass.cpp
55

No that's not possible. The new helper class basic-format-string (http://eel.is/c++draft/format#fmt.string) has a consteval constructor. If you have a non-constant format-string you can use std::vformat instead of std::format. Something like std::vformat(fmt, std::make_format_args('a', 42'));

The test for vformat will test all exceptions and format won't test the exceptions. A generic test to see whether these invalid format-strings are ill-formed has been added. But that tests only tests a few corner cases.

Mordante updated this revision to Diff 418415.Mar 26 2022, 1:13 PM
Mordante marked an inline comment as done.

Minor improvements and depend on the new patch with the NFC test changes.

Mordante planned changes to this revision.Mar 26 2022, 1:17 PM
Mordante marked 14 inline comments as done.Mar 27 2022, 6:11 AM
Mordante added inline comments.
libcxx/include/__format/parser_std_format_spec.h
330

For the constexpr version the code needs some of these internals . For prototypes I love to make everything public. In the next revision I've restored protected and only made the required parts public.

libcxx/include/format
230

The constructor is constexpr, the __parse_ won't be when called. The code should first set the proper handler. I've added comment to this constructor.

234

Since the typedef would only be used once I prefer to keep it as is.

332

Not every __formatter type has a __precision_needs_substitution() member function. So this avoids ill-formed code when the formatter doesn't have this member. Note that every formatter has a __width_needs_substitution.

502

I agree, however this matches the wording in the Standard http://eel.is/c++draft/format.fmt.string
In general I prefer to follow the wording of the Standard.

522

This using matches the wording in the Standard http://eel.is/c++draft/format.syn.

559–560

This is a temporary solution to improve the debug code generation.
FYI The formatting code is still work in progress. It's quite large and so some improvements will be done later. However we'll not mark this header API/ABI stable until we're happy with it.

561

It's required. The output_iterator concept only requires movable.
See https://cplusplus.github.io/LWG/issue3539 for more details.

libcxx/test/std/utilities/format/format.functions/format.pass.cpp
50–51

As discussed on Discord this has been improved.

Mordante updated this revision to Diff 418447.Mar 27 2022, 6:52 AM
Mordante marked 9 inline comments as done.

Address review comments.
Code cleanups.
Attempt to fix the CI.

Mordante updated this revision to Diff 420003.Apr 2 2022, 11:11 AM

Adjusted to the changes in D121514.

Mordante updated this revision to Diff 421724.Apr 9 2022, 6:22 AM

Rebased and some minor polishing.

Mordante updated this revision to Diff 421726.Apr 9 2022, 6:31 AM

fix a tab, which clang-format missed.

Mordante updated this revision to Diff 421737.Apr 9 2022, 8:19 AM

Minor cleanup and test to fix Windows builds.

Mordante added inline comments.Apr 9 2022, 9:32 AM
libcxx/include/format
518

This code is commented out, due to an issue with the MinGW/Clang-cl compiler (https://buildkite.com/llvm-project/libcxx-ci/builds/10114). It issues the diagnostic:

C:/ws/w32-1/llvm-project/libcxx-ci/build/clang-cl-dll/include/c++/v1\format(518,23): error: expected unqualified-id
using __format_string = __basic_format_string<char, type_identity_t<_Args>...>;
                      ^

This code is the exact wording in the Standard.

Since this is exposition only it's acceptable to remove this code, but https://wg21.link/P2508 intends to make this exposition only part non-exposition only. (This paper is ready for the next plenary and might be backported to C++20.)

I'll have to investigate why this fails.

641

This overly long line has been formatted locally and will be part of the next iteration of this patch. Similar for other overly long lines with __basic_format_string. (No need to rerun the CI for formatting changes.)

Mordante retitled this revision from [WIP][libc++][format] Implement format-string. to [libc++][format] Implement format-string..Apr 9 2022, 9:32 AM
Mordante edited the summary of this revision. (Show Details)
Mordante added reviewers: vitaut, ldionne.
Mordante updated this revision to Diff 424224.Apr 21 2022, 9:07 AM

Updated to changes D121514.

Mordante updated this revision to Diff 427833.May 7 2022, 2:24 AM

Rebased to test CI.

Mordante updated this revision to Diff 427846.May 7 2022, 4:33 AM

Test without Windows work-arounds.

Mordante updated this revision to Diff 429421.May 14 2022, 2:05 AM

Test WIN32 work-around.

Mordante added inline comments.May 14 2022, 12:35 PM
libcxx/include/format
516

It seems __format_string is Microsoft annotation which caused the compilation failures. So by naming them different would solve the issue. This means the name becomes a bit awkward for the libc++ naming scheme. I expect P2508 to be accepted soon, which changes the names to format_string and wformat_string. For now I prefer to keep the work-around and fix it once P2508 is voted in. (This paper aims to be applied to C++20.)

Mordante updated this revision to Diff 430462.May 18 2022, 11:29 AM

Rebased and resolved merge conflicts.

Should probably be mentioned in the release notes...?

libcxx/docs/Status/Cxx20Papers.csv
199

There's several items on the Format status page that are (positively) affected by this as well.

Mordante updated this revision to Diff 432710.May 28 2022, 4:02 AM

Addresses review comments.

Mordante marked an inline comment as done.May 28 2022, 4:06 AM

Should probably be mentioned in the release notes...?

Good point, thanks!

libcxx/docs/Status/Cxx20Papers.csv
199

I tend to update page separately, that works better for me. Especially to mark things I started working on; several of these features are rather big.

philnik added inline comments.
libcxx/include/format
236

It's not necessary, but I find it a lot easier to read than finding the variable name inside a function pointer declaration.

Mordante updated this revision to Diff 432716.May 28 2022, 5:08 AM
Mordante marked an inline comment as done.

Rebased to fix CI.

Mordante updated this revision to Diff 432737.May 28 2022, 9:56 AM

Fixes due to upstream changes.

Mordante updated this revision to Diff 432750.May 28 2022, 1:08 PM

Try to fix CI.

Mordante updated this revision to Diff 432771.May 29 2022, 2:14 AM

Rebased on main.

vitaut requested changes to this revision.Jun 4 2022, 8:35 AM

In general looks good but I would strongly recommend not to instantiate all of formatting path just to validate width/precision (see an inline comment).

libcxx/benchmarks/formatter_float.bench.cpp
219–220

Since all the components are known at compile time why not make format string construction constexpr and make the benchmark more realistic by replacing vformat_to with format_to?

libcxx/docs/ReleaseNotes.rst
115 ↗(On Diff #432771)

typo: std:;format_to -> std::format_to

117 ↗(On Diff #432771)

compile-time -> at compile time
compile time literal -> compile-time literal

119 ↗(On Diff #432771)

run-time -> at run time.

libcxx/include/__format/parser_std_format_spec.h
260

Why make it public? Isn't it only called by subclasses?

libcxx/include/format
217

What is this class for? A comment would be nice.

259

Does it give a user-friendly error? Perhaps it would be better to generate an error ourselves with a better message because this is a pretty common case.

302

Is this ifdef needed? In principle you could always define __i128 and __u128 but only use it if _LIBCPP_HAS_NO_INT128 is set. This might be beneficial for ABI compatibility and simplify some code like the one here.

313

Should be either "isn't integral" or "isn't an integral type"

503–504

Using __vformat_to to do validation is an interesting idea but instantiation all of the formatting code just to check dynamic width/precision will likely have negative impact on compile times. I would recommend only doing parsing here and a small dynamic width/precision validation step. This should also make __compile_time_basic_format_context unnecessary.

561–565

Can we avoid these ugly ifdefs? Why not rename __format_string to __format_string_t or similar?

This revision now requires changes to proceed.Jun 4 2022, 8:35 AM
Mordante marked 8 inline comments as done.Jun 4 2022, 10:48 AM

Thanks for the review! I've addressed some of the smaller comments, will look at the others tomorrow.

libcxx/include/__format/parser_std_format_spec.h
260

Not anymore, __compile_time_validate_argument calls the function too, there it needs to be public.

libcxx/include/format
217

I added comment. Basically when a handle is used the code needs to use the parse function of the formatter associated with the type _Tp.

302

Good point. Here it doesn't have an ABI impact. Based on earlier review comments by you in another patch I made these field unconditionally in the enum. So I can use them here unconditionally.

561–565

I originally added them due to odd errors on Windows. Later I figured out it wasn't a bug of the Windows compilers but that __format_string is a SAL macro on Windows. (Not the first time I've been bitten by SAL macros.)

For now I prefer to keep them since I expect these become format_string during the plenary in July.

vitaut added inline comments.Jun 4 2022, 11:38 AM
libcxx/include/format
561–565

Up to you but if you rename it to something other than __format_string now it will be a small change to switch to format_string as well.

Mordante updated this revision to Diff 434315.Jun 5 2022, 3:31 AM
Mordante marked 6 inline comments as done.

Addresses review comments.

Mordante marked 3 inline comments as done.Jun 5 2022, 3:38 AM
Mordante added inline comments.
libcxx/benchmarks/formatter_float.bench.cpp
219–220

Good suggestion! When I initially wrote this patch libc++ didn't support constexpr std::string, but now it does.

libcxx/include/format
236

As said above I prefer not to add a typedef for one instance.

259

It doesn't give a nice error. I've changed it to throwing an exception; still not a great error message but now the compiler output shows "Argument index out of bounds" instead of a null pointer error. I did the same for line 264.

503–504

I agree that this implementation may have more compile-time overhead than strictly needed. When I originally went an approach you suggested I was unhappy with the amount of code duplication. So for now I prefer to keep the current solution. I will probably look at it another time, then I have a baseline to measure the improvements against.

561–565

Fair point, I've changed it.

ldionne accepted this revision.Jun 7 2022, 9:25 AM

This LGTM. I think most comments by @vitaut have been applied, and while I have a couple of comments, I think this is good to go.

libcxx/docs/ReleaseNotes.rst
116 ↗(On Diff #434315)

I think we should move these "API Changes" to "New features" above, but without mentioning that this can break code. Indeed, these release notes are aimed at LLVM releases, however <format> has not been enabled yet in LLVM releases, so calling this an API change is going to be misleading. I think we should only explain that format strings are now checked at compile-time.

libcxx/include/format
486–488

Would this work too? If so, it may even become unnecessary to have a __get_types function altogether, perhaps we can just inline it in the body of __basic_format_string below.

libcxx/test/std/utilities/format/format.functions/format.locale.pass.cpp
45
Mordante marked 6 inline comments as done.Jun 7 2022, 10:50 AM

Thanks for the review!

libcxx/include/format
486–488

Thanks for the suggestion, that removes quite some unneeded lines, especially for __get_handle.

Mordante updated this revision to Diff 434890.Jun 7 2022, 10:53 AM
Mordante marked an inline comment as done.

Addresses review comments.

Mordante updated this revision to Diff 435050.Jun 7 2022, 11:08 PM

This should fix the AIX build.

vitaut added inline comments.Jun 8 2022, 9:11 AM
libcxx/benchmarks/formatter_float.bench.cpp
239

nit: no need to call make_fmt twice. In fact you could probably merge make_fmt into this lambda.

Mordante updated this revision to Diff 435289.Jun 8 2022, 12:15 PM

Test to see whether the CI works again.

Mordante marked an inline comment as done.Jun 8 2022, 12:20 PM
Mordante added inline comments.
libcxx/benchmarks/formatter_float.bench.cpp
239

I tried that, but that doesn't seem to work. The std::string isn't a constexpr then.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 11 2022, 6:26 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Mordante marked an inline comment as done.
vitaut added inline comments.Jun 12 2022, 9:26 AM
libcxx/include/format
503–504

You can avoid duplication with some refactoring of parsing code but I guess it doesn't have to be done in the current diff.

But does the current approach work with user-defined formatters that don't have templated format methods and therefore won't accept __compile_time_basic_format_context? Is it covered in tests?

Mordante marked an inline comment as done.Jun 12 2022, 1:16 PM
Mordante added inline comments.
libcxx/include/format
503–504

Good question. The current handle test is templated (using auto) I'll change that formatter to match the BasicFormatter requirements. I'll make a separate review for that.

Mordante marked an inline comment as done.Jun 14 2022, 10:31 AM
Mordante added inline comments.
libcxx/include/format
423

@vitaut The handle only parses the format string, it doesn't try to "format" it.

503–504

I've validated it.

  • I always use a basic_format_parse_context<CharT> as parse context must work due to the BasicFormatter requirements.
  • I only instantiate a _compile_time_basic_format_context if the type isn't a handle class. See my new comment.

So this is work, but there's no test. I've updated the handle test in D127767.