This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Make chars_format a bitmask type.
ClosedPublic

Authored by Mordante on Feb 20 2021, 2:08 AM.

Details

Summary

Some of Microsoft's unit tests in D70631 fail because libc++'s
implementation of std::chars_format isn't a proper bitmask type. Adding
the required functions to make std::chars_format a proper bitmask type.

Implements parts of P0067: Elementary string conversions

Diff Detail

Event Timeline

Mordante requested review of this revision.Feb 20 2021, 2:08 AM
Mordante created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2021, 2:08 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
curdeius added inline comments.
libcxx/include/charconv
112

Would adding to_underlying helper be cleaner?
Something along the lines of the proposed http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1682r1.html. WDYT?

116

Maybe add a constant for 0x7 to make it clear that it's a bitmask for all possible enumerators.

116

http://eel.is/c++draft/bitmask.types lacks the bitand so that the result is a correct enum value. Would it be an issue in the wording? Or should we stick closely to the standard given that the usage of ~ should be X &= ~Y?

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
49

Given that the exact values of enumerators are unspecified, this test is technically libc++-specific.

Mordante planned changes to this revision.Feb 20 2021, 7:55 AM

Thanks for your feedback! I'll wait for @ldionne's view on to_underlying before addressing the comments.

libcxx/include/charconv
112

I hadn't seen that paper yet, but it seems very useful. According to to https://github.com/cplusplus/papers/issues/460 it seems to be on track to land in C++23. I'm somewhat tempted to implement __to_underlying in utility for C++11 for now and also adjust the other bitmask types. Then we can switch to to_underlying or add a C++17 to_underlying once the paper lands. @ldionne What's your opinion?

116

Maybe add a constant for 0x7 to make it clear that it's a bitmask for all possible enumerators.

I think that would be useful.

http://eel.is/c++draft/bitmask.types lacks the bitand so that the result is a correct enum value. Would it be an issue in the wording? Or should we stick closely to the standard given that the usage of ~ should be X &= ~Y?

Not entirely sure what you mean. Do you mean it's possible to get a value of 0 for the format? That seems possible when using [bitmask.types]/1 and 4.3 implies this is wanted. The caller of std::to_chars should ensure this doesn't happen http://eel.is/c++draft/charconv#to.chars-10.

116

http://eel.is/c++draft/bitmask.types lacks the bitand so that the result is a correct enum value. Would it be an issue in the wording? Or should we stick closely to the standard given that the usage of ~ should be X &= ~Y?

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
49

IMO opinion http://eel.is/c++draft/bitmask.types strongly implies we need to use these values. However I don't mind to make the tests not depending on the exact value.

curdeius added inline comments.Feb 20 2021, 1:23 PM
libcxx/include/charconv
116

I meant that http://eel.is/c++draft/bitmask.types#2 defines:

constexpr bitmask operator~(bitmask X){
  return static_cast<bitmask>(~static_cast<int_type>(X));
}

whereas you define it as more or less:

constexpr bitmask operator~(bitmask X){
  return static_cast<bitmask>(~static_cast<int_type>(X) & all_bits_of_bitmask);
}

So, without your change, the result of ~X, where X is a chars_format, would not have a value being one of enumerators of X. Example ~fixed = ~2 = 0xFFFFFFFD (if 32-bit).

Personally I'd say that it's not important given the intended usage (X&=~Y, http://eel.is/c++draft/bitmask.types#4.2).

On a second thought, I'd remove & 0x7 altogether. Anyway, WDYT about it?

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
49

I'm not a standardese expert but I read values in http://eel.is/c++draft/bitmask.types#2 as an example (exposition only), and that's http://eel.is/c++draft/bitmask.types#3 that defines these values, which should be nonzero, distinct and have no common set bits. So, theoretically, the values could be e.g. 3, 4, 24.
But anyway, that would be silly.
I'm ok if you keep it as is.

Mordante added inline comments.Feb 21 2021, 6:42 AM
libcxx/include/charconv
116

Personally I'd say that it's not important given the intended usage (X&=~Y, http://eel.is/c++draft/bitmask.types#4.2).

On a second thought, I'd remove & 0x7 altogether. Anyway, WDYT about it?

That would solve the issue that 0x7 could change in the future. The code I used as template has similar code.
The bitmasks using ~static_cast<T>(__x) are in

  • <filesystem>

The bitmask using ~static_cast<T(__x) & V

  • <future>
  • <regex>

I looked at some other bitmask types, but they don't use scoped enums, so don't need their own operators. Of course when using an unsigned the value can also be out of the 'valid' bitmasks.

MSVC STL uses a macro for their bitmask types using ~static_cast<T>(__x).
libstdc++ also uses ~static_cast<T>(__x), only looked at chars_format.

So I'm in favour of removing the & 0x7 and also create a new patch changing <future> and <regex>.

Mordante added inline comments.Feb 22 2021, 10:59 PM
libcxx/include/charconv
112

This comment aged fast; according to Herb P1682 has been adopted [1]. I'll create a patch to implement it. I think it makes sense to allow its usage in C++11 or do we prefer C++23 only? In that case I'll add a private function for C++11 which is used by the C++23 public version.

[1] https://herbsutter.com/2021/02/22/trip-report-winter-2021-iso-c-standards-meeting-virtual/

curdeius added inline comments.Feb 23 2021, 12:17 AM
libcxx/include/charconv
112

@Mordante, I have already an almost ready patch for to_underlying. I'll finish it up this evening or tomorrow and send for a review, so don't bother for the tests of your private function.

Mordante marked an inline comment as done.Feb 23 2021, 8:41 AM
Mordante added inline comments.
libcxx/include/charconv
112

Thanks for the information. Then I'll wait for your patch and use that in this patch.

Mordante updated this revision to Diff 326901.Feb 27 2021, 6:32 AM
Mordante marked an inline comment as done.
  • Use __to_underlying
  • Don't depend on the exact value of the mask
  • Use bitwise NOT according to the definition in the standard
Mordante marked 6 inline comments as done.Feb 27 2021, 6:34 AM
curdeius accepted this revision.Feb 27 2021, 9:19 AM

LGTM.

libcxx/include/charconv
84

Hmm, if adding this include is too much, we might move __to_underlying to <type_traits> (which <utility> includes anyway). But let's see what others thinks.

Mordante added inline comments.Mar 1 2021, 9:59 AM
libcxx/include/charconv
84

Agreed.

Mordante updated this revision to Diff 332727.Mar 23 2021, 10:51 AM

Rebased to update to main and a friendly ping.

LGTM % test changes, but I'd like to see the test changes.

libcxx/include/charconv
116

Pedantically, isn't this sketchy as heck? ~scientific would be chars_format(-1) which I think isn't technically defined behavior? Does the standard actually require this operator to work?
If I'm right (which I may not be), then would giving chars_format a proper underlying type help matters? e.g. enum class chars_format : unsigned int { ... }
(I wrote out this comment on line 114 before seeing that it duplicated a lot of this comment thread.)

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
70

Please rewrite this function to use assert(x); assert(y); assert(z); return true;
instead of return x && y && z;

Mordante marked 2 inline comments as done.Apr 4 2021, 5:04 AM
Mordante added inline comments.
libcxx/include/charconv
116

Pedantically, isn't this sketchy as heck? ~scientific would be chars_format(-1) which I think isn't technically defined behavior? Does the standard actually require this operator to work?

http://eel.is/c++draft/bitmask.types#4.2:
"To clear a value Y in an object X is to evaluate the expression X &= ~Y."

If I'm right (which I may not be), then would giving chars_format a proper underlying type help matters? e.g. enum class chars_format : unsigned int { ... }

So ~scientific should be well defined. So I prefer not to change the underlying type of enumerate to a fixed type.

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
70

I personally prefer to write the code as you suggested, in fact I originally wrote it like that. Unfortunately assert can't be used in a constexpr function in C++11. Hence this comment at line 48: "This construction is used to make the function constexpr in C++11."

Quuxplusone added inline comments.Apr 4 2021, 6:28 AM
libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
70

<charconv> was introduced in C++17. Anyone still using C++11 constexpr in the year of our lord 2021 deserves what they get.

If you need to add UNSUPPORTED: c++11 to this test in order to use the preferred style, I think that's a good tradeoff. @ldionne, thoughts?

Mordante marked 2 inline comments as done.Apr 4 2021, 6:45 AM
Mordante added inline comments.
libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
70

The reason I use C++11 in this test is the fact that libc++ has enabled std::to_chars in C++11 as an extension.

ldionne added inline comments.Apr 8 2021, 11:11 AM
libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
70

I think we're better off assuming C++17.

It looks like they discussed the issue of back-porting in https://reviews.llvm.org/D41458 (which added to_chars), but I'm not satisfied by that discussion. If I had commented on that review, I would have pushed back on supporting this pre-C++17 with all I have.

Now it's done, but at least I think it's clear that IMO it's reasonable not to bend backwards for the tests to support < C++17.

Mordante updated this revision to Diff 336662.Apr 11 2021, 6:30 AM

Improve test by using assert and dropping C++11 support. As discussed during the review.

Mordante marked 3 inline comments as done.Apr 11 2021, 6:30 AM

LGTM mod nits; but I think the "operator@= should be constexpr" nit is a BIG nit.

libcxx/include/charconv
112

FWIW, you could use constexpr instead of _LIBCPP_CONSTEXPR throughout this C++11-and-later file. I have no opinion on the matter, but I don't know if others might have a preference.

147–151

This should also be constexpr, unless I'm mistaken. Even if it's not mandated, certainly it would be nice QoI to be able to use chars_format x, y; x ^= y in a constexpr function the same way we can use int a, b; a ^= b today.

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
11

...allowed in a constexpr function in C++. To keep the code readable, C++11...

30

See my comment above about constexpr @= operators, and if so, make this function also constexpr.

59–61

Would this test be easier to read as...?

assert((std::chars_format::hex & (std::chars_format::scientific | std::chars_format::fixed)) == std::chars_format(0));

(or {} instead of (0), I have no strong preference)

76

It might be worth adding these tests:

std::chars_format x;
static_assert(std::is_same<std::chars_format, decltype(~x)>::value, "");
static_assert(std::is_same<std::chars_format, decltype(x&x)>::value, "");
static_assert(std::is_same<std::chars_format, decltype(x|x)>::value, "");
static_assert(std::is_same<std::chars_format, decltype(x^x)>::value, "");
static_assert(std::is_same<std::chars_format&, decltype(x&=x)>::value, "");
static_assert(std::is_same<std::chars_format&, decltype(x|=x)>::value, "");
static_assert(std::is_same<std::chars_format&, decltype(x^=x)>::value, "");
Mordante marked 6 inline comments as done.Apr 13 2021, 10:35 AM
Mordante added inline comments.
libcxx/include/charconv
112

Good point. I prefer contexpr.

147–151

They shouldn't be constexpr per http://eel.is/c++draft/bitmask.types#2.
My guess is the bitmask type was added/updated in C++11. IIRC in C++11 a constexpr could only have a single return and they have never been updated. Guess it would be nice to write a proposal to update this type.

I agree it would be nice to use constexpr, but that's not allowed per http://eel.is/c++draft/constexpr.functions.

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
59–61

I don't like that, I tried it and our formatting rules still make it not that great. Instead I renamed chars_format_underlying_type to ut and added using cf = std::chars_format;. With these changes all asserts are less than 80 characters.

Mordante updated this revision to Diff 337200.Apr 13 2021, 10:38 AM
Mordante marked 3 inline comments as done.

Addresses review comments.

libcxx/include/charconv
147–151

The code you quoted is just "exposition-only"; I think the normative text in http://eel.is/c++draft/bitmask.types#1 takes precedence, and that normative text clearly states that integer types are bitmask types. Integer types have constexpr assignment ops. Thus, bitmask types can have constexpr assignment ops, even if the exposition-only code does depict a specific example type without constexpr support.

Right now, libc++ is the only one of the Big Three to provide compound operators that don't work in constexpr functions. Let's fix that bug and help users write more portable code. https://godbolt.org/z/r7fzo4oKh

Mordante added inline comments.Apr 13 2021, 11:22 AM
libcxx/include/charconv
147–151

The code you quoted is just "exposition-only"; I think the normative text in http://eel.is/c++draft/bitmask.types#1 takes precedence, and that normative text clearly states that integer types are bitmask types. Integer types have constexpr assignment ops. Thus, bitmask types can have constexpr assignment ops, even if the exposition-only code does depict a specific example type without constexpr support.

Good point.

Right now, libc++ is the only one of the Big Three to provide compound operators that don't work in constexpr functions. Let's fix that bug and help users write more portable code. https://godbolt.org/z/r7fzo4oKh

Indeed, it would be good for the users if the big 3 behave in the same manner. Thanks for this additional information.

I'll adjust the code and update the unit tests. (Note since we allow this code in C++11, I'll only make it constexpr in C++14 and newer.)

Mordante updated this revision to Diff 337216.Apr 13 2021, 11:32 AM

Addresses review comments and make all new functions constexpr after C++11.

Quuxplusone accepted this revision.Apr 13 2021, 11:36 AM

LGTM % one last clang-format-introduced whitespace typo!

libcxx/test/std/utilities/charconv/charconv.syn/chars_format.pass.cpp
67

/x&/x &/

This revision is now accepted and ready to land.Apr 13 2021, 11:36 AM
This revision was automatically updated to reflect the committed changes.

Thanks for the review!
Interesting change by clang-format, nice catch.