This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Reject month 0 in get_date/__get_month
ClosedPublic

Authored by DavidSpickett on Apr 21 2022, 7:27 AM.

Details

Reviewers
Mordante
Group Reviewers
Restricted Project
Commits
rGab07d06e7b86: [libcxx] Reject month 0 in get_date/__get_month
Summary

[libcxx] Reject month 0 in get_date/__get_month

This fixes #47663.

Months in dates should be >= 1 and <= 12.
We parse up to two digits then minus one, because
we want to store this as "months since January"
(0-11).

However we didn't check that the result of that
was not -1. For example if you had (MM/DD/YYYY)
00/21/2022.

Added tests for:

  • Failing if month is 0
  • Failing if month is 13
  • Allowing a leading zero in month e.g. "01"

Note that libc++ and libstdc++ return different
values on parsing failure, and MSVC STL returns
end of stream instead.

Handle the first two by checking for defines, MSVC STL
expects these tests to fail for other reasons already:
https://github.com/microsoft/STL/blob/main/tests/libcxx/expected_results.txt#L372
so not handling that case here.

Diff Detail

Event Timeline

DavidSpickett created this revision.Apr 21 2022, 7:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 7:27 AM
DavidSpickett requested review of this revision.Apr 21 2022, 7:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 7:27 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

For https://github.com/llvm/llvm-project/issues/47663.

That's a lot of boilerplate for the tests but I figured I'd stay focused on the problem at hand. (and there are probably reasons for that style)

I did notice that libstdc++ differs in its return value for the failures. It will consume just one character, even if the month is "13" where libcxx consumes both.

$ /tmp/test_libcxx.o
Parsing the date out of '13/01/2013' in the locale en_US.utf8
Parse failed. Unparsed string: /01/2013
$ g++ /tmp/test.cpp -o /tmp/test.o && /tmp/test.o
Parsing the date out of '13/01/2013' in the locale en_US.utf8
Parse failed. Unparsed string: 3/01/2013

From https://en.cppreference.com/w/cpp/locale/time_get/get_date (not sure if I should be getting the actual standard document for this) this seems like it could be ok.

I read in a draft standard for do_get_date specifically "Returns: An iterator pointing immediately beyond the last character recognized as possibly part of a valid date.". So "possibly" seems like it would allow for both behaviours.

DavidSpickett added inline comments.Apr 21 2022, 7:39 AM
libcxx/include/locale
1928

If this way of writing the comparison seems strange, I'm matching __get_day above.

Thanks for working on this!

For https://github.com/llvm/llvm-project/issues/47663.

From https://en.cppreference.com/w/cpp/locale/time_get/get_date (not sure if I should be getting the actual standard document for this) this seems like it could be ok.

Nothing wrong with cppreference, but for the Standard wording we usually use http://eel.is/c++draft/locale.time.get.virtuals#lib:do_get_date,time_get. (Not sure whether you know this resource.)

I read in a draft standard for do_get_date specifically "Returns: An iterator pointing immediately beyond the last character recognized as possibly part of a valid date.". So "possibly" seems like it would allow for both behaviours.

I think libstdc++'s behaviour is the correct behaviour. But fixing that should be a separate review.

libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
18

:-( This is not your change, but it basically means these changes aren't tested in most of our CI. I'll create a fix.

114

This should be guarded with

#if _LIBCPP_VERSION
        assert(base(i) == in+2);
#else
        assert(base(i) == in+2);
#endif

Our tests should be portable and MSVC STL and GCC use them.

Nothing wrong with cppreference, but for the Standard wording we usually use http://eel.is/c++draft/locale.time.get.virtuals#lib:do_get_date,time_get. (Not sure whether you know this resource.)

I didn't, that's much better since cppreference has some of the descriptions merged.

I think libstdc++'s behaviour is the correct behaviour. But fixing that should be a separate review.

I think I see the logic there now. For MM/DD/YYYY "13/01/2022", the "1" could be part of a valid date ("1/01/2022"), but "13" could not. For the "00/" case, you can have one leading zero so "0" could possibly be part of a valid date ("01/01/2022") but "00" cannot be.

libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
114

I'll confirm what MSVC does and add these guards.

DavidSpickett added inline comments.Apr 22 2022, 2:20 AM
libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
18

Yeah I actually tested this on Linux by commenting out that one test case. So I can at least confirm everything else works.

DavidSpickett updated this revision to Diff 424432.EditedApr 22 2022, 4:25 AM

Added defines for the iterator checks to support libstdc++.
(MSVC STL has these disabled already:
https://github.com/microsoft/STL/blob/main/tests/libcxx/expected_results.txt#L372
I do know what it does though so I could add that if needed?)

DavidSpickett edited the summary of this revision. (Show Details)Apr 22 2022, 4:25 AM

Rebase, might pick up the AIX xfails in the process.

I opened https://github.com/llvm/llvm-project/issues/55036 for the difference in the return value.

I agree with you that libstdc++ is correct, made sense once I thought about the use case where you want to know which bit of the date caused the problem.

Format step is failing in CI because I'm adding to files that aren't formatted to begin with. https://buildkite.com/llvm-project/libcxx-ci/builds/10345#813d83ad-2ec6-4919-9769-a226d56a6da3

Format step is failing in CI because I'm adding to files that aren't formatted to begin with. https://buildkite.com/llvm-project/libcxx-ci/builds/10345#813d83ad-2ec6-4919-9769-a226d56a6da3

You can ignore this. It's only soft failing currently.

I'll have another look after the open review comments have been addressed, but the patch looks already nice.

libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
18

I just landed D124331, which fixes the tests on Linux. Can you rebase this patch on top of that to see whether the CI is happy too?

DavidSpickett marked 3 inline comments as done.Apr 26 2022, 8:34 AM
DavidSpickett added inline comments.
libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
114

I'm assuming no MSVC guards is ok.

You can ignore this. It's only soft failing currently.

Cool, thanks for confirming that.

Sorry I missed your update, but LGTM! Please rebase to test the CI is green before landing.

libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
18

Odd this XFAIL should be gone upstream.

Mordante accepted this revision.May 5 2022, 12:50 PM
This revision is now accepted and ready to land.May 5 2022, 12:50 PM
This revision was automatically updated to reflect the committed changes.
DavidSpickett marked an inline comment as done.May 6 2022, 3:12 AM
DavidSpickett added inline comments.
libcxx/test/std/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp
18

Must have rebased too soon, it's gone now.