This PR is to fix a few UTF16 and UTF32 related test cases that are testing member functions for https://en.cppreference.com/w/cpp/locale/codecvt class , functions are converting from UTF16, UTF32 to UTF8 or vise visa. Test cases need to explicitly specify it is UNICODE character for UTF16/32 type in order to be valid tests to match type in documentation. it assumes it will be ASCII or UTF8 type for 1 byte character ( value range from 1 to 127 ), which is not true on z/OS in EBCDIC mode. For information related to UTF16/32 , please see https://naveenr.net/unicode-character-set-and-utf-8-utf-16-utf-32-encoding/ , and EBCDIC/ASCII character value can be found in http://www.simotime.com/asc2ebc1.htm
Details
- Reviewers
uweigand hubert.reinterpretcast abhina.sreeskantharajan fanbo-meng muiez • Quuxplusone ldionne - Group Reviewers
Restricted Project - Commits
- rGf3cb8d6e2520: [SystemZ][z/OS][libcxx]: fix libcxx test cases related to codecvt class UTF16/32
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM (if icky) for the char8_t-based tests.
For the other two tests that are currently failing buildkite in C++03 mode (because u's' is a syntax error in C++03), personally I would recommend leaving the code alone and just UNSUPPORTED'ing them on zOS. They're already marked with
// This test runs in C++20, but we have deprecated codecvt<char(16|32), char, mbstate_t> in C++20.
which indicates that it's not too important to make this stuff work on "new" platforms — in fact we're actively deprecating/breaking it on "old" platforms and it might be completely moot by the time C++23 ships.
OK I can disable those 2 tests on z/OS since we already have utf8_t test for running with c++20.
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp | ||
---|---|---|
17–18 | @NancyWang2222 wrote:
The comment means, "This test runs [in C++03 through C++17 but also even] in C++20." It works fine (on ASCII platforms) in C++03 through C++17; in C++20 mode it requires this extra deprecation macro. Your patch basically suggests to run this test on both ASCII and EBCDIC platforms, but only in C++11 through C++20. |
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp | ||
---|---|---|
17–18 | I think your counterproposal is good. I do have one question, given that char16_t was added in C++11 why is this test being run with C++03? I'm mostly curious, just want to understand the background. |
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp | ||
---|---|---|
17–18 | sure. sounds good to me . I will modify test to run ASCII mode only for z/OS. |
yeah. I also have same question as Sean mentioned. codecvt class for UTF16/32_t starts from c++11, utf8_t started from c++20. we shouldn't run with c++03. the error is expected.
hi Arthur O'Dwyer, I am thinking if we can add unsupported cxx-03 for both test cases with current resolution. adding u'some test' or u'a' is right syntax for uchar16_t , same for utf8_t type. any thoughts about this ?
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp | ||
---|---|---|
17–18 | @NancyWang2222 wrote:
I don't think we should remove coverage for this c++03 extension just because z/OS can't handle it. Skipping the test for z/OS makes more sense. At least historically, I'd say it's more important to test extensions, because they must have been important to customers or else we wouldn't have done them. In modern times, I personally have done some patches that increased libc++'s c++03 test coverage just because "hey, it happens to work, we might as well test it"; but my impression is that this particular test case dates back to when "making codecvt work in c++03 mode" was an important feature for some real reason, and so I assume we shouldn't stop testing it yet. |
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp | ||
---|---|---|
17–18 | hi hi Arthur O'Dwyer , I modified test cases for both patches. can you review it? Thanks |
LGTM if it looks good to @ldionne. (The regex, for detecting z/OS, is unusual; but IMHO is probably the appropriate tool in this case.)
LGTM, but please add short comments (1-2 lines) explaining why the tests are unsupported. I don't need to see this again.
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_out.pass.cpp | ||
---|---|---|
20 | Please add a quick comment explaining why it's unsupported (here and in char16_t_out.pass.cpp too). I try to always add a comment when marking tests as UNSUPPORTED because it makes the archeology a lot easier in the future if you want to understand why X is failing, and you end up looking at the test for X and seeing it's always been skipped on your platform. This has been invaluable for me at Apple and I assume you folks will appreciate it just as much down the line. |
@NancyWang2222 wrote:
The comment means, "This test runs [in C++03 through C++17 but also even] in C++20." It works fine (on ASCII platforms) in C++03 through C++17; in C++20 mode it requires this extra deprecation macro.
Your patch basically suggests to run this test on both ASCII and EBCDIC platforms, but only in C++11 through C++20.
My counterproposal is to run this test in C++03 through C++20, but only on ASCII platforms. Which means we're not getting test coverage of the feature on EBCDIC platforms... but the feature is already deprecated, so I'm saying it may not really matter that it's not covered on EBCDIC, as long as we keep the coverage we've got (for ASCII C++03 platforms).