This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ][z/OS][libcxx]: fix libcxx test cases related to codecvt class UTF16/32
ClosedPublic

Authored by NancyWang2222 on Jul 16 2021, 7:19 AM.

Details

Summary

This PR is to fix a few UTF16 and UTF32 related test cases that are testing member functions for https://en.cppreference.com/w/cpp/locale/codecvt class , functions are converting from UTF16, UTF32 to UTF8 or vise visa. Test cases need to explicitly specify it is UNICODE character for UTF16/32 type in order to be valid tests to match type in documentation. it assumes it will be ASCII or UTF8 type for 1 byte character ( value range from 1 to 127 ), which is not true on z/OS in EBCDIC mode. For information related to UTF16/32 , please see https://naveenr.net/unicode-character-set-and-utf-8-utf-16-utf-32-encoding/ , and EBCDIC/ASCII character value can be found in http://www.simotime.com/asc2ebc1.htm

Diff Detail

Event Timeline

NancyWang2222 requested review of this revision.Jul 16 2021, 7:19 AM
NancyWang2222 created this revision.

LGTM (if icky) for the char8_t-based tests.
For the other two tests that are currently failing buildkite in C++03 mode (because u's' is a syntax error in C++03), personally I would recommend leaving the code alone and just UNSUPPORTED'ing them on zOS. They're already marked with

// This test runs in C++20, but we have deprecated codecvt<char(16|32), char, mbstate_t> in C++20.

which indicates that it's not too important to make this stuff work on "new" platforms — in fact we're actively deprecating/breaking it on "old" platforms and it might be completely moot by the time C++23 ships.

NancyWang2222 added a comment.EditedJul 16 2021, 9:48 AM

LGTM (if icky) for the char8_t-based tests.
For the other two tests that are currently failing buildkite in C++03 mode (because u's' is a syntax error in C++03), personally I would recommend leaving the code alone and just UNSUPPORTED'ing them on zOS. They're already marked with

// This test runs in C++20, but we have deprecated codecvt<char(16|32), char, mbstate_t> in C++20.

which indicates that it's not too important to make this stuff work on "new" platforms — in fact we're actively deprecating/breaking it on "old" platforms and it might be completely moot by the time C++23 ships.

OK I can disable those 2 tests on z/OS since we already have utf8_t test for running with c++20.

libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp
17–18

@NancyWang2222 wrote:

if test cases are for C++20, should we add unsupported for c++03 - c++17

The comment means, "This test runs [in C++03 through C++17 but also even] in C++20." It works fine (on ASCII platforms) in C++03 through C++17; in C++20 mode it requires this extra deprecation macro.

Your patch basically suggests to run this test on both ASCII and EBCDIC platforms, but only in C++11 through C++20.
My counterproposal is to run this test in C++03 through C++20, but only on ASCII platforms. Which means we're not getting test coverage of the feature on EBCDIC platforms... but the feature is already deprecated, so I'm saying it may not really matter that it's not covered on EBCDIC, as long as we keep the coverage we've got (for ASCII C++03 platforms).

SeanP added a subscriber: SeanP.Jul 16 2021, 10:39 AM
SeanP added inline comments.
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp
17–18

I think your counterproposal is good. I do have one question, given that char16_t was added in C++11 why is this test being run with C++03? I'm mostly curious, just want to understand the background.

NancyWang2222 added inline comments.Jul 16 2021, 10:40 AM
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp
17–18

sure. sounds good to me . I will modify test to run ASCII mode only for z/OS.

yeah. I also have same question as Sean mentioned. codecvt class for UTF16/32_t starts from c++11, utf8_t started from c++20. we shouldn't run with c++03. the error is expected.

hi Arthur O'Dwyer, I am thinking if we can add unsupported cxx-03 for both test cases with current resolution. adding u'some test' or u'a' is right syntax for uchar16_t , same for utf8_t type. any thoughts about this ?

libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp
17–18

@NancyWang2222 wrote:

yeah. I also have same question as Sean mentioned. [...] I am thinking if we can add unsupported cxx-03 for both test cases with current resolution.

I don't think we should remove coverage for this c++03 extension just because z/OS can't handle it. Skipping the test for z/OS makes more sense.

At least historically, I'd say it's more important to test extensions, because they must have been important to customers or else we wouldn't have done them. In modern times, I personally have done some patches that increased libc++'s c++03 test coverage just because "hey, it happens to work, we might as well test it"; but my impression is that this particular test case dates back to when "making codecvt work in c++03 mode" was an important feature for some real reason, and so I assume we shouldn't stop testing it yet.

add unsupported on zOS for 2 failing test cases

NancyWang2222 marked an inline comment as done.Jul 20 2021, 7:31 AM
NancyWang2222 added inline comments.
libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_out.pass.cpp
17–18

hi hi Arthur O'Dwyer , I modified test cases for both patches. can you review it? Thanks

Quuxplusone added a subscriber: ldionne.

LGTM if it looks good to @ldionne. (The regex, for detecting z/OS, is unusual; but IMHO is probably the appropriate tool in this case.)

This revision is now accepted and ready to land.Jul 20 2021, 8:08 AM
ldionne accepted this revision.Jul 20 2021, 8:55 AM

LGTM, but please add short comments (1-2 lines) explaining why the tests are unsupported. I don't need to see this again.

libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_out.pass.cpp
20

Please add a quick comment explaining why it's unsupported (here and in char16_t_out.pass.cpp too). I try to always add a comment when marking tests as UNSUPPORTED because it makes the archeology a lot easier in the future if you want to understand why X is failing, and you end up looking at the test for X and seeing it's always been skipped on your platform. This has been invaluable for me at Apple and I assume you folks will appreciate it just as much down the line.

Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2021, 9:57 AM

Ditto as https://reviews.llvm.org/D106153#2891424. Please fix this post-commit.

NancyWang2222 marked an inline comment as done.Jul 20 2021, 1:50 PM

LGTM, but please add short comments (1-2 lines) explaining why the tests are unsupported. I don't need to see this again.

Thanks will add comments