This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] [test] Use proper UTF-8 locales on Windows
ClosedPublic

Authored by mstorsjo on Feb 16 2022, 3:41 AM.

Details

Reviewers
Mordante
ldionne
Group Reviewers
Restricted Project
Commits
rG38d25aecdf72: [libcxx] [test] Use proper UTF-8 locales on Windows
Summary

Since Windows 10 version 1803 (10.0.17134.0) (or Windows Server 2019),
the Windows Universal C Runtime (UCRT) actually does support UTF-8
locales - they're available e.g. via the same names as commonly on Unices,
e.g. "en_US.UTF-8".

The UTF-8 locale support unfortunately has a bug which breaks a couple
tests that were passing previously. That bug is fixed in the very
latest version of the UCRT (in UCRT 10.0.20348.0, available in Windows
11 or Windows Server 2022), so it will get resolved at some point
eventually, provided that the CI environment does get upgraded to a
newer version of Windows Server.

While the net number of xfailed/passing tests in this patch is a loss,
this does allow fixing a lot more locale tests properly for Windows
in later patches.

Intentionally not touching the ISO-8859-1/2 locales used for testing;
they're not detected and tested/used right now, and fixing that up
is another project.

Diff Detail

Event Timeline

mstorsjo created this revision.Feb 16 2022, 3:41 AM
mstorsjo requested review of this revision.Feb 16 2022, 3:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2022, 3:41 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Mordante added inline comments.Feb 16 2022, 9:59 AM
libcxx/test/std/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp
23

Would it be easy to detect the UCRT version in CMake and add an appropriate flag libcpp-no-vcruntime? Then we can use UNSUPPORTED: libcpp-ucrt-has-utf8-bug. That way others who want to test on Windows don't need to use the exact same version as the CI has.

When that's not possible I would prefer UNSUPPORTED over XFAIL. Again to avoid failing tests depending on which Windows platform they're executed on.

mstorsjo added inline comments.Feb 16 2022, 10:20 AM
libcxx/test/std/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp
23

Hmm, unsure if it's possible to check the version of UCRT. But I guess we could do a proper runtime check when starting testing - like we already do to see if the system supports e.g. the locales we test. I'll give that a shot. (That can be used for other temporary platform bugs too.)

Yeah UNSUPPORTED is good if there's going to be variability - but with the drawback that it's easy to forget that in place for too long, when the issue is resolved in the CI environment. (While it's nice if the tests pass on any version of Windows, once the CI environment no longer require workarounds, I'd move towards removing them.)

ldionne added inline comments.
libcxx/test/std/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp
23

I think it's definitely better to use XFAIL than UNSUPPORTED, especially in these circumstances. XFAIL allows forgetting about this, since it'll start XPASSing when the issue gets fixed eventually. If we use UNSUPPORTED, we are basically dropping this test coverage forever, unless someone comes back and notices that.

I agree even better would be to have a version number -- then we can use either UNSUPPORTED or XFAIL, it doesn't matter, since it won't apply once we update to something newer.

mstorsjo updated this revision to Diff 409410.Feb 16 2022, 2:29 PM

Add a feature flag broken-utf8-wchar-ctype, check for this bug by building and running a code snippet (just like the locale detection); only test for that bug if building for Windows.

Refactored hasAnyLocale to use an intermediate helper programSucceeds instead of calling programOutput directly. (Also added a testcase for this new DSL function.)

mstorsjo updated this revision to Diff 409519.Feb 17 2022, 12:56 AM

Rebased, reordered local patches to make the diff apply cleanly on upstream.

Mordante accepted this revision as: Mordante.Feb 17 2022, 10:50 AM

Great that we can test it in CMake, thanks!
LGTM modulo one small issue.

libcxx/test/std/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp
23

My reason for UNSUPPORTED over XFAIL in this case, is that it depends on the version of the Windows system the user runs. So XFAIL would be nice for our CI, but will be annoying when a user tests it on a newer Windows than our CI.

In general I agree that an XFAIL is better or a UNSUPPORTED on a specific version.

But it seems that @mstorsjo found a way to detect it in CMake :-)

libcxx/utils/libcxx/test/features.py
81

This header provides towlower.

ldionne accepted this revision.Feb 17 2022, 12:45 PM

LGTM with test added. Also @Mordante had some comments, let's address those.

libcxx/test/libcxx/selftest/dsl/dsl.sh.py
221

Can you add a test checking that we raise ConfigurationCompilationError when we fail to compile the program? I.e. it's good to test that we don't swallow the error and return False.

This revision is now accepted and ready to land.Feb 17 2022, 12:45 PM
mstorsjo marked 2 inline comments as done.Feb 17 2022, 1:59 PM
mstorsjo added inline comments.
libcxx/test/libcxx/selftest/dsl/dsl.sh.py
221

Sure, will do!

libcxx/utils/libcxx/test/features.py
81

Thanks, will fix before pushing.

This revision was automatically updated to reflect the committed changes.
mstorsjo marked 2 inline comments as done.