This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Do not hard-code locale data in unit tests: get it from the OS instead
AcceptedPublic

Authored by vangyzen on Nov 22 2016, 9:53 AM.

Details

Reviewers
EricWF
Summary

Locale data can vary across platforms. Rather than hard-code the expected
data in the units tests, get it from the OS via the C API.

This fixes these unit tests on FreeBSD. It might fix them on Linux, too.
I have not tested other platforms, but I will if I get agreement
on the approach.

Event Timeline

vangyzen updated this revision to Diff 78886.Nov 22 2016, 9:53 AM
vangyzen retitled this revision from to Do not hard-code locale data in unit tests: get it from the OS instead.
vangyzen updated this object.
vangyzen added a subscriber: cfe-commits.
EricWF accepted this revision.Nov 22 2016, 11:12 AM
EricWF added a reviewer: EricWF.
EricWF added a subscriber: EricWF.

LGTM minus inline comments.

Thanks for working on this.

test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
51

I think we should reset the global locale so it's different than the one we pass to std::locale l(...). That way we're testing that we don't accidental use the global locale.

This revision is now accepted and ready to land.Nov 22 2016, 11:12 AM
vangyzen added inline comments.Nov 22 2016, 12:11 PM
test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
51

Good idea.

vangyzen updated this revision to Diff 78916.Nov 22 2016, 12:12 PM
vangyzen edited edge metadata.

Reset the global locale so it's different than the one we pass
to std::locale l(...). That way we're testing that we don't accidentally
use the global locale.

vangyzen updated this revision to Diff 78919.Nov 22 2016, 12:15 PM

Fix comment

vim + Caps Lock == :(

vangyzen updated this revision to Diff 78921.Nov 22 2016, 12:20 PM

Remove XFAIL on linux-gnu

These tests now pass on Fedora 24.

@EricWF, would you like to see anything more from me? I could set up an LLVM development environment on my Mac if you want me to run the unit tests on Mac.

@EricWF, would you like to see anything more from me? I could set up an LLVM development environment on my Mac if you want me to run the unit tests on Mac.

Nope. Feel free to commit after adding the requested assertions.

test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
68

We should probably assert that thousands_sep isn't a multibyte character.

Nope. Feel free to commit after adding the requested assertions.

I would gladly commit it, but I don't have commit access. I'm quite new here.

test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
68

I don't quite follow. C defines it as char*. Are you concerned about an implementation defining it as wchar_t*?

vangyzen added inline comments.Nov 23 2016, 2:41 PM
test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
68

Oh, I understand now. Sorry. I've never really done much with locales, so I'm not so fluent with them...especially early on a holiday morning. :)

vangyzen updated this revision to Diff 79322.Nov 25 2016, 2:10 PM

Handle multibyte thousands_sep; do not reference possibly stale locale data

I'm glad you mentioned a multibyte thousands_sep, because it is multibyte in fr_FR.UTF-8 on FreeBSD 11. Specifically, it's a no-break space (U+00A0).

vangyzen added inline comments.Nov 25 2016, 2:23 PM
projects/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/grouping.pass.cpp
42 ↗(On Diff #79322)

localeconv()->grouping can become invalid after we reset the locale to "C", so duplicate the string into a local.

projects/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
79 ↗(On Diff #79322)

This assertion now fails for me. wexpected is 0xa0 (NBSP), which is correct. However, np.thousands_sep() is -62, which is 0xc2, which is the first byte of a UTF-8-encoded NBSP. It looks like the library isn't correctly handling multibyte thousands separators.

vangyzen added inline comments.Nov 28 2016, 9:13 AM
projects/libcxx/test/std/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
79 ↗(On Diff #79322)

D27167 adds support for multibyte strings in decimal_point and thousands_sep. With that change, this unit test passes.

@EricWF Do you have time and interest to review this again?

@EricWF Do you have time and interest to review this again?

I should ask, do you have time and interest to commit this?

vangyzen retitled this revision from Do not hard-code locale data in unit tests: get it from the OS instead to [libc++] Do not hard-code locale data in unit tests: get it from the OS instead.Nov 30 2016, 7:56 AM