This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Get categories.numeric and facet.numpunct localization tests passing on linux
Needs ReviewPublic

Authored by EricWF on Aug 20 2014, 8:15 PM.

Details

Summary

This fixes the tests under

  • test/localization/locale.categories/categories.numeric
  • test/localization/locale.categories/facet.numpunct

The GLIBC localization data for these test differs from what is currently expected. Since GLIBC's localization data seems reasonable it is used when GLIBC is defined.

Diff Detail

Event Timeline

EricWF updated this revision to Diff 12729.Aug 20 2014, 8:15 PM
EricWF retitled this revision from to [libcxx] Get categories.numeric and facet.numpunct localization tests passing on linux.
EricWF updated this object.
EricWF edited the test plan for this revision. (Show Details)
EricWF added reviewers: mclow.lists, danalbert.
EricWF added a subscriber: Unknown Object (MLST).
mclow.lists edited edge metadata.Aug 21 2014, 8:33 AM

This all looks OK to me, but I wonder - is there a *right* answer for these routines. If so, what is the answer we should be expecting?

test/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_long_double.pass.cpp
10722

Are these values mandated by a standard somewhere - or are they just "implementation defined"?

danalbert added inline comments.Aug 21 2014, 11:05 AM
test/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_long_double.pass.cpp
10722

I had previously misread this part of the diff. This looks wrong to me now that I look closer (and with all the context). Specifically, [ul]nan4. That can't be right.

test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
58

Interesting. According to that doc (and a bit of Googling), Mac's implementation is actually wrong here. I've found information claiming that the correct thousands separator for fr_FR.UTF-8 would be either '.', ' ', or, perhaps most accurately, '\u2009' (a thin space). Certainly not ',' though. My vote is that we lose the #ifdefs in this test and mark xfail for Mac.

emaste added a subscriber: emaste.Aug 21 2014, 11:35 AM

This all looks OK to me, but I wonder - is there a *right* answer for these routines. If so, what is the answer we should be expecting?

@mclow.lists There might be a right answer for the NAN changes. Last I looked into this there was a change to support positive and negative NAN. I'll come back with a real answer tomorrow. However the separator and grouping test data seems to make more sense with GLIBC.

test/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_long_double.pass.cpp
10722

I think it follows 22.4.2.2.2. It seems weird but my reading of the standard does not make it incorrect.
I'll look into it some more. The rest of the cases seem reasonable though.

10722

The disagreement seems to be just about if NAN should have the positive sign. I didn't see anything in the standard but I'll take another look.

test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
58

My vote is that we lose the #ifdefs in this test and mark xfail for Mac.

Although I'm very tempted to do this I'm not sure we should since there is no golden standard for locale data. I think we should just live with the fact different platforms are going to give use different results.
As you mentioned fr_FR can have a bunch of possible separators. This workaround isn't too ugly so my vote goes to leaving it in.

Can we get a tie-breaker from @mclow.lists or @emaste?

danalbert added inline comments.Aug 21 2014, 1:50 PM
test/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_long_double.pass.cpp
10722

Nerver mind. Had misread and didn't realize the fill character was being set to '*'.

Wikipedia says (so it must be true) that France uses a space as the separator. http://en.wikipedia.org/wiki/Decimal_mark

See my comment on line #59 about a possible solution.

Rewrote my #59 comment.

test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
59

My suggestion here is to change the #if from !defined(__GLIBC__) to defined(_APPLE_) and move on.

See if that breaks any other systems (FREEBSD, etc).

EricWF added inline comments.Aug 25 2014, 4:37 PM
test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp
59

This test is currently passing on FreeBSD. So I think we should stick with GLIBC in this specific case.

http://llvm-amd64.freebsd.your.org/b/builders/libcxx-amd64-freebsd/builds/507/steps/check-libcxx/logs/stdio

I don't seem to see anything about what "nan" should actually print in the standard.