This patch gets all localization tests passing on linux.
It doesn't actually change any code in locale. Only tests.
Most of the failures were due to different locale behavior between apple and linux.
Other failures seem to be related to bugs in GLIBCs handling of some characters.
Details
- Reviewers
mclow.lists danalbert
Diff Detail
Event Timeline
I think I may have misled you when I said we should #ifdef the differences between glibc and Mac. If there are legitimate differences, we should #ifdef them. If glibc is wrong (it looks like it often is), we should just XFAIL the test and file a bug against glibc (or does that data come from an OS package?).
I usually only commented on one test of a given type, but the comments apply to all similar.
test/localization/locale.categories/category.ctype/locale.ctype.byname/narrow_1.pass.cpp | ||
---|---|---|
11 ↗ | (On Diff #12383) | Doh! My mistake. |
test/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp | ||
38 | I think you meant to #ifdef this one as you did in the next test. Either way, this doesn't look right to me. Do we know why this is happening? Is this a legitimate difference between Mac and glibc, or is glibc broken? If it's just a difference, then we should still check against the expected glibc behavior. If Linux is broken, we should XFAIL the test rather than skipping the broken part. | |
test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp | ||
11 | We're XFAILing these as a TODO, right? I wish LIT had a way to distinguish between intended failures and known failures. | |
test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_zh_CN.pass.cpp | ||
311 | I think glibc's locale data is wrong here. We should mark these tests as XFAIL rather than hiding the failure. A bug will need to be filed against glibc (or wherever glibc gets their locale data from). | |
test/localization/locale.categories/category.monetary/locale.moneypunct.byname/decimal_point.pass.cpp | ||
133 | The test is right, glibc's locale data is wrong. Russia uses ',' for a decimal separator (according to wikipedia). This test (and those like it) should be XFAIL. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp | ||
63 | Looks like more bad locale data to me. | |
79 | What? | |
92 | Looks like it wouldn't be the first time :) | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date_wide.pass.cpp | ||
86 | Then let's XFAIL it for now. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_one.pass.cpp | ||
47 | There should be an #else case that tests the Linux behavior. | |
70 | Pretty sure this is more bad locale data. | |
117 | Phabricator: "Context not available" *checks the source* Well, that didn't make it any clearer. I presume that mess of hex is encoding cyrillic characters. Are these tests like the other Russian ones where the separators are wrong? If so, XFAIL, not #ifdef. | |
test/localization/locale.categories/category.time/locale.time.put.byname/put1.pass.cpp | ||
73 | What's with the case differences? I see that was actually already the case, but why? | |
test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp | ||
60 | Looks like more glibc wrongness. |
Hi Dan,
Thanks for the review. I'll follow your suggestions. However I want to ask a question. Since the glibc data is frequently wrong we sacrifice a lot of test coverage on linux by marking the tests
XFAIL. Is there a point at which test coverage should trump glibc wrongness?
Hi Dan,
I've addressed most of your comments. Thanks for the input.
I'll post a revised patch closely after.
test/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp | ||
---|---|---|
38 | FreeBSD also fails on these tests as well for the same reason and I suspect darwin does as well. This needs more investigation. | |
test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp | ||
11 | Yep. I'll add a comment saying its TODO and explain the likely cause (translation of U00A0). | |
test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_zh_CN.pass.cpp | ||
311 | GLIBC's locale definition doesn't seem that unreasonable. I'll look more into it but I don't know if we can say its test data is for sure wrong. | |
test/localization/locale.categories/category.monetary/locale.moneypunct.byname/decimal_point.pass.cpp | ||
133 | Yep. That seems wrong. I'll back out the GLIBC changes and just mark it as XFAIL. I'll also file a bug with GLIBC. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp | ||
63 | I think this might be bad locale data on apples part as opposed to GLIBCs. I'm basing that off | |
79 | GLIBC's locale data says that U002E is used as the separator. I don't think that seems unreasonable. More investigation needed though. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date_wide.pass.cpp | ||
86 | Ok. Done. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_one.pass.cpp | ||
47 | Not quite sure how to write one. The output depends on the timezone your in. | |
70 | Probably, but I think GLIBC is correct. Having 12 hour time for en_US seems to make more sense than 24 hour. | |
117 | Done. Changed backed out. I suspect the difference is between upper and lower case cyrillic characters. However more investigation is needed. | |
test/localization/locale.categories/category.time/locale.time.put.byname/put1.pass.cpp | ||
73 | Not sure. glibc uses lower case. I've seperated the glibc change and only check the lower case version. | |
test/localization/locale.categories/facet.numpunct/locale.numpunct.byname/thousands_sep.pass.cpp | ||
60 | Perhaps. Some of the sources I've found online suggest that ' ' is valid. |
Implement most of Dan's comments. I've also backed out any changes to ctype tests. They need more investigation. They are just marked xfail now.
test/localization/locale.categories/category.time/locale.time.get.byname/get_one.pass.cpp | ||
---|---|---|
117 | @danalbert: the "Context not available" bit isn't phabricator's fault. This happens when a patch is made without '-U999'. | |
test/localization/locale.categories/category.time/locale.time.put.byname/put1.pass.cpp | ||
73 | The lowercase versions of that abbreviation are suspect to me. IIRC French folks capitalize it always. |
test/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp | ||
---|---|---|
38 | Okay. Still, if it needs more investigation then it needs more investigation, but we definitely shouldn't be commenting out anything. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp | ||
79 | But 0x2e == '.' | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_one.pass.cpp | ||
70 | Yeah, that's strange. Does this test fail on Mac? If it doesn't, it probably should. |
Thanks for the comments. It seems not all of the changes made it into the last patch. I'll implement your advice and back out all of the categories.ctype tests and just mark them XFAIL.
test/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_1.pass.cpp | ||
---|---|---|
38 | Yeah, Sorry I thought I took that out and just marked it XFAIL. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_date.pass.cpp | ||
79 | I must have been tired. Thanks for the correction. I'll see what up with these tests. | |
test/localization/locale.categories/category.time/locale.time.get.byname/get_one.pass.cpp | ||
70 | I'll double check on this. I think this test passes on mac. |
It might be worth splitting this up into a couple patches. I think all the XFAILs can probably be submitted (just the XFAIL line, not the rest of the changes to the file). It would clean up a lot of the test results so you can have less noise tracking down the others, as well as making the next update easier to review.
I think you meant to #ifdef this one as you did in the next test.
Either way, this doesn't look right to me. Do we know why this is happening? Is this a legitimate difference between Mac and glibc, or is glibc broken? If it's just a difference, then we should still check against the expected glibc behavior. If Linux is broken, we should XFAIL the test rather than skipping the broken part.