Page MenuHomePhabricator

[libc++] Keep __regex_word in sync with ctype_base
ClosedPublic

Authored by miyuki on Jun 13 2019, 10:16 AM.

Details

Summary

The class ctype_base in the header <locale> contains masks for
character classification functions, which are kept in sync with
platform's C library, hence it contains many special cases.
The value of the bit mask
regex_word in the header <regex> must not
clash with those bit masks.

Currently the default case (i.e. unknown platform/C library) is
handled incorrectly: the __regex_word clashes with ctype_base::punct.

To avoid replicating the whole list of platforms in <regex> this patch
defines regex_word in <locale>, so that it is always kept in sync
with other masks.

Diff Detail

Repository
rL LLVM

Event Timeline

miyuki created this revision.Jun 13 2019, 10:16 AM
Herald added a project: Restricted Project. · View Herald Transcript
ldionne accepted this revision.Jun 13 2019, 10:58 AM

I looked at it and I don't think this is an ABI break. Because of the way __regex_word is used (it's never passed through an ABI boundary, instead we use "w". Also this doesn't impact Apple platforms, so I'm personally fine with it.

This revision is now accepted and ready to land.Jun 13 2019, 10:58 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2019, 2:01 AM
atanasyan added inline comments.
libcxx/trunk/include/__locale
412

This change breaks building on mips: https://bugs.llvm.org/show_bug.cgi?id=43011
Maybe this change fix the problem, but I'm not an expert in libc++:

static const mask __regex_word = static_cast<mask>(_ISbit(15));
mclow.lists added inline comments.Aug 16 2019, 7:51 AM
libcxx/trunk/include/__locale
412

Personally, I would lose the _ISbit call as well; just say 1 << 15 or 0x8000
We don't use _ISbit for the other constants, and we don't use it anywhere else in libc++.

miyuki marked an inline comment as done.Aug 20 2019, 3:03 AM
miyuki added inline comments.
libcxx/trunk/include/__locale
412

_ISbit is endian-dependent, see https://reviews.llvm.org/D17132