This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Add missing header <cuchar>
ClosedPublic

Authored by ldionne on Mar 3 2021, 9:48 AM.

Details

Reviewers
None
Group Reviewers
Restricted Project
Commits
rG311ff3917827: [libc++] Add missing header <cuchar>
Summary

Diff Detail

Event Timeline

ldionne created this revision.Mar 3 2021, 9:48 AM
ldionne requested review of this revision.Mar 3 2021, 9:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2021, 9:48 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

@curdeius I don't have a clear picture of what's the status of our implementation of P0482, do you?

@curdeius I don't have a clear picture of what's the status of our implementation of P0482, do you?

Just a few days ago I started looking at it. I'll try to review the whole status of it and create a patch if needed. It's good that you are adding <cuchar>. That was one of missing parts.

I see from the build failures that char16_t typedefs should be guarded not to conflict with C++ built-in types. At least for glibc the solution will be to define __USE_ISOCXX11, cf. https://sourceware.org/git/?p=glibc.git;a=blob;f=wcsmbs/uchar.h;h=6020f66cf6d27d86d78213c3efdf17a9e52f45e7;hb=HEAD.

Other thing, with this patch you add cuchar header, but there is no intermediate uchar.h in libc++ as it's done for other headers (which then #include_next their C library equivalent). I guess that it should be added here and it will probably be the place to define the aforementioned macro guard.

Correct me if I'm wrong.

curdeius added inline comments.Mar 19 2021, 2:13 AM
libcxx/test/std/strings/c.strings/cuchar.pass.cpp
22

Shouldn't this be guarded on support of char8_t as well?

#if TEST_STD_VER > 17 && defined(__cpp_char8_t) ...
// or
#if defined(__cpp_lib_char8_t) && __cpp_lib_char8_t >= 201811L ...

@ldionne, regarding P0482, I'll send a small patch updating the synopses and adding missing tests very soon.
Apart from that, the missing things are only: <cuchar> (mbrtoc8, c8rtomb) and <memory_resource> (pmr::u8string, tests for hash<pmr::*string>).

ldionne marked an inline comment as done.Jul 5 2021, 12:12 PM

I see from the build failures that char16_t typedefs should be guarded not to conflict with C++ built-in types. At least for glibc the solution will be to define __USE_ISOCXX11, cf. https://sourceware.org/git/?p=glibc.git;a=blob;f=wcsmbs/uchar.h;h=6020f66cf6d27d86d78213c3efdf17a9e52f45e7;hb=HEAD.

So the root cause of the issue here is that libc++ tries to define its own char16_t type inside __config in C++03 mode (those types were added in C++11). If we didn't try to provide those as extensions in C++03 mode, we wouldn't hit this issue.

Since I don't think it's going to be easy to stop providing those in C++03 mode, instead what I'll do is mark <cuchar> and <uchar.h> as unsupported in C++03 mode. Those headers were added in C++11, so it seems reasonable if they don't work as intended in C++03.

Other thing, with this patch you add cuchar header, but there is no intermediate uchar.h in libc++ as it's done for other headers (which then #include_next their C library equivalent). I guess that it should be added here and it will probably be the place to define the aforementioned macro guard.

I added uchar.h, thanks for the catch. TBH, I'm not a big fan of defining headers that should be provided by the underlying C library, however we do it for all other libc++ headers, so let's do it for consistency if nothing else.

ldionne updated this revision to Diff 356553.Jul 5 2021, 12:15 PM

Rebase onto main, implement review feedback, add uchar.h and fix tests.

ldionne updated this revision to Diff 356695.Jul 6 2021, 5:55 AM

Mark test as UNSUPPORTED in C++03 and fail clearly when <cuchar> is included in C++03 mode

I did something that we usually don't do here - I made the <cuchar> and <uchar.h> headers fail with an explicit error when included in C++03 mode. The rationale for doing that is that the headers won't work in C++03 mode (they require the char16_t type, which is in C++11), and I thought getting a clear error was better than getting a cryptic error about missing char16_t (or even worse, clashing with our broken attempt to provide char16_t in C++03 mode in <__config>).

Another approach would have been to render the header empty when used in C++03 mode. This is what we do elsewhere. However, I quite dislike this as a general approach because it hides the fact that the header shouldn't be used pre C++11. Users could rightfully file bugs against us for providing an empty <cuchar> header in C++03 mode. Instead, I thought providing an error message was more helpful and more in line with the "fail fast and in obvious ways" line of thinking.

I did something that we usually don't do here - I made the <cuchar> and <uchar.h> headers fail with an explicit error when included in C++03 mode. The rationale for doing that is that the headers won't work in C++03 mode (they require the char16_t type, which is in C++11)

In the spirit of my "remove _LIBCPP_CXX03_LANG" patch series: We care about compiling in C++03 mode only with Clang, right? So how about doing some macro magic in <__config> that makes our char16_t an alias for Clang's uglified built-in __char16_t on-Clang-in-C++03-mode? And then on every other compiler we just use char16_t directly.

In the spirit of my "remove _LIBCPP_CXX03_LANG" patch series: We care about compiling in C++03 mode only with Clang, right? So how about doing some macro magic in <__config> that makes our char16_t an alias for Clang's uglified built-in __char16_t on-Clang-in-C++03-mode? And then on every other compiler we just use char16_t directly.

We already do, and it's broken! https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L441

It's broken because in some cases, even in C++03 mode, the underlying C library tries to do the exact same thing and provide a typedef for char16_t. We end up trying to redefine that typedef in <__config>, and that doesn't work. I think that what we want to do here is instead not try to backport <uchar.h> to C++03 (as I do in the patch right now).

We already do, and it's broken! https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L441

Well, it's definitely broken, because we do it again later on:
https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L825
(er, I guess redundant typedef aliases actually became legal in C++17 or thereabouts? okay...)

I think the original sin there is that <__config> is trying to define typedefs in the global namespace. What if we defined char16_t etc. inside namespace _VSTD instead, so that they didn't conflict with the C library's aliases in the global namespace?

I mean, you've looked at this more than I have, and I'm not trying to block this if you just want to get it in; but it sure would be nice not to have <cuchar> be the one special header you can't include in C++03 mode, so I'm trying to suggest ways to get around that.

ldionne updated this revision to Diff 405762.Feb 3 2022, 1:13 PM

Rebase onto main and define the header like we do for other headers (i.e. don't hard-fail).

Continues to look reasonable to me, mod comment about consistency with cwctype.

libcxx/include/cuchar
47–57

Looking at how libc++ does its cwctype and wctype.h, I suspect you ought to do something isomorphic over here and wrap all this in #if defined(_LIBCPP_INCLUDED_C_LIBRARY_UCHAR_H). (Or else explain in the commit summary why uchar.h's situation is different.)

libcxx/test/std/depr/depr.c.headers/uchar_h.compile.pass.cpp
28

Weird mix of 0-to-mean-nullptr and {}-to-mean-zero, but okay. ;)

ldionne updated this revision to Diff 406561.Feb 7 2022, 12:18 PM
ldionne marked 2 inline comments as done.

Address comments.

libcxx/include/cuchar
47–57

Excellent point. I think we have the same problem, however the only declaration that can cause problems is size_t. So instead, if the platform doesn't provide <uchar.h>, I'll include <stddef.h> so we get the definition of size_t` and we don't cause problems when we do using ::size_t _LIBCPP_USING_IF_EXISTS;.

ldionne updated this revision to Diff 407638.Feb 10 2022, 12:21 PM

Fix pragma indentation

ldionne updated this revision to Diff 409635.Feb 17 2022, 6:55 AM

Disable <uchar.h> in C++03 mode.

ldionne updated this revision to Diff 413070.Mar 4 2022, 10:46 AM
ldionne added a subscriber: daltenty.

Rebase and mark a few tests as XFAIL on AIX. CC @daltenty

I'll ship this if CI is green.

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2022, 10:46 AM
ldionne accepted this revision as: Restricted Project.Mar 4 2022, 10:46 AM
This revision was not accepted when it landed; it landed in state Needs Review.Mar 7 2022, 5:49 AM
This revision was automatically updated to reflect the committed changes.
libcxx/test/std/depr/depr.c.headers/uchar_h.compile.pass.cpp