Details
- Reviewers
- None
- Group Reviewers
Restricted Project - Commits
- rG311ff3917827: [libc++] Add missing header <cuchar>
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
@curdeius I don't have a clear picture of what's the status of our implementation of P0482, do you?
Just a few days ago I started looking at it. I'll try to review the whole status of it and create a patch if needed. It's good that you are adding <cuchar>. That was one of missing parts.
I see from the build failures that char16_t typedefs should be guarded not to conflict with C++ built-in types. At least for glibc the solution will be to define __USE_ISOCXX11, cf. https://sourceware.org/git/?p=glibc.git;a=blob;f=wcsmbs/uchar.h;h=6020f66cf6d27d86d78213c3efdf17a9e52f45e7;hb=HEAD.
Other thing, with this patch you add cuchar header, but there is no intermediate uchar.h in libc++ as it's done for other headers (which then #include_next their C library equivalent). I guess that it should be added here and it will probably be the place to define the aforementioned macro guard.
Correct me if I'm wrong.
libcxx/test/std/strings/c.strings/cuchar.pass.cpp | ||
---|---|---|
22 | Shouldn't this be guarded on support of char8_t as well? #if TEST_STD_VER > 17 && defined(__cpp_char8_t) ... // or #if defined(__cpp_lib_char8_t) && __cpp_lib_char8_t >= 201811L ... |
@ldionne, regarding P0482, I'll send a small patch updating the synopses and adding missing tests very soon.
Apart from that, the missing things are only: <cuchar> (mbrtoc8, c8rtomb) and <memory_resource> (pmr::u8string, tests for hash<pmr::*string>).
So the root cause of the issue here is that libc++ tries to define its own char16_t type inside __config in C++03 mode (those types were added in C++11). If we didn't try to provide those as extensions in C++03 mode, we wouldn't hit this issue.
Since I don't think it's going to be easy to stop providing those in C++03 mode, instead what I'll do is mark <cuchar> and <uchar.h> as unsupported in C++03 mode. Those headers were added in C++11, so it seems reasonable if they don't work as intended in C++03.
Other thing, with this patch you add cuchar header, but there is no intermediate uchar.h in libc++ as it's done for other headers (which then #include_next their C library equivalent). I guess that it should be added here and it will probably be the place to define the aforementioned macro guard.
I added uchar.h, thanks for the catch. TBH, I'm not a big fan of defining headers that should be provided by the underlying C library, however we do it for all other libc++ headers, so let's do it for consistency if nothing else.
Mark test as UNSUPPORTED in C++03 and fail clearly when <cuchar> is included in C++03 mode
I did something that we usually don't do here - I made the <cuchar> and <uchar.h> headers fail with an explicit error when included in C++03 mode. The rationale for doing that is that the headers won't work in C++03 mode (they require the char16_t type, which is in C++11), and I thought getting a clear error was better than getting a cryptic error about missing char16_t (or even worse, clashing with our broken attempt to provide char16_t in C++03 mode in <__config>).
Another approach would have been to render the header empty when used in C++03 mode. This is what we do elsewhere. However, I quite dislike this as a general approach because it hides the fact that the header shouldn't be used pre C++11. Users could rightfully file bugs against us for providing an empty <cuchar> header in C++03 mode. Instead, I thought providing an error message was more helpful and more in line with the "fail fast and in obvious ways" line of thinking.
I did something that we usually don't do here - I made the <cuchar> and <uchar.h> headers fail with an explicit error when included in C++03 mode. The rationale for doing that is that the headers won't work in C++03 mode (they require the char16_t type, which is in C++11)
In the spirit of my "remove _LIBCPP_CXX03_LANG" patch series: We care about compiling in C++03 mode only with Clang, right? So how about doing some macro magic in <__config> that makes our char16_t an alias for Clang's uglified built-in __char16_t on-Clang-in-C++03-mode? And then on every other compiler we just use char16_t directly.
We already do, and it's broken! https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L441
It's broken because in some cases, even in C++03 mode, the underlying C library tries to do the exact same thing and provide a typedef for char16_t. We end up trying to redefine that typedef in <__config>, and that doesn't work. I think that what we want to do here is instead not try to backport <uchar.h> to C++03 (as I do in the patch right now).
We already do, and it's broken! https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L441
Well, it's definitely broken, because we do it again later on:
https://github.com/llvm/llvm-project/blob/main/libcxx/include/__config#L825
(er, I guess redundant typedef aliases actually became legal in C++17 or thereabouts? okay...)
I think the original sin there is that <__config> is trying to define typedefs in the global namespace. What if we defined char16_t etc. inside namespace _VSTD instead, so that they didn't conflict with the C library's aliases in the global namespace?
I mean, you've looked at this more than I have, and I'm not trying to block this if you just want to get it in; but it sure would be nice not to have <cuchar> be the one special header you can't include in C++03 mode, so I'm trying to suggest ways to get around that.
Rebase onto main and define the header like we do for other headers (i.e. don't hard-fail).
Continues to look reasonable to me, mod comment about consistency with cwctype.
libcxx/include/cuchar | ||
---|---|---|
47–57 | Looking at how libc++ does its cwctype and wctype.h, I suspect you ought to do something isomorphic over here and wrap all this in #if defined(_LIBCPP_INCLUDED_C_LIBRARY_UCHAR_H). (Or else explain in the commit summary why uchar.h's situation is different.) | |
libcxx/test/std/depr/depr.c.headers/uchar_h.compile.pass.cpp | ||
28 | Weird mix of 0-to-mean-nullptr and {}-to-mean-zero, but okay. ;) |
Address comments.
libcxx/include/cuchar | ||
---|---|---|
47–57 | Excellent point. I think we have the same problem, however the only declaration that can cause problems is size_t. So instead, if the platform doesn't provide <uchar.h>, I'll include <stddef.h> so we get the definition of size_t` and we don't cause problems when we do using ::size_t _LIBCPP_USING_IF_EXISTS;. |
Rebase and mark a few tests as XFAIL on AIX. CC @daltenty
I'll ship this if CI is green.
Looking at how libc++ does its cwctype and wctype.h, I suspect you ought to do something isomorphic over here and wrap all this in #if defined(_LIBCPP_INCLUDED_C_LIBRARY_UCHAR_H). (Or else explain in the commit summary why uchar.h's situation is different.)