Implement N2418 for C2x.
Details
Diff Detail
Unit Tests
Event Timeline
Can you also add a release note for the new feature, and update the clang/www/c_status.html page as well?
clang/test/Lexer/utf8-char-literal.cpp | ||
---|---|---|
23 | One more test I'd like to see added, just to make sure we're covering 6.4.4.4p9 properly: _Static_assert( _Generic(u8'a', default: 0, unsigned char : 1), "Surprise!"); We expect the type of a u8 character literal to be unsigned char at the moment, which is different from a u8 string literal, which uses char. However, WG14 is also going to be considering http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm for C2x at our meeting next week. |
clang/lib/Lex/Lexer.cpp | ||
---|---|---|
3462–3465 | The comment is slightly misleading both before and after this change. Assuming this level of detail is desired, I suggest: // Identifer (e.g., uber), or // UTF-8 (C2x/C++17) or UTF-16 (C11/C++11) character literal, or // UTF-8 or UTF-16 string literal (C11/C++11). case 'u': | |
clang/test/Lexer/utf8-char-literal.cpp | ||
23 | Good suggestion. I believe the following update will be needed to`Sema::ActOnCharacterConstant() in clang/lib/Sema/SemaExpr.cpp`: ... else if (Literal.isUTF8() && getLangOpts().C2x) Ty = Context.UnsignedCharTy; // u8'x' -> unsigned char in c2x. else if Literal.isUTF8() && getLangOpts().Char8) Ty = Context.Char8Ty; // u8'x' -> char8_t when it exists. ... | |
28 | We should also exercise the preprocessor with something like this: #if u8'\xff' != 0xff #error uh oh #endif Hmm, this currently fails for C++20 for both Clang and gcc unless -funsigned-char is passed. That seems wrong. https://godbolt.org/z/Tb7z85ToG. MSVC gets this wrong too, but I think for a different reason; see the implementation impact section of P2029 if curious. |
clang/test/Lexer/utf8-char-literal.cpp | ||
---|---|---|
28 | This also fails in C2x. |
clang/lib/Lex/Lexer.cpp | ||
---|---|---|
3462–3465 | ||
clang/test/Lexer/utf8-char-literal.cpp | ||
3 | no need for the new -D | |
16 | You can test with __STDC_VERSION__ > 202000L. | |
23 |
I have an update on this. We discussed the paper and took a straw poll: Does WG14 wish to adopt N2653 in C23? 18/0/2 (consensus) So we should make sure that we all agree this patch is in line with the changes from that paper. I believe your changes agree, but it'd be nice for @tahonermann to confirm. | |
28 | I don't think we need to fix the preprocessor behavior as part of this patch, but it would be good to file an issue for this so we know to track it down at some point. |
clang/test/Lexer/utf8-char-literal.cpp | ||
---|---|---|
28 |
The changes LGTM (thanks for filing the issue). Please wait a day or so for Tom to sign off though (I've pinged him off-list).
Looks good to me! Thank you for filing the separate issue.
clang/test/Lexer/utf8-char-literal.cpp | ||
---|---|---|
23 | Confirmed. N2653 technically changes the type of u8 character literals to char8_t, but since that is just a typedef of unsigned char, these changes still align with the semantic intent. Ideally, we would maybe try to reflect the typedef, but 1) the typedef isn't necessarily available, 2) Clang doesn't do similarly for any of the other character (or string) literals, and 3) no one is likely to care anyway. |
The comment is slightly misleading both before and after this change. Assuming this level of detail is desired, I suggest: