This is an archive of the discontinued LLVM Phabricator instance.

Avoid u8"" literals in tests, their type changes in C++20
ClosedPublic

Authored by massberg on Jan 10 2023, 8:14 AM.

Details

Summary

Just specify the encoded bytes instead.

Diff Detail

Event Timeline

massberg created this revision.Jan 10 2023, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2023, 8:14 AM
massberg requested review of this revision.Jan 10 2023, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2023, 8:14 AM

Are there any uses of u8 string literals with formatted_raw_ostream outside tests? What was the effect on the final behavior here?

I just wanted to make sure we are not doing the wrong thing here. If the code used to compile and it now changes behavior, we may run into runtime failures in real code.
If that's the case, we could potentially consider adding overloads for u8 string literals to either support it or with =delete to make sure the code using them will be rewritten.

Are there any uses of u8 string literals with formatted_raw_ostream outside tests? What was the effect on the final behavior here?

I just wanted to make sure we are not doing the wrong thing here. If the code used to compile and it now changes behavior, we may run into runtime failures in real code.
If that's the case, we could potentially consider adding overloads for u8 string literals to either support it or with =delete to make sure the code using them will be rewritten.

u8"" literals (respectively char8_t*) used with ostreams changed behavior in C++20 in an unexpected way, see e.g.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r1.html#option7
So the right way might be to ensure that they cannot be used with osteams.

An alternative is to cast the char8_t* to char_t* in the << operator of the ostream of to get back the old behavior, e.g. something like

raw_ostream &operator<<(const char8_t *Str) {
    return this->operator<<(reinterpret_cast<const char *>(Str));
}
massberg updated this revision to Diff 488199.Jan 11 2023, 7:10 AM

Delete insertion operator of raw_ostream for char8_t.

ilya-biryukov accepted this revision.Jan 11 2023, 9:28 AM

LGTM, but please address the NITs before submitting.

llvm/include/llvm/Support/raw_ostream.h
231

NIT: is there a typo? should this be 'from C++20 on' or 'from C++20 onward'?

236

NIT: there's a typo, should be reinterpret_cast

llvm/unittests/Support/formatted_raw_ostream_test.cpp
102

NIT: reinterpret_cast seems better there as the character code is directly written above

109

NIT: reinterpret_cast also seems better here

115

NIT: reinterpret_cast also seems better here

122

NIT: reinterpret_cast also seems better here

140

NIT: reinterpret_cast also seems better here

148

NIT: reinterpret_cast also seems better here

This revision is now accepted and ready to land.Jan 11 2023, 9:28 AM
massberg updated this revision to Diff 488522.Jan 12 2023, 1:09 AM

Fixed nits. Thanks for the comments!

massberg marked 8 inline comments as done.Jan 12 2023, 1:10 AM

Thanks for the comments!

This revision was landed with ongoing or failed builds.Jan 12 2023, 1:22 AM
This revision was automatically updated to reflect the committed changes.