Just specify the encoded bytes instead.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Are there any uses of u8 string literals with formatted_raw_ostream outside tests? What was the effect on the final behavior here?
I just wanted to make sure we are not doing the wrong thing here. If the code used to compile and it now changes behavior, we may run into runtime failures in real code.
If that's the case, we could potentially consider adding overloads for u8 string literals to either support it or with =delete to make sure the code using them will be rewritten.
u8"" literals (respectively char8_t*) used with ostreams changed behavior in C++20 in an unexpected way, see e.g.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r1.html#option7
So the right way might be to ensure that they cannot be used with osteams.
An alternative is to cast the char8_t* to char_t* in the << operator of the ostream of to get back the old behavior, e.g. something like
raw_ostream &operator<<(const char8_t *Str) {
    return this->operator<<(reinterpret_cast<const char *>(Str));
}LGTM, but please address the NITs before submitting.
| llvm/include/llvm/Support/raw_ostream.h | ||
|---|---|---|
| 231 ↗ | (On Diff #488199) | NIT: is there a typo? should this be 'from C++20 on' or 'from C++20 onward'? | 
| 236 ↗ | (On Diff #488199) | NIT: there's a typo, should be reinterpret_cast | 
| llvm/unittests/Support/formatted_raw_ostream_test.cpp | ||
| 102 | NIT: reinterpret_cast seems better there as the character code is directly written above | |
| 109 | NIT: reinterpret_cast also seems better here | |
| 115 | NIT: reinterpret_cast also seems better here | |
| 122 | NIT: reinterpret_cast also seems better here | |
| 140 | NIT: reinterpret_cast also seems better here | |
| 148 | NIT: reinterpret_cast also seems better here | |
NIT: reinterpret_cast seems better there as the character code is directly written above