This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Formatter does not handle c++11 string literal prefix with stringize #
ClosedPublic

Authored by MyDeveloperDay on Dec 17 2021, 6:13 AM.

Details

Summary

https://github.com/llvm/llvm-project/issues/27740

Ensure

#define _u(str) u#str
#define _u(str) u8#str
#define _u(str) U#str

behave the same as

#define _u(str) L#str

when formatted, ensure clang-format follows the conventions for L u U u8

https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?redirectedfrom=MSDN&view=msvc-170

Fixes #27740

Diff Detail

Event Timeline

MyDeveloperDay requested review of this revision.Dec 17 2021, 6:13 AM
MyDeveloperDay created this revision.
curdeius added a comment.EditedDec 17 2021, 6:28 AM

When at it, should we also take care of LR"(string)", R, uR, u8R and UR? Cf. https://en.cppreference.com/w/cpp/language/string_literal
From MS doc:

// Raw string literals containing unescaped \ and "
auto R0 =   R"("Hello \ world")"; // const char*
auto R1 = u8R"("Hello \ world")"; // const char* before C++20, encoded as UTF-8,
                                  // const char8_t* in C++20
auto R2 =  LR"("Hello \ world")"; // const wchar_t*
auto R3 =  uR"("Hello \ world")"; // const char16_t*, encoded as UTF-16
auto R4 =  UR"("Hello \ world")"; // const char32_t*, encoded as UTF-32

When at it, should we also take care of LR"(string)", R, uR, u8R and UR? Cf. https://en.cppreference.com/w/cpp/language/string_literal
From MS doc:

// Raw string literals containing unescaped \ and "
auto R0 =   R"("Hello \ world")"; // const char*
auto R1 = u8R"("Hello \ world")"; // const char* before C++20, encoded as UTF-8,
                                  // const char8_t* in C++20
auto R2 =  LR"("Hello \ world")"; // const wchar_t*
auto R3 =  uR"("Hello \ world")"; // const char16_t*, encoded as UTF-16
auto R4 =  UR"("Hello \ world")"; // const char32_t*, encoded as UTF-32

I did think about that but I was thinking about how would the calling side work?

#define MyRawString(str) R#str


void foo()
{
  const char *s = MyRawString("(" Hello \ world ")")
}

I think trying to pass the raw string into the macro confuses it no? this was why I left them out for now. Unless someone can give me an example of how it might work.

Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30137 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

test12.cpp
test12.cpp(5): error C3513: '\': unsupported raw string literal delimiter character
test12.cpp(7): error C3516: unexpected end-of-file found while processing the raw string literal; delimiter sequence '")' was not matched
test12.cpp(5): note: start of raw string literal
test12.cpp(7): fatal error C1903: unable to recover from previous error(s); stopping compilation
MyDeveloperDay edited the summary of this revision. (Show Details)Dec 17 2021, 6:39 AM
MyDeveloperDay edited the summary of this revision. (Show Details)

You shouldn't have added the outer quotes as #str adds them.
This works:

// clang-format off
#define MyRawString(str) R#str

void foo()
{
    const auto * s1 = MyRawString((" Hello \ world "));
    const auto * s2 = MyRawString(abc(" Hello \ world ")abc);
}

Add the raw string literal cases

curdeius accepted this revision.Dec 17 2021, 8:07 AM

LGTM.

This revision is now accepted and ready to land.Dec 17 2021, 8:07 AM
owenpan accepted this revision.Dec 17 2021, 9:36 AM
This revision was landed with ongoing or failed builds.Dec 17 2021, 10:29 AM
This revision was automatically updated to reflect the committed changes.