In contrast to Clang-17, we treat an invalid ud-suffix as if whitespace preceded
it only if it can be seen as a macro and the preceding string literal is non-empty.
#define E "!" const char *operator""E(const char*), // ""E is a single token as it should be *s = "not empty"E; // treated as if whitespace preceds E hence a string concat: // = "not empty!"
This addresses comments in D153156.
I don't think this is accurate. Clang supported CWG1473 before these changes, as far as I can see: all valid code under CWG1473 was accepted, and invalid code was diagnosed (by default). Rather, what has changed is the behavior for invalid code: instead of treating an invalid ""blah as two tokens always, in order to accept as much old code as possible, we now treat it as two tokens only when blah is defined as a macro name.
This is still a breaking change in some cases, for users of -Wno-deprecated-literal-operator, eg:
... now will be lexed as a single invalid token rather than two tokens.
I'm not sure what the motivation for making changes here was, and D153156's description doesn't really help me understand it. Is the goal to improve the diagnostic quality for these kinds of errors on invalid code? Is there some example for which Clang's behavior with regard to CWG1473 was non-conforming? Something else?