This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/test/clang-tidy/checkers/modernize/
-
test/
-
clang-tidy/
-
checkers/
-
modernize/
-
unary-static-assert.cpp
-
clang/
-
docs/
1/2
ReleaseNotes.rst
-
include/clang/
-
clang/
-
AST/
4/4
Expr.h
-
Basic/
5/5
DiagnosticLexKinds.td
-
DiagnosticSemaKinds.td
-
Lex/
6/6
LiteralSupport.h
-
Parse/
1/2
Parser.h
-
Sema/
-
Sema.h
-
lib/
-
AST/
13/17
Expr.cpp
-
Lex/
33/35
LiteralSupport.cpp
1
PPMacroExpansion.cpp
3/5
Pragma.cpp
-
Parse/
-
ParseDeclCXX.cpp
3/7
ParseExpr.cpp
-
Sema/
3/3
SemaDeclCXX.cpp
1/1
SemaExpr.cpp
-
SemaExprCXX.cpp
-
SemaInit.cpp
-
test/
-
CXX/dcl.dcl/
-
dcl.dcl/
-
dcl.link/
-
p2.cpp
1/1
p4-0x.cpp
-
FixIt/
-
fixit-static-assert.cpp
-
SemaCXX/
-
static-assert.cpp
-
www/
1
cxx_status.html

Differential D105759

Implement P2361 Unevaluated string literals
ClosedPublic

Authored by cor3ntin on Jul 10 2021, 6:53 AM.

Download Raw Diff

Details

Reviewers

aaron.ballman
rsmith
erichkeane
rjmccall
hubert.reinterpretcast
shafik

Commits

rG95f50964fbf5: Implement P2361 Unevaluated string literals

Summary

This patch proposes to handle in an uniform fashion
the parsing of strings that are never evaluated,
in asm statement, static assert, attrributes, extern,
etc.

Unevaluated strings are UTF-8 internally and so currently
behave as narrow strings, but these things will diverge with
D93031.

The big question both for this patch and the P2361 paper
is whether we risk breaking code by disallowing
encoding prefixes in this context.
I hope this patch may allow to gather some data on that.

Future work:
Improve the rendering of unicode characters, line break
and so forth in static-assert messages

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In general, I think this is shaping up nicely and is almost complete. I'm adding some additional reviewer though, as this is a somewhat experimental patch for a WG21 proposal that has not been accepted yet and I want to make sure that I'm not missing something. That may also solve the few open questions that still remain.

clang/lib/AST/Expr.cpp
1187–1188	I'd recommend running the entire patch through clang-format though: https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting
clang/lib/Lex/LiteralSupport.cpp
1897	Looks like this comment is still missing punctuation.

Formatting and missing punctuation

erichkeane added inline comments.Sep 27 2021, 8:33 AM

clang/include/clang/Basic/DiagnosticLexKinds.td
280	Is there value to combining these two diagnostics with a %select?
clang/lib/Lex/LiteralSupport.cpp
96–97	I might consider rejecting ANY character escape in the less-than-32 part of the table. For consistency at least, I don't see value in allowing \a if we're rejecting layout things like \t.
119	This is like the 3rd time we're using 'Unevaluated' as a bool parameter. I have a pretty strong preference for making it a scoped-enum in 'Basic' somewhere.
2005	Is this OK? It looks like we're passing a ton of parameters to a diag type that doesn't have any wildcards?

Harbormaster completed remote builds in B125884: Diff 375278.Sep 27 2021, 9:33 AM

Replace Unevaluated by an enum.

aaron.ballman added inline comments.Sep 27 2021, 10:13 AM

clang/include/clang/Basic/DiagnosticLexKinds.td
280	I waffled when doing this review, so it's funny you mention it. :-D We could do: `an unevaluated string literal cannot %select{have an encoding prefix\|be a user-defined literal}0` but there was just enough text in the `select` that I felt it wasn't critical to combine. But I don't feel strongly either way.
clang/lib/Lex/LiteralSupport.cpp
96–97	But that's just it, we're accepting `\t` and `\n` with this code.
2005	Good catch! The first two are not helpful (the diag engine will silently ignore them), but the second two are for underlines in the diagnostic and are useful.

erichkeane added inline comments.Sep 27 2021, 10:20 AM

clang/lib/Lex/LiteralSupport.cpp
96–97	Ah! I missed that this is an allow-list instead of a deny-list. That makes me way more comfortable with this code. IMO, I'd suggest we we allow '\r' (since wouldn't we have problems on Windows at that point, being unable to accept a printable newline for windows?), but disallow `\a` for now unless someone comes up with a really good reason to allow it.

Harbormaster completed remote builds in B125916: Diff 375319.Sep 27 2021, 11:02 AM

Accept \r as an escape sequence n unevaluated string literal

Rename commit

Harbormaster completed remote builds in B126038: Diff 375476.Sep 27 2021, 11:39 PM

A couple of small things, otherwise I'm happy; but Aaron has some bigger opens above, plus clang-format, plus the modules from Richard.

clang/include/clang/Basic/DiagnosticLexKinds.td
280	I was waffly on this too, so your waffling + my waffling I think is sufficient reason to not deal with this now.
clang/lib/AST/Expr.cpp
1160	minor preference (perhaps 'nit' level) to move this whole CharByteWidth + IsPascal calculation into its own function. This constructor is absurdly long as it is.
clang/lib/Lex/LiteralSupport.cpp
99	For future clarification, the ones from the 'simple' list here: https://en.cppreference.com/w/cpp/language/escape that we are missing are: `\a` `\b` `\f` and `\v`. I personally think I'm ok with that until someone else says they care.
1896	Hrm.... this is unfortunate. Is there no way to combine the loops? I guess (hope?) that hte list of tokens is at least going to be short...

Formatting

Harbormaster completed remote builds in B126102: Diff 375569.Sep 28 2021, 7:45 AM

Get rid of the extra loop by using a lambda

Harbormaster completed remote builds in B126105: Diff 375576.Sep 28 2021, 8:15 AM

Some naming nits. There are two open questions also: one about module behavior and one about a TODO comment in the patch. If we don't hear back about the modules question, I think that can be handled in a follow-up.

clang/include/clang/Lex/LiteralSupport.h
215	Slight renaming so nobody thinks this is going to be about wide vs narrow vs u8, etc.
243	We should rename anything mentioning `StringKind` similarly -- this will also help avoid confusion with the `StringKind` type in Expr.h.
257	Can we make this private now rather than letting callers access it directly?
clang/lib/Lex/LiteralSupport.cpp
1897	When I hear "check" I think it'll return a value; I think this name is a bit more clear.
2007–2009

Address Aaron's comments

clang/lib/Lex/LiteralSupport.cpp
119	Any suggestion for where to
2007–2009	This are actually used by `err_string_concat_mixed_suffix`

cor3ntin added inline comments.Sep 28 2021, 10:58 AM

clang/lib/Lex/LiteralSupport.cpp
119	NVM

erichkeane added inline comments.Sep 28 2021, 10:59 AM

clang/lib/Lex/LiteralSupport.cpp
2007–2009	right, i guess it is just super awkward to have unused parameters passed like this. I know we only check the other direction, but seems awkward. Aaron, thoughts?

Harbormaster completed remote builds in B126144: Diff 375645.Sep 28 2021, 11:11 AM

aaron.ballman added inline comments.Sep 28 2021, 11:56 AM

clang/lib/Lex/LiteralSupport.cpp
2007–2009	I'd split it into two calls at this point. e.g., if (UnevaluatedStringHasUDL) Diags->Report(TokLoc, diag::err_unevaluated_string_udl) << ...; else Diags->Report(TokLoc, diag::err_string_concat_mixed_suffix) << ...;

Cleanup Diagnostics In LiteralSupport

LGTM aside from two small nits. As for the modules question, if @rsmith doesn't get back to us, I think it's fine to address that post-commit.

clang/include/clang/Lex/LiteralSupport.h
243	Did this one get missed?
clang/test/CXX/dcl.dcl/p4-0x.cpp
25	Can you add the newline back to the end of the file?

This revision is now accepted and ready to land.Sep 29 2021, 10:35 AM

Harbormaster completed remote builds in B126371: Diff 375940.Sep 29 2021, 10:45 AM

Fix EOF & unrenamed StringKind

Harbormaster completed remote builds in B126388: Diff 375964.Sep 29 2021, 11:57 AM

cor3ntin retitled this revision from [WIP] Implement P2361 Unevaluated string literals to Implement P2361 Unevaluated string literals.Sep 30 2021, 5:07 AM

cor3ntin closed this revision.Oct 1 2021, 12:00 PM

cor3ntin added a reviewer: hubert.reinterpretcast.Oct 28 2021, 8:55 AM

aaron.ballman removed a child revision: D108469: Improve handling of static assert messages..Jun 28 2022, 8:48 AM

cor3ntin reopened this revision.Jun 22 2023, 1:01 PM

This revision is now accepted and ready to land.Jun 22 2023, 1:01 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2023, 1:01 PM

Herald added subscribers: PiotrZSL, carlosgalvezp. · View Herald Transcript

As approved by CWG
Updates to cxx_status / doc will come later :)

Herald added a subscriber: jdoerfert. · View Herald TranscriptJun 22 2023, 1:01 PM

Harbormaster completed remote builds in B240597: Diff 533743.Jun 22 2023, 3:18 PM

shafik added a subscriber: shafik.Jun 23 2023, 10:09 PM

shafik added inline comments.

clang/lib/Parse/ParseExpr.cpp
3268

Address Shafik's comment

Harbormaster completed remote builds in B240956: Diff 534201.Jun 24 2023, 6:54 AM

LGTM but I don't see asm covered in the tests.

clang/lib/AST/Expr.cpp
1140	Why not grouped w/ `Ordinary` above?
1164	Isn't this the same as `Length`?
1201	Isn't `Str.size()` the same as `ByteLength`?
clang/lib/Lex/LiteralSupport.cpp
90	Should we use `Is` as a prefix here? Right now it should like we are modifying something.

Rename /EscapeValidInUnevaluatedStringLiteral/IsEscapeValidInUnevaluatedStringLiteral

nigelp-xmos removed a subscriber: nigelp-xmos.Jun 26 2023, 9:50 AM

cor3ntin marked 43 inline comments as done.Jun 26 2023, 9:52 AM

cor3ntin added inline comments.

clang/lib/AST/Expr.cpp
1164	Only when CharByteWidth == 1
1201	ByteLength isn't defined in this scope, I guess i could move it.

This also should update the cxx_status page and have a release note.

clang/include/clang/Basic/Attr.td
1411 ↗	(On Diff #534201)	What is the plan for non-standard attributes? Are you planning to handle those in a follow-up, or should we be investigating those right now?
clang/include/clang/Parse/Parser.h
1857–1859	Two default `bool` params is a bad thing but three default `bool` params seems like we should fix the interface at this point. WDYT? Also, it's not clear what the new parameter will do, the function could use comments unless fixing the interface makes it sufficiently clear.
clang/lib/AST/Expr.cpp
1140	Specifically because we want the host encoding, not the target encoding.
1164	It is -- I think we can get rid of `ByteLength`, but it's possible that this exists because of the optimization comment below. I don't insist, but it would be nice to know if we can replace the switch with `Length /= CharByteWidth` these days.
1189	Add `assert(!Pascal && "Can't make an unevaluated Pascal string");` ?
1201	I think it's more clear to use `Str.size()` because we're copying from `Str.data()`.
1238
clang/lib/Lex/LiteralSupport.cpp
90	+1, I think `Is` would be an improvement.

aaron.ballman added inline comments.Jun 26 2023, 9:54 AM

clang/lib/Lex/LiteralSupport.cpp
98	We're still missing support for some escape characters from: http://eel.is/c++draft/lex#nt:simple-escape-sequence-char Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?
1917	Doesn't returning here leave the object in a partially-initialized state? That seems bad.
2083–2086	Is there test coverage that we diagnose this properly?
clang/lib/Lex/PPMacroExpansion.cpp
1872–1874	Test coverage for this change?
clang/lib/Lex/Pragma.cpp
776	Pinging @ChuanqiXu for opinions.
clang/lib/Parse/ParseExpr.cpp
3497–3498	I'm surprised we need special logic in `ParseExpressionList()` for handling unevaluated string literals; I would have expected that to be needed when parsing a string literal. Nothing changed in the grammar for http://eel.is/c++draft/expr.post.general#nt:expression-list (or initializer-list), so these changes seem wrong. Can you explain the changes a bit more?
clang/lib/Sema/SemaDeclAttr.cpp
878–879 ↗	(On Diff #534201)	Test coverage for these changes?
clang/lib/Sema/SemaDeclCXX.cpp
16474	Test coverage for changes?

cor3ntin marked 2 inline comments as done.Jun 26 2023, 9:57 AM

cor3ntin added inline comments.

clang/lib/AST/Expr.cpp
1140	an unevaluated string is a sequence of 1-byte even on platforms were `sizeof(char)` would be 2 or 4. It's never influenced by the target's properties

cor3ntin marked 6 inline comments as done.Jun 26 2023, 10:30 AM

cor3ntin added inline comments.

clang/include/clang/Basic/Attr.td
1411 ↗	(On Diff #534201)	I don't feel I'm qualified to answer that. Ideally, attributes that expect string literals that are not evaluated should follow suite.
clang/include/clang/Parse/Parser.h
1857–1859	I'm still not sure that's the best solution. `AllowEvaluatedString` would only ever be false for attributes, I consider duplicating the function, except it does quite a bit for variadics, which apparently attribute support Maybe would could have ParseAttributeArgumentList ParseExpressionList ParseExpressionListImpl? ?
clang/lib/AST/Expr.cpp
1164	I think we should.
clang/lib/Lex/LiteralSupport.cpp
98	Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct? They are dealt with elsewhere yes (and supported)
2083–2086	What sort of test would you like to see?
clang/lib/Parse/ParseExpr.cpp
3497–3498	We use `ParseExpressionList` when parsing attribute arguments, and some attributes have unevaluate string as argument - I agree with you that I'd rather find a better solution for attributes, but I came up empty. There is no further reason for this change, and you are right it does not match the grammar.
clang/lib/Sema/SemaDeclAttr.cpp
878–879 ↗	(On Diff #534201)	There is one somewhere, I don;t remember where, The reason we need to do that is that Unevaluated StringLiterals don''t have types
clang/lib/Sema/SemaDeclCXX.cpp
16474	There are some in dcl.link/p2.cpp

Address some of Aaron's comments

Harbormaster completed remote builds in B241235: Diff 534640.Jun 26 2023, 1:44 PM

ChuanqiXu added inline comments.Jun 26 2023, 7:42 PM

clang/lib/Lex/Pragma.cpp
776	I think the both options (to modify it or not) are acceptable. Because the input here should be the output of the clang itself. See https://github.com/llvm/llvm-project/blob/ebd0b8a0472b865b7eb6e1a32af97ae31d829033/clang/lib/Basic/Module.cpp#L229-L231 and https://github.com/llvm/llvm-project/blob/ebd0b8a0472b865b7eb6e1a32af97ae31d829033/clang/lib/Frontend/Rewrite/FrontendActions.cpp#L238-L240. We can see there is no deprecated prefix. So while it is acceptable to modify this since its pattern matches the paper, it doesn't matter really since we can control the input completely. Personally, I prefer to not touch it. Since I feel like this use case doesn't have been used a lot. So the effort here may not be worthy.

aaron.ballman added inline comments.Jun 27 2023, 7:41 AM

clang/include/clang/Basic/Attr.td
1411 ↗	(On Diff #534201)	Let's do them in a follow-up. Normally I'd suggest working with @erichkeane on which attributes to apply that to, but he's about to go on a sabbatical and might not have time to help with that. So maybe you can take a first pass at it as best you can and then rope me in to help finalize it, if that'd work for you?
clang/lib/Lex/LiteralSupport.cpp
2083–2086	Pascal strings enabled and using something like `[[deprecated("\pOh no, a Pascal string!")]]` (or some other unevaluated uses).
clang/lib/Parse/ParseExpr.cpp
3497–3498	I was thinking we'd use a new kind of evaluation context for this. We'd enter the evaluation context when we know we need to parse an expression that is an unevaluated string literal which the string literal parser would pay attention to. This would require knowing up-front when we want to parse an unevaluated string literal, but we should have that information available to us at parse time (I think).
clang/lib/Sema/SemaDeclAttr.cpp
878–879 ↗	(On Diff #534201)	Let's try to track that down, but... an unevaluated string literal still has a type, surely? It'd be `const char[]` for C++?

cor3ntin added inline comments.Jun 27 2023, 8:14 AM

clang/lib/Parse/ParseExpr.cpp
3497–3498	After offline discussion, i think what we want to be doing is to have a `ParseAtttributeArgumentList` function that is aware of whether the Nth argument is an unevaluated string - by means of modifying tablegen, and doing the right parsing accordingly. It would take care of all attributes automatically. Alas that's a tad more involved.

aaron.ballman added inline comments.Jun 27 2023, 8:16 AM

clang/lib/Parse/ParseExpr.cpp
3497–3498	+1 I agree it's more involved, but it's also a more general solution that fits nicely in the parser design (we do this sort of thing for other parts of attribute parsing).

cor3ntin added inline comments.Jun 27 2023, 8:29 AM

clang/lib/Sema/SemaDeclAttr.cpp
878–879 ↗	(On Diff #534201)	It doesn't because it doesn't exist past phase 6. It's not unevaluated as in decltype, it's more unevaluated as it's a weird token that never participate in the program, the same way a pragma or an attribute don't have a type. Note that we can revert that change if we do the whole tablegen thing The relevant test is in test/SemaCXX/warn-thread-safety-parsing.cpp, L17

Add tests for pascal strings (which are not a thing in C++ apparently)

Harbormaster completed remote builds in B241499: Diff 534999.Jun 27 2023, 10:33 AM

Parse attribute as unevaluated string if they
are declare StringLiteralArgument in the Attr.td file.

WIP

@aaron.ballman Do we agree on direction before I
fix the remaining broken tests?

There are a few limitations, which I'm hoping not to fix there

It doesn't support variadic string arguments
checking the type of argument ahead of time seems like a good idea overall, maybe we want to expand that system?

Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 11:19 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

In D105759#4453440, @cor3ntin wrote:

Parse attribute as unevaluated string if they
are declare StringLiteralArgument in the Attr.td file.

WIP

@aaron.ballman Do we agree on direction before I
fix the remaining broken tests?

Mostly agreed, though I left a comment where I think the direction should change slightly.

There are a few limitations, which I'm hoping not to fix there

It doesn't support variadic string arguments

That's reasonable; let's leave them as evaluated strings for now so there's no behavioral change.

checking the type of argument ahead of time seems like a good idea overall, maybe we want to expand that system?

I agree; we currently have the common handler checking argument counts (https://github.com/llvm/llvm-project/blob/1e010c5c4fae43c52d6f5f1c8e8920c26bcc6cc7/clang/lib/Sema/SemaAttr.cpp#L1419), but we don't generate any code for checking argument types. But certainly doesn't need to be done as part of your work.

clang/include/clang/Basic/Attr.td
3048 ↗	(On Diff #535073)	I don't think we should reuse this flag this way. This flag is for the traditional sense of "unevaluated", but unevaluated string literals are a different kind of beast. I think that should be tracked on the argument level. We can either adjust: class StringArgument<string name, bit opt = 0> : Argument<name, opt>; so that it takes another bit for whether the string is unevaluated or not, or we could add a new subclass for `UnevaluatedStringArgument`. Then ClangAttrEmitter.cpp would look at this information when emitting the switch cases.
llvm/cmake/modules/HandleLLVMOptions.cmake
608 ↗	(On Diff #535073)	Spurious change. ;-)

Clearing the "accepted" status so it's not confusing as to the state of things.

This revision now requires changes to proceed.Jun 27 2023, 12:21 PM

cor3ntin added inline comments.Jun 27 2023, 12:40 PM

clang/include/clang/Basic/Attr.td
3048 ↗	(On Diff #535073)	This is the previous approach i forgot to fixup everywhere. My current approach is to always consider StringArgument unevaluated. I don't think it make sense to have both StringArgument and UnevaluatedStringArgument. Currently in all the places we accept StringArgument, we check it's a possibly parenthesized StringLiteral If you want an evaluated string literal, an expression that produce a const char* or something should work
llvm/cmake/modules/HandleLLVMOptions.cmake
608 ↗	(On Diff #535073)	I've been battling with that for weeks, that flag completely breaks my IDE, not sure why. It was inevitable that it ended up in a commit :\|

aaron.ballman added inline comments.Jun 27 2023, 1:32 PM

clang/include/clang/Basic/Attr.td
3048 ↗	(On Diff #535073)	My current approach is to always consider StringArgument unevaluated. I don't think it make sense to have both StringArgument and UnevaluatedStringArgument. I think that's potentially a pretty significant change in behavior until we actually evaluate (ahahaha, puns!) all the vendor attributes using a `StringArgument`. Also, I thought you mentioned you planned to leave variadic string arguments as evaluated strings, so there would be a pretty surprising inconsistency to the behavior there. I would feel more comfortable not changing the behavior of attributes we've not validated are still correct when using unevaluated strings.

Harbormaster completed remote builds in B241554: Diff 535073.Jun 27 2023, 1:51 PM

Fix tests and handle variadic attributes.

With that all normal attributes are handled. Only attributes with custop parsing code and those specified as an enum are left untouched.

Note that I have confirmed that all the change attributes require a StringLiteral and go through checkStringLiteralArgumentAttr

revert accidental changes to cmake

Just 2 small nits, otherwise this all LGTM.

clang/lib/Parse/ParseDecl.cpp
430 ↗	(On Diff #535356)	Please put a newline between unchained 'if' statements... it makes tehse really hard to read without it. It happens a few times here.
clang/lib/Sema/SemaDeclAttr.cpp
878 ↗	(On Diff #535356)	Unrelated change here? What is this for?

Address Erich's comments

cor3ntin added inline comments.Jun 28 2023, 7:02 AM

clang/lib/Sema/SemaDeclAttr.cpp
878 ↗	(On Diff #535356)	Some test i failed to fully revert. good catch!

Harbormaster completed remote builds in B241770: Diff 535373.Jun 28 2023, 7:58 AM

I don't think it's correct to assume that all string arguments to attributes are unevaluated, but it is hard to tell where to draw the line sometimes. Backing up a step, as I understand P2361, an unevaluated string is one which is not converted into the execution character set (effectively). Is that correct? If so, then as an example, [[clang::annotate()]] should almost certainly be using an evaluated string because the argument is passed down to LLVM IR and is used in ways we cannot predict. What's more, an unevaluated string cannot have some kinds of escape characters (numeric and conditional escape sequences) and those are currently allowed by clang::annotate and could potentially be used by a backend plugin.

I think other attributes may have similar issues. For example, the alias attribute is a bit of a question mark for me -- that takes a string literal representing an external identifier that is looked up. I'm not certain whether that should be in the execution character set or not, but we do support escape sequences for it: https://godbolt.org/z/v65Yd7a68

I think we need to track evaluated vs not on the argument level so that the attributes in Attr.td can decide which form to use. I think we should default to "evaluated" for any attribute we're on the fence about because that's the current behavior they get today (so we should avoid regressions).

clang/include/clang/Sema/ParsedAttr.h
919 ↗	(On Diff #535373)
clang/lib/Parse/ParseDecl.cpp
291 ↗	(On Diff #535373)	Comment doesn't match the function name. ;-)
453–454 ↗	(On Diff #535373)	What are these lines intended to do? We assign to `E` but nothing ever reads from it after this assignment and we reset it on the next iteration through the loop.
clang/lib/Parse/ParseExpr.cpp
3497	Can revert these two changes now.

In D105759#4456864, @aaron.ballman wrote:

I don't think it's correct to assume that all string arguments to attributes are unevaluated, but it is hard to tell where to draw the line sometimes. Backing up a step, as I understand P2361, an unevaluated string is one which is not converted into the execution character set (effectively). Is that correct? If so, then as an example, [[clang::annotate()]] should almost certainly be using an evaluated string because the argument is passed down to LLVM IR and is used in ways we cannot predict. What's more, an unevaluated string cannot have some kinds of escape characters (numeric and conditional escape sequences) and those are currently allowed by clang::annotate and could potentially be used by a backend plugin.

I think other attributes may have similar issues. For example, the alias attribute is a bit of a question mark for me -- that takes a string literal representing an external identifier that is looked up. I'm not certain whether that should be in the execution character set or not, but we do support escape sequences for it: https://godbolt.org/z/v65Yd7a68

I took a quick pass over our existing attributes, and here's my intuition on them regarding encoding of the literal:

Unevaluated Strings are Fine:
AbiTag
TLSModel
Availability
Deprecated
EnableIf/DiagnoseIf
ObjCRuntimeName
PragmaClangBSSSection/PragmaClangDataSection/PragmaClangRodataSection/PragmaClangRelroSection/PragmaClangTextSection (only created implicitly)
Suppress
Target/TargetVersion/TargetClones
Unavailable
Uuid
WarnUnusedResult
NoSanitize
Capability
Assumption
NoBuiltin (it names a builtin name, so this is probably fine to leave unevaluated?)
AcquireHandle/UseHandle/ReleaseHandle
Error
HLSLResourceBinding

Unevaluated String are Potentially Bad:
Annotate
AnnotateType

Unevaluated String Needs More Thinking (common thread is that they survive to LLVM IR):
Alias
AsmLabel
IFunc
BTFDeclTag/BTFTypeTag (is emitted to DWARF with -g so probably evaluated?)
WebAssemblyExportName/WebAssemblyImportModule/WebAssemblyImportModule
ExternalSourceSymbol
SwiftAsyncName/SwiftAttr/SwiftBridge/SwiftName
Section/CodeSeg/InitSeg
WeakRef
EnforceTCB/EnforceTCBLeaf

There's also the escape sequences issue where use of an escape sequence will go from accepted to rejected in these contexts. I did some hunting to see if I could find uses of numeric escape sequences in asm labels or alias attributes, to see if there's some signs this is done in practice:

Testing we can find numeric escape sequences at all:
https://sourcegraph.com/search?q=context:global+%5C%28%5C%22%5B%5B:alpha:%5D%5D*%28%5C%5C%5B%5B:digit:%5D%5D%2B%29%2B%5B%5B:alpha:%5D%5D*%5C%22%5C%29+lang:C+lang:C%2B%2B&patternType=regexp&case=yes&sm=1&groupBy=repo

Testing we can find asm labels at all:
https://sourcegraph.com/search?q=context:global+asm%5C%28%5C%22%5B%5B:alpha:%5D%5D*%5B%5B:alpha:%5D%5D*%5C%22%5C%29+lang:C+lang:C%2B%2B&patternType=regexp&case=yes&sm=1&groupBy=repo

Testing we can find asm labels with numeric escapes:
https://sourcegraph.com/search?q=context:global+asm%5C%28%5C%22%5B%5B:alpha:%5D%5D*%28%5C%5C%5B%5B:digit:%5D%5D%2B%29%2B%5B%5B:alpha:%5D%5D*%5C%22%5C%29+lang:C+lang:C%2B%2B&patternType=regexp&case=yes&sm=1&groupBy=repo

Testing we can find alias attributes at all:
https://sourcegraph.com/search?q=context:global+alias%5C%28%5C%22%5B%5B:alpha:%5D%5D*%5B%5B:alpha:%5D%5D*%5C%22%5C%29+lang:C+lang:C%2B%2B&patternType=regexp&case=yes&sm=1&groupBy=repo

Testing we can find alias attributes with numeric escapes:
https://sourcegraph.com/search?q=context:global+alias%5C%28%5C%22%5B%5B:alpha:%5D%5D*%28%5C%5C%5B%5B:digit:%5D%5D%2B%29%2B%5B%5B:alpha:%5D%5D*%5C%22%5C%29+lang:C+lang:C%2B%2B&patternType=regexp&case=yes&sm=1&groupBy=repo

I think this leaves me with three open questions:

Do we know of any uses of the annotate attribute that rely on the string literal being in the execution character set? I do not know of any but I know this is used by plugins quite often.
Do we know of any attributes in the "needs more thinking" list that should have the string literal encoded in the execution character set? I think most of these are for referring to identifiers in source and I expect those would want source character set and not execution character set strings.
Do we know of any significant body of code using numeric escape sequences in these string literals that could not be relatively easily modified to compile again? I would be surprised, but I think someone should probably run more of the attributes on the "needs more thinking" list through similar searches on source graph and we can use that as an approximation.

If all these answers come back "no" as best we can figure, then I think we can punt on argument-level handling of this until we add an attribute that really does need an execution-encoded (or numeric escape sequence-using) string literal. I think we've got enough time before the Clang 17 branch to hear if the changes cause problems after we've done this due diligence. WDYT?

In D105759#4456864, @aaron.ballman wrote:

I don't think it's correct to assume that all string arguments to attributes are unevaluated, but it is hard to tell where to draw the line sometimes. Backing up a step, as I understand P2361, an unevaluated string is one which is not converted into the execution character set (effectively). Is that correct? If so, then as an example, [[clang::annotate()]] should almost certainly be using an evaluated string because the argument is passed down to LLVM IR and is used in ways we cannot predict. What's more, an unevaluated string cannot have some kinds of escape characters (numeric and conditional escape sequences) and those are currently allowed by clang::annotate and could potentially be used by a backend plugin.

I think other attributes may have similar issues. For example, the alias attribute is a bit of a question mark for me -- that takes a string literal representing an external identifier that is looked up. I'm not certain whether that should be in the execution character set or not, but we do support escape sequences for it: https://godbolt.org/z/v65Yd7a68

I think we need to track evaluated vs not on the argument level so that the attributes in Attr.td can decide which form to use. I think we should default to "evaluated" for any attribute we're on the fence about because that's the current behavior they get today (so we should avoid regressions).

I really don't think it makes sense to have both "unevaluated" and "evaluated" arguments.
We chatted offline and we struggle to find places where escape sequences are used, or examples of attributes intended to be in the execution character set.

My suggestion would be to land the non-attributes changes now, and the attributes bits in early clang 18.
If we find clear example of attributes expecting execution character set, they should be able to be described as an expression, which will be checked as a string literal anyway, hopefully?

In the case of annotate, if these are fed, for example to a debugger, their may need to convert to whatever the debugger expect as encoding, which is not necessarily the execution charset,
Same for plugins, they certainly not expect ebcdic data, for example.
I would expect for example static analyzers and code generator to keep working after the introduction of fexec-charset
So it's important that it remains unevaluated in the front end so that it can be correctly converted to the appropriate encoding of the various consumers. Which doesn't have a single answer

Do we know of any attributes in the "needs more thinking" list that should have the string literal encoded in the execution character set? I think most of these are for referring to identifiers in source and I expect those would want source character set and not execution character set strings.

Identifiers and symbol names are in UTF8, and may get mangle through, for example replacing non-ascii codepoints by UCN. The source character set is never relevant
This address the WebAsm attributes

BTFDeclTag/BTFTypeTag (is emitted to DWARF with -g so probably evaluated?)

Is it correct to assume the debugger file encoding is always the same as the program's ? Probably not!
If need be, we can then transcode the strings when doing codegen for these things

cor3ntin mentioned this in D154290: [Clang] Implement P2741R3 - user-generated static_assert messages.Jul 1 2023, 3:39 PM

In D105759#4457041, @cor3ntin wrote:

In D105759#4456864, @aaron.ballman wrote:

I don't think it's correct to assume that all string arguments to attributes are unevaluated, but it is hard to tell where to draw the line sometimes. Backing up a step, as I understand P2361, an unevaluated string is one which is not converted into the execution character set (effectively). Is that correct? If so, then as an example, [[clang::annotate()]] should almost certainly be using an evaluated string because the argument is passed down to LLVM IR and is used in ways we cannot predict. What's more, an unevaluated string cannot have some kinds of escape characters (numeric and conditional escape sequences) and those are currently allowed by clang::annotate and could potentially be used by a backend plugin.

I think other attributes may have similar issues. For example, the alias attribute is a bit of a question mark for me -- that takes a string literal representing an external identifier that is looked up. I'm not certain whether that should be in the execution character set or not, but we do support escape sequences for it: https://godbolt.org/z/v65Yd7a68

I think we need to track evaluated vs not on the argument level so that the attributes in Attr.td can decide which form to use. I think we should default to "evaluated" for any attribute we're on the fence about because that's the current behavior they get today (so we should avoid regressions).

I really don't think it makes sense to have both "unevaluated" and "evaluated" arguments.
We chatted offline and we struggle to find places where escape sequences are used, or examples of attributes intended to be in the execution character set.

In general I agree, but the one scenario that I keep coming back to are attributes like diagnose_if where they take an expression we're evaluating at compile time (condition expression) and a string literal that's not evaluated (warning vs error, diagnostic message itself). But I think the "evaluating at compile time" is part of why I don't think we intend the attribute to be considering the execution character set.

My suggestion would be to land the non-attributes changes now, and the attributes bits in early clang 18.

I think we're almost safe enough to make the attribute changes in Clang 17 so that no attribute uses an evaluated argument, but given that there's less than a month before we make the 17 branch, I think it's probably a good idea to make these changes after the branch point so folks have longer to react. Adding clang-vendors to the review for awareness of the potential for a breaking change.

I removed the changes to attributes.
Nothing else changes except cxx_status/ReleaseNotes.

Unevaluated strings in attributes will be back (in a separate PR)

LGTM with a minor tweak to the wording on the status page, thank you!

clang/www/cxx_status.html
118–124

This revision is now accepted and ready to land.Jul 7 2023, 4:21 AM

This revision was landed with ongoing or failed builds.Jul 7 2023, 4:30 AM

Closed by commit rG95f50964fbf5: Implement P2361 Unevaluated string literals (authored by cor3ntin). · Explain Why

This revision was automatically updated to reflect the committed changes.

cor3ntin added a commit: rG95f50964fbf5: Implement P2361 Unevaluated string literals.

Harbormaster completed remote builds in B243730: Diff 538073.Jul 7 2023, 5:00 AM

barannikov88 added a subscriber: barannikov88.Jul 7 2023, 7:20 AM

barannikov88 added inline comments.Jul 7 2023, 7:37 AM

clang/docs/ReleaseNotes.rst
138	Looks like a copy&paste bug.

cor3ntin added inline comments.Jul 7 2023, 7:40 AM

clang/docs/ReleaseNotes.rst
138	Nice catch, thanks

@cor3ntin
I've been working on pretty much the same functionality in our downstream fork. I was not aware of the paper, nor of the ongoing work in this direction, and so I unfortunately missed the review.
Thanks for this patch, it significantly reduces the number of changes downstream and makes it easier to merge with upstream in the future.

I have a couple of questions about future work:

IIUC the paper initially addressed this issue with #line directive, but the changes were reverted(?). Is there any chance they can get back?
Are there any plans for making similar changes to asm statement parsing?

In D105759#4482543, @barannikov88 wrote:

@cor3ntin
I've been working on pretty much the same functionality in our downstream fork. I was not aware of the paper, nor of the ongoing work in this direction, and so I unfortunately missed the review.
Thanks for this patch, it significantly reduces the number of changes downstream and makes it easier to merge with upstream in the future.

I have a couple of questions about future work:

IIUC the paper initially addressed this issue with #line directive, but the changes were reverted(?). Is there any chance they can get back?

There is a core issue tracking that, ie the c++ committee was concerned about escape sequences in header names
https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2693

I'd be happy to bring that back to clang though, as the concerned is unlikely to be warranted for us.

Are there any plans for making similar changes to asm statement parsing?

The direction of the c++ committee is that what's in asm() is now strictly implementation-defined, so we could but last time there were concerns about escape sequences in there too.

uabelho added a subscriber: uabelho.Jul 10 2023, 3:55 AM

I hope this patch may allow to gather some data on that.

@cor3ntin, I have reports that applications having encoding prefixes in static_assert are failing to build. The committee did not adopt the subject paper as a "DR resolution". Is it possible to downgrade to a warning?

In D105759#4540716, @hubert.reinterpretcast wrote:

I hope this patch may allow to gather some data on that.

@cor3ntin, I have reports that applications having encoding prefixes in static_assert are failing to build. The committee did not adopt the subject paper as a "DR resolution". Is it possible to downgrade to a warning?

You know how frequent it is?

Making it a warning is possible but not straightforward.

Previously with a prefix, it was parsed as a wide (for example) string, and we relied on the fact that L was UTF-16/32 to sometimes print a reasonable diagnostics - and sometimes not https://godbolt.org/z/f3Pj4T5aj
This is going to be worse when we add -fexec-charset:

static_assert(true, L"やあ") is going to be ill-formed when coded as, eg, EBCDIC because it's not representable, and if it is representable we need to either output mojibake or convert the string back to UTF-8 which we are currently not doing.

Another solution maybe to lexically ignore prefixes by replacing the string literal token on the fly such that they are still parsed as unevaluated strings and not encoded, i could look into that.

In D105759#4541813, @cor3ntin wrote:

In D105759#4540716, @hubert.reinterpretcast wrote:

I hope this patch may allow to gather some data on that.

@cor3ntin, I have reports that applications having encoding prefixes in static_assert are failing to build. The committee did not adopt the subject paper as a "DR resolution". Is it possible to downgrade to a warning?

You know how frequent it is?

No, but I am concerned that this came up even before we deployed an LLVM 17-based solution (in pre-release testing). I believe that reverting for LLVM 17 is the prudent course of action.

Previously with a prefix, it was parsed as a wide (for example) string, and we relied on the fact that L was UTF-16/32 to sometimes print a reasonable diagnostics - and sometimes not https://godbolt.org/z/f3Pj4T5aj
This is going to be worse when we add -fexec-charset:

static_assert(true, L"やあ") is going to be ill-formed when coded as, eg, EBCDIC because it's not representable, and if it is representable we need to either output mojibake or convert the string back to UTF-8 which we are currently not doing.

This may be the motivation for the prefixes in the applications in the first place in the context of other compilers: They may have needed the prefix to avoid unrepresentable character issues (e.g., if the other compiler rejects the unprefixed string, but manages to emit the error to the terminal because both the terminal and the compiler use the source encoding).

Another solution maybe to lexically ignore prefixes by replacing the string literal token on the fly such that they are still parsed as unevaluated strings and not encoded, i could look into that.

This sounds good.

In D105759#4543184, @hubert.reinterpretcast wrote:

In D105759#4541813, @cor3ntin wrote:

In D105759#4540716, @hubert.reinterpretcast wrote:

I hope this patch may allow to gather some data on that.

@cor3ntin, I have reports that applications having encoding prefixes in static_assert are failing to build. The committee did not adopt the subject paper as a "DR resolution". Is it possible to downgrade to a warning?

You know how frequent it is?

No, but I am concerned that this came up even before we deployed an LLVM 17-based solution (in pre-release testing). I believe that reverting for LLVM 17 is the prudent course of action.

Early reports of user code getting tripped up on this is something we should react to while we still can; I'd recommend we change the diagnostic to be a warning that defaults to an error so that users who are caught by the changes can still disable the diagnostic rather than be stuck; for Clang 18, we can explore other solutions to the issue. Would this work for you @hubert.reinterpretcast?

In D105759#4543246, @aaron.ballman wrote:

I'd recommend we change the diagnostic to be a warning that defaults to an error so that users who are caught by the changes can still disable the diagnostic rather than be stuck; for Clang 18, we can explore other solutions to the issue. Would this work for you @hubert.reinterpretcast?

I think there are questions about whether an error (or even warning) by default is appropriate. This seems to be a change for C++2c that does not have "DR" treatment from the committee. Considering this a warning controlled by c++2c-compat is a potential direction. Indeed, if we are going to accept the code, we might as well allow it as an extension in C++2c modes. With this line of logic, I don't see why we would want user-side churn of making a migration effort.

In D105759#4543685, @hubert.reinterpretcast wrote:

In D105759#4543246, @aaron.ballman wrote:

I'd recommend we change the diagnostic to be a warning that defaults to an error so that users who are caught by the changes can still disable the diagnostic rather than be stuck; for Clang 18, we can explore other solutions to the issue. Would this work for you @hubert.reinterpretcast?

I think there are questions about whether an error (or even warning) by default is appropriate. This seems to be a change for C++2c that does not have "DR" treatment from the committee. Considering this a warning controlled by c++2c-compat is a potential direction. Indeed, if we are going to accept the code, we might as well allow it as an extension in C++2c modes. With this line of logic, I don't see why we would want user-side churn of making a migration effort.

I will endeavor to have a patch by the beginning of the week.

I think the implementation effort is going to be the same whether it is an error by default or not so we can discuss that. I don't have a strong opinion.
Ideally, that would depend on how many users are affected.

However, I don't think nothing at all is a reasonable expectation here, L in static_assert message does either not work or is ignored. In no case does it do what the user wants https://godbolt.org/z/fYnMqT38P
Text encodings are sufficiently confusing that we should not add to the confusion by not telling users their encodings prefix have no effects.

And, given that prior to c++20 the standard simply ignores encoding prefixes, we could also discuss whether it was ever intended for prefixes to be supported or whether it was an oversight to begin with.

cor3ntin mentioned this in D156596: [Clang] Produce a warning instead of an error in unevaluated strings before C++26.Jul 31 2023, 6:53 AM

Revision Contents

Path

Size

clang-tools-extra/

test/

clang-tidy/

checkers/

modernize/

unary-static-assert.cpp

3 lines

clang/

docs/

ReleaseNotes.rst

2 lines

include/

clang/

AST/

Expr.h

5 lines

Basic/

DiagnosticLexKinds.td

7 lines

DiagnosticSemaKinds.td

3 lines

Lex/

LiteralSupport.h

29 lines

Parse/

Parser.h

4 lines

Sema/

Sema.h

2 lines

lib/

AST/

Expr.cpp

70 lines

Lex/

LiteralSupport.cpp

96 lines

PPMacroExpansion.cpp

3 lines

Pragma.cpp

3 lines

Parse/

ParseDeclCXX.cpp

4 lines

ParseExpr.cpp

16 lines

Sema/

11 lines

24 lines

3 lines

3 lines

test/

CXX/

dcl.dcl/

dcl.link/

p2.cpp

8 lines

p4-0x.cpp

5 lines

FixIt/

fixit-static-assert.cpp

2 lines

SemaCXX/

static-assert.cpp

24 lines

www/

cxx_status.html

7 lines

Diff 538078

clang-tools-extra/test/clang-tidy/checkers/modernize/unary-static-assert.cpp

	// RUN: %check_clang_tidy -std=c++17-or-later %s modernize-unary-static-assert %t			// RUN: %check_clang_tidy -std=c++17-or-later %s modernize-unary-static-assert %t

	#define FOO static_assert(sizeof(a) <= 15, "");			#define FOO static_assert(sizeof(a) <= 15, "");
	#define MSG ""			#define MSG ""

	void f_textless(int a) {			void f_textless(int a) {
	static_assert(sizeof(a) <= 10, "");			static_assert(sizeof(a) <= 10, "");
	// CHECK-MESSAGES: :[[@LINE-1]]:3: warning: use unary 'static_assert' when the string literal is an empty string [modernize-unary-static-assert]			// CHECK-MESSAGES: :[[@LINE-1]]:3: warning: use unary 'static_assert' when the string literal is an empty string [modernize-unary-static-assert]
	// CHECK-FIXES: {{^}} static_assert(sizeof(a) <= 10 );{{$}}			// CHECK-FIXES: {{^}} static_assert(sizeof(a) <= 10 );{{$}}
	static_assert(sizeof(a) <= 12, L"");
	// CHECK-MESSAGES: :[[@LINE-1]]:3: warning: use unary 'static_assert' when
	// CHECK-FIXES: {{^}} static_assert(sizeof(a) <= 12 );{{$}}
	FOO			FOO
	// CHECK-FIXES: {{^}} FOO{{$}}			// CHECK-FIXES: {{^}} FOO{{$}}
	static_assert(sizeof(a) <= 17, MSG);			static_assert(sizeof(a) <= 17, MSG);
	// CHECK-FIXES: {{^}} static_assert(sizeof(a) <= 17, MSG);{{$}}			// CHECK-FIXES: {{^}} static_assert(sizeof(a) <= 17, MSG);{{$}}
	}			}

	void f_with_tex(int a) {			void f_with_tex(int a) {
	static_assert(sizeof(a) <= 10, "Size of variable a is out of range!");			static_assert(sizeof(a) <= 10, "Size of variable a is out of range!");
	}			}

	void f_unary(int a) { static_assert(sizeof(a) <= 10); }			void f_unary(int a) { static_assert(sizeof(a) <= 10); }

	void f_incorrect_assert() { static_assert(""); }			void f_incorrect_assert() { static_assert(""); }

clang/docs/ReleaseNotes.rst

Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines

- Implemented partial support for `P2448R2: Relaxing some constexpr restrictions <https://wg21.link/p2448r2>`_

non-constexpr functions and constructors.

- Clang now supports `requires cplusplus23` for module maps.

- Implemented `P2564R3: consteval needs to propagate up <https://wg21.link/P2564R3>`_.

C++2c Feature Support

^^^^^^^^^^^^^^^^^^^^^

- Compiler flags ``-std=c++2c`` and ``-std=gnu++2c`` have been added for experimental C++2c implementation work.

- Implemented `P2738R1: constexpr cast from void* <https://wg21.link/P2738R1>`_.

- Partially implemented `P2361R6: constexpr cast from void* <https://wg21.link/P2361R6>`_.

barannikov88Unsubmitted

Not Done

- Implemented `P2738R1: constexpr cast from void* <https://wg21.link/P2738R1>`_.

- - Partially implemented `P2361R6: constexpr cast from void* <https://wg21.link/P2361R6>`_.

+ - Partially implemented `P2361R6: Unevaluated strings <https://wg21.link/P2361R6>`_.

The changes to attributes declarations are not part of this release.

Looks like a copy&paste bug.

barannikov88: Looks like a copy&paste bug.

cor3ntinAuthorUnsubmitted

Done

Nice catch, thanks

cor3ntin: Nice catch, thanks

The changes to attributes declarations are not part of this release.

Resolutions to C++ Defect Reports

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- Implemented `DR2397 <https://wg21.link/CWG2397>`_ which allows ``auto`` specifier for pointers

and reference to arrays.

C Language Changes

------------------

▲ Show 20 Lines • Show All 715 Lines • Show Last 20 Lines

clang/include/clang/AST/Expr.h

Show First 20 Lines • Show All 1,798 Lines • ▼ Show 20 Lines class StringLiteral final

/// consider moving it inside StringLiteral. /// consider moving it inside StringLiteral.

/// ///

/// * An array of getNumConcatenated() SourceLocation, one for each of the /// * An array of getNumConcatenated() SourceLocation, one for each of the

/// token this string is made of. /// token this string is made of.

/// ///

/// * An array of getByteLength() char used to store the string data. /// * An array of getByteLength() char used to store the string data.

public: public:

enum StringKind { Ordinary, Wide, UTF8, UTF16, UTF32 }; enum StringKind { Ordinary, Wide, UTF8, UTF16, UTF32, Unevaluated };

private: private:

unsigned numTrailingObjects(OverloadToken<unsigned>) const { return 1; } unsigned numTrailingObjects(OverloadToken<unsigned>) const { return 1; }

unsigned numTrailingObjects(OverloadToken<SourceLocation>) const { unsigned numTrailingObjects(OverloadToken<SourceLocation>) const {

return getNumConcatenated(); return getNumConcatenated();

} }

unsigned numTrailingObjects(OverloadToken<char>) const { unsigned numTrailingObjects(OverloadToken<char>) const {

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines public:

} }

/// Construct an empty string literal. /// Construct an empty string literal.

static StringLiteral *CreateEmpty(const ASTContext &Ctx, static StringLiteral *CreateEmpty(const ASTContext &Ctx,

unsigned NumConcatenated, unsigned Length, unsigned NumConcatenated, unsigned Length,

unsigned CharByteWidth); unsigned CharByteWidth);

StringRef getString() const { StringRef getString() const {

assert(getCharByteWidth() == 1 && assert((isUnevaluated() || getCharByteWidth() == 1) &&

aaron.ballmanUnsubmitted

Done

Do we also want to assert that if it is unevaluated, it's char byte width *is* one byte? (No such thing as a multibyte unevaluated string literal.)

aaron.ballman: Do we also want to assert that if it is unevaluated, it's char byte width *is* one byte? (No…

cor3ntinAuthorUnsubmitted

Done

This test is there because unevaluated strings don't have bytes at all! (trying to call getCharByteWidth() on them would assert)

cor3ntin: This test is there because unevaluated strings don't have bytes at all! (trying to call…

aaron.ballmanUnsubmitted

Done

Ah, good point!

aaron.ballman: Ah, good point!

"This function is used in places that assume strings use char"); "This function is used in places that assume strings use char");

return StringRef(getStrDataAsChar(), getByteLength()); return StringRef(getStrDataAsChar(), getByteLength());

aaron.ballmanUnsubmitted

Done

StringRef getString() const {

- assert(isUnevaluated() ||

- getCharByteWidth() == 1 &&

+ assert((isUnevaluated() ||

+ getCharByteWidth() == 1) &&

"This function is used in places that assume strings use char");

This should silence some diagnostics about mixed && and || in the same expression.

aaron.ballman: This should silence some diagnostics about mixed && and || in the same expression.

} }

/// Allow access to clients that need the byte representation, such as /// Allow access to clients that need the byte representation, such as

/// ASTWriterStmt::VisitStringLiteral(). /// ASTWriterStmt::VisitStringLiteral().

StringRef getBytes() const { StringRef getBytes() const {

// FIXME: StringRef may not be the right type to use as a result for this. // FIXME: StringRef may not be the right type to use as a result for this.

return StringRef(getStrDataAsChar(), getByteLength()); return StringRef(getStrDataAsChar(), getByteLength());

} }

Show All 21 Lines StringKind getKind() const {

return static_cast<StringKind>(StringLiteralBits.Kind); return static_cast<StringKind>(StringLiteralBits.Kind);

} }

bool isOrdinary() const { return getKind() == Ordinary; } bool isOrdinary() const { return getKind() == Ordinary; }

bool isWide() const { return getKind() == Wide; } bool isWide() const { return getKind() == Wide; }

bool isUTF8() const { return getKind() == UTF8; } bool isUTF8() const { return getKind() == UTF8; }

bool isUTF16() const { return getKind() == UTF16; } bool isUTF16() const { return getKind() == UTF16; }

bool isUTF32() const { return getKind() == UTF32; } bool isUTF32() const { return getKind() == UTF32; }

bool isUnevaluated() const { return getKind() == Unevaluated; }

bool isPascal() const { return StringLiteralBits.IsPascal; } bool isPascal() const { return StringLiteralBits.IsPascal; }

bool containsNonAscii() const { bool containsNonAscii() const {

for (auto c : getString()) for (auto c : getString())

if (!isASCII(c)) if (!isASCII(c))

return true; return true;

return false; return false;

} }

▲ Show 20 Lines • Show All 4,728 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticLexKinds.td

Show First 20 Lines • Show All 270 Lines • ▼ Show 20 Lines

def ext_reserved_user_defined_literal : ExtWarn< def ext_reserved_user_defined_literal : ExtWarn<

"invalid suffix on literal; C++11 requires a space between literal and " "invalid suffix on literal; C++11 requires a space between literal and "

"identifier">, InGroup<ReservedUserDefinedLiteral>, DefaultError; "identifier">, InGroup<ReservedUserDefinedLiteral>, DefaultError;

def ext_ms_reserved_user_defined_literal : ExtWarn< def ext_ms_reserved_user_defined_literal : ExtWarn<

"invalid suffix on literal; C++11 requires a space between literal and " "invalid suffix on literal; C++11 requires a space between literal and "

"identifier">, InGroup<ReservedUserDefinedLiteral>; "identifier">, InGroup<ReservedUserDefinedLiteral>;

def err_unsupported_string_concat : Error< def err_unsupported_string_concat : Error<

"unsupported non-standard concatenation of string literals">; "unsupported non-standard concatenation of string literals">;

def err_unevaluated_string_prefix : Error<

erichkeaneUnsubmitted

Done

Is there value to combining these two diagnostics with a %select?

erichkeane: Is there value to combining these two diagnostics with a %select?

aaron.ballmanUnsubmitted

Done

I waffled when doing this review, so it's funny you mention it. :-D

We could do: an unevaluated string literal cannot %select{have an encoding prefix|be a user-defined literal}0 but there was just enough text in the select that I felt it wasn't critical to combine. But I don't feel strongly either way.

aaron.ballman: I waffled when doing this review, so it's funny you mention it. :-D We could do: `an…

erichkeaneUnsubmitted

Done

I was waffly on this too, so your waffling + my waffling I think is sufficient reason to not deal with this now.

erichkeane: I was waffly on this too, so your waffling + my waffling I think is sufficient reason to not…

"an unevaluated string literal cannot have an encoding prefix">;

def err_unevaluated_string_udl : Error<

"an unevaluated string literal cannot be a user-defined literal">;

aaron.ballmanUnsubmitted

Done

def err_unevaluated_string_udl : Error<

- "an unevaluated string literal cannot be a user defined literal">;

+ "an unevaluated string literal cannot be a user-defined literal">;

def err_unevaluated_string_invalid_escape_sequence : Error<

aaron.ballman:

def err_unevaluated_string_invalid_escape_sequence : Error<

"invalid escape sequence '%0' in an unevaluated string literal">;

aaron.ballmanUnsubmitted

Done

def err_unevaluated_string_invalid_escape_sequence : Error<

- "Invalid escape sequence '%0' in an unevaluated string literal">;

+ "invalid escape sequence '%0' in an unevaluated string literal">;

def err_string_concat_mixed_suffix : Error<

aaron.ballman:

def err_string_concat_mixed_suffix : Error< def err_string_concat_mixed_suffix : Error<

"differing user-defined suffixes ('%0' and '%1') in string literal " "differing user-defined suffixes ('%0' and '%1') in string literal "

"concatenation">; "concatenation">;

def err_pp_invalid_udl : Error< def err_pp_invalid_udl : Error<

"%select{character|integer}0 literal with user-defined suffix " "%select{character|integer}0 literal with user-defined suffix "

"cannot be used in preprocessor constant expression">; "cannot be used in preprocessor constant expression">;

def err_bad_string_encoding : Error< def err_bad_string_encoding : Error<

"illegal character encoding in string literal">; "illegal character encoding in string literal">;

▲ Show 20 Lines • Show All 669 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 427 Lines • ▼ Show 20 Lines	def ext_implicit_function_decl_c99 : ExtWarn<
"call to undeclared function %0; ISO C99 and later do not support implicit "		"call to undeclared function %0; ISO C99 and later do not support implicit "
"function declarations">, InGroup<ImplicitFunctionDeclare>, DefaultError;		"function declarations">, InGroup<ImplicitFunctionDeclare>, DefaultError;
def note_function_suggestion : Note<"did you mean %0?">;		def note_function_suggestion : Note<"did you mean %0?">;

def err_ellipsis_first_param : Error<		def err_ellipsis_first_param : Error<
"ISO C requires a named parameter before '...'">;		"ISO C requires a named parameter before '...'">;
def err_declarator_need_ident : Error<"declarator requires an identifier">;		def err_declarator_need_ident : Error<"declarator requires an identifier">;
def err_language_linkage_spec_unknown : Error<"unknown linkage language">;		def err_language_linkage_spec_unknown : Error<"unknown linkage language">;
def err_language_linkage_spec_not_ascii : Error<
"string literal in language linkage specifier cannot have an "
"encoding-prefix">;
def ext_use_out_of_scope_declaration : ExtWarn<		def ext_use_out_of_scope_declaration : ExtWarn<
"use of out-of-scope declaration of %0%select{\| whose type is not "		"use of out-of-scope declaration of %0%select{\| whose type is not "
"compatible with that of an implicit declaration}1">,		"compatible with that of an implicit declaration}1">,
InGroup<DiagGroup<"out-of-scope-function">>;		InGroup<DiagGroup<"out-of-scope-function">>;
def err_inline_non_function : Error<		def err_inline_non_function : Error<
"'inline' can only appear on functions%select{\| and non-local variables}0">;		"'inline' can only appear on functions%select{\| and non-local variables}0">;
def err_noreturn_non_function : Error<		def err_noreturn_non_function : Error<
"'_Noreturn' can only appear on functions">;		"'_Noreturn' can only appear on functions">;
▲ Show 20 Lines • Show All 11,445 Lines • Show Last 20 Lines

clang/include/clang/Lex/LiteralSupport.h

Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines public:

uint64_t getValue() const { return Value; } uint64_t getValue() const { return Value; }

StringRef getUDSuffix() const { return UDSuffixBuf; } StringRef getUDSuffix() const { return UDSuffixBuf; }

unsigned getUDSuffixOffset() const { unsigned getUDSuffixOffset() const {

assert(!UDSuffixBuf.empty() && "no ud-suffix"); assert(!UDSuffixBuf.empty() && "no ud-suffix");

return UDSuffixOffset; return UDSuffixOffset;

} }

}; };

enum class StringLiteralEvalMethod {

aaron.ballmanUnsubmitted

Done

}

};

- enum class StringLiteralKind {

+ enum class StringLiteralEvalMethod {

Evaluated,

Slight renaming so nobody thinks this is going to be about wide vs narrow vs u8, etc.

aaron.ballman: Slight renaming so nobody thinks this is going to be about wide vs narrow vs u8, etc.

Evaluated,

Unevaluated,

};

/// StringLiteralParser - This decodes string escape characters and performs /// StringLiteralParser - This decodes string escape characters and performs

/// wide string analysis and Translation Phase #6 (concatenation of string /// wide string analysis and Translation Phase #6 (concatenation of string

/// literals) (C99 5.1.1.2p1). /// literals) (C99 5.1.1.2p1).

class StringLiteralParser { class StringLiteralParser {

const SourceManager &SM; const SourceManager &SM;

const LangOptions &Features; const LangOptions &Features;

const TargetInfo &Target; const TargetInfo &Target;

DiagnosticsEngine *Diags; DiagnosticsEngine *Diags;

unsigned MaxTokenLength; unsigned MaxTokenLength;

unsigned SizeBound; unsigned SizeBound;

unsigned CharByteWidth; unsigned CharByteWidth;

tok::TokenKind Kind; tok::TokenKind Kind;

SmallString<512> ResultBuf; SmallString<512> ResultBuf;

aaron.ballmanUnsubmitted

Done

This seems to be unused.

aaron.ballman: This seems to be unused.

char *ResultPtr; // cursor char *ResultPtr; // cursor

SmallString<32> UDSuffixBuf; SmallString<32> UDSuffixBuf;

unsigned UDSuffixToken; unsigned UDSuffixToken;

unsigned UDSuffixOffset; unsigned UDSuffixOffset;

StringLiteralEvalMethod EvalMethod;

public: public:

StringLiteralParser(ArrayRef<Token> StringToks, StringLiteralParser(ArrayRef<Token> StringToks, Preprocessor &PP,

Preprocessor &PP); StringLiteralEvalMethod StringMethod =

StringLiteralParser(ArrayRef<Token> StringToks, StringLiteralEvalMethod::Evaluated);

aaron.ballmanUnsubmitted

Done

We should rename anything mentioning StringKind similarly -- this will also help avoid confusion with the StringKind type in Expr.h.

aaron.ballman: We should rename anything mentioning `StringKind` similarly -- this will also help avoid…

aaron.ballmanUnsubmitted

Done

Did this one get missed?

aaron.ballman: Did this one get missed?

const SourceManager &sm, const LangOptions &features, StringLiteralParser(ArrayRef<Token> StringToks, const SourceManager &sm,

const TargetInfo &target, const LangOptions &features, const TargetInfo &target,

DiagnosticsEngine *diags = nullptr) DiagnosticsEngine *diags = nullptr)

: SM(sm), Features(features), Target(target), Diags(diags), : SM(sm), Features(features), Target(target), Diags(diags),

MaxTokenLength(0), SizeBound(0), CharByteWidth(0), Kind(tok::unknown), MaxTokenLength(0), SizeBound(0), CharByteWidth(0), Kind(tok::unknown),

ResultPtr(ResultBuf.data()), hadError(false), Pascal(false) { ResultPtr(ResultBuf.data()),

EvalMethod(StringLiteralEvalMethod::Evaluated), hadError(false),

Pascal(false) {

aaron.ballmanUnsubmitted

Done

MaxTokenLength(0), SizeBound(0), CharByteWidth(0), Kind(tok::unknown),

- ResultPtr(ResultBuf.data()), hadError(false), Pascal(false) {

+ ResultPtr(ResultBuf.data()), hadError(false), Pascal(false), Unevaluated(false) {

init(StringToks);

Alternatively, you could use an in-class initializer and drop the changes to both ctor init lists.

aaron.ballman: Alternatively, you could use an in-class initializer and drop the changes to both ctor init…

init(StringToks); init(StringToks);

} }

bool hadError; bool hadError;

bool Pascal; bool Pascal;

aaron.ballmanUnsubmitted

Done

bool Pascal;

- StringLiteralKind StringKind;

+ StringLiteralEvalMethod EvalMethod;

StringRef GetString() const {

Can we make this private now rather than letting callers access it directly?

aaron.ballman: Can we make this private now rather than letting callers access it directly?

StringRef GetString() const { StringRef GetString() const {

return StringRef(ResultBuf.data(), GetStringLength()); return StringRef(ResultBuf.data(), GetStringLength());

} }

unsigned GetStringLength() const { return ResultPtr-ResultBuf.data(); } unsigned GetStringLength() const { return ResultPtr-ResultBuf.data(); }

unsigned GetNumStringChars() const { unsigned GetNumStringChars() const {

return GetStringLength() / CharByteWidth; return GetStringLength() / CharByteWidth;

} }

/// getOffsetOfStringByte - This function returns the offset of the /// getOffsetOfStringByte - This function returns the offset of the

/// specified byte of the string data represented by Token. This handles /// specified byte of the string data represented by Token. This handles

/// advancing over escape sequences in the string. /// advancing over escape sequences in the string.

/// ///

/// If the Diagnostics pointer is non-null, then this will do semantic /// If the Diagnostics pointer is non-null, then this will do semantic

/// checking of the string literal and emit errors and warnings. /// checking of the string literal and emit errors and warnings.

unsigned getOffsetOfStringByte(const Token &TheTok, unsigned ByteNo) const; unsigned getOffsetOfStringByte(const Token &TheTok, unsigned ByteNo) const;

bool isOrdinary() const { return Kind == tok::string_literal; } bool isOrdinary() const { return Kind == tok::string_literal; }

bool isWide() const { return Kind == tok::wide_string_literal; } bool isWide() const { return Kind == tok::wide_string_literal; }

bool isUTF8() const { return Kind == tok::utf8_string_literal; } bool isUTF8() const { return Kind == tok::utf8_string_literal; }

bool isUTF16() const { return Kind == tok::utf16_string_literal; } bool isUTF16() const { return Kind == tok::utf16_string_literal; }

bool isUTF32() const { return Kind == tok::utf32_string_literal; } bool isUTF32() const { return Kind == tok::utf32_string_literal; }

bool isPascal() const { return Pascal; } bool isPascal() const { return Pascal; }

bool isUnevaluated() const {

return EvalMethod == StringLiteralEvalMethod::Unevaluated;

}

StringRef getUDSuffix() const { return UDSuffixBuf; } StringRef getUDSuffix() const { return UDSuffixBuf; }

/// Get the index of a token containing a ud-suffix. /// Get the index of a token containing a ud-suffix.

unsigned getUDSuffixToken() const { unsigned getUDSuffixToken() const {

assert(!UDSuffixBuf.empty() && "no ud-suffix"); assert(!UDSuffixBuf.empty() && "no ud-suffix");

return UDSuffixToken; return UDSuffixToken;

} }

Show All 18 Lines

clang/include/clang/Parse/Parser.h

Show First 20 Lines • Show All 1,782 Lines • ▼ Show 20 Lines	public:
// Expr that doesn't include commas.		// Expr that doesn't include commas.
ExprResult ParseAssignmentExpression(TypeCastState isTypeCast = NotTypeCast);		ExprResult ParseAssignmentExpression(TypeCastState isTypeCast = NotTypeCast);

ExprResult ParseMSAsmIdentifier(llvm::SmallVectorImpl<Token> &LineToks,		ExprResult ParseMSAsmIdentifier(llvm::SmallVectorImpl<Token> &LineToks,
unsigned &NumLineToksConsumed,		unsigned &NumLineToksConsumed,
bool IsUnevaluated);		bool IsUnevaluated);

ExprResult ParseStringLiteralExpression(bool AllowUserDefinedLiteral = false);		ExprResult ParseStringLiteralExpression(bool AllowUserDefinedLiteral = false);
		ExprResult ParseUnevaluatedStringLiteralExpression();

private:		private:
		ExprResult ParseStringLiteralExpression(bool AllowUserDefinedLiteral,
		bool Unevaluated);

ExprResult ParseExpressionWithLeadingAt(SourceLocation AtLoc);		ExprResult ParseExpressionWithLeadingAt(SourceLocation AtLoc);

ExprResult ParseExpressionWithLeadingExtension(SourceLocation ExtLoc);		ExprResult ParseExpressionWithLeadingExtension(SourceLocation ExtLoc);

ExprResult ParseRHSOfBinaryExpression(ExprResult LHS,		ExprResult ParseRHSOfBinaryExpression(ExprResult LHS,
prec::Level MinPrec);		prec::Level MinPrec);
/// Control what ParseCastExpression will parse.		/// Control what ParseCastExpression will parse.
enum CastParseKind {		enum CastParseKind {
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	ExprResult ParseExprAfterUnaryExprOrTypeTrait(const Token &OpTok,
bool &isCastExpr,		bool &isCastExpr,
ParsedType &CastTy,		ParsedType &CastTy,
SourceRange &CastRange);		SourceRange &CastRange);

/// ParseExpressionList - Used for C/C++ (argument-)expression-list.		/// ParseExpressionList - Used for C/C++ (argument-)expression-list.
bool ParseExpressionList(SmallVectorImpl<Expr *> &Exprs,		bool ParseExpressionList(SmallVectorImpl<Expr *> &Exprs,
llvm::function_ref<void()> ExpressionStarts =		llvm::function_ref<void()> ExpressionStarts =
llvm::function_ref<void()>(),		llvm::function_ref<void()>(),
bool FailImmediatelyOnInvalidExpr = false,		bool FailImmediatelyOnInvalidExpr = false,
bool EarlyTypoCorrection = false);		bool EarlyTypoCorrection = false);

		aaron.ballmanUnsubmitted Not Done Reply Inline Actions Two default `bool` params is a bad thing but three default `bool` params seems like we should fix the interface at this point. WDYT? Also, it's not clear what the new parameter will do, the function could use comments unless fixing the interface makes it sufficiently clear. aaron.ballman: Two default `bool` params is a bad thing but three default `bool` params seems like we should…
		cor3ntinAuthorUnsubmitted Done Reply Inline Actions I'm still not sure that's the best solution. `AllowEvaluatedString` would only ever be false for attributes, I consider duplicating the function, except it does quite a bit for variadics, which apparently attribute support Maybe would could have ParseAttributeArgumentList ParseExpressionList ParseExpressionListImpl? ? cor3ntin: I'm still not sure that's the best solution. `AllowEvaluatedString` would only ever be false…
/// ParseSimpleExpressionList - A simple comma-separated list of expressions,		/// ParseSimpleExpressionList - A simple comma-separated list of expressions,
/// used for misc language extensions.		/// used for misc language extensions.
bool ParseSimpleExpressionList(SmallVectorImpl<Expr *> &Exprs);		bool ParseSimpleExpressionList(SmallVectorImpl<Expr *> &Exprs);

/// ParenParseOption - Control what ParseParenExpression will parse.		/// ParenParseOption - Control what ParseParenExpression will parse.
enum ParenParseOption {		enum ParenParseOption {
SimpleExpr, // Only parse '(' expression ')'		SimpleExpr, // Only parse '(' expression ')'
FoldExpr, // Also allow fold-expression <anything>		FoldExpr, // Also allow fold-expression <anything>
▲ Show 20 Lines • Show All 1,801 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,697 Lines • ▼ Show 20 Lines	ExprResult ActOnParenListExpr(SourceLocation L,
SourceLocation R,		SourceLocation R,
MultiExprArg Val);		MultiExprArg Val);

/// ActOnStringLiteral - The specified tokens were lexed as pasted string		/// ActOnStringLiteral - The specified tokens were lexed as pasted string
/// fragments (e.g. "foo" "bar" L"baz").		/// fragments (e.g. "foo" "bar" L"baz").
ExprResult ActOnStringLiteral(ArrayRef<Token> StringToks,		ExprResult ActOnStringLiteral(ArrayRef<Token> StringToks,
Scope *UDLScope = nullptr);		Scope *UDLScope = nullptr);

		ExprResult ActOnUnevaluatedStringLiteral(ArrayRef<Token> StringToks);

/// ControllingExprOrType is either an opaque pointer coming out of a		/// ControllingExprOrType is either an opaque pointer coming out of a
/// ParsedType or an Expr *. FIXME: it'd be better to split this interface		/// ParsedType or an Expr *. FIXME: it'd be better to split this interface
/// into two so we don't take a void *, but that's awkward because one of		/// into two so we don't take a void *, but that's awkward because one of
/// the operands is either a ParsedType or an Expr *, which doesn't lend		/// the operands is either a ParsedType or an Expr *, which doesn't lend
/// itself to generic code very well.		/// itself to generic code very well.
ExprResult ActOnGenericSelectionExpr(SourceLocation KeyLoc,		ExprResult ActOnGenericSelectionExpr(SourceLocation KeyLoc,
SourceLocation DefaultLoc,		SourceLocation DefaultLoc,
SourceLocation RParenLoc,		SourceLocation RParenLoc,
▲ Show 20 Lines • Show All 8,366 Lines • Show Last 20 Lines

clang/lib/AST/Expr.cpp

Show First 20 Lines • Show All 1,130 Lines • ▼ Show 20 Lines case Wide:

CharByteWidth = Target.getWCharWidth(); CharByteWidth = Target.getWCharWidth();

break; break;

case UTF16: case UTF16:

CharByteWidth = Target.getChar16Width(); CharByteWidth = Target.getChar16Width();

break; break;

case UTF32: case UTF32:

CharByteWidth = Target.getChar32Width(); CharByteWidth = Target.getChar32Width();

break; break;

case Unevaluated:

return sizeof(char); // Host;

shafikUnsubmitted

Done

Why not grouped w/ Ordinary above?

shafik: Why not grouped w/ `Ordinary` above?

aaron.ballmanUnsubmitted

Not Done

Specifically because we want the host encoding, not the target encoding.

aaron.ballman: Specifically because we want the host encoding, not the target encoding.

cor3ntinAuthorUnsubmitted

Done

an unevaluated string is a sequence of 1-byte even on platforms were sizeof(char) would be 2 or 4. It's never influenced by the target's properties

cor3ntin: an unevaluated string is a sequence of 1-byte even on platforms were `sizeof(char)` would be 2…

} }

assert((CharByteWidth & 7) == 0 && "Assumes character size is byte multiple"); assert((CharByteWidth & 7) == 0 && "Assumes character size is byte multiple");

CharByteWidth /= 8; CharByteWidth /= 8;

assert((CharByteWidth == 1 || CharByteWidth == 2 || CharByteWidth == 4) && assert((CharByteWidth == 1 || CharByteWidth == 2 || CharByteWidth == 4) &&

"The only supported character byte widths are 1,2 and 4!"); "The only supported character byte widths are 1,2 and 4!");

return CharByteWidth; return CharByteWidth;

} }

StringLiteral::StringLiteral(const ASTContext &Ctx, StringRef Str, StringLiteral::StringLiteral(const ASTContext &Ctx, StringRef Str,

StringKind Kind, bool Pascal, QualType Ty, StringKind Kind, bool Pascal, QualType Ty,

const SourceLocation *Loc, const SourceLocation *Loc,

unsigned NumConcatenated) unsigned NumConcatenated)

: Expr(StringLiteralClass, Ty, VK_LValue, OK_Ordinary) { : Expr(StringLiteralClass, Ty, VK_LValue, OK_Ordinary) {

aaron.ballmanUnsubmitted

Done

Basically unused and is shadowed by a declaration below (on line 1087).

aaron.ballman: Basically unused and is shadowed by a declaration below (on line 1087).

unsigned Length = Str.size();

StringLiteralBits.Kind = Kind;

StringLiteralBits.NumConcatenated = NumConcatenated;

aaron.ballmanUnsubmitted

Done

This should be in an else clause along with StringLiteralBits.IsPascal = false;.

aaron.ballman: This should be in an `else` clause along with `StringLiteralBits.IsPascal = false;`.

if (Kind != StringKind::Unevaluated) {

erichkeaneUnsubmitted

Done

minor preference (perhaps 'nit' level) to move this whole CharByteWidth + IsPascal calculation into its own function. This constructor is absurdly long as it is.

erichkeane: minor preference (perhaps 'nit' level) to move this whole CharByteWidth + IsPascal calculation…

assert(Ctx.getAsConstantArrayType(Ty) && assert(Ctx.getAsConstantArrayType(Ty) &&

"StringLiteral must be of constant array type!"); "StringLiteral must be of constant array type!");

unsigned CharByteWidth = mapCharByteWidth(Ctx.getTargetInfo(), Kind); unsigned CharByteWidth = mapCharByteWidth(Ctx.getTargetInfo(), Kind);

unsigned ByteLength = Str.size(); unsigned ByteLength = Str.size();

shafikUnsubmitted

Done

Isn't this the same as Length?

shafik: Isn't this the same as `Length`?

aaron.ballmanUnsubmitted

Not Done

It is -- I think we can get rid of ByteLength, but it's possible that this exists because of the optimization comment below. I don't insist, but it would be nice to know if we can replace the switch with Length /= CharByteWidth these days.

aaron.ballman: It is -- I think we can get rid of `ByteLength`, but it's possible that this exists because of…

cor3ntinAuthorUnsubmitted

Done

I think we should.

cor3ntin: I think we should.

cor3ntinAuthorUnsubmitted

Done

Only when CharByteWidth == 1

cor3ntin: Only when CharByteWidth == 1

assert((ByteLength % CharByteWidth == 0) && assert((ByteLength % CharByteWidth == 0) &&

"The size of the data must be a multiple of CharByteWidth!"); "The size of the data must be a multiple of CharByteWidth!");

// Avoid the expensive division. The compiler should be able to figure it // Avoid the expensive division. The compiler should be able to figure it

// out by itself. However as of clang 7, even with the appropriate // out by itself. However as of clang 7, even with the appropriate

// llvm_unreachable added just here, it is not able to do so. // llvm_unreachable added just here, it is not able to do so.

unsigned Length;

switch (CharByteWidth) { switch (CharByteWidth) {

case 1: case 1:

Length = ByteLength; Length = ByteLength;

break; break;

case 2: case 2:

Length = ByteLength / 2; Length = ByteLength / 2;

break; break;

case 4: case 4:

Length = ByteLength / 4; Length = ByteLength / 4;

break; break;

default: default:

llvm_unreachable("Unsupported character width!"); llvm_unreachable("Unsupported character width!");

} }

StringLiteralBits.Kind = Kind;

StringLiteralBits.CharByteWidth = CharByteWidth; StringLiteralBits.CharByteWidth = CharByteWidth;

StringLiteralBits.IsPascal = Pascal; StringLiteralBits.IsPascal = Pascal;

StringLiteralBits.NumConcatenated = NumConcatenated; } else {

assert(!Pascal && "Can't make an unevaluated Pascal string");

aaron.ballmanUnsubmitted

Done

StringLiteralBits.IsPascal = Pascal;

- }

- else {

+ } else {

StringLiteralBits.CharByteWidth = 1;

I'd recommend running the entire patch through clang-format though: https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting

aaron.ballman: I'd recommend running the entire patch through clang-format though: https://clang.llvm.

StringLiteralBits.CharByteWidth = 1;

aaron.ballmanUnsubmitted

Done

Add assert(!Pascal && "Can't make an unevaluated Pascal string"); ?

aaron.ballman: Add `assert(!Pascal && "Can't make an unevaluated Pascal string");` ?

StringLiteralBits.IsPascal = false;

}

*getTrailingObjects<unsigned>() = Length; *getTrailingObjects<unsigned>() = Length;

// Initialize the trailing array of SourceLocation. // Initialize the trailing array of SourceLocation.

// This is safe since SourceLocation is POD-like. // This is safe since SourceLocation is POD-like.

std::memcpy(getTrailingObjects<SourceLocation>(), Loc, std::memcpy(getTrailingObjects<SourceLocation>(), Loc,

NumConcatenated * sizeof(SourceLocation)); NumConcatenated * sizeof(SourceLocation));

// Initialize the trailing array of char holding the string data. // Initialize the trailing array of char holding the string data.

std::memcpy(getTrailingObjects<char>(), Str.data(), ByteLength); std::memcpy(getTrailingObjects<char>(), Str.data(), Str.size());

shafikUnsubmitted

Not Done

Isn't Str.size() the same as ByteLength?

shafik: Isn't `Str.size()` the same as `ByteLength`?

aaron.ballmanUnsubmitted

Not Done

I think it's more clear to use Str.size() because we're copying from Str.data().

aaron.ballman: I think it's more clear to use `Str.size()` because we're copying from `Str.data()`.

cor3ntinAuthorUnsubmitted

Done

ByteLength isn't defined in this scope, I guess i could move it.

cor3ntin: ByteLength isn't defined in this scope, I guess i could move it.

setDependence(ExprDependence::None); setDependence(ExprDependence::None);

} }

StringLiteral::StringLiteral(EmptyShell Empty, unsigned NumConcatenated, StringLiteral::StringLiteral(EmptyShell Empty, unsigned NumConcatenated,

unsigned Length, unsigned CharByteWidth) unsigned Length, unsigned CharByteWidth)

: Expr(StringLiteralClass, Empty) { : Expr(StringLiteralClass, Empty) {

StringLiteralBits.CharByteWidth = CharByteWidth; StringLiteralBits.CharByteWidth = CharByteWidth;

Show All 20 Lines void *Mem = Ctx.Allocate(totalSizeToAlloc<unsigned, SourceLocation, char>(

1, NumConcatenated, Length * CharByteWidth), 1, NumConcatenated, Length * CharByteWidth),

alignof(StringLiteral)); alignof(StringLiteral));

return new (Mem) return new (Mem)

StringLiteral(EmptyShell(), NumConcatenated, Length, CharByteWidth); StringLiteral(EmptyShell(), NumConcatenated, Length, CharByteWidth);

} }

void StringLiteral::outputString(raw_ostream &OS) const { void StringLiteral::outputString(raw_ostream &OS) const {

switch (getKind()) { switch (getKind()) {

case Unevaluated:

aaron.ballmanUnsubmitted

Done

switch (getKind()) {

- case Unevaluated: // fallthrough. no prefix.

+ case Unevaluated:

case Ordinary:

aaron.ballman:

case Ordinary: case Ordinary:

aaron.ballmanUnsubmitted

Done

switch (getKind()) {

case Unevaluated:

- break; // no prefic

case Ascii: break; // no prefix.

aaron.ballman:

break; // no prefix. break; // no prefix.

case Wide: OS << 'L'; break; case Wide: OS << 'L'; break;

case UTF8: OS << "u8"; break; case UTF8: OS << "u8"; break;

case UTF16: OS << 'u'; break; case UTF16: OS << 'u'; break;

case UTF32: OS << 'U'; break; case UTF32: OS << 'U'; break;

} }

OS << '"'; OS << '"';

static const char Hex[] = "0123456789ABCDEF"; static const char Hex[] = "0123456789ABCDEF";

▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines

/// string. /// string.

/// ///

SourceLocation SourceLocation

StringLiteral::getLocationOfByte(unsigned ByteNo, const SourceManager &SM, StringLiteral::getLocationOfByte(unsigned ByteNo, const SourceManager &SM,

const LangOptions &Features, const LangOptions &Features,

const TargetInfo &Target, unsigned *StartToken, const TargetInfo &Target, unsigned *StartToken,

unsigned *StartTokenByteOffset) const { unsigned *StartTokenByteOffset) const {

assert((getKind() == StringLiteral::Ordinary || assert((getKind() == StringLiteral::Ordinary ||

getKind() == StringLiteral::UTF8) && getKind() == StringLiteral::UTF8 ||

getKind() == StringLiteral::Unevaluated) &&

"Only narrow string literals are currently supported"); "Only narrow string literals are currently supported");

// Loop over all of the tokens in this string until we find the one that // Loop over all of the tokens in this string until we find the one that

// contains the byte we're looking for. // contains the byte we're looking for.

unsigned TokNo = 0; unsigned TokNo = 0;

unsigned StringOffset = 0; unsigned StringOffset = 0;

if (StartToken) if (StartToken)

TokNo = *StartToken; TokNo = *StartToken;

▲ Show 20 Lines • Show All 3,892 Lines • Show Last 20 Lines

clang/lib/Lex/LiteralSupport.cpp

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines static DiagnosticBuilder Diag(DiagnosticsEngine *Diags,

const char *TokRangeEnd, unsigned DiagID) { const char *TokRangeEnd, unsigned DiagID) {

SourceLocation Begin = SourceLocation Begin =

Lexer::AdvanceToTokenCharacter(TokLoc, TokRangeBegin - TokBegin, Lexer::AdvanceToTokenCharacter(TokLoc, TokRangeBegin - TokBegin,

TokLoc.getManager(), Features); TokLoc.getManager(), Features);

return Diags->Report(Begin, DiagID) << return Diags->Report(Begin, DiagID) <<

MakeCharSourceRange(Features, TokLoc, TokBegin, TokRangeBegin, TokRangeEnd); MakeCharSourceRange(Features, TokLoc, TokBegin, TokRangeBegin, TokRangeEnd);

} }

static bool IsEscapeValidInUnevaluatedStringLiteral(char Escape) {

shafikUnsubmitted

Done

Should we use Is as a prefix here? Right now it should like we are modifying something.

shafik: Should we use `Is` as a prefix here? Right now it should like we are modifying something.

aaron.ballmanUnsubmitted

Done

+1, I think Is would be an improvement.

aaron.ballman: +1, I think `Is` would be an improvement.

switch (Escape) {

case '\'':

case '"':

case '?':

case '\\':

case 'a':

case 'b':

aaron.ballmanUnsubmitted

Done

Do you intend to miss a bunch of escapes like \' and \r (etc)?

aaron.ballman: Do you intend to miss a bunch of escapes like `\'` and `\r` (etc)?

cor3ntinAuthorUnsubmitted

Done

\' is there. I am less sure about '\r' and '\a'. for example. This is something I realized after writing P2361.
what does '\a` in static assert mean? even '\r' is not so obvious

cor3ntin: \' is there. I am less sure about '\r' and '\a'. for example. This is something I realized…

aaron.ballmanUnsubmitted

Done

Looking at the list again, I think only \a is really of interest here. I know some folks like @jfb have mentioned that \a could be used to generate an alert sound on a terminal, which is a somewhat useful feature for a failed static assertion if you squint at it hard enough.

But the rest of the missing ones do seem more questionable to support.

aaron.ballman: Looking at the list again, I think only `\a` is really of interest here. I know some folks like…

aaron.ballmanUnsubmitted

Done

@jfb and @cor3ntin -- any opinions on whether \a should be supported? My opinion is that it should be supported because it has some utility for anyone running the compiler from a command line, but it's a pretty weak opinion.

aaron.ballman: @jfb and @cor3ntin -- any opinions on whether `\a` should be supported? My opinion is that it…

erichkeaneUnsubmitted

Done

I might consider rejecting ANY character escape in the less-than-32 part of the table.

For consistency at least, I don't see value in allowing \a if we're rejecting layout things like \t.

erichkeane: I might consider rejecting ANY character escape in the less-than-32 part of the table. For…

aaron.ballmanUnsubmitted

Done

But that's just it, we're accepting \t and \n with this code.

aaron.ballman: But that's just it, we're accepting `\t` and `\n` with this code.

erichkeaneUnsubmitted

Done

Ah! I missed that this is an allow-list instead of a deny-list. That makes me way more comfortable with this code.

IMO, I'd suggest we we allow '\r' (since wouldn't we have problems on Windows at that point, being unable to accept a printable newline for windows?), but disallow \a for now unless someone comes up with a really good reason to allow it.

erichkeane: Ah! I missed that this is an allow-list instead of a deny-list. That makes me way more…

case 'f':

aaron.ballmanUnsubmitted

Done

We're still missing support for some escape characters from: http://eel.is/c++draft/lex#nt:simple-escape-sequence-char

Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?

aaron.ballman: We're still missing support for some escape characters from: http://eel.is/c++draft/lex#nt…

cor3ntinAuthorUnsubmitted

Done

Just to verify, UCNs have already been handled by the time we get here, so we don't need to care about those, correct?

They are dealt with elsewhere yes (and supported)

cor3ntin: > Just to verify, UCNs have already been handled by the time we get here, so we don't need to…

case 'n':

erichkeaneUnsubmitted

Done

For future clarification, the ones from the 'simple' list here: https://en.cppreference.com/w/cpp/language/escape

that we are missing are: \a \b \f and \v.

I personally think I'm ok with that until someone else says they care.

erichkeane: For future clarification, the ones from the 'simple' list here: https://en.cppreference.

case 'r':

case 't':

case 'v':

return true;

}

return false;

}

/// ProcessCharEscape - Parse a standard C escape sequence, which can occur in /// ProcessCharEscape - Parse a standard C escape sequence, which can occur in

/// either a character or a string literal. /// either a character or a string literal.

static unsigned ProcessCharEscape(const char *ThisTokBegin, static unsigned ProcessCharEscape(const char *ThisTokBegin,

const char *&ThisTokBuf, const char *&ThisTokBuf,

const char *ThisTokEnd, bool &HadError, const char *ThisTokEnd, bool &HadError,

FullSourceLoc Loc, unsigned CharWidth, FullSourceLoc Loc, unsigned CharWidth,

DiagnosticsEngine *Diags, DiagnosticsEngine *Diags,

const LangOptions &Features) { const LangOptions &Features,

StringLiteralEvalMethod EvalMethod) {

const char *EscapeBegin = ThisTokBuf; const char *EscapeBegin = ThisTokBuf;

bool Delimited = false; bool Delimited = false;

bool EndDelimiterFound = false; bool EndDelimiterFound = false;

erichkeaneUnsubmitted

Done

This is like the 3rd time we're using 'Unevaluated' as a bool parameter. I have a pretty strong preference for making it a scoped-enum in 'Basic' somewhere.

erichkeane: This is like the 3rd time we're using 'Unevaluated' as a bool parameter. I have a pretty…

cor3ntinAuthorUnsubmitted

Done

Any suggestion for where to

cor3ntin: Any suggestion for where to

cor3ntinAuthorUnsubmitted

Done

NVM

cor3ntin: NVM

// Skip the '\' char. // Skip the '\' char.

++ThisTokBuf; ++ThisTokBuf;

// We know that this character can't be off the end of the buffer, because // We know that this character can't be off the end of the buffer, because

// that would have been \", which would not have been the end of string. // that would have been \", which would not have been the end of string.

unsigned ResultChar = *ThisTokBuf++; unsigned ResultChar = *ThisTokBuf++;

char Escape = ResultChar;

switch (ResultChar) { switch (ResultChar) {

// These map to themselves. // These map to themselves.

case '\\': case '\'': case '"': case '?': break; case '\\': case '\'': case '"': case '?': break;

// These have fixed mappings. // These have fixed mappings.

case 'a': case 'a':

// TODO: K&R: the meaning of '\\a' is different in traditional C // TODO: K&R: the meaning of '\\a' is different in traditional C

ResultChar = 7; ResultChar = 7;

▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines else

diag::ext_unknown_escape) diag::ext_unknown_escape)

<< "x" + llvm::utohexstr(ResultChar); << "x" + llvm::utohexstr(ResultChar);

break; break;

} }

if (Delimited && Diags) { if (Delimited && Diags) {

if (!EndDelimiterFound) if (!EndDelimiterFound)

Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf, Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,

diag::err_expected) diag::err_expected)

aaron.ballmanUnsubmitted

Done

diag::err_unevaluated_string_invalid_escape_sequence)

- << std::string(1, EscapeBegin[1]);

+ << StringRef(&EscapeBegin[1], 1);

}

return ResultChar;

aaron.ballman:

<< tok::r_brace; << tok::r_brace;

else if (!HadError) { else if (!HadError) {

Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf, Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,

Features.CPlusPlus23 ? diag::warn_cxx23_delimited_escape_sequence Features.CPlusPlus23 ? diag::warn_cxx23_delimited_escape_sequence

: diag::ext_delimited_escape_sequence) : diag::ext_delimited_escape_sequence)

<< /*delimited*/ 0 << (Features.CPlusPlus ? 1 : 0); << /*delimited*/ 0 << (Features.CPlusPlus ? 1 : 0);

} }

if (EvalMethod == StringLiteralEvalMethod::Unevaluated &&

!IsEscapeValidInUnevaluatedStringLiteral(Escape)) {

Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,

diag::err_unevaluated_string_invalid_escape_sequence)

<< StringRef(EscapeBegin, ThisTokBuf - EscapeBegin);

}

return ResultChar; return ResultChar;

} }

static void appendCodePoint(unsigned Codepoint, static void appendCodePoint(unsigned Codepoint,

llvm::SmallVectorImpl<char> &Str) { llvm::SmallVectorImpl<char> &Str) {

char ResultBuf[4]; char ResultBuf[4];

char *ResultPtr = ResultBuf; char *ResultPtr = ResultBuf;

if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr)) if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))

▲ Show 20 Lines • Show All 1,393 Lines • ▼ Show 20 Lines if (begin[1] == 'u' || begin[1] == 'U' || begin[1] == 'N') {

PP.Diag(Loc, diag::err_character_too_large); PP.Diag(Loc, diag::err_character_too_large);

} }

++buffer_begin; ++buffer_begin;

continue; continue;

} }

unsigned CharWidth = getCharWidth(Kind, PP.getTargetInfo()); unsigned CharWidth = getCharWidth(Kind, PP.getTargetInfo());

uint64_t result = uint64_t result =

ProcessCharEscape(TokBegin, begin, end, HadError, ProcessCharEscape(TokBegin, begin, end, HadError,

FullSourceLoc(Loc,PP.getSourceManager()), FullSourceLoc(Loc, PP.getSourceManager()), CharWidth,

CharWidth, &PP.getDiagnostics(), PP.getLangOpts()); &PP.getDiagnostics(), PP.getLangOpts(),

StringLiteralEvalMethod::Evaluated);

*buffer_begin++ = result; *buffer_begin++ = result;

} }

unsigned NumCharsSoFar = buffer_begin - &codepoint_buffer.front(); unsigned NumCharsSoFar = buffer_begin - &codepoint_buffer.front();

if (NumCharsSoFar > 1) { if (NumCharsSoFar > 1) {

if (isOrdinary() && NumCharsSoFar == 4) if (isOrdinary() && NumCharsSoFar == 4)

PP.Diag(Loc, diag::warn_four_char_character_literal); PP.Diag(Loc, diag::warn_four_char_character_literal);

▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines

/// hexadecimal-escape-sequence hexadecimal-digit /// hexadecimal-escape-sequence hexadecimal-digit

/// universal-character-name: /// universal-character-name:

/// \u hex-quad /// \u hex-quad

/// \U hex-quad hex-quad /// \U hex-quad hex-quad

/// hex-quad: /// hex-quad:

/// hex-digit hex-digit hex-digit hex-digit /// hex-digit hex-digit hex-digit hex-digit

/// \endverbatim /// \endverbatim

/// ///

StringLiteralParser:: StringLiteralParser::StringLiteralParser(ArrayRef<Token> StringToks,

StringLiteralParser(ArrayRef<Token> StringToks, Preprocessor &PP,

Preprocessor &PP) StringLiteralEvalMethod EvalMethod)

: SM(PP.getSourceManager()), Features(PP.getLangOpts()), : SM(PP.getSourceManager()), Features(PP.getLangOpts()),

Target(PP.getTargetInfo()), Diags(&PP.getDiagnostics()), Target(PP.getTargetInfo()), Diags(&PP.getDiagnostics()),

MaxTokenLength(0), SizeBound(0), CharByteWidth(0), Kind(tok::unknown), MaxTokenLength(0), SizeBound(0), CharByteWidth(0), Kind(tok::unknown),

ResultPtr(ResultBuf.data()), hadError(false), Pascal(false) { ResultPtr(ResultBuf.data()), EvalMethod(EvalMethod), hadError(false),

Pascal(false) {

init(StringToks); init(StringToks);

} }

void StringLiteralParser::init(ArrayRef<Token> StringToks){ void StringLiteralParser::init(ArrayRef<Token> StringToks){

// The literal token may have come from an invalid source location (e.g. due // The literal token may have come from an invalid source location (e.g. due

// to a PCH error), in which case the token length will be 0. // to a PCH error), in which case the token length will be 0.

if (StringToks.empty() || StringToks[0].getLength() < 2) if (StringToks.empty() || StringToks[0].getLength() < 2)

return DiagnoseLexingError(SourceLocation()); return DiagnoseLexingError(SourceLocation());

// Scan all of the string portions, remember the max individual token length, // Scan all of the string portions, remember the max individual token length,

// computing a bound on the concatenated string length, and see whether any // computing a bound on the concatenated string length, and see whether any

// piece is a wide-string. If any of the string portions is a wide-string // piece is a wide-string. If any of the string portions is a wide-string

// literal, the result is a wide-string literal [C99 6.4.5p4]. // literal, the result is a wide-string literal [C99 6.4.5p4].

assert(!StringToks.empty() && "expected at least one token"); assert(!StringToks.empty() && "expected at least one token");

MaxTokenLength = StringToks[0].getLength(); MaxTokenLength = StringToks[0].getLength();

assert(StringToks[0].getLength() >= 2 && "literal token is invalid!"); assert(StringToks[0].getLength() >= 2 && "literal token is invalid!");

SizeBound = StringToks[0].getLength()-2; // -2 for "". SizeBound = StringToks[0].getLength() - 2; // -2 for "".

Kind = StringToks[0].getKind();

hadError = false; hadError = false;

// Implement Translation Phase #6: concatenation of string literals // Determines the kind of string from the prefix

Kind = tok::string_literal;

aaron.ballmanUnsubmitted

Done

hadError = false;

- // Determines the kind of string from the prefix

+ // Determines the kind of string from the prefix.

Kind = tok::string_literal;

aaron.ballman:

aaron.ballmanUnsubmitted

Done

This means we're looping over (almost) all the string tokens three times -- once here, once below on line 1562, and again on 1605.

aaron.ballman: This means we're looping over (almost) all the string tokens three times -- once here, once…

erichkeaneUnsubmitted

Done

Hrm.... this is unfortunate. Is there no way to combine the loops? I guess (hope?) that hte list of tokens is at least going to be short...

erichkeane: Hrm.... this is unfortunate. Is there no way to combine the loops? I guess (hope?) that hte…

/// (C99 5.1.1.2p1). The common case is only one string fragment. /// (C99 5.1.1.2p1). The common case is only one string fragment.

aaron.ballmanUnsubmitted

Done

for (const auto &Tok : StringToks) {

- // Unevaluated string literals can never have a prefix

+ // Unevaluated string literals can never have a prefix.

if (Unevaluated && Tok.getKind() != tok::string_literal) {

aaron.ballman:

aaron.ballmanUnsubmitted

Done

Looks like this comment is still missing punctuation.

aaron.ballman: Looks like this comment is still missing punctuation.

aaron.ballmanUnsubmitted

Done

Kind = tok::string_literal;

- auto CheckStringKind = [&](const Token &Tok) {

+ auto DiagWrongStringKind = [&](const Token &Tok) {

if (isUnevaluated() && Tok.getKind() != tok::string_literal) {

When I hear "check" I think it'll return a value; I think this name is a bit more clear.

aaron.ballman: When I hear "check" I think it'll return a value; I think this name is a bit more clear.

for (const Token &Tok : StringToks) { for (const Token &Tok : StringToks) {

if (Tok.getLength() < 2) if (Tok.getLength() < 2)

return DiagnoseLexingError(Tok.getLocation()); return DiagnoseLexingError(Tok.getLocation());

aaron.ballmanUnsubmitted

Done

This diagnostic might be somewhat odd for Pascal strings because those sort of have a prefix but it's not really the kind of prefix we're talking about. I don't know of a better way to word the diagnostic though. If you think of a way to improve it, then yay, but otherwise, I think it's fine as-is.

aaron.ballman: This diagnostic might be somewhat odd for Pascal strings because those sort of have a prefix…

// The string could be shorter than this if it needs cleaning, but this is a // The string could be shorter than this if it needs cleaning, but this is a

// reasonable bound, which is all we need. // reasonable bound, which is all we need.

assert(Tok.getLength() >= 2 && "literal token is invalid!"); assert(Tok.getLength() >= 2 && "literal token is invalid!");

SizeBound += Tok.getLength() - 2; // -2 for "". SizeBound += Tok.getLength() - 2; // -2 for "".

// Remember maximum string piece length. // Remember maximum string piece length.

if (Tok.getLength() > MaxTokenLength) if (Tok.getLength() > MaxTokenLength)

MaxTokenLength = Tok.getLength(); MaxTokenLength = Tok.getLength();

// Remember if we see any wide or utf-8/16/32 strings. // Remember if we see any wide or utf-8/16/32 strings.

// Also check for illegal concatenations. // Also check for illegal concatenations.

if (Tok.isNot(Kind) && Tok.isNot(tok::string_literal)) { if (isUnevaluated() && Tok.getKind() != tok::string_literal) {

if (Diags)

Diags->Report(Tok.getLocation(), diag::err_unevaluated_string_prefix);

hadError = true;

} else if (Tok.isNot(Kind) && Tok.isNot(tok::string_literal)) {

aaron.ballmanUnsubmitted

Done

Doesn't returning here leave the object in a partially-initialized state? That seems bad.

aaron.ballman: Doesn't returning here leave the object in a partially-initialized state? That seems bad.

if (isOrdinary()) { if (isOrdinary()) {

Kind = Tok.getKind(); Kind = Tok.getKind();

} else { } else {

if (Diags) if (Diags)

Diags->Report(Tok.getLocation(), diag::err_unsupported_string_concat); Diags->Report(Tok.getLocation(), diag::err_unsupported_string_concat);

hadError = true; hadError = true;

} }

▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines if (ThisTokEnd[-1] != '"') {

expandUCNs(ExpandedUDSuffix, UDSuffix); expandUCNs(ExpandedUDSuffix, UDSuffix);

UDSuffix = ExpandedUDSuffix; UDSuffix = ExpandedUDSuffix;

} }

// C++11 [lex.ext]p8: At the end of phase 6, if a string literal is the // C++11 [lex.ext]p8: At the end of phase 6, if a string literal is the

// result of a concatenation involving at least one user-defined-string- // result of a concatenation involving at least one user-defined-string-

// literal, all the participating user-defined-string-literals shall // literal, all the participating user-defined-string-literals shall

// have the same ud-suffix. // have the same ud-suffix.

if (UDSuffixBuf != UDSuffix) { bool UnevaluatedStringHasUDL = isUnevaluated() && !UDSuffix.empty();

if (UDSuffixBuf != UDSuffix || UnevaluatedStringHasUDL) {

aaron.ballmanUnsubmitted

Done

// have the same ud-suffix.

- const bool UnevaluatedStringHasUDL = Unevaluated && !UDSuffix.empty();

+ bool UnevaluatedStringHasUDL = Unevaluated && !UDSuffix.empty();

if (UDSuffixBuf != UDSuffix || UnevaluatedStringHasUDL) {

aaron.ballman:

if (Diags) { if (Diags) {

SourceLocation TokLoc = StringToks[i].getLocation(); SourceLocation TokLoc = StringToks[i].getLocation();

if (UnevaluatedStringHasUDL) {

Diags->Report(TokLoc, diag::err_unevaluated_string_udl)

erichkeaneUnsubmitted

Done

Is this OK? It looks like we're passing a ton of parameters to a diag type that doesn't have any wildcards?

erichkeane: Is this OK? It looks like we're passing a ton of parameters to a diag type that doesn't have…

aaron.ballmanUnsubmitted

Done

Good catch! The first two are not helpful (the diag engine will silently ignore them), but the second two are for underlines in the diagnostic and are useful.

aaron.ballman: Good catch! The first two are not helpful (the diag engine will silently ignore them), but the…

<< SourceRange(TokLoc, TokLoc);

} else {

Diags->Report(TokLoc, diag::err_string_concat_mixed_suffix) Diags->Report(TokLoc, diag::err_string_concat_mixed_suffix)

<< UDSuffixBuf << UDSuffix << UDSuffixBuf << UDSuffix

aaron.ballmanUnsubmitted

Done

: diag::err_string_concat_mixed_suffix)

- << UDSuffixBuf << UDSuffix

<< SourceRange(UDSuffixTokLoc, UDSuffixTokLoc)

<< SourceRange(TokLoc, TokLoc);

}

hadError = true;

aaron.ballman:

cor3ntinAuthorUnsubmitted

Done

This are actually used by err_string_concat_mixed_suffix

cor3ntin: This are actually used by `err_string_concat_mixed_suffix`

erichkeaneUnsubmitted

Done

right, i guess it is just super awkward to have unused parameters passed like this. I know we only check the other direction, but seems awkward. Aaron, thoughts?

erichkeane: right, i guess it is just super awkward to have unused parameters passed like this. I know we…

aaron.ballmanUnsubmitted

Done

I'd split it into two calls at this point. e.g.,

if (UnevaluatedStringHasUDL)
  Diags->Report(TokLoc, diag::err_unevaluated_string_udl) << ...;
else
  Diags->Report(TokLoc, diag::err_string_concat_mixed_suffix) << ...;

aaron.ballman: I'd split it into two calls at this point. e.g., ``` if (UnevaluatedStringHasUDL) Diags…

<< SourceRange(UDSuffixTokLoc, UDSuffixTokLoc) << SourceRange(UDSuffixTokLoc, UDSuffixTokLoc);

<< SourceRange(TokLoc, TokLoc); }

} }

hadError = true; hadError = true;

} }

// Strip the end quote. // Strip the end quote.

--ThisTokEnd; --ThisTokEnd;

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines if (ThisTokBuf[0] == 'R') {

if (ThisTokBuf[0] != '"') { if (ThisTokBuf[0] != '"') {

// The file may have come from PCH and then changed after loading the // The file may have come from PCH and then changed after loading the

// PCH; Fail gracefully. // PCH; Fail gracefully.

return DiagnoseLexingError(StringToks[i].getLocation()); return DiagnoseLexingError(StringToks[i].getLocation());

} }

++ThisTokBuf; // skip " ++ThisTokBuf; // skip "

// Check if this is a pascal string // Check if this is a pascal string

if (Features.PascalStrings && ThisTokBuf + 1 != ThisTokEnd && if (!isUnevaluated() && Features.PascalStrings &&

ThisTokBuf[0] == '\\' && ThisTokBuf[1] == 'p') { ThisTokBuf + 1 != ThisTokEnd && ThisTokBuf[0] == '\\' &&

ThisTokBuf[1] == 'p') {

aaron.ballmanUnsubmitted

Not Done

Is there test coverage that we diagnose this properly?

aaron.ballman: Is there test coverage that we diagnose this properly?

cor3ntinAuthorUnsubmitted

Done

What sort of test would you like to see?

cor3ntin: What sort of test would you like to see?

aaron.ballmanUnsubmitted

Not Done

Pascal strings enabled and using something like [[deprecated("\pOh no, a Pascal string!")]] (or some other unevaluated uses).

aaron.ballman: Pascal strings enabled and using something like `[[deprecated("\pOh no, a Pascal string!")]]`…

// If the \p sequence is found in the first token, we have a pascal string // If the \p sequence is found in the first token, we have a pascal string

// Otherwise, if we already have a pascal string, ignore the first \p // Otherwise, if we already have a pascal string, ignore the first \p

if (i == 0) { if (i == 0) {

++ThisTokBuf; ++ThisTokBuf;

Pascal = true; Pascal = true;

} else if (Pascal) } else if (Pascal)

ThisTokBuf += 2; ThisTokBuf += 2;

} }

Show All 18 Lines if (ThisTokBuf[0] == 'R') {

EncodeUCNEscape(ThisTokBegin, ThisTokBuf, ThisTokEnd, EncodeUCNEscape(ThisTokBegin, ThisTokBuf, ThisTokEnd,

ResultPtr, hadError, ResultPtr, hadError,

FullSourceLoc(StringToks[i].getLocation(), SM), FullSourceLoc(StringToks[i].getLocation(), SM),

CharByteWidth, Diags, Features); CharByteWidth, Diags, Features);

continue; continue;

} }

// Otherwise, this is a non-UCN escape character. Process it. // Otherwise, this is a non-UCN escape character. Process it.

unsigned ResultChar = unsigned ResultChar =

ProcessCharEscape(ThisTokBegin, ThisTokBuf, ThisTokEnd, hadError, ProcessCharEscape(ThisTokBegin, ThisTokBuf, ThisTokEnd, hadError,

FullSourceLoc(StringToks[i].getLocation(), SM), FullSourceLoc(StringToks[i].getLocation(), SM),

CharByteWidth*8, Diags, Features); CharByteWidth * 8, Diags, Features, EvalMethod);

if (CharByteWidth == 4) { if (CharByteWidth == 4) {

// FIXME: Make the type of the result buffer correct instead of // FIXME: Make the type of the result buffer correct instead of

// using reinterpret_cast. // using reinterpret_cast.

llvm::UTF32 *ResultWidePtr = reinterpret_cast<llvm::UTF32*>(ResultPtr); llvm::UTF32 *ResultWidePtr = reinterpret_cast<llvm::UTF32*>(ResultPtr);

*ResultWidePtr = ResultChar; *ResultWidePtr = ResultChar;

ResultPtr += 4; ResultPtr += 4;

} else if (CharByteWidth == 2) { } else if (CharByteWidth == 2) {

// FIXME: Make the type of the result buffer correct instead of // FIXME: Make the type of the result buffer correct instead of

// using reinterpret_cast. // using reinterpret_cast.

llvm::UTF16 *ResultWidePtr = reinterpret_cast<llvm::UTF16*>(ResultPtr); llvm::UTF16 *ResultWidePtr = reinterpret_cast<llvm::UTF16*>(ResultPtr);

*ResultWidePtr = ResultChar & 0xFFFF; *ResultWidePtr = ResultChar & 0xFFFF;

ResultPtr += 2; ResultPtr += 2;

} else { } else {

assert(CharByteWidth == 1 && "Unexpected char width"); assert(CharByteWidth == 1 && "Unexpected char width");

*ResultPtr++ = ResultChar & 0xFF; *ResultPtr++ = ResultChar & 0xFF;

} }

assert((!Pascal || !isUnevaluated()) &&

"Pascal string in unevaluated context");

if (Pascal) { if (Pascal) {

if (CharByteWidth == 4) { if (CharByteWidth == 4) {

// FIXME: Make the type of the result buffer correct instead of // FIXME: Make the type of the result buffer correct instead of

// using reinterpret_cast. // using reinterpret_cast.

llvm::UTF32 *ResultWidePtr = reinterpret_cast<llvm::UTF32*>(ResultBuf.data()); llvm::UTF32 *ResultWidePtr = reinterpret_cast<llvm::UTF32*>(ResultBuf.data());

ResultWidePtr[0] = GetNumStringChars() - 1; ResultWidePtr[0] = GetNumStringChars() - 1;

} else if (CharByteWidth == 2) { } else if (CharByteWidth == 2) {

// FIXME: Make the type of the result buffer correct instead of // FIXME: Make the type of the result buffer correct instead of

▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines if (SpellingPtr[1] == 'u' || SpellingPtr[1] == 'U' ||

if (Len > ByteNo) { if (Len > ByteNo) {

// ByteNo is somewhere within the escape sequence. // ByteNo is somewhere within the escape sequence.

SpellingPtr = EscapePtr; SpellingPtr = EscapePtr;

break; break;

} }

ByteNo -= Len; ByteNo -= Len;

} else { } else {

ProcessCharEscape(SpellingStart, SpellingPtr, SpellingEnd, HadError, ProcessCharEscape(SpellingStart, SpellingPtr, SpellingEnd, HadError,

FullSourceLoc(Tok.getLocation(), SM), FullSourceLoc(Tok.getLocation(), SM), CharByteWidth * 8,

CharByteWidth*8, Diags, Features); Diags, Features, StringLiteralEvalMethod::Evaluated);

aaron.ballmanUnsubmitted

Done

FullSourceLoc(Tok.getLocation(), SM), CharByteWidth * 8,

- Diags, Features, false);

+ Diags, Features, /*Unevaluated*/ false);

--ByteNo;

aaron.ballman:

--ByteNo; --ByteNo;

} }

assert(!HadError && "This method isn't valid on erroneous strings"); assert(!HadError && "This method isn't valid on erroneous strings");

} }

return SpellingPtr-SpellingStart; return SpellingPtr-SpellingStart;

} }

/// Determine whether a suffix is a valid ud-suffix. We avoid treating reserved /// Determine whether a suffix is a valid ud-suffix. We avoid treating reserved

/// suffixes as ud-suffixes, because the diagnostic experience is better if we /// suffixes as ud-suffixes, because the diagnostic experience is better if we

/// treat it as an invalid suffix. /// treat it as an invalid suffix.

bool StringLiteralParser::isValidUDSuffix(const LangOptions &LangOpts, bool StringLiteralParser::isValidUDSuffix(const LangOptions &LangOpts,

StringRef Suffix) { StringRef Suffix) {

return NumericLiteralParser::isValidUDSuffix(LangOpts, Suffix) || return NumericLiteralParser::isValidUDSuffix(LangOpts, Suffix) ||

Suffix == "sv"; Suffix == "sv";

} }

clang/lib/Lex/PPMacroExpansion.cpp

Show First 20 Lines • Show All 1,863 Lines • ▼ Show 20 Lines	#include "clang/Basic/TransformTypeTraits.def"
}		}

SourceLocation LParenLoc = Tok.getLocation();		SourceLocation LParenLoc = Tok.getLocation();
LexNonComment(Tok);		LexNonComment(Tok);

if (!Tok.isAnnotation() && Tok.getIdentifierInfo())		if (!Tok.isAnnotation() && Tok.getIdentifierInfo())
Tok.setKind(tok::identifier);		Tok.setKind(tok::identifier);
else if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {		else if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {
StringLiteralParser Literal(Tok, *this);		StringLiteralParser Literal(Tok, *this,
		StringLiteralEvalMethod::Unevaluated);
if (Literal.hadError)		if (Literal.hadError)
		aaron.ballmanUnsubmitted Not Done Reply Inline Actions Test coverage for this change? aaron.ballman: Test coverage for this change?
return;		return;

Tok.setIdentifierInfo(getIdentifierInfo(Literal.GetString()));		Tok.setIdentifierInfo(getIdentifierInfo(Literal.GetString()));
Tok.setKind(tok::identifier);		Tok.setKind(tok::identifier);
} else {		} else {
Diag(Tok.getLocation(), diag::err_pp_identifier_arg_not_identifier)		Diag(Tok.getLocation(), diag::err_pp_identifier_arg_not_identifier)
<< Tok.getKind();		<< Tok.getKind();
// Don't walk past anything that's not a real token.		// Don't walk past anything that's not a real token.
▲ Show 20 Lines • Show All 102 Lines • Show Last 20 Lines

clang/lib/Lex/Pragma.cpp

Show First 20 Lines • Show All 767 Lines • ▼ Show 20 Lines
// Lex a component of a module name: either an identifier or a string literal;		// Lex a component of a module name: either an identifier or a string literal;
// for components that can be expressed both ways, the two forms are equivalent.		// for components that can be expressed both ways, the two forms are equivalent.
static bool LexModuleNameComponent(		static bool LexModuleNameComponent(
Preprocessor &PP, Token &Tok,		Preprocessor &PP, Token &Tok,
std::pair<IdentifierInfo *, SourceLocation> &ModuleNameComponent,		std::pair<IdentifierInfo *, SourceLocation> &ModuleNameComponent,
bool First) {		bool First) {
PP.LexUnexpandedToken(Tok);		PP.LexUnexpandedToken(Tok);
if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {		if (Tok.is(tok::string_literal) && !Tok.hasUDSuffix()) {
StringLiteralParser Literal(Tok, PP);		StringLiteralParser Literal(Tok, PP);
		aaron.ballmanUnsubmitted Done Reply Inline Actions Should this also be modified? aaron.ballman: Should this also be modified?
		cor3ntinAuthorUnsubmitted Done Reply Inline Actions Probably but because I'm not super familiar with module map things I preferred being conservative cor3ntin: Probably but because I'm not super familiar with module map things I preferred being…
		aaron.ballmanUnsubmitted Done Reply Inline Actions Paging @rsmith for opinions. Lacking those opinions, I think being conservative here is fine. aaron.ballman: Paging @rsmith for opinions. Lacking those opinions, I think being conservative here is fine.
		aaron.ballmanUnsubmitted Not Done Reply Inline Actions Pinging @ChuanqiXu for opinions. aaron.ballman: Pinging @ChuanqiXu for opinions.
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions I think the both options (to modify it or not) are acceptable. Because the input here should be the output of the clang itself. See https://github.com/llvm/llvm-project/blob/ebd0b8a0472b865b7eb6e1a32af97ae31d829033/clang/lib/Basic/Module.cpp#L229-L231 and https://github.com/llvm/llvm-project/blob/ebd0b8a0472b865b7eb6e1a32af97ae31d829033/clang/lib/Frontend/Rewrite/FrontendActions.cpp#L238-L240. We can see there is no deprecated prefix. So while it is acceptable to modify this since its pattern matches the paper, it doesn't matter really since we can control the input completely. Personally, I prefer to not touch it. Since I feel like this use case doesn't have been used a lot. So the effort here may not be worthy. ChuanqiXu: I think the both options (to modify it or not) are acceptable. Because the input here should…
if (Literal.hadError)		if (Literal.hadError)
return true;		return true;
ModuleNameComponent = std::make_pair(		ModuleNameComponent = std::make_pair(
PP.getIdentifierInfo(Literal.GetString()), Tok.getLocation());		PP.getIdentifierInfo(Literal.GetString()), Tok.getLocation());
} else if (!Tok.isAnnotation() && Tok.getIdentifierInfo()) {		} else if (!Tok.isAnnotation() && Tok.getIdentifierInfo()) {
ModuleNameComponent =		ModuleNameComponent =
std::make_pair(Tok.getIdentifierInfo(), Tok.getLocation());		std::make_pair(Tok.getIdentifierInfo(), Tok.getLocation());
} else {		} else {
▲ Show 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	if (II->isStr("assert")) {
DumpAnnot.setAnnotationRange(SourceRange(Tok.getLocation()));		DumpAnnot.setAnnotationRange(SourceRange(Tok.getLocation()));
PP.EnterToken(DumpAnnot, /IsReinject/false);		PP.EnterToken(DumpAnnot, /IsReinject/false);
} else if (II->isStr("diag_mapping")) {		} else if (II->isStr("diag_mapping")) {
Token DiagName;		Token DiagName;
PP.LexUnexpandedToken(DiagName);		PP.LexUnexpandedToken(DiagName);
if (DiagName.is(tok::eod))		if (DiagName.is(tok::eod))
PP.getDiagnostics().dump();		PP.getDiagnostics().dump();
else if (DiagName.is(tok::string_literal) && !DiagName.hasUDSuffix()) {		else if (DiagName.is(tok::string_literal) && !DiagName.hasUDSuffix()) {
StringLiteralParser Literal(DiagName, PP);		StringLiteralParser Literal(DiagName, PP,
		StringLiteralEvalMethod::Unevaluated);
if (Literal.hadError)		if (Literal.hadError)
return;		return;
PP.getDiagnostics().dump(Literal.GetString());		PP.getDiagnostics().dump(Literal.GetString());
} else {		} else {
PP.Diag(DiagName, diag::warn_pragma_debug_missing_argument)		PP.Diag(DiagName, diag::warn_pragma_debug_missing_argument)
<< II->getName();		<< II->getName();
}		}
} else if (II->isStr("llvm_fatal_error")) {		} else if (II->isStr("llvm_fatal_error")) {
▲ Show 20 Lines • Show All 1,101 Lines • Show Last 20 Lines

clang/lib/Parse/ParseDeclCXX.cpp

Show First 20 Lines • Show All 344 Lines • ▼ Show 20 Lines
/// and just before that, that extern was seen.		/// and just before that, that extern was seen.
///		///
/// linkage-specification: [C++ 7.5p2: dcl.link]		/// linkage-specification: [C++ 7.5p2: dcl.link]
/// 'extern' string-literal '{' declaration-seq[opt] '}'		/// 'extern' string-literal '{' declaration-seq[opt] '}'
/// 'extern' string-literal declaration		/// 'extern' string-literal declaration
///		///
Decl *Parser::ParseLinkage(ParsingDeclSpec &DS, DeclaratorContext Context) {		Decl *Parser::ParseLinkage(ParsingDeclSpec &DS, DeclaratorContext Context) {
assert(isTokenStringLiteral() && "Not a string literal!");		assert(isTokenStringLiteral() && "Not a string literal!");
ExprResult Lang = ParseStringLiteralExpression(false);		ExprResult Lang = ParseUnevaluatedStringLiteralExpression();

ParseScope LinkageScope(this, Scope::DeclScope);		ParseScope LinkageScope(this, Scope::DeclScope);
Decl *LinkageSpec =		Decl *LinkageSpec =
Lang.isInvalid()		Lang.isInvalid()
? nullptr		? nullptr
: Actions.ActOnStartLinkageSpecification(		: Actions.ActOnStartLinkageSpecification(
getCurScope(), DS.getSourceRange().getBegin(), Lang.get(),		getCurScope(), DS.getSourceRange().getBegin(), Lang.get(),
Tok.is(tok::l_brace) ? Tok.getLocation() : SourceLocation());		Tok.is(tok::l_brace) ? Tok.getLocation() : SourceLocation());
▲ Show 20 Lines • Show All 656 Lines • ▼ Show 20 Lines	if (Tok.is(tok::r_paren)) {

if (!isTokenStringLiteral()) {		if (!isTokenStringLiteral()) {
Diag(Tok, diag::err_expected_string_literal)		Diag(Tok, diag::err_expected_string_literal)
<< /Source='static_assert'/ 1;		<< /Source='static_assert'/ 1;
SkipMalformedDecl();		SkipMalformedDecl();
return nullptr;		return nullptr;
}		}

AssertMessage = ParseStringLiteralExpression();		AssertMessage = ParseUnevaluatedStringLiteralExpression();
if (AssertMessage.isInvalid()) {		if (AssertMessage.isInvalid()) {
SkipMalformedDecl();		SkipMalformedDecl();
return nullptr;		return nullptr;
}		}
}		}

T.consumeClose();		T.consumeClose();

▲ Show 20 Lines • Show All 3,854 Lines • Show Last 20 Lines

clang/lib/Parse/ParseExpr.cpp

Show First 20 Lines • Show All 3,250 Lines • ▼ Show 20 Lines

/// form string literals, and also handles string concatenation [C99 5.1.1.2, /// form string literals, and also handles string concatenation [C99 5.1.1.2,

/// translation phase #6]. /// translation phase #6].

/// ///

/// \verbatim /// \verbatim

/// primary-expression: [C99 6.5.1] /// primary-expression: [C99 6.5.1]

/// string-literal /// string-literal

/// \verbatim /// \verbatim

ExprResult Parser::ParseStringLiteralExpression(bool AllowUserDefinedLiteral) { ExprResult Parser::ParseStringLiteralExpression(bool AllowUserDefinedLiteral) {

return ParseStringLiteralExpression(AllowUserDefinedLiteral,

/*Unevaluated=*/false);

}

ExprResult Parser::ParseUnevaluatedStringLiteralExpression() {

return ParseStringLiteralExpression(/*AllowUserDefinedLiteral=*/false,

/*Unevaluated=*/true);

}

ExprResult Parser::ParseStringLiteralExpression(bool AllowUserDefinedLiteral,

shafikUnsubmitted

Done

return ExprError();

}

- return ParseStringLiteralExpression(false, true);

+ return ParseStringLiteralExpression(/*AllowUserDefinedLiteral=*/false, /*Unevaluated=*/true);

}

ExprResult Parser::ParseStringLiteralExpression(bool AllowUserDefinedLiteral,

shafik:

bool Unevaluated) {

assert(isTokenStringLiteral() && "Not a string literal!"); assert(isTokenStringLiteral() && "Not a string literal!");

// String concat. Note that keywords like __func__ and __FUNCTION__ are not // String concat. Note that keywords like __func__ and __FUNCTION__ are not

// considered to be strings for concatenation purposes. // considered to be strings for concatenation purposes.

SmallVector<Token, 4> StringToks; SmallVector<Token, 4> StringToks;

do { do {

StringToks.push_back(Tok); StringToks.push_back(Tok);

ConsumeStringToken(); ConsumeStringToken();

} while (isTokenStringLiteral()); } while (isTokenStringLiteral());

if (Unevaluated) {

assert(!AllowUserDefinedLiteral && "UDL are always evaluated");

return Actions.ActOnUnevaluatedStringLiteral(StringToks);

}

// Pass the set of string tokens, ready for concatenation, to the actions. // Pass the set of string tokens, ready for concatenation, to the actions.

return Actions.ActOnStringLiteral(StringToks, return Actions.ActOnStringLiteral(StringToks,

AllowUserDefinedLiteral ? getCurScope() AllowUserDefinedLiteral ? getCurScope()

: nullptr); : nullptr);

} }

/// ParseGenericSelectionExpression - Parse a C11 generic-selection /// ParseGenericSelectionExpression - Parse a C11 generic-selection

/// [C11 6.5.1.1]. /// [C11 6.5.1.1].

▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines bool Parser::ParseExpressionList(SmallVectorImpl<Expr *> &Exprs,

while (true) { while (true) {

if (ExpressionStarts) if (ExpressionStarts)

ExpressionStarts(); ExpressionStarts();

ExprResult Expr; ExprResult Expr;

if (getLangOpts().CPlusPlus11 && Tok.is(tok::l_brace)) { if (getLangOpts().CPlusPlus11 && Tok.is(tok::l_brace)) {

Diag(Tok, diag::warn_cxx98_compat_generalized_initializer_lists); Diag(Tok, diag::warn_cxx98_compat_generalized_initializer_lists);

Expr = ParseBraceInitializer(); Expr = ParseBraceInitializer();

} else } else

aaron.ballmanUnsubmitted

Not Done

Can revert these two changes now.

aaron.ballman: Can revert these two changes now.

Expr = ParseAssignmentExpression(); Expr = ParseAssignmentExpression();

aaron.ballmanUnsubmitted

Not Done

I'm surprised we need special logic in ParseExpressionList() for handling unevaluated string literals; I would have expected that to be needed when parsing a string literal. Nothing changed in the grammar for http://eel.is/c++draft/expr.post.general#nt:expression-list (or initializer-list), so these changes seem wrong. Can you explain the changes a bit more?

aaron.ballman: I'm surprised we need special logic in `ParseExpressionList()` for handling unevaluated string…

cor3ntinAuthorUnsubmitted

Done

We use ParseExpressionList when parsing attribute arguments, and some attributes have unevaluate string as argument - I agree with you that I'd rather find a better solution for attributes, but I came up empty. There is no further reason for this change, and you are right it does not match the grammar.

cor3ntin: We use `ParseExpressionList` when parsing attribute arguments, and some attributes have…

aaron.ballmanUnsubmitted

Not Done

I was thinking we'd use a new kind of evaluation context for this. We'd enter the evaluation context when we know we need to parse an expression that is an unevaluated string literal which the string literal parser would pay attention to. This would require knowing up-front when we want to parse an unevaluated string literal, but we should have that information available to us at parse time (I think).

aaron.ballman: I was thinking we'd use a new kind of evaluation context for this. We'd enter the evaluation…

cor3ntinAuthorUnsubmitted

Done

After offline discussion, i think what we want to be doing is to have a

ParseAtttributeArgumentList function that is aware of whether the Nth argument is an unevaluated string - by means of modifying tablegen,
and doing the right parsing accordingly.
It would take care of all attributes automatically.
Alas that's a tad more involved.

cor3ntin: After offline discussion, i think what we want to be doing is to have a…

aaron.ballmanUnsubmitted

Not Done

I agree it's more involved, but it's also a more general solution that fits nicely in the parser design (we do this sort of thing for other parts of attribute parsing).

aaron.ballman: +1 I agree it's more involved, but it's also a more general solution that fits nicely in the…

if (EarlyTypoCorrection) if (EarlyTypoCorrection)

Expr = Actions.CorrectDelayedTyposInExpr(Expr); Expr = Actions.CorrectDelayedTyposInExpr(Expr);

if (Tok.is(tok::ellipsis)) if (Tok.is(tok::ellipsis))

Expr = Actions.ActOnPackExpansion(Expr.get(), ConsumeToken()); Expr = Actions.ActOnPackExpansion(Expr.get(), ConsumeToken());

else if (Tok.is(tok::code_completion)) { else if (Tok.is(tok::code_completion)) {

// There's nothing to suggest in here as we parsed a full expression. // There's nothing to suggest in here as we parsed a full expression.

▲ Show 20 Lines • Show All 329 Lines • Show Last 20 Lines

clang/lib/Sema/SemaDeclCXX.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 16,465 Lines • ▼ Show 20 Lines
/// the '{'. ExternLoc is the location of the 'extern', Lang is the		/// the '{'. ExternLoc is the location of the 'extern', Lang is the
/// language string literal. LBraceLoc, if valid, provides the location of		/// language string literal. LBraceLoc, if valid, provides the location of
/// the '{' brace. Otherwise, this linkage specification does not		/// the '{' brace. Otherwise, this linkage specification does not
/// have any braces.		/// have any braces.
Decl Sema::ActOnStartLinkageSpecification(Scope S, SourceLocation ExternLoc,		Decl Sema::ActOnStartLinkageSpecification(Scope S, SourceLocation ExternLoc,
Expr *LangStr,		Expr *LangStr,
SourceLocation LBraceLoc) {		SourceLocation LBraceLoc) {
StringLiteral *Lit = cast<StringLiteral>(LangStr);		StringLiteral *Lit = cast<StringLiteral>(LangStr);
if (!Lit->isOrdinary()) {		assert(Lit->isUnevaluated() && "Unexpected string literal kind");
		aaron.ballmanUnsubmitted Done Reply Inline Actions Test coverage for changes? aaron.ballman: Test coverage for changes?
		cor3ntinAuthorUnsubmitted Done Reply Inline Actions There are some in dcl.link/p2.cpp cor3ntin: There are some in dcl.link/p2.cpp
Diag(LangStr->getExprLoc(), diag::err_language_linkage_spec_not_ascii)
aaron.ballmanUnsubmitted Done Reply Inline Actions This diagnostic can be removed from DiagnosticSemaKinds.td now. aaron.ballman: This diagnostic can be removed from DiagnosticSemaKinds.td now.
<< LangStr->getSourceRange();
return nullptr;
}

StringRef Lang = Lit->getString();		StringRef Lang = Lit->getString();
LinkageSpecDecl::LanguageIDs Language;		LinkageSpecDecl::LanguageIDs Language;
if (Lang == "C")		if (Lang == "C")
Language = LinkageSpecDecl::lang_c;		Language = LinkageSpecDecl::lang_c;
else if (Lang == "C++")		else if (Lang == "C++")
Language = LinkageSpecDecl::lang_cxx;		Language = LinkageSpecDecl::lang_cxx;
else {		else {
▲ Show 20 Lines • Show All 448 Lines • ▼ Show 20 Lines	bool InTemplateDefinition =
getLangOpts().CPlusPlus && CurContext->isDependentContext();		getLangOpts().CPlusPlus && CurContext->isDependentContext();

if (!Failed && !Cond && !InTemplateDefinition) {		if (!Failed && !Cond && !InTemplateDefinition) {

SmallString<256> MsgBuffer;		SmallString<256> MsgBuffer;
llvm::raw_svector_ostream Msg(MsgBuffer);		llvm::raw_svector_ostream Msg(MsgBuffer);
if (AssertMessage) {		if (AssertMessage) {
const auto *MsgStr = cast<StringLiteral>(AssertMessage);		const auto *MsgStr = cast<StringLiteral>(AssertMessage);
if (MsgStr->isOrdinary())
Msg << MsgStr->getString();		Msg << MsgStr->getString();
else
MsgStr->printPretty(Msg, nullptr, getPrintingPolicy());
}		}

Expr *InnerCond = nullptr;		Expr *InnerCond = nullptr;
std::string InnerCondDescription;		std::string InnerCondDescription;
std::tie(InnerCond, InnerCondDescription) =		std::tie(InnerCond, InnerCondDescription) =
findFailedBooleanCondition(Converted.get());		findFailedBooleanCondition(Converted.get());
if (InnerCond && isa<ConceptSpecializationExpr>(InnerCond)) {		if (InnerCond && isa<ConceptSpecializationExpr>(InnerCond)) {
// Drill down into concept specialization expressions to see why they		// Drill down into concept specialization expressions to see why they
▲ Show 20 Lines • Show All 1,805 Lines • Show Last 20 Lines

clang/lib/Sema/SemaExpr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,915 Lines • ▼ Show 20 Lines

if (S.LookupLiteralOperator(Scope, R, llvm::ArrayRef(ArgTy, Args.size()),

/*AllowRaw*/ false, /*AllowTemplate*/ false,

/*AllowStringTemplatePack*/ false,

/*DiagnoseMissing*/ true) == Sema::LOLR_Error)

return ExprError();

return S.BuildLiteralOperatorCall(R, OpNameInfo, Args, LitEndLoc);

}

ExprResult Sema::ActOnUnevaluatedStringLiteral(ArrayRef<Token> StringToks) {

StringLiteralParser Literal(StringToks, PP,

aaron.ballmanUnsubmitted

Done

ExprResult Sema::ActOnUnevaluatedStringLiteral(ArrayRef<Token> StringToks) {

- StringLiteralParser Literal(StringToks, PP, true);

+ StringLiteralParser Literal(StringToks, PP, /*Unevaluated*/ true);

if (Literal.hadError)

aaron.ballman:

StringLiteralEvalMethod::Unevaluated);

if (Literal.hadError)

return ExprError();

SmallVector<SourceLocation, 4> StringTokLocs;

for (const Token &Tok : StringToks)

StringTokLocs.push_back(Tok.getLocation());

StringLiteral *Lit = StringLiteral::Create(

Context, Literal.GetString(), StringLiteral::Unevaluated, false, {},

&StringTokLocs[0], StringTokLocs.size());

if (!Literal.getUDSuffix().empty()) {

SourceLocation UDSuffixLoc =

getUDSuffixLoc(*this, StringTokLocs[Literal.getUDSuffixToken()],

Literal.getUDSuffixOffset());

return ExprError(Diag(UDSuffixLoc, diag::err_invalid_string_udl));

}

return Lit;

}

/// ActOnStringLiteral - The specified tokens were lexed as pasted string

/// fragments (e.g. "foo" "bar" L"baz"). The result string has to handle string

/// concatenation ([C99 5.1.1.2, translation phase #6]), so it may come from

/// multiple tokens. However, the common case is that StringToks points to one

/// string.

///

ExprResult

Sema::ActOnStringLiteral(ArrayRef<Token> StringToks, Scope *UDLScope) {

▲ Show 20 Lines • Show All 19,777 Lines • Show Last 20 Lines

clang/lib/Sema/SemaExprCXX.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,082 Lines • ▼ Show 20 Lines	if (const PointerType *ToPtrType = ToType->getAs<PointerType>())
// We don't allow UTF literals to be implicitly converted		// We don't allow UTF literals to be implicitly converted
break;		break;
case StringLiteral::Ordinary:		case StringLiteral::Ordinary:
return (ToPointeeType->getKind() == BuiltinType::Char_U \|\|		return (ToPointeeType->getKind() == BuiltinType::Char_U \|\|
ToPointeeType->getKind() == BuiltinType::Char_S);		ToPointeeType->getKind() == BuiltinType::Char_S);
case StringLiteral::Wide:		case StringLiteral::Wide:
return Context.typesAreCompatible(Context.getWideCharType(),		return Context.typesAreCompatible(Context.getWideCharType(),
QualType(ToPointeeType, 0));		QualType(ToPointeeType, 0));
		case StringLiteral::Unevaluated:
		assert(false && "Unevaluated string literal in expression");
		break;
}		}
}		}
}		}

return false;		return false;
}		}

static ExprResult BuildCXXCastArgument(Sema &S,		static ExprResult BuildCXXCastArgument(Sema &S,
▲ Show 20 Lines • Show All 5,079 Lines • Show Last 20 Lines

clang/lib/Sema/SemaInit.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	static StringInitFailureKind IsStringInit(Expr Init, const ArrayType AT,
case StringLiteral::Wide:		case StringLiteral::Wide:
if (Context.typesAreCompatible(Context.getWideCharType(), ElemTy))		if (Context.typesAreCompatible(Context.getWideCharType(), ElemTy))
return SIF_None;		return SIF_None;
if (ElemTy->isCharType() \|\| ElemTy->isChar8Type())		if (ElemTy->isCharType() \|\| ElemTy->isChar8Type())
return SIF_WideStringIntoChar;		return SIF_WideStringIntoChar;
if (IsWideCharCompatible(ElemTy, Context))		if (IsWideCharCompatible(ElemTy, Context))
return SIF_IncompatWideStringIntoWideChar;		return SIF_IncompatWideStringIntoWideChar;
return SIF_Other;		return SIF_Other;
		case StringLiteral::Unevaluated:
		assert(false && "Unevaluated string literal in initialization");
		break;
}		}

llvm_unreachable("missed a StringLiteral kind?");		llvm_unreachable("missed a StringLiteral kind?");
}		}

static StringInitFailureKind IsStringInit(Expr *init, QualType declType,		static StringInitFailureKind IsStringInit(Expr *init, QualType declType,
ASTContext &Context) {		ASTContext &Context) {
const ArrayType *arrayType = Context.getAsArrayType(declType);		const ArrayType *arrayType = Context.getAsArrayType(declType);
▲ Show 20 Lines • Show All 10,757 Lines • Show Last 20 Lines

clang/test/CXX/dcl.dcl/dcl.link/p2.cpp

	// RUN: %clang_cc1 -std=c++11 -verify %s			// RUN: %clang_cc1 -std=c++11 -verify %s

	extern "C" {			extern "C" {
	extern R"(C++)" { }			extern R"(C++)" { }
	}			}

	#define plusplus "++"			#define plusplus "++"
	extern "C" plusplus {			extern "C" plusplus {
	}			}

	extern u8"C" {} // expected-error {{string literal in language linkage specifier cannot have an encoding-prefix}}			extern u8"C" {} // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
	extern L"C" {} // expected-error {{string literal in language linkage specifier cannot have an encoding-prefix}}			extern L"C" {} // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
	extern u"C++" {} // expected-error {{string literal in language linkage specifier cannot have an encoding-prefix}}			extern u"C++" {} // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
	extern U"C" {} // expected-error {{string literal in language linkage specifier cannot have an encoding-prefix}}			extern U"C" {} // expected-error {{an unevaluated string literal cannot have an encoding prefix}}

clang/test/CXX/dcl.dcl/p4-0x.cpp

Show All 12 Lines	struct U {
constexpr operator long() const { return 0; } // expected-note {{candidate}}		constexpr operator long() const { return 0; } // expected-note {{candidate}}
};		};

static_assert(S(true), "");		static_assert(S(true), "");
static_assert(S(false), "not so fast"); // expected-error {{not so fast}}		static_assert(S(false), "not so fast"); // expected-error {{not so fast}}
static_assert(T(), "");		static_assert(T(), "");
static_assert(U(), ""); // expected-error {{ambiguous}}		static_assert(U(), ""); // expected-error {{ambiguous}}

static_assert(false, L"\x14hi" "!" R"x(")x"); // expected-error {{static assertion failed: L"\024hi!\""}}		static_assert(false, L"\x14hi" // expected-error {{an unevaluated string literal cannot have an encoding prefix}} \
		// expected-error {{invalid escape sequence '\x14' in an unevaluated string literal}}
		"!"
		R"x(")x");
		aaron.ballmanUnsubmitted Done Reply Inline Actions Can you add the newline back to the end of the file? aaron.ballman: Can you add the newline back to the end of the file?

clang/test/FixIt/fixit-static-assert.cpp

	// RUN: %clang_cc1 -std=c++14 %s -fdiagnostics-parseable-fixits %s 2>&1 \| FileCheck %s			// RUN: %clang_cc1 -std=c++14 %s -fdiagnostics-parseable-fixits %s 2>&1 \| FileCheck %s
	// Ensure no warnings are emitted in c++17.			// Ensure no warnings are emitted in c++17.
	// RUN: %clang_cc1 -std=c++17 %s -verify=cxx17			// RUN: %clang_cc1 -std=c++17 %s -verify=cxx17
	// RUN: %clang_cc1 -std=c++14 %s -fixit-recompile -fixit-to-temporary -Werror			// RUN: %clang_cc1 -std=c++14 %s -fixit-recompile -fixit-to-temporary -Werror

	// cxx17-no-diagnostics			// cxx17-no-diagnostics

	static_assert(true && "String");			static_assert(true && "String");
	// CHECK-DAG: {[[@LINE-1]]:20-[[@LINE-1]]:22}:","			// CHECK-DAG: {[[@LINE-1]]:20-[[@LINE-1]]:22}:","

	// String literal prefixes are good.			// String literal prefixes are good.
	static_assert(true && R"(RawString)");			static_assert(true && R"(RawString)");
	// CHECK-DAG: {[[@LINE-1]]:20-[[@LINE-1]]:22}:","			// CHECK-DAG: {[[@LINE-1]]:20-[[@LINE-1]]:22}:","
	static_assert(true && L"RawString");
	// CHECK-DAG: {[[@LINE-1]]:20-[[@LINE-1]]:22}:","

	static_assert(true);			static_assert(true);
	// CHECK-DAG: {[[@LINE-1]]:19-[[@LINE-1]]:19}:", \"\""			// CHECK-DAG: {[[@LINE-1]]:19-[[@LINE-1]]:19}:", \"\""

	// While its technically possible to transform this to			// While its technically possible to transform this to
	// static_assert(true, "String") we don't attempt this fix.			// static_assert(true, "String") we don't attempt this fix.
	static_assert("String" && true);			static_assert("String" && true);
	// CHECK-DAG: {[[@LINE-1]]:31-[[@LINE-1]]:31}:", \"\""			// CHECK-DAG: {[[@LINE-1]]:31-[[@LINE-1]]:31}:", \"\""

	// Don't be smart and look in parentheses.			// Don't be smart and look in parentheses.
	static_assert((true && "String"));			static_assert((true && "String"));
	// CHECK-DAG: {[[@LINE-1]]:33-[[@LINE-1]]:33}:", \"\""			// CHECK-DAG: {[[@LINE-1]]:33-[[@LINE-1]]:33}:", \"\""

clang/test/SemaCXX/static-assert.cpp

	Show All 23 Lines
	template<typename T> struct S {			template<typename T> struct S {
	static_assert(sizeof(T) > sizeof(char), "Type not big enough!"); // expected-error {{static assertion failed due to requirement 'sizeof(char) > sizeof(char)': Type not big enough!}} \			static_assert(sizeof(T) > sizeof(char), "Type not big enough!"); // expected-error {{static assertion failed due to requirement 'sizeof(char) > sizeof(char)': Type not big enough!}} \
	// expected-note {{1 > 1}}			// expected-note {{1 > 1}}
	};			};

	S<char> s1; // expected-note {{in instantiation of template class 'S<char>' requested here}}			S<char> s1; // expected-note {{in instantiation of template class 'S<char>' requested here}}
	S<int> s2;			S<int> s2;

	static_assert(false, L"\xFFFFFFFF"); // expected-error {{static assertion failed: L"\xFFFFFFFF"}}			static_assert(false, L"\xFFFFFFFF"); // expected-error {{an unevaluated string literal cannot have an encoding prefix}} \
	static_assert(false, u"\U000317FF"); // expected-error {{static assertion failed: u"\U000317FF"}}			// expected-error {{invalid escape sequence '\xFFFFFFFF' in an unevaluated string literal}}
				static_assert(false, u"\U000317FF"); // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
	static_assert(false, u8"Ω"); // expected-error {{static assertion failed: u8"\316\251"}}			// FIXME: render this as u8"\u03A9"
	static_assert(false, L"\u1234"); // expected-error {{static assertion failed: L"\x1234"}}			static_assert(false, u8"Ω"); // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
	static_assert(false, L"\x1ff" "0\x123" "fx\xfffff" "goop"); // expected-error {{static assertion failed: L"\x1FF""0\x123""fx\xFFFFFgoop"}}			static_assert(false, L"\u1234"); // expected-error {{an unevaluated string literal cannot have an encoding prefix}}
				static_assert(false, L"\x1ff" // expected-error {{an unevaluated string literal cannot have an encoding prefix}} \
				// expected-error {{invalid escape sequence '\x1ff' in an unevaluated string literal}}
				"0\x123" // expected-error {{invalid escape sequence '\x123' in an unevaluated string literal}}
				"fx\xfffff" // expected-error {{invalid escape sequence '\xfffff' in an unevaluated string literal}}
				"goop");

				static_assert(false, "\'\"\?\\\a\b\f\n\r\t\v"); // expected-error {{'"?\<U+0007><U+0008>}}
				static_assert(true, "\xFF"); // expected-error {{invalid escape sequence '\xFF' in an unevaluated string literal}}
				static_assert(true, "\123"); // expected-error {{invalid escape sequence '\123' in an unevaluated string literal}}
				static_assert(true, "\pOh no, a Pascal string!"); // expected-warning {{unknown escape sequence '\p'}} \
				// expected-error {{invalid escape sequence '\p' in an unevaluated string literal}}
	static_assert(false, R"(a			static_assert(false, R"(a
	\tb			\tb
	c			c
	)"); // expected-error@-3 {{static assertion failed: a\n\tb\nc\n}}			)"); // expected-error@-3 {{static assertion failed: a\n\tb\nc\n}}

	static_assert(false, "\u0080\u0081\u0082\u0083\u0099\u009A\u009B\u009C\u009D\u009E\u009F");			static_assert(false, "\u0080\u0081\u0082\u0083\u0099\u009A\u009B\u009C\u009D\u009E\u009F");
	// expected-error@-1 {{static assertion failed: <U+0080><U+0081><U+0082><U+0083><U+0099><U+009A><U+009B><U+009C><U+009D><U+009E><U+009F>}}			// expected-error@-1 {{static assertion failed: <U+0080><U+0081><U+0082><U+0083><U+0099><U+009A><U+009B><U+009C><U+009D><U+009E><U+009F>}}

	▲ Show 20 Lines • Show All 268 Lines • Show Last 20 Lines

clang/www/cxx_status.html

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines

<tr>

<td>Making non-encodable string literals ill-formed</td>

<td class="full" align="center">Clang 14</td>

</tr>

<tr>

<td>Unevaluated strings</td>

<summary>Clang 17 (Partial)</summary>

Attributes arguments don't yet parse as unevaluated string literals.

</details>

</td>

</tr>

aaron.ballmanUnsubmitted

Not Done

<td class="partial" align="center"><details><summary>Clang 17 (Partial)</summary>

- Attributes don't require unevaluated string literals in this release

+ Attributes arguments don't yet parse as unevaluated string literals.

</details</td>

</tr>

aaron.ballman:

<tr>

<td>Add @, $, and ` to the basic character set</td>

</tr>

<tr>

<td>constexpr cast from <tt>void*</tt></td>

▲ Show 20 Lines • Show All 1,628 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Implement P2361 Unevaluated string literalsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 538078

clang-tools-extra/test/clang-tidy/checkers/modernize/unary-static-assert.cpp

clang/docs/ReleaseNotes.rst

clang/include/clang/AST/Expr.h

clang/include/clang/Basic/DiagnosticLexKinds.td

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/include/clang/Lex/LiteralSupport.h

clang/include/clang/Parse/Parser.h

clang/include/clang/Sema/Sema.h

clang/lib/AST/Expr.cpp

clang/lib/Lex/LiteralSupport.cpp

clang/lib/Lex/PPMacroExpansion.cpp

clang/lib/Lex/Pragma.cpp

clang/lib/Parse/ParseDeclCXX.cpp

clang/lib/Parse/ParseExpr.cpp

clang/lib/Sema/SemaDeclCXX.cpp

clang/lib/Sema/SemaExpr.cpp

clang/lib/Sema/SemaExprCXX.cpp

clang/lib/Sema/SemaInit.cpp

clang/test/CXX/dcl.dcl/dcl.link/p2.cpp

clang/test/CXX/dcl.dcl/p4-0x.cpp

clang/test/FixIt/fixit-static-assert.cpp

clang/test/SemaCXX/static-assert.cpp

clang/www/cxx_status.html

Implement P2361 Unevaluated string literals
ClosedPublic