This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
7/7
ClangFormatStyleOptions.rst
-
include/clang/Format/
-
clang/
-
Format/
-
Format.h
-
lib/Format/
-
Format/
-
BreakableToken.h
8/8
BreakableToken.cpp
3/3
Format.cpp
-
NamespaceEndCommentsFixer.cpp
-
unittests/Format/
-
Format/
-
FormatTest.cpp
5/5
FormatTestComments.cpp

Differential D92257

[clang-format] Add option to control the space at the front of a line comment
ClosedPublic

Authored by HazardyKnusperkeks on Nov 27 2020, 9:24 PM.

Download Raw Diff

Details

Reviewers

MyDeveloperDay
klimek
krasimir
curdeius

Commits

rG772eb24e0062: [clang-format] Add option to control the spaces in a line comment
rG4ad41f1daf0f: Revert "[clang-format] Add option to control the spaces in a line comment"
rG078f30e04d1f: [clang-format] Add option to control the spaces in a line comment

Summary

Adding a minimum and a maximum number of allowed spaces in the line comment prefix.
The both values are needed for what I intend (a maximum of 0), and keeping the old behavior (only a minimum of 1).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

HazardyKnusperkeks created this revision.Nov 27 2020, 9:24 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 27 2020, 9:24 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

HazardyKnusperkeks requested review of this revision.Nov 27 2020, 9:24 PM

Harbormaster completed remote builds in B80400: Diff 308135.Nov 27 2020, 10:12 PM

MyDeveloperDay added inline comments.Nov 28 2020, 2:20 AM

clang/docs/ClangFormatStyleOptions.rst
2727	I'm personally not a massive fan of stuffing -1 into an unsigned but I understand why its there, if this was an signed and it was actually -1 would the algorithm be substantially worse in your view?
clang/lib/Format/BreakableToken.cpp
797	my assumption is when Maximum is -1 this is a very large -ve number correct? is that defined behavior for drop_back()

HazardyKnusperkeks added inline comments.Nov 28 2020, 5:17 AM

clang/docs/ClangFormatStyleOptions.rst
2727	I'm no fan if unsigned in any way, but that seems to be the way in clang-format. Making it signed would require a few more checks when to use it, but I don't see any problem in that. I just would also make the Minimum signed then just to be consistent. While parsing the style I would add checks to ensure Minimum is never negative and Maximum is always greater or equal to -1, should that print any warnings? Is there a standard way of doing so? Or should it be just silently corrected?
clang/lib/Format/BreakableToken.cpp
797	Since we check beforehand SpacesInPrefix is larger than Maximum there is no problem.

I think fundamentally from my perspective this seem ok, out of interest can I ask what drove you to require it?

My assumption is that some people write comments like

// Free comment without space

and you want to be able to consistently format it to be (N spaces, as clang-format already does 1 space correct?)

//  Free comment without space

is that correct? is there a common style guide asking for that? what is the rationale

clang/lib/Format/BreakableToken.cpp
790	is this case covered by a unit test at all? sorry can you explain why you are looking for "##"?
797	yep sorry didn't see that.

In D92257#2422381, @MyDeveloperDay wrote:
I think fundamentally from my perspective this seem ok, out of interest can I ask what drove you to require it?

My assumption is that some people write comments like
// Free comment without space
and you want to be able to consistently format it to be (N spaces, as clang-format already does 1 space correct?)
//  Free comment without space
is that correct? is there a common style guide asking for that? what is the rationale

I will go for {0,0}, so no space between // and the text, I don't know about a style guide asking for it, other than my own. (Which I can dictate in my company.) :)
I have just recently started using clang-format and it does not everything the the way I want to, on some aspects I have adapted, but on others I try to "fix" it, you can expect some more changes from me in the next time.

clang/lib/Format/BreakableToken.cpp
790	It is covered by multiple tests, that's how I was made aware of it. :) If you look at the code before it only adds a space if the old prefix is "#" not "" which is also found by `getLineCommentIndentPrefix`. As it seems in `TextProto` "" should not be touched. I can of course add a test in my test function. Now I see a change, in the code before "#" was only accepted when the language is `TextProto`, now it is always. But I think for that to happen the parser (or lexer?) should have assigned something starting with"#" as comment, right? But I can change that.

HazardyKnusperkeks added inline comments.Nov 30 2020, 12:30 PM

clang/lib/Format/BreakableToken.cpp
790	Okay # # is formatted, I try again: If you look at the code before it only adds a space if the old prefix is "#" not "`##`" which is also found by `getLineCommentIndentPrefix`. As it seems in `TextProto` "`##`" should not be touched.

MyDeveloperDay added inline comments.Dec 1 2020, 3:25 AM

clang/docs/ClangFormatStyleOptions.rst
2710	Is this change generated? with clang/doc/tools/dump_style.py or did you hand craft it? ClangFormatStyleOptions.rst is always autogenerated from running dump_style.py, any text you want in the rst needs to be present in Format.h

HazardyKnusperkeks added inline comments.Dec 2 2020, 11:57 AM

clang/docs/ClangFormatStyleOptions.rst
2710	Yes it is generated, after you told me what I have to do on D91507 I have (and will in the future) always run that script, I've not touched the file directly. But I also have not really looked at it, because it is generated. If it is a bit odd I can retake a look on other options with nested entries.

This LGTM, I'm not sure if others have any comments

This revision is now accepted and ready to land.Dec 3 2020, 12:07 PM

curdeius added a subscriber: curdeius.Dec 3 2020, 12:18 PM

curdeius added inline comments.

clang/lib/Format/Format.cpp
963	I don't know precisely the LLVM style but does it allow more than one space (as Maximum would suggest)? Are there any tests covering that? And what about other styles, no need to set min/max for them?

HazardyKnusperkeks added inline comments.Dec 3 2020, 12:26 PM

clang/lib/Format/Format.cpp
963	The part with the LLVM Style from my test case did run exactly so without any modification, so yes it allows more than one space. Since there was no option before (that I'm aware of) all other styles behaved exactly like that. I did not check the style guides if they say anything about that, I just preserved the old behavior when nothing is configured.

LGTM

clang/lib/Format/Format.cpp
963	Ok. Great

Can I assume you need someone to land this for you?

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

let me know if you still want me to land this

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

let me know if you still want me to land this

Updated. :)
Yes please land this.

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.

let me know if you still want me to land this

krasimir added inline comments.Dec 6 2020, 10:46 AM

clang/docs/ClangFormatStyleOptions.rst
2727	I find it confusing why we have 2, Minimum and Maximum, instead of a single one. I'm not convinced that `Maximum` is useful. Conceptually I'd prefer a single integer option, say `LineCommentContentIndent` that would indicate the default indent used for the content of line comments. I'd naively expect `LineCommentContentIndent =` : 0 would produce `//comment` 1 would produce `// comment` (current default) 2 would produce `// comment`, etc. and this will work with if the input is any of `//comment`, `// comment`, or `// comment`, etc. An additional consideration is that line comment sections often contain additional indentation, e.g. when there is a bullet list, paragraphs, etc. and so we can't guarantee that the indent of each line comment will be less than Maximum in general. I'd expect this feature to not adjust extra indent in comments, e.g., // Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... after reformatting with `LineCommentContentIndent=0` to produce //Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... (and vice-versa, after reformatting with `LineCommentContentIndent=1`). This may well be handled by code, I just wasn't sure by looking at the code and test examples.
clang/lib/Format/BreakableToken.cpp
790	Thanks for the analysis! I wrote the text proto comment detection. I believe the current clang-format is buggy in that it should transform `##comment` into `## comment` for text proto (and similarly for all other `KnownTextProtoPrefixes` in `getLineCommentIndentCommentPrefix`), so this `substr(0, 2) != "##"` is unnecessary and I should go ahead and update and add tests for that.
clang/unittests/Format/FormatTestComments.cpp
3405	This is desired, AFAIK, and due to the normalization behavior while reflowing: when a comment line exceeds the comment limit and is broken up into a new line, the full range of blanks is replaced with a newline. (https://github.com/llvm/llvm-project/blob/ddb002d7c74c038b64dd9d3c3e4a4b58795cf1a6/clang/lib/Format/BreakableToken.cpp#L66). Note that reflowing copies the extra indent of the line, e.g., // line limit V // heading // line is // long long long long get reformatted as // line limit V // heading // line is // long long // long long so if for ranges of blanks longer of size S>1 we copied the (S-1) blanks at the beginning of the next line, we would have cascading comment reflows undesired with longer and longer indents.

I tend to agree with @krasimir I don't see where you really use Maximum to mean anything, the nested configuration seems perhaps unnecessarily confusing?

In D92257#2435906, @lebedev.ri wrote:

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.

let me know if you still want me to land this

What/Where are the docs? I read https://llvm.org/docs/Contributing.html before hand and just now https://llvm.org/docs/CodeReview.html

I will update the patch, but that won't happen before the weekend.

clang/docs/ClangFormatStyleOptions.rst
2727	I was actually going for only one value, but while writing the tests I came to the conclusion that before my change is only enforced a minimum of 1. But that very well may be because of what you call line comment sections, I did not consider that. That's why I chose a minimum and maximum. I will modify the patch to one value and will also add tests for the sections. But for that I need to remember if I added or removed spaces, right? Is there already infrastructure for that? Or is there any documentation of the various steps clang-format takes to parse and format code? Until now I tried to understand what's going on through single stepping with the debugger (quite time consuming).
clang/lib/Format/BreakableToken.cpp
790	In that case I will remove that check, but as said there were many tests which failed without it, I will have to adapt them too.
clang/unittests/Format/FormatTestComments.cpp
3405	Okay, I mean the spaced between `sit` and `amet`, while the spaces between `Lorem` and `ipsum`, and `dolor` and `sit` is kept.

In D92257#2437918, @HazardyKnusperkeks wrote:

In D92257#2435906, @lebedev.ri wrote:

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.

let me know if you still want me to land this

What/Where are the docs? I read https://llvm.org/docs/Contributing.html before hand and just now https://llvm.org/docs/CodeReview.html

My comment was addressed at @MyDeveloperDay
https://llvm.org/docs/DeveloperPolicy.html#commit-messages

If you’re not the original author, ensure the ‘Author’ property of the commit
is set to the original author and the ‘Committer’ property is set to yourself.
You can use a command similar to git commit --amend --author="John Doe <jdoe@llvm.org>"
to correct the author property if it is incorrect.
See Attribution of Changes for more information including the method
we used for attribution before the project migrated to git.

IOW @HazardyKnusperkeks's request was correct.

I will update the patch, but that won't happen before the weekend.

In D92257#2435906, @lebedev.ri wrote:

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.

let me know if you still want me to land this

Yes I agree I hadn’t seen that the process had changed,

This is one reason why I don’t like landing patches for others, this just confirms that in the future I will generally request people apply for commit access themselves.

Could we consider dropping the maximum?

This revision now requires changes to proceed.Dec 7 2020, 11:17 PM

In D92257#2438926, @MyDeveloperDay wrote:

In D92257#2435906, @lebedev.ri wrote:

In D92257#2435902, @MyDeveloperDay wrote:

In D92257#2435899, @HazardyKnusperkeks wrote:

In D92257#2435701, @MyDeveloperDay wrote:

Can I assume you need someone to land this for you?

Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?

You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.

The process is that you add (https://llvm.org/docs/Contributing.html)

Patch By: HazardyKnusperkeks

to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.

That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.

let me know if you still want me to land this

Yes I agree I hadn’t seen that the process had changed,

This is one reason why I don’t like landing patches for others, this just confirms that in the future I will generally request people apply for commit access themselves.

And where do I do that? Also I did not think I would not have a chance of getting the access so early.

And where do I do that? Also I did not think I would not have a chance of getting the access so early.

https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

HazardyKnusperkeks mentioned this in D93163: [clang-format] Fix handling of ## comments in TextProto.Dec 12 2020, 8:49 AM

In D92257#2438928, @MyDeveloperDay wrote:

Could we consider dropping the maximum?

While rewriting the tests with the unmodified clang-format I just confirmed that currently only the minimum of 1 is enforced, there is no maximum. I.e.

//x
int x;
//   y
int y;

will be formatted as

// x
int x;
//   y
int y;

So for what I want to do I can:

Enforce the Minimum (for LLVM Style 1)
Enforce an optional Maximum (but what if the Maximum is set to 0, what I want to do, given that the Minimum is 1?)
Stay with the current design of a Minimum/Maximum Pair with (practically unbounded Maximum)

I would choose #3, in that case I only have to add a test for line comment sections (and most likely adapt the implementation).

/// A List:
///  * Foo
///  * Bar

This didn't really address the comments, what is the point of the maximum? what if the maximum is > the ColumnLimit?

In D92257#2452063, @MyDeveloperDay wrote:

This didn't really address the comments, what is the point of the maximum?

My goal is to remove all spaces between // and the content (with the exception of comment sections, as I have learned here), and do not break the current behavior in any way, and currently it seems to work with an unlimited maximum, and a minimum of 1.

In D92257#2452063, @MyDeveloperDay wrote:

what if the maximum is > the ColumnLimit?

Most likely what ever happens now if the space in the comment is larger than the ColumnLimit. But one could just remove spaces as needed.
A more interesting question would be what happens if the minimum is larger than the ColumnLimit?. For that one had to decide which is more important, I would go with the ColumnLimit and reduce the minimum, but maybe that could be handled with the penalties? Although I have to admit that I don't understand where and how they are used.

I'm back!
I've reworked the change to correctly(*) work with line comment sections.

*: That is to be discussed, currently there is one change in behavior which is not covered through previous tests and one test which is failing. I will highlight that in inline comments.

Harbormaster completed remote builds in B83534: Diff 313774.Dec 27 2020, 1:13 AM

In D92257#2452063, @MyDeveloperDay wrote:

This didn't really address the comments, what is the point of the maximum? what if the maximum is > the ColumnLimit?

Current behavior if there are more spaces than ColumnLimit: Do not format the comment at all. Now if the minimum is larger than the ColumnLimit we will obey the limit and then normal behavior kicks in, this is also covered in the tests.

clang/docs/ClangFormatStyleOptions.rst
2727	I find it confusing why we have 2, Minimum and Maximum, instead of a single one. I'm not convinced that `Maximum` is useful. Conceptually I'd prefer a single integer option, say `LineCommentContentIndent` that would indicate the default indent used for the content of line comments. I'd naively expect `LineCommentContentIndent =` : 0 would produce `//comment` 1 would produce `// comment` (current default) 2 would produce `// comment`, etc. and this will work with if the input is any of `//comment`, `// comment`, or `// comment`, etc. An additional consideration is that line comment sections often contain additional indentation, e.g. when there is a bullet list, paragraphs, etc. and so we can't guarantee that the indent of each line comment will be less than Maximum in general. I'd expect this feature to not adjust extra indent in comments, e.g., // Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... after reformatting with `LineCommentContentIndent=0` to produce //Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... (and vice-versa, after reformatting with `LineCommentContentIndent=1`). This may well be handled by code, I just wasn't sure by looking at the code and test examples.
clang/unittests/Format/FormatTestComments.cpp
3135–3141	Here the test fails, because `commen1` gets a space added and `commen3` belongs to the same section, thus also gets an additional space. I see three options: The whole keeping indentation in a section is wrong. Disable the mechanic for text proto. Adapt the test.
3393	Here is the difference. Before this would have been formatted as // if (ret1) { // return 2; //} So only one space added for the `if`, it did not keep the indentation of the `return` and not adding a space to `}`. I think this is much better and also basically what @krasimir requested.

HazardyKnusperkeks edited the summary of this revision. (Show Details)Dec 27 2020, 1:34 AM

My assumption is that you want to stick with the minimum and maximum is that correct?

In D92257#2497535, @MyDeveloperDay wrote:

My assumption is that you want to stick with the minimum and maximum is that correct?

Otherwise I have to make a breaking change, or not achieve at all what I want. So either abandon this or we need a minimum and maximum.
Although right now I have changed an existing test because in that case the behavior changed (as noted inline) and a test case still to decide how to advance.

If that little break is not acceptable I see no further base to pursue this and have to decide to drop it totally or only apply it locally.

And thanks for bringing this up again, currently I have very little time to work on clang-format and only react on the mails. Even while I have many things I want to add/change or cases where it is currently misformatted.

MyDeveloperDay accepted this revision.Jan 17 2021, 3:20 AM

This revision is now accepted and ready to land.Jan 17 2021, 3:20 AM

Rebased
Fixed(?) the last UnitTest, please take a look @krasimir

Harbormaster completed remote builds in B85569: Diff 317294.Jan 18 2021, 2:59 AM

Closed by commit rG078f30e04d1f: [clang-format] Add option to control the spaces in a line comment (authored by HazardyKnusperkeks). · Explain WhyJan 28 2021, 10:00 PM

This revision was automatically updated to reflect the committed changes.

HazardyKnusperkeks added a commit: rG078f30e04d1f: [clang-format] Add option to control the spaces in a line comment.

HazardyKnusperkeks added a reverting change: rG4ad41f1daf0f: Revert "[clang-format] Add option to control the spaces in a line comment".Jan 29 2021, 12:31 AM

krasimir added inline comments.Jan 29 2021, 1:06 AM

clang/unittests/Format/FormatTestComments.cpp
3135–3141	This test change looks OK.

The previous one broke a (format) test in polly. This lead me to change the one breaking behavior, no it is not breaking anymore.
A comment starting with } will only be indented if it is in a comment section which will get an indention. Test case is adapted.

I will wait a few days if there is no negative feedback I will push it again.

HazardyKnusperkeks reopened this revision.Jan 29 2021, 1:06 PM

This revision is now accepted and ready to land.Jan 29 2021, 1:06 PM

HazardyKnusperkeks added a commit: rG4ad41f1daf0f: Revert "[clang-format] Add option to control the spaces in a line comment".Jan 29 2021, 1:07 PM

Harbormaster completed remote builds in B87203: Diff 320196.Jan 29 2021, 1:44 PM

LGTM. Could you please give us a link to the failing test in Polly? May be GitHub or buildbot.

In D92257#2532071, @curdeius wrote:

LGTM. Could you please give us a link to the failing test in Polly? May be GitHub or buildbot.

No problem:

http://lab.llvm.org:8011/#builders/10/builds/2294

I have a script that runs clang-format -n on various directories in clang
that are clang format clean, polly is one of them because they have clang
format as a unit test

I use this to ensure I don’t regress behaviour

Maybe we should formalise this with some sort of clang-format-check cmake
rule

Mydeveloperday

In D92257#2532077, @MyDeveloperDay wrote:

I have a script that runs clang-format -n on various directories in clang
that are clang format clean, polly is one of them because they have clang
format as a unit test

I use this to ensure I don’t regress behaviour

Maybe we should formalise this with some sort of clang-format-check cmake
rule

Mydeveloperday

That would be ok for me.

This revision was landed with ongoing or failed builds.Feb 1 2021, 1:49 PM

Closed by commit rG772eb24e0062: [clang-format] Add option to control the spaces in a line comment (authored by HazardyKnusperkeks). · Explain Why

This revision was automatically updated to reflect the committed changes.

HazardyKnusperkeks added a commit: rG772eb24e0062: [clang-format] Add option to control the spaces in a line comment.

Hi guys，i found SpacesInLineCommentPrefix does not support other encoding such as utf8 ，
I am curious why there is a isAlphanumeric limit in BreakableLineCommentSection::BreakableLineCommentSection() ?
I want to make some contribution to make it support utf8, what should i do ?

In D92257#3003281, @byronhe wrote:

Hi guys，i found SpacesInLineCommentPrefix does not support other encoding such as utf8 ，
I am curious why there is a isAlphanumeric limit in BreakableLineCommentSection::BreakableLineCommentSection() ?
I want to make some contribution to make it support utf8, what should i do ?

see https://llvm.org/docs/Contributing.html but in essence:

open a bug at https://bugs.llvm.org/
clone the llvm repo from gitub
build the repo
add unit tests in clang/unittests/Format that show the problem
add the code that fixes the issue
upload a diff of the patch to reviews.llvm.org
add clang-format project tag and at least me as a reviewer and I can help fill in the rest

This sounds like a great new contributor idea.. go for it! I'll support this.

I missed the most important step!

Add LLVM contributor to your CV.

No seriously I mean it. I interview people all the time, if I saw that on a CV, it would immediately start a conversation about what/who/why you did it. (allowing me to look up your contribution)

As an interviewer, Contributing to open source is a great thing!

In D92257#3003281, @byronhe wrote:

Hi guys，i found SpacesInLineCommentPrefix does not support other encoding such as utf8 ，
I am curious why there is a isAlphanumeric limit in BreakableLineCommentSection::BreakableLineCommentSection() ?
I want to make some contribution to make it support utf8, what should i do ?

The isAlphanumeric is there to not break doxygen like comments for example.

I'm very interested in how you want to tackle that problem. :)

In D92257#3004563, @HazardyKnusperkeks wrote:

In D92257#3003281, @byronhe wrote:

Hi guys，i found SpacesInLineCommentPrefix does not support other encoding such as utf8 ，
I am curious why there is a isAlphanumeric limit in BreakableLineCommentSection::BreakableLineCommentSection() ?
I want to make some contribution to make it support utf8, what should i do ?

The isAlphanumeric is there to not break doxygen like comments for example.

I'm very interested in how you want to tackle that problem. :)

Sorry for digging out this months long thing but I encountered this in one project full of non-alphanumeric comment. If I understood the problem correctly, is it more like avoiding symbols like @ / * rather than avoiding non-ASCII characters like CJK's? In this case, would just not changing leading spaces when it is started with symbols make this option usable for wider amount of comment content? :)

CoelacanthusHex added a subscriber: CoelacanthusHex.Jan 17 2022, 8:09 PM

In D92257#3249650, @ksyx wrote:

In D92257#3004563, @HazardyKnusperkeks wrote:

In D92257#3003281, @byronhe wrote:

Hi guys，i found SpacesInLineCommentPrefix does not support other encoding such as utf8 ，
I am curious why there is a isAlphanumeric limit in BreakableLineCommentSection::BreakableLineCommentSection() ?
I want to make some contribution to make it support utf8, what should i do ?

The isAlphanumeric is there to not break doxygen like comments for example.

I'm very interested in how you want to tackle that problem. :)

Sorry for digging out this months long thing but I encountered this in one project full of non-alphanumeric comment. If I understood the problem correctly, is it more like avoiding symbols like @ / * rather than avoiding non-ASCII characters like CJK's? In this case, would just not changing leading spaces when it is started with symbols make this option usable for wider amount of comment content? :)

So we basically have to choose between using this simple trick, but excluding non latin characters, or creating a list where we do not add spaces, which is possibly incomplete.

But please create a patch and we will evaluate it.

ksyx mentioned this in D118869: [clang-format] Non-latin comment prefix whitespace.Feb 2 2022, 7:37 PM

penagos mentioned this in D120188: Fix extraneous whitespace addition in line comments on clang-format directives.Feb 19 2022, 10:44 AM

curdeius mentioned this in rGd9567babef30: Fix extraneous whitespace addition in line comments on clang-format directives.Feb 20 2022, 12:54 PM

Revision Contents

Path

Size

clang/

docs/

ClangFormatStyleOptions.rst

20 lines

include/

clang/

Format/

Format.h

23 lines

lib/

Format/

BreakableToken.h

5 lines

BreakableToken.cpp

53 lines

Format.cpp

16 lines

NamespaceEndCommentsFixer.cpp

9 lines

unittests/

Format/

FormatTest.cpp

20 lines

FormatTestComments.cpp

177 lines

Diff 308135

clang/docs/ClangFormatStyleOptions.rst

Show First 20 Lines • Show All 2,701 Lines • ▼ Show 20 Lines	SpacesInContainerLiterals (``bool``)
ObjC and Javascript array and dict literals).		ObjC and Javascript array and dict literals).

.. code-block:: js		.. code-block:: js

true: false:		true: false:
var arr = [ 1, 2, 3 ]; vs. var arr = [1, 2, 3];		var arr = [ 1, 2, 3 ]; vs. var arr = [1, 2, 3];
f({a : 1, b : 2, c : 3}); f({a: 1, b: 2, c: 3});		f({a : 1, b : 2, c : 3}); f({a: 1, b: 2, c: 3});

		SpacesInLineComments (``SpacesInLineComment``)
		MyDeveloperDayUnsubmitted Done Reply Inline Actions Is this change generated? with clang/doc/tools/dump_style.py or did you hand craft it? ClangFormatStyleOptions.rst is always autogenerated from running dump_style.py, any text you want in the rst needs to be present in Format.h MyDeveloperDay: Is this change generated? with clang/doc/tools/dump_style.py or did you hand craft it?
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Yes it is generated, after you told me what I have to do on D91507 I have (and will in the future) always run that script, I've not touched the file directly. But I also have not really looked at it, because it is generated. If it is a bit odd I can retake a look on other options with nested entries. HazardyKnusperkeks: Yes it is generated, after you told me what I have to do on D91507 I have (and will in the…
		How many spaces are allowed at the start of a line comment. To disable the
		maximum set it to ``-1``, apart from that the maximum takes precedence
		over the minimum.
		Minimum = 1 Maximum = -1
		// One space is forced
		// but more spaces are possible

		Minimum = 0
		Maximum = 0
		//Forces to start every comment directly after the slashes

		Nested configuration flags:


		* ``unsigned Minimum`` The minimum number of spaces at the start of the comment.

		* ``unsigned Maximum`` The maximum number of spaces at the start of the comment.
		MyDeveloperDayUnsubmitted Done Reply Inline Actions I'm personally not a massive fan of stuffing -1 into an unsigned but I understand why its there, if this was an signed and it was actually -1 would the algorithm be substantially worse in your view? MyDeveloperDay: I'm personally not a massive fan of stuffing -1 into an unsigned but I understand why its there…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions I'm no fan if unsigned in any way, but that seems to be the way in clang-format. Making it signed would require a few more checks when to use it, but I don't see any problem in that. I just would also make the Minimum signed then just to be consistent. While parsing the style I would add checks to ensure Minimum is never negative and Maximum is always greater or equal to -1, should that print any warnings? Is there a standard way of doing so? Or should it be just silently corrected? HazardyKnusperkeks: I'm no fan if unsigned in any way, but that seems to be the way in clang-format. Making it…
		krasimirUnsubmitted Done Reply Inline Actions I find it confusing why we have 2, Minimum and Maximum, instead of a single one. I'm not convinced that `Maximum` is useful. Conceptually I'd prefer a single integer option, say `LineCommentContentIndent` that would indicate the default indent used for the content of line comments. I'd naively expect `LineCommentContentIndent =` : 0 would produce `//comment` 1 would produce `// comment` (current default) 2 would produce `// comment`, etc. and this will work with if the input is any of `//comment`, `// comment`, or `// comment`, etc. An additional consideration is that line comment sections often contain additional indentation, e.g. when there is a bullet list, paragraphs, etc. and so we can't guarantee that the indent of each line comment will be less than Maximum in general. I'd expect this feature to not adjust extra indent in comments, e.g., // Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... after reformatting with `LineCommentContentIndent=0` to produce //Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... (and vice-versa, after reformatting with `LineCommentContentIndent=1`). This may well be handled by code, I just wasn't sure by looking at the code and test examples. krasimir: I find it confusing why we have 2, Minimum and Maximum, instead of a single one. I'm not…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions I was actually going for only one value, but while writing the tests I came to the conclusion that before my change is only enforced a minimum of 1. But that very well may be because of what you call line comment sections, I did not consider that. That's why I chose a minimum and maximum. I will modify the patch to one value and will also add tests for the sections. But for that I need to remember if I added or removed spaces, right? Is there already infrastructure for that? Or is there any documentation of the various steps clang-format takes to parse and format code? Until now I tried to understand what's going on through single stepping with the debugger (quite time consuming). HazardyKnusperkeks: I was actually going for only one value, but while writing the tests I came to the conclusion…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions I find it confusing why we have 2, Minimum and Maximum, instead of a single one. I'm not convinced that `Maximum` is useful. Conceptually I'd prefer a single integer option, say `LineCommentContentIndent` that would indicate the default indent used for the content of line comments. I'd naively expect `LineCommentContentIndent =` : 0 would produce `//comment` 1 would produce `// comment` (current default) 2 would produce `// comment`, etc. and this will work with if the input is any of `//comment`, `// comment`, or `// comment`, etc. An additional consideration is that line comment sections often contain additional indentation, e.g. when there is a bullet list, paragraphs, etc. and so we can't guarantee that the indent of each line comment will be less than Maximum in general. I'd expect this feature to not adjust extra indent in comments, e.g., // Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... after reformatting with `LineCommentContentIndent=0` to produce //Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... (and vice-versa, after reformatting with `LineCommentContentIndent=1`). This may well be handled by code, I just wasn't sure by looking at the code and test examples. HazardyKnusperkeks: > I find it confusing why we have 2, Minimum and Maximum, instead of a single one. > I'm not…


SpacesInParentheses (``bool``)		SpacesInParentheses (``bool``)
If ``true``, spaces will be inserted after ``(`` and before ``)``.		If ``true``, spaces will be inserted after ``(`` and before ``)``.

.. code-block:: c++		.. code-block:: c++

true: false:		true: false:
t f( Deleted & ) & = delete; vs. t f(Deleted &) & = delete;		t f( Deleted & ) & = delete; vs. t f(Deleted &) & = delete;

▲ Show 20 Lines • Show All 235 Lines • Show Last 20 Lines

clang/include/clang/Format/Format.h

Show First 20 Lines • Show All 2,276 Lines • ▼ Show 20 Lines

/// If ``true``, spaces may be inserted into C style casts.		/// If ``true``, spaces may be inserted into C style casts.
/// \code		/// \code
/// true: false:		/// true: false:
/// x = ( int32 )y vs. x = (int32)y		/// x = ( int32 )y vs. x = (int32)y
/// \endcode		/// \endcode
bool SpacesInCStyleCastParentheses;		bool SpacesInCStyleCastParentheses;

		/// Control of spaces within a single line comment
		struct SpacesInLineComment {
		/// The minimum number of spaces at the start of the comment.
		unsigned Minimum;
		/// The maximum number of spaces at the start of the comment.
		unsigned Maximum;
		};

		/// How many spaces are allowed at the start of a line comment. To disable the
		/// maximum set it to ``-1``, apart from that the maximum takes precedence
		/// over the minimum.
		/// \code Minimum = 1 Maximum = -1
		/// // One space is forced
		/// // but more spaces are possible
		///
		/// Minimum = 0
		/// Maximum = 0
		/// //Forces to start every comment directly after the slashes
		/// \endcode
		SpacesInLineComment SpacesInLineComments;

/// If ``true``, spaces will be inserted after ``(`` and before ``)``.		/// If ``true``, spaces will be inserted after ``(`` and before ``)``.
/// \code		/// \code
/// true: false:		/// true: false:
/// t f( Deleted & ) & = delete; vs. t f(Deleted &) & = delete;		/// t f( Deleted & ) & = delete; vs. t f(Deleted &) & = delete;
/// \endcode		/// \endcode
bool SpacesInParentheses;		bool SpacesInParentheses;

/// If ``true``, spaces will be inserted after ``[`` and before ``]``.		/// If ``true``, spaces will be inserted after ``[`` and before ``]``.
▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	return AccessModifierOffset == R.AccessModifierOffset &&
R.SpaceBeforeRangeBasedForLoopColon &&		R.SpaceBeforeRangeBasedForLoopColon &&
SpaceInEmptyBlock == R.SpaceInEmptyBlock &&		SpaceInEmptyBlock == R.SpaceInEmptyBlock &&
SpaceInEmptyParentheses == R.SpaceInEmptyParentheses &&		SpaceInEmptyParentheses == R.SpaceInEmptyParentheses &&
SpacesBeforeTrailingComments == R.SpacesBeforeTrailingComments &&		SpacesBeforeTrailingComments == R.SpacesBeforeTrailingComments &&
SpacesInAngles == R.SpacesInAngles &&		SpacesInAngles == R.SpacesInAngles &&
SpacesInConditionalStatement == R.SpacesInConditionalStatement &&		SpacesInConditionalStatement == R.SpacesInConditionalStatement &&
SpacesInContainerLiterals == R.SpacesInContainerLiterals &&		SpacesInContainerLiterals == R.SpacesInContainerLiterals &&
SpacesInCStyleCastParentheses == R.SpacesInCStyleCastParentheses &&		SpacesInCStyleCastParentheses == R.SpacesInCStyleCastParentheses &&
		SpacesInLineComments.Minimum == R.SpacesInLineComments.Minimum &&
		SpacesInLineComments.Maximum == R.SpacesInLineComments.Maximum &&
SpacesInParentheses == R.SpacesInParentheses &&		SpacesInParentheses == R.SpacesInParentheses &&
SpacesInSquareBrackets == R.SpacesInSquareBrackets &&		SpacesInSquareBrackets == R.SpacesInSquareBrackets &&
SpaceBeforeSquareBrackets == R.SpaceBeforeSquareBrackets &&		SpaceBeforeSquareBrackets == R.SpaceBeforeSquareBrackets &&
BitFieldColonSpacing == R.BitFieldColonSpacing &&		BitFieldColonSpacing == R.BitFieldColonSpacing &&
Standard == R.Standard && TabWidth == R.TabWidth &&		Standard == R.Standard && TabWidth == R.TabWidth &&
StatementMacros == R.StatementMacros && UseTab == R.UseTab &&		StatementMacros == R.StatementMacros && UseTab == R.UseTab &&
UseCRLF == R.UseCRLF && TypenameMacros == R.TypenameMacros;		UseCRLF == R.UseCRLF && TypenameMacros == R.TypenameMacros;
}		}
▲ Show 20 Lines • Show All 283 Lines • Show Last 20 Lines

clang/lib/Format/BreakableToken.h

Show First 20 Lines • Show All 465 Lines • ▼ Show 20 Lines	private:
// then the original prefix is "// ".		// then the original prefix is "// ".
SmallVector<StringRef, 16> OriginalPrefix;		SmallVector<StringRef, 16> OriginalPrefix;

// Prefix[i] contains the intended leading "//" with trailing spaces to		// Prefix[i] contains the intended leading "//" with trailing spaces to
// account for the indentation of content within the comment at line i after		// account for the indentation of content within the comment at line i after
// formatting. It can be different than the original prefix when the original		// formatting. It can be different than the original prefix when the original
// line starts like this:		// line starts like this:
// //content		// //content
// Then the original prefix is "//", but the prefix is "// ".		// Then the original prefix is "//", but the prefix could be "// ", if adding
SmallVector<StringRef, 16> Prefix;		// spaces is desired.
		SmallVector<std::string, 16> Prefix;

SmallVector<unsigned, 16> OriginalContentColumn;		SmallVector<unsigned, 16> OriginalContentColumn;

/// The token to which the last line of this breakable token belongs		/// The token to which the last line of this breakable token belongs
/// to; nullptr if that token is the initial token.		/// to; nullptr if that token is the initial token.
///		///
/// The distinction is because if the token of the last line of this breakable		/// The distinction is because if the token of the last line of this breakable
/// token is distinct from the initial token, this breakable token owns the		/// token is distinct from the initial token, this breakable token owns the
/// whitespace before the token of the last line, and the whitespace manager		/// whitespace before the token of the last line, and the whitespace manager
/// must be able to modify it.		/// must be able to modify it.
FormatToken *LastLineTok = nullptr;		FormatToken *LastLineTok = nullptr;
};		};
} // namespace format		} // namespace format
} // namespace clang		} // namespace clang

#endif		#endif

clang/lib/Format/BreakableToken.cpp

Show First 20 Lines • Show All 773 Lines • ▼ Show 20 Lines	for (const FormatToken *CurrentTok = &Tok;
for (size_t i = FirstLineIndex, e = Lines.size(); i < e; ++i) {		for (size_t i = FirstLineIndex, e = Lines.size(); i < e; ++i) {
Lines[i] = Lines[i].ltrim(Blanks);		Lines[i] = Lines[i].ltrim(Blanks);
// We need to trim the blanks in case this is not the first line in a		// We need to trim the blanks in case this is not the first line in a
// multiline comment. Then the indent is included in Lines[i].		// multiline comment. Then the indent is included in Lines[i].
StringRef IndentPrefix =		StringRef IndentPrefix =
getLineCommentIndentPrefix(Lines[i].ltrim(Blanks), Style);		getLineCommentIndentPrefix(Lines[i].ltrim(Blanks), Style);
assert((TokenText.startswith("//") \|\| TokenText.startswith("#")) &&		assert((TokenText.startswith("//") \|\| TokenText.startswith("#")) &&
"unsupported line comment prefix, '//' and '#' are supported");		"unsupported line comment prefix, '//' and '#' are supported");
OriginalPrefix[i] = Prefix[i] = IndentPrefix;		OriginalPrefix[i] = IndentPrefix;
if (Lines[i].size() > Prefix[i].size() &&		const auto SpacesInPrefix =
isAlphanumeric(Lines[i][Prefix[i].size()])) {		std::count(IndentPrefix.begin(), IndentPrefix.end(), ' ');
if (Prefix[i] == "//")
Prefix[i] = "// ";		if (SpacesInPrefix < Style.SpacesInLineComments.Minimum &&
else if (Prefix[i] == "///")		Lines[i].size() > IndentPrefix.size() &&
Prefix[i] = "/// ";		isAlphanumeric(Lines[i][IndentPrefix.size()]) &&
else if (Prefix[i] == "//!")		(Style.Language != FormatStyle::LK_TextProto \|\|
Prefix[i] = "//! ";		OriginalPrefix[i].substr(0, 2) != "##")) {
		MyDeveloperDayUnsubmitted Done Reply Inline Actions is this case covered by a unit test at all? sorry can you explain why you are looking for "##"? MyDeveloperDay: is this case covered by a unit test at all? sorry can you explain why you are looking for "##"?
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions It is covered by multiple tests, that's how I was made aware of it. :) If you look at the code before it only adds a space if the old prefix is "#" not "" which is also found by `getLineCommentIndentPrefix`. As it seems in `TextProto` "" should not be touched. I can of course add a test in my test function. Now I see a change, in the code before "#" was only accepted when the language is `TextProto`, now it is always. But I think for that to happen the parser (or lexer?) should have assigned something starting with"#" as comment, right? But I can change that. HazardyKnusperkeks: It is covered by multiple tests, that's how I was made aware of it. :) If you look at the code…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Okay # # is formatted, I try again: If you look at the code before it only adds a space if the old prefix is "#" not "`##`" which is also found by `getLineCommentIndentPrefix`. As it seems in `TextProto` "`##`" should not be touched. HazardyKnusperkeks: Okay # # is formatted, I try again: If you look at the code before it only adds a space if the…
		krasimirUnsubmitted Done Reply Inline Actions Thanks for the analysis! I wrote the text proto comment detection. I believe the current clang-format is buggy in that it should transform `##comment` into `## comment` for text proto (and similarly for all other `KnownTextProtoPrefixes` in `getLineCommentIndentCommentPrefix`), so this `substr(0, 2) != "##"` is unnecessary and I should go ahead and update and add tests for that. krasimir: Thanks for the analysis! I wrote the text proto comment detection. I believe the current clang…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions In that case I will remove that check, but as said there were many tests which failed without it, I will have to adapt them too. HazardyKnusperkeks: In that case I will remove that check, but as said there were many tests which failed without…
else if (Prefix[i] == "///<")		Prefix[i] = IndentPrefix.str();
Prefix[i] = "///< ";		Prefix[i].append(Style.SpacesInLineComments.Minimum - SpacesInPrefix,
else if (Prefix[i] == "//!<")		' ');
Prefix[i] = "//!< ";		} else if (SpacesInPrefix > Style.SpacesInLineComments.Maximum) {
else if (Prefix[i] == "#" &&		Prefix[i] =
Style.Language == FormatStyle::LK_TextProto)		IndentPrefix
Prefix[i] = "# ";		.drop_back(SpacesInPrefix - Style.SpacesInLineComments.Maximum)
		MyDeveloperDayUnsubmitted Done Reply Inline Actions my assumption is when Maximum is -1 this is a very large -ve number correct? is that defined behavior for drop_back() MyDeveloperDay: my assumption is when Maximum is -1 this is a very large -ve number correct? is that defined…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Since we check beforehand SpacesInPrefix is larger than Maximum there is no problem. HazardyKnusperkeks: Since we check beforehand SpacesInPrefix is larger than Maximum there is no problem.
		MyDeveloperDayUnsubmitted Done Reply Inline Actions yep sorry didn't see that. MyDeveloperDay: yep sorry didn't see that.
		.str();
		} else {
		Prefix[i] = IndentPrefix.str();
}		}

Tokens[i] = LineTok;		Tokens[i] = LineTok;
Content[i] = Lines[i].substr(IndentPrefix.size());		Content[i] = Lines[i].substr(IndentPrefix.size());
OriginalContentColumn[i] =		OriginalContentColumn[i] =
StartColumn + encoding::columnWidthWithTabs(OriginalPrefix[i],		StartColumn + encoding::columnWidthWithTabs(OriginalPrefix[i],
StartColumn,		StartColumn,
Style.TabWidth, Encoding);		Style.TabWidth, Encoding);
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	Whitespaces.replaceWhitespace(*Tokens[LineIndex],
/Newlines=/1,		/Newlines=/1,
/Spaces=/LineColumn,		/Spaces=/LineColumn,
/StartOfTokenColumn=/LineColumn,		/StartOfTokenColumn=/LineColumn,
/IsAligned=/true,		/IsAligned=/true,
/InPPDirective=/false);		/InPPDirective=/false);
}		}
if (OriginalPrefix[LineIndex] != Prefix[LineIndex]) {		if (OriginalPrefix[LineIndex] != Prefix[LineIndex]) {
// Adjust the prefix if necessary.		// Adjust the prefix if necessary.
		bool RemoveWhitespace =
// Take care of the space possibly introduced after a decoration.		OriginalPrefix[LineIndex].size() > Prefix[LineIndex].size();
assert(Prefix[LineIndex] == (OriginalPrefix[LineIndex] + " ").str() &&		int SpacesToRemove = RemoveWhitespace ? OriginalPrefix[LineIndex].size() -
"Expecting a line comment prefix to differ from original by at most "		Prefix[LineIndex].size()
"a space");		: 0;
		int SpacesToAdd = RemoveWhitespace ? 0
		: Prefix[LineIndex].size() -
		OriginalPrefix[LineIndex].size();
Whitespaces.replaceWhitespaceInToken(		Whitespaces.replaceWhitespaceInToken(
tokenAt(LineIndex), OriginalPrefix[LineIndex].size(), 0, "", "",		tokenAt(LineIndex), OriginalPrefix[LineIndex].size() - SpacesToRemove,
/InPPDirective=/false, /Newlines=/0, /Spaces=/1);		/ReplaceChars=/SpacesToRemove, "", "", -/InPPDirective=/false,
		/Newlines=/0, /Spaces=/SpacesToAdd);
}		}
}		}

void BreakableLineCommentSection::updateNextToken(LineState &State) const {		void BreakableLineCommentSection::updateNextToken(LineState &State) const {
if (LastLineTok) {		if (LastLineTok) {
State.NextToken = LastLineTok->Next;		State.NextToken = LastLineTok->Next;
}		}
}		}
Show All 23 Lines

clang/lib/Format/Format.cpp

Show First 20 Lines • Show All 618 Lines • ▼ Show 20 Lines	IO.mapOptional("SpacesBeforeTrailingComments",
Style.SpacesBeforeTrailingComments);		Style.SpacesBeforeTrailingComments);
IO.mapOptional("SpacesInAngles", Style.SpacesInAngles);		IO.mapOptional("SpacesInAngles", Style.SpacesInAngles);
IO.mapOptional("SpacesInConditionalStatement",		IO.mapOptional("SpacesInConditionalStatement",
Style.SpacesInConditionalStatement);		Style.SpacesInConditionalStatement);
IO.mapOptional("SpacesInContainerLiterals",		IO.mapOptional("SpacesInContainerLiterals",
Style.SpacesInContainerLiterals);		Style.SpacesInContainerLiterals);
IO.mapOptional("SpacesInCStyleCastParentheses",		IO.mapOptional("SpacesInCStyleCastParentheses",
Style.SpacesInCStyleCastParentheses);		Style.SpacesInCStyleCastParentheses);
		IO.mapOptional("SpacesInLineComments", Style.SpacesInLineComments);
IO.mapOptional("SpacesInParentheses", Style.SpacesInParentheses);		IO.mapOptional("SpacesInParentheses", Style.SpacesInParentheses);
IO.mapOptional("SpacesInSquareBrackets", Style.SpacesInSquareBrackets);		IO.mapOptional("SpacesInSquareBrackets", Style.SpacesInSquareBrackets);
IO.mapOptional("SpaceBeforeSquareBrackets",		IO.mapOptional("SpaceBeforeSquareBrackets",
Style.SpaceBeforeSquareBrackets);		Style.SpaceBeforeSquareBrackets);
IO.mapOptional("BitFieldColonSpacing", Style.BitFieldColonSpacing);		IO.mapOptional("BitFieldColonSpacing", Style.BitFieldColonSpacing);
IO.mapOptional("Standard", Style.Standard);		IO.mapOptional("Standard", Style.Standard);
IO.mapOptional("StatementMacros", Style.StatementMacros);		IO.mapOptional("StatementMacros", Style.StatementMacros);
IO.mapOptional("TabWidth", Style.TabWidth);		IO.mapOptional("TabWidth", Style.TabWidth);
Show All 33 Lines	static void mapping(IO &IO, FormatStyle::RawStringFormat &Format) {
IO.mapOptional("Language", Format.Language);		IO.mapOptional("Language", Format.Language);
IO.mapOptional("Delimiters", Format.Delimiters);		IO.mapOptional("Delimiters", Format.Delimiters);
IO.mapOptional("EnclosingFunctions", Format.EnclosingFunctions);		IO.mapOptional("EnclosingFunctions", Format.EnclosingFunctions);
IO.mapOptional("CanonicalDelimiter", Format.CanonicalDelimiter);		IO.mapOptional("CanonicalDelimiter", Format.CanonicalDelimiter);
IO.mapOptional("BasedOnStyle", Format.BasedOnStyle);		IO.mapOptional("BasedOnStyle", Format.BasedOnStyle);
}		}
};		};

		template <> struct MappingTraits<FormatStyle::SpacesInLineComment> {
		static void mapping(IO &IO, FormatStyle::SpacesInLineComment &Space) {
		// Transform the maximum to signed, to parse "-1" correctly
		int signedMaximum = static_cast<int>(Space.Maximum);
		IO.mapOptional("Minimum", Space.Minimum);
		IO.mapOptional("Maximum", signedMaximum);
		Space.Maximum = static_cast<unsigned>(signedMaximum);

		if (Space.Maximum != -1u) {
		Space.Minimum = std::min(Space.Minimum, Space.Maximum);
		}
		}
		};

// Allows to read vector<FormatStyle> while keeping default values.		// Allows to read vector<FormatStyle> while keeping default values.
// IO.getContext() should contain a pointer to the FormatStyle structure, that		// IO.getContext() should contain a pointer to the FormatStyle structure, that
// will be used to get default values for missing keys.		// will be used to get default values for missing keys.
// If the first element has no Language specified, it will be treated as the		// If the first element has no Language specified, it will be treated as the
// default one for the following elements.		// default one for the following elements.
template <> struct DocumentListTraits<std::vector<FormatStyle>> {		template <> struct DocumentListTraits<std::vector<FormatStyle>> {
static size_t size(IO &IO, std::vector<FormatStyle> &Seq) {		static size_t size(IO &IO, std::vector<FormatStyle> &Seq) {
return Seq.size();		return Seq.size();
▲ Show 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	FormatStyle getLLVMStyle(FormatStyle::LanguageKind Language) {
LLVMStyle.UseTab = FormatStyle::UT_Never;		LLVMStyle.UseTab = FormatStyle::UT_Never;
LLVMStyle.ReflowComments = true;		LLVMStyle.ReflowComments = true;
LLVMStyle.SpacesInParentheses = false;		LLVMStyle.SpacesInParentheses = false;
LLVMStyle.SpacesInSquareBrackets = false;		LLVMStyle.SpacesInSquareBrackets = false;
LLVMStyle.SpaceInEmptyBlock = false;		LLVMStyle.SpaceInEmptyBlock = false;
LLVMStyle.SpaceInEmptyParentheses = false;		LLVMStyle.SpaceInEmptyParentheses = false;
LLVMStyle.SpacesInContainerLiterals = true;		LLVMStyle.SpacesInContainerLiterals = true;
LLVMStyle.SpacesInCStyleCastParentheses = false;		LLVMStyle.SpacesInCStyleCastParentheses = false;
		LLVMStyle.SpacesInLineComments = {/Minimum=/1, /Maximum=/-1u};
		curdeiusUnsubmitted Done Reply Inline Actions I don't know precisely the LLVM style but does it allow more than one space (as Maximum would suggest)? Are there any tests covering that? And what about other styles, no need to set min/max for them? curdeius: I don't know precisely the LLVM style but does it allow more than one space (as Maximum would…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions The part with the LLVM Style from my test case did run exactly so without any modification, so yes it allows more than one space. Since there was no option before (that I'm aware of) all other styles behaved exactly like that. I did not check the style guides if they say anything about that, I just preserved the old behavior when nothing is configured. HazardyKnusperkeks: The part with the LLVM Style from my test case did run exactly so without any modification, so…
		curdeiusUnsubmitted Done Reply Inline Actions Ok. Great curdeius: Ok. Great
LLVMStyle.SpaceAfterCStyleCast = false;		LLVMStyle.SpaceAfterCStyleCast = false;
LLVMStyle.SpaceAfterLogicalNot = false;		LLVMStyle.SpaceAfterLogicalNot = false;
LLVMStyle.SpaceAfterTemplateKeyword = true;		LLVMStyle.SpaceAfterTemplateKeyword = true;
LLVMStyle.SpaceAroundPointerQualifiers = FormatStyle::SAPQ_Default;		LLVMStyle.SpaceAroundPointerQualifiers = FormatStyle::SAPQ_Default;
LLVMStyle.SpaceBeforeCtorInitializerColon = true;		LLVMStyle.SpaceBeforeCtorInitializerColon = true;
LLVMStyle.SpaceBeforeInheritanceColon = true;		LLVMStyle.SpaceBeforeInheritanceColon = true;
LLVMStyle.SpaceBeforeParens = FormatStyle::SBPO_ControlStatements;		LLVMStyle.SpaceBeforeParens = FormatStyle::SBPO_ControlStatements;
LLVMStyle.SpaceBeforeRangeBasedForLoopColon = true;		LLVMStyle.SpaceBeforeRangeBasedForLoopColon = true;
▲ Show 20 Lines • Show All 1,967 Lines • Show Last 20 Lines

clang/lib/Format/NamespaceEndCommentsFixer.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	while (Tok && !Tok->is(tok::l_brace)) {
name += " ";		name += " ";
Tok = Tok->getNextNonComment();		Tok = Tok->getNextNonComment();
}		}
}		}
return name;		return name;
}		}

std::string computeEndCommentText(StringRef NamespaceName, bool AddNewline,		std::string computeEndCommentText(StringRef NamespaceName, bool AddNewline,
const FormatToken *NamespaceTok) {		const FormatToken *NamespaceTok,
		unsigned SpacesToAdd) {
std::string text = "// ";		std::string text = "//";
		text.append(SpacesToAdd, ' ');
text += NamespaceTok->TokenText;		text += NamespaceTok->TokenText;
if (NamespaceTok->is(TT_NamespaceMacro))		if (NamespaceTok->is(TT_NamespaceMacro))
text += "(";		text += "(";
else if (!NamespaceName.empty())		else if (!NamespaceName.empty())
text += ' ';		text += ' ';
text += NamespaceName;		text += NamespaceName;
if (NamespaceTok->is(TT_NamespaceMacro))		if (NamespaceTok->is(TT_NamespaceMacro))
text += ")";		text += ")";
▲ Show 20 Lines • Show All 194 Lines • ▼ Show 20 Lines	for (size_t I = 0, E = AnnotatedLines.size(); I != E; ++I) {
if (EndCommentNextTok && EndCommentNextTok->is(tok::comment))		if (EndCommentNextTok && EndCommentNextTok->is(tok::comment))
EndCommentNextTok = EndCommentNextTok->Next;		EndCommentNextTok = EndCommentNextTok->Next;
if (!EndCommentNextTok && I + 1 < E)		if (!EndCommentNextTok && I + 1 < E)
EndCommentNextTok = AnnotatedLines[I + 1]->First;		EndCommentNextTok = AnnotatedLines[I + 1]->First;
bool AddNewline = EndCommentNextTok &&		bool AddNewline = EndCommentNextTok &&
EndCommentNextTok->NewlinesBefore == 0 &&		EndCommentNextTok->NewlinesBefore == 0 &&
EndCommentNextTok->isNot(tok::eof);		EndCommentNextTok->isNot(tok::eof);
const std::string EndCommentText =		const std::string EndCommentText =
computeEndCommentText(NamespaceName, AddNewline, NamespaceTok);		computeEndCommentText(NamespaceName, AddNewline, NamespaceTok,
		Style.SpacesInLineComments.Minimum);
if (!hasEndComment(EndCommentPrevTok)) {		if (!hasEndComment(EndCommentPrevTok)) {
bool isShort = I - StartLineIndex <= kShortNamespaceMaxLines + 1;		bool isShort = I - StartLineIndex <= kShortNamespaceMaxLines + 1;
if (!isShort)		if (!isShort)
addEndComment(EndCommentPrevTok, EndCommentText, SourceMgr, &Fixes);		addEndComment(EndCommentPrevTok, EndCommentText, SourceMgr, &Fixes);
} else if (!validEndComment(EndCommentPrevTok, NamespaceName,		} else if (!validEndComment(EndCommentPrevTok, NamespaceName,
NamespaceTok)) {		NamespaceTok)) {
updateEndComment(EndCommentPrevTok, EndCommentText, SourceMgr, &Fixes);		updateEndComment(EndCommentPrevTok, EndCommentText, SourceMgr, &Fixes);
}		}
StartLineIndex = SIZE_MAX;		StartLineIndex = SIZE_MAX;
}		}
return {Fixes, 0};		return {Fixes, 0};
}		}

} // namespace format		} // namespace format
} // namespace clang		} // namespace clang

clang/unittests/Format/FormatTest.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 14,565 Lines • ▼ Show 20 Lines	CHECK_PARSE("RawStringFormats:\n"
" Delimiters:\n"		" Delimiters:\n"
" - 'cc'\n"		" - 'cc'\n"
" - 'cpp'\n"		" - 'cpp'\n"
" EnclosingFunctions:\n"		" EnclosingFunctions:\n"
" - 'C_CODEBLOCK'\n"		" - 'C_CODEBLOCK'\n"
" - 'CPPEVAL'\n"		" - 'CPPEVAL'\n"
" CanonicalDelimiter: 'cc'",		" CanonicalDelimiter: 'cc'",
RawStringFormats, ExpectedRawStringFormats);		RawStringFormats, ExpectedRawStringFormats);

		CHECK_PARSE("SpacesInLineComments:\n"
		" Minimum: 0\n"
		" Maximum: 0",
		SpacesInLineComments.Minimum, 0u);
		EXPECT_EQ(Style.SpacesInLineComments.Maximum, 0u);
		Style.SpacesInLineComments.Minimum = 1;
		CHECK_PARSE("SpacesInLineComments:\n"
		" Minimum: 2",
		SpacesInLineComments.Minimum, 0u);
		CHECK_PARSE("SpacesInLineComments:\n"
		" Maximum: -1",
		SpacesInLineComments.Maximum, -1u);
		CHECK_PARSE("SpacesInLineComments:\n"
		" Minimum: 2",
		SpacesInLineComments.Minimum, 2u);
		CHECK_PARSE("SpacesInLineComments:\n"
		" Maximum: 1",
		SpacesInLineComments.Maximum, 1u);
		EXPECT_EQ(Style.SpacesInLineComments.Minimum, 1u);
}		}

TEST_F(FormatTest, ParsesConfigurationWithLanguages) {		TEST_F(FormatTest, ParsesConfigurationWithLanguages) {
FormatStyle Style = {};		FormatStyle Style = {};
Style.Language = FormatStyle::LK_Cpp;		Style.Language = FormatStyle::LK_Cpp;
CHECK_PARSE("Language: Cpp\n"		CHECK_PARSE("Language: Cpp\n"
"IndentWidth: 12",		"IndentWidth: 12",
IndentWidth, 12u);		IndentWidth, 12u);
▲ Show 20 Lines • Show All 2,690 Lines • Show Last 20 Lines

clang/unittests/Format/FormatTestComments.cpp

Show First 20 Lines • Show All 3,126 Lines • ▼ Show 20 Lines	EXPECT_EQ("id {\n"
"}",		"}",
format("id {k:val#comment comment\n"		format("id {k:val#comment comment\n"
"# line line\n"		"# line line\n"
"a:1}",		"a:1}",
getTextProtoStyleWithColumns(20)));		getTextProtoStyleWithColumns(20)));
// Aligns trailing comments.		// Aligns trailing comments.
EXPECT_EQ("k: val # commen1\n"		EXPECT_EQ("k: val # commen1\n"
" # commen2\n"		" # commen2\n"
" # commen3\n"		" # commen3\n"
"# commen4\n"		"# commen4\n"
"a: 1 # commen5\n"		"a: 1 # commen5\n"
" # commen6\n"		" # commen6\n"
" # commen7",		" # commen7",
format("k:val#commen1 commen2\n"		format("k:val#commen1 commen2\n"
" # commen3\n"		" # commen3\n"
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Here the test fails, because `commen1` gets a space added and `commen3` belongs to the same section, thus also gets an additional space. I see three options: The whole keeping indentation in a section is wrong. Disable the mechanic for text proto. Adapt the test. HazardyKnusperkeks: Here the test fails, because `commen1` gets a space added and `commen3` belongs to the same…
		krasimirUnsubmitted Done Reply Inline Actions This test change looks OK. krasimir: This test change looks OK.
"# commen4\n"		"# commen4\n"
"a:1#commen5 commen6\n"		"a:1#commen5 commen6\n"
" #commen7",		" #commen7",
getTextProtoStyleWithColumns(20)));		getTextProtoStyleWithColumns(20)));
}		}

TEST_F(FormatTestComments, BreaksBeforeTrailingUnbreakableSequence) {		TEST_F(FormatTestComments, BreaksBeforeTrailingUnbreakableSequence) {
// The end of /* trail */ is exactly at 80 columns, but the unbreakable		// The end of /* trail */ is exactly at 80 columns, but the unbreakable
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	EXPECT_EQ("/**\n"
" * to break}\n"		" * to break}\n"
" */\n",		" */\n",
format("/**\n"		format("/**\n"
" * @param {l1 long1 to break}\n"		" * @param {l1 long1 to break}\n"
" */\n",		" */\n",
JSStyle20));		JSStyle20));
}		}

		TEST_F(FormatTestComments, SpaceAtLineCommentBegin) {
		FormatStyle Style = getLLVMStyle();
		StringRef NoTextInComment = " // \n"
		"\n"
		"void foo() {// \n"
		"// \n"
		"}";

		EXPECT_EQ("//\n"
		"\n"
		"void foo() { //\n"
		" //\n"
		"}",
		format(NoTextInComment, Style));

		Style.SpacesInLineComments.Minimum = 0;
		EXPECT_EQ("//\n"
		"\n"
		"void foo() { //\n"
		" //\n"
		"}",
		format(NoTextInComment, Style));

		Style = getLLVMStyle();
		StringRef Code = "//Free comment without space\n"
		"// Free comment with 3 spaces\n"
		"///Free Doxygen without space\n"
		"/// Free Doxygen with 3 spaces\n"
		"namespace Foo {\n"
		"bool bar(bool b) {\n"
		" bool ret1 = true; ///<Doxygenstyle without space\n"
		" bool ret2 = true; ///< Doxygenstyle with 3 spaces\n"
		" if (b) {\n"
		" //Foo\n"
		" // In function comment\n"
		" ret2 = false;\n"
		" } // End of if\n"
		" return ret1 && ret2;\n"
		"}\n"
		"}\n"
		"\n"
		"namespace Bar {\n"
		"int foo();\n"
		"} // namespace Bar\n"
		"//@Nothing added because of the non ascii char\n"
		"//@ Nothing removed because of the non ascii char\n";

		EXPECT_EQ("// Free comment without space\n"
		"// Free comment with 3 spaces\n"
		"/// Free Doxygen without space\n"
		"/// Free Doxygen with 3 spaces\n"
		"namespace Foo {\n"
		"bool bar(bool b) {\n"
		" bool ret1 = true; ///< Doxygenstyle without space\n"
		" bool ret2 = true; ///< Doxygenstyle with 3 spaces\n"
		" if (b) {\n"
		" // Foo\n"
		" // In function comment\n"
		" ret2 = false;\n"
		" } // End of if\n"
		" return ret1 && ret2;\n"
		"}\n"
		"} // namespace Foo\n"
		"\n"
		"namespace Bar {\n"
		"int foo();\n"
		"} // namespace Bar\n"
		"//@Nothing added because of the non ascii char\n"
		"//@ Nothing removed because of the non ascii char\n",
		format(Code, Style));

		Style.SpacesInLineComments = {0, 0};
		EXPECT_EQ("//Free comment without space\n"
		"//Free comment with 3 spaces\n"
		"///Free Doxygen without space\n"
		"///Free Doxygen with 3 spaces\n"
		"namespace Foo {\n"
		"bool bar(bool b) {\n"
		" bool ret1 = true; ///<Doxygenstyle without space\n"
		" bool ret2 = true; ///<Doxygenstyle with 3 spaces\n"
		" if (b) {\n"
		" //Foo\n"
		" //In function comment\n"
		" ret2 = false;\n"
		" } //End of if\n"
		" return ret1 && ret2;\n"
		"}\n"
		"} //namespace Foo\n"
		"\n"
		"namespace Bar {\n"
		"int foo();\n"
		"} //namespace Bar\n"
		"//@Nothing added because of the non ascii char\n"
		"//@ Nothing removed because of the non ascii char\n",
		format(Code, Style));

		Style.SpacesInLineComments = {2, -1u};
		EXPECT_EQ("// Free comment without space\n"
		"// Free comment with 3 spaces\n"
		"/// Free Doxygen without space\n"
		"/// Free Doxygen with 3 spaces\n"
		"namespace Foo {\n"
		"bool bar(bool b) {\n"
		" bool ret1 = true; ///< Doxygenstyle without space\n"
		" bool ret2 = true; ///< Doxygenstyle with 3 spaces\n"
		" if (b) {\n"
		" // Foo\n"
		" // In function comment\n"
		" ret2 = false;\n"
		" } // End of if\n"
		" return ret1 && ret2;\n"
		"}\n"
		"} // namespace Foo\n"
		"\n"
		"namespace Bar {\n"
		"int foo();\n"
		"} // namespace Bar\n"
		"//@Nothing added because of the non ascii char\n"
		"//@ Nothing removed because of the non ascii char\n",
		format(Code, Style));

		Style = getLLVMStyleWithColumns(20);
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Here is the difference. Before this would have been formatted as // if (ret1) { // return 2; //} So only one space added for the `if`, it did not keep the indentation of the `return` and not adding a space to `}`. I think this is much better and also basically what @krasimir requested. HazardyKnusperkeks: Here is the difference. Before this would have been formatted as ``` // if (ret1) { // return…
		StringRef WrapCode = "//Lorem ipsum dolor sit amet\n"
		"\n"
		"// Lorem ipsum dolor sit amet\n"
		"\n"
		"void f() {//Hello World\n"
		"}";

		EXPECT_EQ("// Lorem ipsum dolor\n"
		"// sit amet\n"
		"\n"
		"// Lorem ipsum\n"
		"// dolor sit amet\n" // Why are here the spaces dropped?
		krasimirUnsubmitted Done Reply Inline Actions This is desired, AFAIK, and due to the normalization behavior while reflowing: when a comment line exceeds the comment limit and is broken up into a new line, the full range of blanks is replaced with a newline. (https://github.com/llvm/llvm-project/blob/ddb002d7c74c038b64dd9d3c3e4a4b58795cf1a6/clang/lib/Format/BreakableToken.cpp#L66). Note that reflowing copies the extra indent of the line, e.g., // line limit V // heading // line is // long long long long get reformatted as // line limit V // heading // line is // long long // long long so if for ranges of blanks longer of size S>1 we copied the (S-1) blanks at the beginning of the next line, we would have cascading comment reflows undesired with longer and longer indents. krasimir: This is desired, AFAIK, and due to the normalization behavior while reflowing: when a comment…
		HazardyKnusperkeksAuthorUnsubmitted Done Reply Inline Actions Okay, I mean the spaced between `sit` and `amet`, while the spaces between `Lorem` and `ipsum`, and `dolor` and `sit` is kept. HazardyKnusperkeks: Okay, I mean the spaced between `sit` and `amet`, while the spaces between `Lorem` and `ipsum`…
		"\n"
		"void f() { // Hello\n"
		" // World\n"
		"}",
		format(WrapCode, Style));

		Style.SpacesInLineComments = {0, 0};
		EXPECT_EQ("//Lorem ipsum dolor\n"
		"//sit amet\n"
		"\n"
		"//Lorem ipsum\n"
		"//dolor sit amet\n"
		"\n"
		"void f() { //Hello\n"
		" //World\n"
		"}",
		format(WrapCode, Style));

		Style.SpacesInLineComments = {1, 1};
		EXPECT_EQ("// Lorem ipsum dolor\n"
		"// sit amet\n"
		"\n"
		"// Lorem ipsum\n"
		"// dolor sit amet\n"
		"\n"
		"void f() { // Hello\n"
		" // World\n"
		"}",
		format(WrapCode, Style));

		Style.SpacesInLineComments = {3, 3};
		EXPECT_EQ("// Lorem ipsum\n"
		"// dolor sit amet\n"
		"\n"
		"// Lorem ipsum\n"
		"// dolor sit amet\n"
		"\n"
		"void f() { // Hello\n"
		" // World\n"
		"}",
		format(WrapCode, Style));
		}

} // end namespace		} // end namespace
} // end namespace format		} // end namespace format
} // end namespace clang		} // end namespace clang

This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Add option to control the space at the front of a line commentClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 308135

clang/docs/ClangFormatStyleOptions.rst

clang/include/clang/Format/Format.h

clang/lib/Format/BreakableToken.h

clang/lib/Format/BreakableToken.cpp

clang/lib/Format/Format.cpp

clang/lib/Format/NamespaceEndCommentsFixer.cpp

clang/unittests/Format/FormatTest.cpp

clang/unittests/Format/FormatTestComments.cpp

[clang-format] Add option to control the space at the front of a line comment
ClosedPublic