Adding a minimum and a maximum number of allowed spaces in the line comment prefix.
The both values are needed for what I intend (a maximum of 0), and keeping the old behavior (only a minimum of 1).
Details
- Reviewers
MyDeveloperDay klimek krasimir curdeius
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3064 | I'm personally not a massive fan of stuffing -1 into an unsigned but I understand why its there, if this was an signed and it was actually -1 would the algorithm be substantially worse in your view? | |
clang/lib/Format/BreakableToken.cpp | ||
794 | my assumption is when Maximum is -1 this is a very large -ve number correct? is that defined behavior for drop_back() |
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3064 | I'm no fan if unsigned in any way, but that seems to be the way in clang-format. While parsing the style I would add checks to ensure Minimum is never negative and Maximum is always greater or equal to -1, should that print any warnings? Is there a standard way of doing so? Or should it be just silently corrected? | |
clang/lib/Format/BreakableToken.cpp | ||
794 | Since we check beforehand SpacesInPrefix is larger than Maximum there is no problem. |
I think fundamentally from my perspective this seem ok, out of interest can I ask what drove you to require it?
My assumption is that some people write comments like
// Free comment without space
and you want to be able to consistently format it to be (N spaces, as clang-format already does 1 space correct?)
// Free comment without space
is that correct? is there a common style guide asking for that? what is the rationale
clang/lib/Format/BreakableToken.cpp | ||
---|---|---|
787 | is this case covered by a unit test at all? sorry can you explain why you are looking for "##"? | |
794 | yep sorry didn't see that. |
I will go for {0,0}, so no space between // and the text, I don't know about a style guide asking for it, other than my own. (Which I can dictate in my company.) :)
I have just recently started using clang-format and it does not everything the the way I want to, on some aspects I have adapted, but on others I try to "fix" it, you can expect some more changes from me in the next time.
clang/lib/Format/BreakableToken.cpp | ||
---|---|---|
787 | It is covered by multiple tests, that's how I was made aware of it. :) Now I see a change, in the code before "#" was only accepted when the language is TextProto, now it is always. But I think for that to happen the parser (or lexer?) should have assigned something starting with"#" as comment, right? But I can change that. |
clang/lib/Format/BreakableToken.cpp | ||
---|---|---|
787 | Okay # # is formatted, I try again: |
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3047 | Is this change generated? with clang/doc/tools/dump_style.py or did you hand craft it? ClangFormatStyleOptions.rst is always autogenerated from running dump_style.py, any text you want in the rst needs to be present in Format.h |
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3047 | Yes it is generated, after you told me what I have to do on D91507 I have (and will in the future) always run that script, I've not touched the file directly. But I also have not really looked at it, because it is generated. If it is a bit odd I can retake a look on other options with nested entries. |
clang/lib/Format/Format.cpp | ||
---|---|---|
976 | I don't know precisely the LLVM style but does it allow more than one space (as Maximum would suggest)? |
clang/lib/Format/Format.cpp | ||
---|---|---|
976 | The part with the LLVM Style from my test case did run exactly so without any modification, so yes it allows more than one space. Since there was no option before (that I'm aware of) all other styles behaved exactly like that. I did not check the style guides if they say anything about that, I just preserved the old behavior when nothing is configured. |
Yes I do. But I have a question, my last change is commited in your name, that means git blame would blame it on you, right?
You can set me as author:
Björn Schäpers <bjoern@hazardy.de>
My Github Account is also called HazardyKnusperkeks.
The process is that you add (https://llvm.org/docs/Contributing.html)
Patch By: HazardyKnusperkeks
to the commit message if the user doesn't have commit access, if you want your name against the blame then I recommend applying for commit access yourself.
let me know if you still want me to land this
That is incorrect and does not represent the nowadays reality, i suggest that you look up the docs.
let me know if you still want me to land this
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3064 | I find it confusing why we have 2, Minimum and Maximum, instead of a single one.
and this will work with if the input is any of //comment, // comment, or // comment, etc. An additional consideration is that line comment sections often contain additional indentation, e.g. when there is a bullet list, paragraphs, etc. and so we can't guarantee that the indent of each line comment will be less than Maximum in general. I'd expect this feature to not adjust extra indent in comments, e.g., // Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... after reformatting with LineCommentContentIndent=0 to produce //Lorem ipsum dolor sit amet, // consectetur adipiscing elit, // ... (and vice-versa, after reformatting with LineCommentContentIndent=1). | |
clang/lib/Format/BreakableToken.cpp | ||
787 | Thanks for the analysis! | |
clang/unittests/Format/FormatTestComments.cpp | ||
3411 | This is desired, AFAIK, and due to the normalization behavior while reflowing: when a comment line exceeds the comment limit and is broken up into a new line, the full range of blanks is replaced with a newline. (https://github.com/llvm/llvm-project/blob/ddb002d7c74c038b64dd9d3c3e4a4b58795cf1a6/clang/lib/Format/BreakableToken.cpp#L66). // line limit V // heading // *line is // long long long long get reformatted as // line limit V // heading // *line is // long long // long long so if for ranges of blanks longer of size S>1 we copied the (S-1) blanks at the beginning of the next line, we would have cascading comment reflows undesired with longer and longer indents. |
I tend to agree with @krasimir I don't see where you really use Maximum to mean anything, the nested configuration seems perhaps unnecessarily confusing?
What/Where are the docs? I read https://llvm.org/docs/Contributing.html before hand and just now https://llvm.org/docs/CodeReview.html
I will update the patch, but that won't happen before the weekend.
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3064 | I was actually going for only one value, but while writing the tests I came to the conclusion that before my change is only enforced a minimum of 1. But that very well may be because of what you call line comment sections, I did not consider that. That's why I chose a minimum and maximum. I will modify the patch to one value and will also add tests for the sections. But for that I need to remember if I added or removed spaces, right? Is there already infrastructure for that? Or is there any documentation of the various steps clang-format takes to parse and format code? Until now I tried to understand what's going on through single stepping with the debugger (quite time consuming). | |
clang/lib/Format/BreakableToken.cpp | ||
787 | In that case I will remove that check, but as said there were many tests which failed without it, I will have to adapt them too. | |
clang/unittests/Format/FormatTestComments.cpp | ||
3411 | Okay, I mean the spaced between sit and amet, while the spaces between Lorem and ipsum, and dolor and sit is kept. |
My comment was addressed at @MyDeveloperDay
https://llvm.org/docs/DeveloperPolicy.html#commit-messages
If you’re not the original author, ensure the ‘Author’ property of the commit is set to the original author and the ‘Committer’ property is set to yourself. You can use a command similar to git commit --amend --author="John Doe <jdoe@llvm.org>" to correct the author property if it is incorrect. See Attribution of Changes for more information including the method we used for attribution before the project migrated to git.
IOW @HazardyKnusperkeks's request was correct.
I will update the patch, but that won't happen before the weekend.
Yes I agree I hadn’t seen that the process had changed,
This is one reason why I don’t like landing patches for others, this just confirms that in the future I will generally request people apply for commit access themselves.
And where do I do that? Also I did not think I would not have a chance of getting the access so early.
And where do I do that? Also I did not think I would not have a chance of getting the access so early.
https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access
While rewriting the tests with the unmodified clang-format I just confirmed that currently only the minimum of 1 is enforced, there is no maximum. I.e.
//x int x; // y int y;
will be formatted as
// x int x; // y int y;
So for what I want to do I can:
- Enforce the Minimum (for LLVM Style 1)
- Enforce an optional Maximum (but what if the Maximum is set to 0, what I want to do, given that the Minimum is 1?)
- Stay with the current design of a Minimum/Maximum Pair with (practically unbounded Maximum)
I would choose #3, in that case I only have to add a test for line comment sections (and most likely adapt the implementation).
/// A List: /// * Foo /// * Bar
This didn't really address the comments, what is the point of the maximum? what if the maximum is > the ColumnLimit?
My goal is to remove all spaces between // and the content (with the exception of comment sections, as I have learned here), and do not break the current behavior in any way, and currently it seems to work with an unlimited maximum, and a minimum of 1.
Most likely what ever happens now if the space in the comment is larger than the ColumnLimit. But one could just remove spaces as needed.
A more interesting question would be what happens if the minimum is larger than the ColumnLimit?. For that one had to decide which is more important, I would go with the ColumnLimit and reduce the minimum, but maybe that could be handled with the penalties? Although I have to admit that I don't understand where and how they are used.
I'm back!
I've reworked the change to correctly(*) work with line comment sections.
*: That is to be discussed, currently there is one change in behavior which is not covered through previous tests and one test which is failing. I will highlight that in inline comments.
Current behavior if there are more spaces than ColumnLimit: Do not format the comment at all. Now if the minimum is larger than the ColumnLimit we will obey the limit and then normal behavior kicks in, this is also covered in the tests.
clang/docs/ClangFormatStyleOptions.rst | ||
---|---|---|
3064 |
| |
clang/unittests/Format/FormatTestComments.cpp | ||
3141–3147 | Here the test fails, because commen1 gets a space added and commen3 belongs to the same section, thus also gets an additional space.
| |
3399 | Here is the difference. Before this would have been formatted as // if (ret1) { // return 2; //} So only one space added for the if, it did not keep the indentation of the return and not adding a space to }. |
My assumption is that you want to stick with the minimum and maximum is that correct?
Otherwise I have to make a breaking change, or not achieve at all what I want. So either abandon this or we need a minimum and maximum.
Although right now I have changed an existing test because in that case the behavior changed (as noted inline) and a test case still to decide how to advance.
If that little break is not acceptable I see no further base to pursue this and have to decide to drop it totally or only apply it locally.
And thanks for bringing this up again, currently I have very little time to work on clang-format and only react on the mails. Even while I have many things I want to add/change or cases where it is currently misformatted.