This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Add the possibility to align assignments spanning empty lines or comments
ClosedPublic

Authored by tinloaf on Jan 3 2021, 4:33 AM.

Details

Summary

Currently, empty lines and comments break alignment of assignments on consecutive
lines. This makes the AlignConsecutiveAssignments option an enum that allows controlling
whether empty lines or empty lines and comments should be ignored when aligning
assignments.

Diff Detail

Event Timeline

tinloaf requested review of this revision.Jan 3 2021, 4:33 AM
tinloaf created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2021, 4:33 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Two remarks, first: I made the change only for AlignConsecutiveAssignments, but I think it should be consistent for AlignConsecutiveMacros, AlignConsecutiveBitFields and AlignConsecutiveDeclarations, as well. If this change is accepted, I will make corresponding changes to the other three.

Second: Sorry if this change pops up twice. I botched the first revision by not understanding arcanist.

tinloaf updated this revision to Diff 314284.Jan 3 2021, 7:43 AM

Fix formatting issue resulting from ancient clang-format.

MyDeveloperDay retitled this revision from [format] Add the possibility to align assignments spanning empty lines or comments to [clang-format] Add the possibility to align assignments spanning empty lines or comments.Jan 3 2021, 9:22 AM
MyDeveloperDay added a reviewer: curdeius.
MyDeveloperDay added a project: Restricted Project.
MyDeveloperDay added a comment.EditedJan 3 2021, 9:27 AM

by and large, this looks pretty good to me..lets have some other comment

clang/docs/ClangFormatStyleOptions.rst
198

Did you regenerate this or make the changes by hand?

you need to run clang/tools/dump_style.py

clang/include/clang/Format/Format.h
110
clang/lib/Format/WhitespaceManager.cpp
390

add "." at the end

MyDeveloperDay added inline comments.Jan 3 2021, 9:30 AM
clang/unittests/Format/FormatTest.cpp
12581

can you add an example with a block comment that spans multiple lines

e.g.

int a = 5
/*
 * block comment
*/
int oneTwoThree = 123;
int a = 5
///
/// block comment
///
int oneTwoThree = 123;
tinloaf updated this revision to Diff 314293.Jan 3 2021, 10:19 AM
tinloaf marked 3 inline comments as done.

Address MyDeveloperDay's review comments.

tinloaf marked an inline comment as done.Jan 3 2021, 10:21 AM
tinloaf added inline comments.
clang/include/clang/Format/Format.h
110

Sorry - I messed up my first try using arcanist, now I just used 'git diff' and forgot about -U999999. Should be fixed in the newest diff.

MyDeveloperDay added inline comments.Jan 3 2021, 12:26 PM
clang/include/clang/Format/Format.h
141

Is there a case for aligning over empty lines and comments?

int a           = 5;

/* comment */
int oneTwoThree = 123;
clang/lib/Format/Format.cpp
140

move true and false to the bottom and separate with

// For backward compatibility.

comment (see below)

clang/lib/Format/WhitespaceManager.cpp
365

could this be an enum?

enum {
    None
    AcrossEmptyLines,
    AcrossComments,
    AcrossEmptyLinesAndComments,
}
curdeius added inline comments.Jan 3 2021, 12:39 PM
clang/lib/Format/WhitespaceManager.cpp
418–433

Nit: unnecessary parentheses around !AcrossEmpty.
Same around !(LineIsComment && AcrossComments).
Maybe you might factor out a bool variable for this condition?

tinloaf marked an inline comment as done.Jan 3 2021, 12:46 PM
tinloaf added inline comments.
clang/include/clang/Format/Format.h
141

Not sure I understand what you mean. Currently, the Option AcrossComments includes 'across empty lines'. So, there currently is no case for "across comments, but not across empty lines". I'm not sure if that is really something people want to do. Do you think so? I can add it.

MyDeveloperDay added inline comments.Jan 3 2021, 10:53 PM
clang/include/clang/Format/Format.h
141

I could see a case where you might want to begin a new "alignment group" by leaving a blank line.

/* align these 3 */
int a = 5;
/* align these 3 */
int b = 6;
/* align these 3 */
int c = 7;

/* align these 2 which are longer */
int d           = 5
/* align these 2 which are longer */
int oneTwoThree = 123;
tinloaf updated this revision to Diff 314361.Jan 4 2021, 4:58 AM

Address reviewer comments, noteworthy changes:

  • Add an option to span alignment only across comments, not across newlines
  • Factor out spanning options to AlignTokens to an enum
tinloaf marked 5 inline comments as done.Jan 4 2021, 5:01 AM
tinloaf added inline comments.
clang/include/clang/Format/Format.h
141

Yes, good point. I've added the option AlignAcrossEmptyLinesAndComments, and made AlignAcrossComments do what the name suggests.

clang/lib/Format/WhitespaceManager.cpp
365

Yes, that's probably cleaner, though that means I need a rather ugly switch statement to convert one enum into the other where AlignTokens is called. Maybe one could address this with some template magic, but that's probably not less ugly than the switch statement.

418–433

Good point. I factored this out into two booleans, which should improve readability.

tinloaf updated this revision to Diff 314377.Jan 4 2021, 7:22 AM
tinloaf marked 3 inline comments as done.

Fix bogus default case reported by clang-tidy.

MyDeveloperDay accepted this revision.Jan 4 2021, 8:31 AM

This LGTM, I'm not sure if others have any further comments

Ideally we should add a comment to the release notes, (which you could do via a separate NFC commit if you wanted)

This revision is now accepted and ready to land.Jan 4 2021, 8:31 AM

Ideally we should add a comment to the release notes, (which you could do via a separate NFC commit if you wanted)

Thank, I'll do that. But before that, I will create a revision that adds the same feature for the other three alignment types.

HazardyKnusperkeks requested changes to this revision.Jan 4 2021, 12:28 PM

Ideally we should add a comment to the release notes, (which you could do via a separate NFC commit if you wanted)

Thank, I'll do that. But before that, I will create a revision that adds the same feature for the other three alignment types.

I think it should happen in this revision so that it is atomically.

But I'm willing to be overruled.

Apart from the 2 comments and the release notes nice change! (And I really have to think about what my new setting will be.)

clang/include/clang/Format/Format.h
128

You are adding a Tab here, or do I misinterpret this? Shouldn't that be fixed by clang-format itself?

clang/lib/Format/WhitespaceManager.cpp
647

All other variables start with an upper case char, maybe we should keep it that way?

This revision now requires changes to proceed.Jan 4 2021, 12:28 PM
tinloaf added inline comments.Jan 4 2021, 3:22 PM
clang/include/clang/Format/Format.h
128

I'm not sure, honestly. I did auto-format everything using clang-format with the provided style. I think I know what you mean (the little double arrows here in the side-by-side view?), but if I open the raw diff ("View Options" above, "Show Raw File (Right)"), it does not seem to contain tabs. Also, the file on my disk does not contain tabs.

Thank you for submitting this patch?

clang/include/clang/Format/Format.h
128

@HazardyKnusperkeks to my knowledge, its not a tab

MyDeveloperDay added inline comments.Jan 5 2021, 2:34 AM
clang/docs/ClangFormatStyleOptions.rst
198–199

As you plan to do the other cases won't this introduce

AlignBitFieldStyle
and
AlignMacroStyle

why not have 1 AlignConsecutiveStyle that can be used for all 3, then you don't need the other enum TokenAlignmentStyle which means you won't need to map from one to the other?

tinloaf marked 2 inline comments as done.Jan 5 2021, 3:32 AM
tinloaf added inline comments.
clang/docs/ClangFormatStyleOptions.rst
198–199

Yes, it would. I was under the assumption that every multiple-choice option in the options should have its own enum, for type safety. But yes, that would be way more elegant (and make the diff much smaller). I'll change it!

clang/include/clang/Format/Format.h
128

Great, I'll mark this as done then.

MyDeveloperDay added inline comments.Jan 5 2021, 3:50 AM
clang/docs/ClangFormatStyleOptions.rst
198–199

we don't make that distinction for those options that use true/false so I'm not sure why enums need to be any different

it may require some changes to the doc/tools/dump_style.py to ensure it gives meaningful documentation, but some consistency between the values might help a little (especially as we'd like to use the same AlignTokens algortihm)

You'd end up with 3 functions to map the various styles to you TokenAlignmentStyle anyway, and hence you'd lose the type safety

To be honest I get a bit annoyed at the use of "None" "Never" "false", "No" anyway, so I don't feel we are in a great place in terms of consistency.

tinloaf updated this revision to Diff 315664.Jan 10 2021, 7:10 AM
tinloaf marked an inline comment as done.
  • Unify the type of the options for AlignConsecutive{BitFields, Declarations, Assignments, Macros}
  • Implement across-alignment for bit fields, declarations, assignments and macros
  • Add tests for across-alignment for bit fields, declarations, assignments and macros
  • Add changelog entry

Ideally we should add a comment to the release notes, (which you could do via a separate NFC commit if you wanted)

Thank, I'll do that. But before that, I will create a revision that adds the same feature for the other three alignment types.

I think it should happen in this revision so that it is atomically.

Done - this revision now also includes the functionality for bit fields, macros and declarations.

tinloaf marked 3 inline comments as done.Jan 10 2021, 7:16 AM

This is now an all-in-one revision, including bit fields, assignments, declarations and macros. I did not reproduce the full "across empty lines, across comments, across empty lines *and* comments" test suite for all four of them, since they all use the same logic (and code, mostly). AlignConsecutiveAssignments is tested extensively, AlignConsecutiveBitFields and AlignConsecutiveDeclarations only have tests for AlignAcrossEmptyLinesAndComments (since they delegate to the same AlignTokens method), and AlignConsecutiveMacros has its own set of tests, since it uses a different implementation (of the same logic, basically).

clang/docs/ClangFormatStyleOptions.rst
198–199

I have unified the type of the style options. I think the solution for the documentation is Okay, though not ideal. Each of the options' docs now starts with an example including the relevant tokens (e.g., bit field ':'s for AlignConsecutiveBitFields), but the examples illustrating the individual options always use assignments. I'm not sure how I could work around this using only a single type.

tinloaf marked an inline comment as done.Jan 10 2021, 7:17 AM
tinloaf updated this revision to Diff 315665.Jan 10 2021, 7:20 AM

Fix merge conflict in the release notes.

MyDeveloperDay accepted this revision.Jan 10 2021, 8:04 AM

This LGTM

Apart from my inline comment and the pre merge check this is superb. (I will not accept it, until we have reached a conclusion for the documentation.)

clang/docs/ClangFormatStyleOptions.rst
428

This are now the examples from aligning the assignment. We now know that the rules for the alignments are the same, but someone new to it, just reading the documentation may be puzzled.

As @MyDeveloperDay said, most likely the script for generation has to be adapted?

The question here is now doing it now, or in a different change? It should happen before LLVM 12 will be released.

Apart from my inline comment and the pre merge check this is superb. (I will not accept it, until we have reached a conclusion for the documentation.)

Regarding the pre-merge check: it is clang-format which failed. My local clang-format claims that the code is formatted according to the project's .clang-format (git-clang-format tells me that there are no modifications), but the clang-format on the server seems to disagree. I can of course just apply the supplied patch, but I would like to fix this so that it won't take me two tries in the future. Which version of clang-format is running on the build server? I'm running 12.0.0.

clang/docs/ClangFormatStyleOptions.rst
428

I agree that this is not a good solution. I can probably adapt the doc generation script (have not yet looked at it in depth), but I currently have no good idea how to put the information into the source - i.e., where the dump script should get the info from. Here is an idea, which doesn't strike me as very elegant:

As far as I can tell, the script first takes the documentation block of the defined symbol (e.g., AlignConsecutiveDeclarations), and if the type of that symbol is an enum, then appends a list of possible enum values together with the documentation block of the respective enum value.

My suggestion would be to introduce some kind of tag into the documentation block of the defined symbol ( DOC_IGNORE_ENUM or something?), that, if present, deactivates this behavior and only takes the documentation of the defined symbol. I would then have to hand-code the different possible values and the respective examples into the documentation blocks of AlignConsecutiveAssignments, AlignConsecutiveMacros, AlignConsecutiveBitFields and AlignConsecutiveDeclarations. The documentation for AlignConsecutiveStyle would be ignored.

Would that be a feasible solution?

tinloaf updated this revision to Diff 316068.Jan 12 2021, 5:27 AM

Fixed formatting issue.

I think I would remove the code examples from the "AlignConsecutive style" to avoid confusion (that would be the first change)

then perhaps regenerate and update the documentation so we can then see, its seems most have a "code block", can the specific examples not go into that area?

And just have the enum explain its different configuration options?

/// \brief If ``true``, aligns consecutive C/C++ preprocessor macros.
///
/// This will align C/C++ preprocessor macros of consecutive lines.
/// Will result in formattings like
/// \code
///   #define SHORT_NAME       42
///   #define LONGER_NAME      0x007f
///   #define EVEN_LONGER_NAME (2)
///   #define foo(x)           (x * x)
///   #define bar(y, z)        (y + z)
/// \endcode
bool AlignConsecutiveMacros;

I think I would remove the code examples from the "AlignConsecutive style" to avoid confusion (that would be the first change)

then perhaps regenerate and update the documentation so we can then see, its seems most have a "code block", can the specific examples not go into that area?

And just have the enum explain its different configuration options?

/// \brief If ``true``, aligns consecutive C/C++ preprocessor macros.
///
/// This will align C/C++ preprocessor macros of consecutive lines.
/// Will result in formattings like
/// \code
///   #define SHORT_NAME       42
///   #define LONGER_NAME      0x007f
///   #define EVEN_LONGER_NAME (2)
///   #define foo(x)           (x * x)
///   #define bar(y, z)        (y + z)
/// \endcode
bool AlignConsecutiveMacros;

I think that's a good way to go.

HazardyKnusperkeks requested changes to this revision.Jan 15 2021, 2:59 PM
This revision now requires changes to proceed.Jan 15 2021, 2:59 PM
tinloaf updated this revision to Diff 317166.Jan 16 2021, 4:33 AM

Fixed documentation for the AlignConsecutive* options. I have used the suggested approach: Remove all documentation from AlignConsecutiveStyle and instead put the documentation of the possible values and what they do in the AlignConsecutive* options' docs.

I had to make two small changes to the documentation dump script for this:

  • Non-documented options in enums are silently ignored instead of raising an error, since AlignConsecutiveStyle has non-documented options now.
  • Indentation in the code-block lines is preserved, so that we can have indented code blocks in the documentation of the possible values.
tinloaf marked an inline comment as done.Jan 16 2021, 4:34 AM

Not what I had in mind, but for me that's okay. I can not say anything to the change of the script though.

What I hat in mind was explaining (maybe in length) the enum and for the alignment settings just show an example with None and one of the other options, while referring to the enum documentation for all options.

clang/include/clang/Format/Format.h
87

If you don't document this with /// you don't need to remove the error from the script, or am I mistaken?

Thie LGTM, just to check a few nits

/usr/bin/sphinx-build -n ./docs ./html
/clang/llvm-project/clang/docs/ClangFormatStyleOptions.rst:330: WARNING: Bullet list ends without a blank line; unexpected unindent.
/clang/llvm-project/clang/docs/ClangFormatStyleOptions.rst:404: WARNING: Bullet list ends without a blank line; unexpected unindent.
/clang/llvm-project/clang/docs/ClangFormatStyleOptions.rst:479: WARNING: Bullet list ends without a blank line; unexpected unindent.

Thank you for changing the dump_style to make this clearer

MyDeveloperDay accepted this revision.Jan 17 2021, 2:59 AM
MyDeveloperDay added inline comments.
clang/include/clang/Format/Format.h
103

Nit: maybe a colon on the end what do you think?

159

Nit: no need for a comma I think

This revision is now accepted and ready to land.Jan 17 2021, 11:42 AM

How is it going? I would like to add some new alignment options and it would be stupid not to do it on top of this.

How is it going? I would like to add some new alignment options and it would be stupid not to do it on top of this.

Sorry, much to do work-wise currently. I'll get to the last details today and finish this!

How is it going? I would like to add some new alignment options and it would be stupid not to do it on top of this.

Sorry, much to do work-wise currently. I'll get to the last details today and finish this!

No problem, and no need to haste. I know that sometimes the available time is way to short.

tinloaf marked 3 inline comments as done.Jan 21 2021, 6:41 AM
tinloaf added inline comments.
clang/include/clang/Format/Format.h
87

Unfortunately no: The dump script's state machine never goes into the InEnum state, and no Enum object with the correct name is created at all. This then breaks the doc generation for the fields using that enum. Sure, I could probably work around that, but that would probably be a more invasive change than silently ignoring undocumented enum members.

tinloaf updated this revision to Diff 318181.Jan 21 2021, 6:41 AM
tinloaf marked an inline comment as done.

Fix spelling issues in doc

tinloaf accepted this revision.Jan 21 2021, 6:43 AM

Fixed the last spelling issues in the documentation. If the automated checks pass, this is ready for takeoff from my side. Thanks for your reviews!

tinloaf updated this revision to Diff 318182.Jan 21 2021, 6:46 AM

Of course there was another merge conflict in the release notes. Mental note to self: Put release note changes in separate commit.

tinloaf accepted this revision.Jan 21 2021, 6:53 AM
tinloaf updated this revision to Diff 318193.Jan 21 2021, 7:17 AM

Fix error introduced by rebase

tinloaf accepted this revision.Jan 21 2021, 9:10 AM
curdeius added inline comments.Jan 22 2021, 1:31 AM
clang/docs/tools/dump_format_style.py
194

Just a nit, maybe we should have a warning at least?

tinloaf added inline comments.Jan 22 2021, 3:34 AM
clang/docs/tools/dump_format_style.py
194

Not sure. After this commit gets in, we'll always have an enum (AlignConsecutiveStyle) with five undocumented options. Thus, every run of this script would emit five warnings, and these warnings would become "intended behavior". I think this would probably confuse people?

Sure, one could implement some more elaborate way of detecting an intentionally undocumented enum, but I guess this would mean a larger modification of this state machine, and I'd rather not lump this together with this PR.

If I understand the LLVM mailing list correctly, everything in the repo by the 26th will still go into LLVM 12, right? Any chance we could get this in by then?

Do you have commit access? If no, please write your Name Surname <email@address> that should be used for the commit.

Do you have commit access?

Since this is my first contribution to LLVM ever, probably not.

If no, please write your Name Surname <email@address> that should be used for the commit.

Lukas Barth <mail@lukas-barth.net>

Thanks a lot!

I did not re-accept this, because of the script change. I'm okay with it, but I never looked really at the script. I think it should be changed afterwards to error or warn again, but not on this enum. Maybe one could annotate enums which don't have to be fully documented?

I'm okay with making that change in another revision which does not land in LLVM 12.

Do you have commit access?

Since this is my first contribution to LLVM ever, probably not.

That's what I thought, but I got it.
https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

I did not re-accept this, because of the script change. I'm okay with it, but I never looked really at the script. I think it should be changed afterwards to error or warn again, but not on this enum. Maybe one could annotate enums which don't have to be fully documented?

Yes, I think that's the best option. I'll take a look at it. I'll probably just introduce a new state to the state machine, InIgnoredEnum or something, which does nothing but look for the end of the enum.

That's what I thought, but I got it.
https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

Oh, that's quite a permissive policy! Maybe I'll try to get commit access - but for this first commit, the commit-by-proxy thing is probably a good idea. :)

I've committted it, but... something went wrong and you're not visible as the author. Sorry for that @tinloaf.

Fixed!

Thanks a lot!