This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] use spaces for alignment with UT_ForContinuationAndIndentation
ClosedPublic

Authored by fickert on Feb 24 2020, 1:24 AM.

Details

Summary

Use spaces instead of tabs for alignment with UT_ForContinuationAndIndentation to make the code aligned for any tab/indent width.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38381

Diff Detail

Event Timeline

fickert created this revision.Feb 24 2020, 1:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2020, 1:24 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I added an IsAligned flag to ParenState and Change to track if an indent was created for alignment. I adapted the corresponding test cases where alignments were previously done with tabs, including a test case where leading whitespace in a comment is interpreted as alignment.

This is my first contribution to clang-format so I appreciate any suggestions.

If you are changing unit tests then I'm nervous, imagine if this many tests change what the impact is on a large code base. I think we need to understand more about why you are making this change and why you think its ok to change/remove existing tests which assert some behavior which up to now we have considered correct.

you need to add any diff as a full context diff (normally by adding -U 9999999) to the diff command

MyDeveloperDay requested changes to this revision.Feb 24 2020, 4:10 AM
This revision now requires changes to proceed.Feb 24 2020, 4:10 AM
fickert updated this revision to Diff 246176.Feb 24 2020, 4:24 AM
fickert marked an inline comment as done.

uploaded full context diff

I understand the caution around changing the unit tests. The tests I modified assumed that alignment would be made with mixed tabs and spaces, which is arguably not the expected behavior with UT_ForContinuationAndIndentation (see https://bugs.llvm.org/show_bug.cgi?id=38381). I removed a test case that was a duplicate (the one in line 10293 is identical to the one in line 10182, and there are no changes to the formatting configuration in between). I added two test cases to ensure that the behavior is consistent for alignment in multi-line comments, e.g. the formatted code should be

{
\t/* aaaa
\t   bbbb
}

instead of

{
\t/* aaaa
\t\t bbbb
}

when using UT_ForContinuationAndIndentation and a tab width of 2.

MyDeveloperDay added inline comments.Feb 24 2020, 5:44 AM
clang/unittests/Format/FormatTest.cpp
10335

Sorry this isn't clear to be which test this is a duplicate of, as this is an example where there are tabs in the comment and trailing tabs

Is this intentional, or is it an oversight?

This is the key question, which I'm not sure how we can answer without the help of the original authors @djasper and @klimek

For example I can't understand with the example below if the TabWith is 8 why are you putting 13 spaces in, rather than the original test which had 1 tab (assuming 8 characters) and 5 spaces? (to make up 13)

Could you explain why this is the correct change for this example why its correct to put 13 spaces? (is this not continuation or indentation?)

Tab.UseTab = FormatStyle::UT_ForContinuationAndIndentation;
Tab.TabWidth = 8;
Tab.IndentWidth = 8;
verifyFormat("class X {\n"
             "\tvoid f() {\n"
             "\t\tsomeFunction(parameter1,\n"
             "\t\t.............parameter2);\n"
             "\t}\n"
             "};",
             Tab);
fickert marked an inline comment as done.Feb 24 2020, 6:02 AM
fickert added inline comments.
clang/unittests/Format/FormatTest.cpp
10335

This test case is identical to the one in line 10182-10190. There are several test cases that appear multiple times, but usually with changes to the formatting settings in between. This is not the case for these two test cases (no changes in between), so this one seems to be obsolete.

Is this intentional, or is it an oversight?

This is the key question, which I'm not sure how we can answer without the help of the original authors @djasper and @klimek

For example I can't understand with the example below if the TabWith is 8 why are you putting 13 spaces in, rather than the original test which had 1 tab (assuming 8 characters) and 5 spaces? (to make up 13)

Could you explain why this is the correct change for this example why its correct to put 13 spaces? (is this not continuation or indentation?)

Tab.UseTab = FormatStyle::UT_ForContinuationAndIndentation;
Tab.TabWidth = 8;
Tab.IndentWidth = 8;
verifyFormat("class X {\n"
             "\tvoid f() {\n"
             "\t\tsomeFunction(parameter1,\n"
             "\t\t.............parameter2);\n"
             "\t}\n"
             "};",
             Tab);

In this example, the first two tabs are for indentation, and the remaining whitespace before parameter2 is for alignment (not continuation) with parameter1. If tabs are used in the alignment, the formatted code will only look aligned with the correct tab width, whereas if the alignment is done with spaces, the code will look aligned with arbitrary tab width.

I believe this to be the expected behavior for UT_ForContinuationAndIndentation, and tabs should only be used for alignment with UT_Always, but I would also be happy about comments from the original authors.

I tend to agree with your assessment, I think its worth waiting for some others to comment.

clang/unittests/Format/FormatTest.cpp
10335

thank you for highlighting that.

Digging through the history, it seems this behavior was explicitly desired: https://reviews.llvm.org/D19028. @mxbOctasic was the original author and perhaps could comment. Regardless, I think there should be a new option.

From the original commit, it does seem that this behavior was desired.

However, from the name of the option and description in the documentation ("Use tabs only for line continuation and indentation"), I would expect it to use tabs only for indentation and continuation, but spaces for alignment. The current behavior uses spaces for some types of alignment that appear in the middle of a line (e.g. consecutive assignments/declarations), but not for others that occur at the beginning of a line (e.g. function parameters). The bug report seems to indicate that other people also expected a different behavior given the description.

Does anyone else want to comment? Otherwise I can move the changed behavior into a new option; in that case, I'd be happy about naming suggestions.

I want to believe that it was a mistake on the original developers part, but I just can't tell.

A search of github shows people using this in their repos

https://github.com/search?q=ForContinuationAndIndentation&type=Code

I almost feel unless the original author can comment we are at an impasse, maybe we have to add:

ForAllContinuationAndIndentation

And if at some point in the future we can collapse both options into 1 then that would be better, but I just don't feel qualified to say change everyone when I don't use that option myself because I don't use tabs.

Whatever you add need to be heavily document that we mean everywhere, and if you ever find any cases where its doesn't work we intend for them to be changed. i.e. be very explicit in the review/commit comments

fickert updated this revision to Diff 253327.Mar 28 2020, 3:29 AM

I moved the changed behavior to a new option (UT_AlignWithSpaces), and added corresponding unit tests by copying and adapting the tests for UT_ForContinuationAndIndentation.

fickert added inline comments.Mar 28 2020, 3:38 AM
clang/unittests/Format/FormatTest.cpp
10164–10165

This test seems to be missing the tab configuration (Tab) as an argument to format, and the same test with missing Tab argument appears again in line 10337. Adding the Tab argument will also change the expected result. I think the test case here should look like this:

EXPECT_EQ("{\n"
          "        /*\n"
          "         * Comment\n"
          "         */\n"
          "        int i;\n"
          "}",
          format("{\n"
                 "\t/*\n"
                 "\t * Comment\n"
                 "\t */\n"
                 "\t int i;\n"
                 "}",
                 Tab));

and a similar change should be made to the one in line 10337.

I assume this should be fixed in a separate commit?

fickert updated this revision to Diff 253349.Mar 28 2020, 9:01 AM

Added documentation and missing test cases for 0-width tabs.

MyDeveloperDay accepted this revision.Apr 2 2020, 12:14 AM

This LGTM, thank you

This revision is now accepted and ready to land.Apr 2 2020, 12:14 AM

Great, thank you!

I do not have push access to the repository, so someone else would need to commit this for me: Maximilian Fickert <maximilian.fickert@gmail.com>

Should I add the fix for the test cases missing the tab configuration (see my last inline comment: https://reviews.llvm.org/D75034#inline-703423)? Or submit it as a separate change set?

fickert updated this revision to Diff 256938.Apr 13 2020, 1:31 AM

rebased, fixed two unit tests

This revision was automatically updated to reflect the committed changes.