This is an archive of the discontinued LLVM Phabricator instance.

clang-format: tweak formatting of variable initialization blocks
ClosedPublic

Authored by Typz on Feb 14 2018, 6:52 AM.

Details

Summary

This patch changes the behavior of PenaltyBreakBeforeFirstCallParameter
so that is does not apply when the brace comes after an assignment.

This way, variable initialization is wrapped more like an initializer
than like a function call, which seems more consistent with user
expectations.

With PenaltyBreakBeforeFirstCallParameter=200, this gives the following
code: (with Cpp11BracedListStyle=false)

Before :

const std::unordered_map<std::string, int> Something::MyHashTable =
    { { "aaaaaaaaaaaaaaaaaaaaa", 0 },
      { "bbbbbbbbbbbbbbbbbbbbb", 1 },
      { "ccccccccccccccccccccc", 2 } };

After :

const std::unordered_set<std::string> Something::MyUnorderedSet = {
  { "aaaaaaaaaaaaaaaaaaaaa", 0 },
  { "bbbbbbbbbbbbbbbbbbbbb", 1 },
  { "ccccccccccccccccccccc", 2 }
};

Diff Detail

Event Timeline

Typz created this revision.Feb 14 2018, 6:52 AM

What you are doing makes sense to me. My only hesitation is whether we should maybe completely disallow breaking between = and { when Cpp11BracedListStyle is false instead of solving this via penalties. Specifically,

                                           | 80 column limit
SomethingReallyLong<SomethingReallyLong> =
    { { a, b, c },
      { a, b, c } };

Might not be as good as

                                           | 80 column limit
SomethingReallyLong<
    SomethingReallyLong> = {
  { a, b, c },
  { a, b, c }
};

in this mode.

unittests/Format/FormatTest.cpp
6889

Upper camel case for variable names according to LLVM style.

Typz added a comment.Feb 15 2018, 5:16 AM

Tweaking the penalty handling would still be needed in any case, since the problem happens also when Cpp11BracedListStyle is true (though the effect is more subtle)

Generally, I think it is better to just rely on penalties, since it gives a way to compare and weight each solution. Then each style can decide what should break first: e.g. a style may also have a lower PenaltyBreakAssignment, thus wrapping after the equal sign would be expected...

Typz updated this revision to Diff 134411.Feb 15 2018, 5:42 AM
Typz marked an inline comment as done.

Address review comments.

Tweaking the penalty handling would still be needed in any case, since the problem happens also when Cpp11BracedListStyle is true (though the effect is more subtle)

I don't understand this. Which style do you actually care about? With Cpp11BracedListStyle=true or =false?

Generally, I think it is better to just rely on penalties, since it gives a way to compare and weight each solution. Then each style can decide what should break first: e.g. a style may also have a lower PenaltyBreakAssignment, thus wrapping after the equal sign would be expected...

And with my reasoning, I think exactly the opposite. Penalties are nice, but if the behavior is generally unwanted, than it's very hard to predict in which situations it might still occur.

Typz added a comment.Feb 27 2018, 6:04 AM

Tweaking the penalty handling would still be needed in any case, since the problem happens also when Cpp11BracedListStyle is true (though the effect is more subtle)

I don't understand this. Which style do you actually care about? With Cpp11BracedListStyle=true or =false?

I use the Cpp11BracedListStyle=false style.
However, I think the current behavior is wrong also when Cpp11BracedListStyle=true. Here the behavior in this case:

Before :

const std::unordered_map<std::string, int> Something::MyHashTable =
    {{ "aaaaaaaaaaaaaaaaaaaaa", 0 },
     { "bbbbbbbbbbbbbbbbbbbbb", 1 },
     { "ccccccccccccccccccccc", 2 }};

After :

const std::unordered_set<std::string> Something::MyUnorderedSet = {
    { "aaaaaaaaaaaaaaaaaaaaa", 0 },
    { "bbbbbbbbbbbbbbbbbbbbb", 1 },
    { "ccccccccccccccccccccc", 2 }};

Generally, I think it is better to just rely on penalties, since it gives a way to compare and weight each solution. Then each style can decide what should break first: e.g. a style may also have a lower PenaltyBreakAssignment, thus wrapping after the equal sign would be expected...

And with my reasoning, I think exactly the opposite. Penalties are nice, but if the behavior is generally unwanted, than it's very hard to predict in which situations it might still occur.

Yet on the other hand enforcing too much can lead to cases where the optimizer fails to find a solution, and ends up simply not formatting the line: which is fine is the code is already formatted (but if there were the case we would not need the tool at all :-) )
The beauty of penalties is that one can tweak the different penalties so that the behavior is "generally" what would be expected: definitely not predictable, but then there are always cases where the "regular" style does not work, and fallback solutions must be used... (for exemple, I would prefer never to never break after an assignment : yet sometimes it is needed...)

I can change the patch to disallow breaking, but this would be a slightly more risky change, with possibly more side-effects; and i think that would not solve the issue when Cpp11BracedListStyle=true.

Tweaking the penalty handling would still be needed in any case, since the problem happens also when Cpp11BracedListStyle is true (though the effect is more subtle)

I don't understand this. Which style do you actually care about? With Cpp11BracedListStyle=true or =false?

I use the Cpp11BracedListStyle=false style.
However, I think the current behavior is wrong also when Cpp11BracedListStyle=true. Here the behavior in this case:

Before :

const std::unordered_map<std::string, int> Something::MyHashTable =
    {{ "aaaaaaaaaaaaaaaaaaaaa", 0 },
     { "bbbbbbbbbbbbbbbbbbbbb", 1 },
     { "ccccccccccccccccccccc", 2 }};

After :

const std::unordered_set<std::string> Something::MyUnorderedSet = {
    { "aaaaaaaaaaaaaaaaaaaaa", 0 },
    { "bbbbbbbbbbbbbbbbbbbbb", 1 },
    { "ccccccccccccccccccccc", 2 }};

"Before" is the desired behavior. I don't know whether other people have extensively analyzed how to format all the various C++11 braced lists, but we have at Google and think this is good. It is formalized here:
https://google.github.io/styleguide/cppguide.html#Braced_Initializer_List_Format

We prefer breaking before "{" at the higher syntactic level or essentially before the imaginary function call in place of the braced list.

Generally, I think it is better to just rely on penalties, since it gives a way to compare and weight each solution. Then each style can decide what should break first: e.g. a style may also have a lower PenaltyBreakAssignment, thus wrapping after the equal sign would be expected...

And with my reasoning, I think exactly the opposite. Penalties are nice, but if the behavior is generally unwanted, than it's very hard to predict in which situations it might still occur.

Yet on the other hand enforcing too much can lead to cases where the optimizer fails to find a solution, and ends up simply not formatting the line: which is fine is the code is already formatted (but if there were the case we would not need the tool at all :-) )
The beauty of penalties is that one can tweak the different penalties so that the behavior is "generally" what would be expected: definitely not predictable, but then there are always cases where the "regular" style does not work, and fallback solutions must be used... (for exemple, I would prefer never to never break after an assignment : yet sometimes it is needed...)

I can change the patch to disallow breaking, but this would be a slightly more risky change, with possibly more side-effects; and i think that would not solve the issue when Cpp11BracedListStyle=true.

It is the better call here. I understand that people that set Cpp11BracedListStyle to false want this behavior and I think it'll always be a surprise when clang-format breaks before the "{". The chance of not coming up with a formatting because of this is somewhat slim. It only adds an additional two characters (we would also not break before the "="). It'd be really weird to only have these exact two line length have a special behavior (slightly shorter - everything fits on one line, slightly longer - a split between type name and variable is necessary).

There is no issue for Cpp11BracedListStyle=true, the behavior is as desired.

Typz updated this revision to Diff 136081.Feb 27 2018, 8:02 AM

Prevent breaking between = and { when Cpp11BracedListStyle=false.

djasper added inline comments.Feb 27 2018, 8:47 AM
lib/Format/TokenAnnotator.cpp
2183

Why is this necessary?

unittests/Format/FormatTest.cpp
6655

How does this penalty influence the test?

Typz added inline comments.Feb 27 2018, 9:30 AM
lib/Format/TokenAnnotator.cpp
2183

Otherwise the PenaltyBreakBeforeFirstCallParameter is used, which is not consistent with the concept of !Cpp11BraceListStyle (e.g. consider this is an initializer), and gives the following format, which is certainly not the expected result:

const std::unordered_map<std::string, int>
    MyHashTable = { { \"aaaaaaaaaaaaaaaaaaaaa\", 0 },
                    { \"bbbbbbbbbbbbbbbbbbbbb\", 1 },
                    { \"ccccccccccccccccccccc\", 2 } };

(again, the issue only happens when PenaltyBreakBeforeFirstCallParameter is increased, e.g. 200 in my case. The default value is 19, so formatting is not affected)

unittests/Format/FormatTest.cpp
6655

Because breaking after the opening brace is affected by the PenaltyBreakBeforeFirstCallParameter : this is an init braced-list, but it is considered as any function.

Specifically, my patch needs (see prev. comment) to change the penalty for breaking after the brace, but this applies only when Cpp11BracedListStyle=false, as per your earlier comment. So this test just verify that the behavior is indeed not affected.

djasper accepted this revision.Mar 22 2018, 8:04 AM

Generally looks good.

lib/Format/TokenAnnotator.cpp
2183

I think PenaltyBreakBeforeFirstCallParameter should not be applied with !Cpp11BracedListStyle whenever a brace is found. Cpp11BracedList style means that braces are to be treated like function calls, but without it, this doesn't make sense. I think this is in some ways better than looking for the "= {".

unittests/Format/FormatTest.cpp
6655

I see.

This revision is now accepted and ready to land.Mar 22 2018, 8:04 AM
Typz updated this revision to Diff 147002.May 16 2018, 1:04 AM
Typz marked 6 inline comments as done.

Address review comment & rebase

This revision was automatically updated to reflect the committed changes.