This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Fix AlignOperands when BreakBeforeBinaryOperators is set
ClosedPublic

Authored by Typz on Apr 25 2017, 1:24 AM.

Details

Summary

Even when BreakBeforeBinaryOperators is set, AlignOperands kept
aligning the beginning of the line, even when it could align the
actual operands (e.g. after an assignment).

With this patch, the operands are actually aligned, and the operator
gets aligned with the equal sign:

int aaaaa = bbbbbb
          + cccccc;

This not happen in tests, to avoid 'breaking' the indentation:

if (aaaaa
    && bbbbb)
  return;

Event Timeline

Typz created this revision.Apr 25 2017, 1:24 AM
Typz added a comment.Apr 25 2017, 1:27 AM

This patch fixes the behavior to match the name of the option, but maybe the original behavior was actually by design: should I instead add an option (e.g. make an enum out of a AlignOperands) instead?

djasper edited edge metadata.May 17 2017, 5:53 AM

The current behavior here is actually intended. If you'd like the other format, we probably need to add a style option. What style guide are you basing this on?

unittests/Format/FormatTest.cpp
2476 ↗(On Diff #96519)

This looks very inconsistent to me.

Typz added a comment.May 17 2017, 9:31 AM

we are using this style at our company, not sure if it is used elsewhere; I will check.

however, it seems to me that this behavior does not match the name of the option : AlignOperands does not align the operands anymore when BreakBeforeBinaryOperators is set...
so, for this patch to go forward, should I change AlignOperands into OperandAlignment enum, with 3 options? Something like

  • OA_NotAligned, same as AlignOperands=false
  • OA_AlignOperator, same the current AlignOperands=true
  • OA_AlignOperands, same as AlignOperands=true when BreakBeforeBinaryOperators=false but my "new" mode when BreakBeforeBinaryOperators=true
unittests/Format/FormatTest.cpp
2476 ↗(On Diff #96519)

not sure what you mean, I do not really understand how this expression was aligned before the patch...
it is not so much better in this case with the patch, but the '&&' is actually right-aligned with the '=' sign.

Typz added inline comments.May 17 2017, 9:41 AM
unittests/Format/FormatTest.cpp
2476 ↗(On Diff #96519)

Seeing the test just before, I see (when breaking after operators) that the operands are actually right-aligned, e.g. all operators are on the same column.

So should it not be the same when breaking before the operator as well (independently from my patch, actually)?

bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
                   + aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
                   + aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
              == aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
                       * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb\n"
                   + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb\n"
          && aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
                   * aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n"
               > ccccccccccccccccccccccccccccccccccccccccc;

Not sure I like this right-alignment thing, but at least I start to understand how we get this output (and this may be another option to prefer left-alignment?)

When you say "this doesn't happen in tests", do you mean this never happens when there are parentheses around the expression?

unittests/Format/FormatTest.cpp
2476 ↗(On Diff #96519)

The right alignment is not relevant. I think I just got "playful" when writing this test.

What's happening here is that we align the operators of subsequent lines to the previous operand. You are right that this is not intuitive wrt. the option's name, but it is the behavior that we intended and that we had seen in styles that use this.

Now, what we also do is add another ContinuationIndentWidth to highlight operator precedence. Basically think of this as replacing parentheses that you would put there instead:

int x = (aaaaa
         * bbbbb) // The "*" is intendet a space because of the paren.
        + ccccc;

int x = aaaaa
            * bbbbb // Instead we use ContinuationIndentWidth here.
        + ccccc;

What I mean about this being inconsistent is that now you align the highest-level operands because as you say that's what's expected from the option, but you still don't align other precedences, which seems wrong. If you really wanted to align all operands, that would look like:

bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
           + aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
           + aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
          == aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
           * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
           + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
          && aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
           * aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
           > ccccccccccccccccccccccccccccccccccccccccc;

But then, that's really a tough expression to understand. I mean, this is a fabricated example and possibly doesn't happen often in real life, but I am slightly worried about it and the inconsistency here.

Typz added a comment.May 19 2017, 5:55 AM

When you say "this doesn't happen in tests", do you mean this never happens when there are parentheses around the expression?

By 'test' I meant 'conditional construct' : it happens when there are parentheses around the expression, but it does not happen when these parentheses are the parentheses from a if (...)

Typz added inline comments.May 23 2017, 2:40 AM
unittests/Format/FormatTest.cpp
2476 ↗(On Diff #96519)

So how should we proceed?

As I understand it, there are 2 separate issues here:

  • The current behavior is actually the expected one, and should not be modified
  • The indentation due to 'fake parenthesis' is not very explicit (even with the current implementation)

Changing AlignOperands into an OperandAlignment enum, with 3 options (OA_NotAligned, same as AlignOperands=false; OA_AlignOperator (or OA_Align), same the current AlignOperands=true; and OA_AlignOperands (or OA_StrictAlign), with my new behavior), would solve the first problem. Would it be acceptable?

I don't think this is related to the extra indentation issue, the problem is already present, and the existing behavior is not that consistent or clear... But I think the 'fake parenthesis' indentation could easily be changed, independently from my patch, to use a dedicated operatorPrecedenceIndentWith instead of ContinuationIndentWidth (or a boolean indentOperatorPrecedence) : setting it would allow to force all operators aligned, independenty from the other options (and in a separate patch?).

What do you think?

Typz updated this revision to Diff 100095.May 24 2017, 8:20 AM

add new value to AlignOperands to support this mode.
fix some corner cases.

In D32478#759347, @Typz wrote:

When you say "this doesn't happen in tests", do you mean this never happens when there are parentheses around the expression?

By 'test' I meant 'conditional construct' : it happens when there are parentheses around the expression, but it does not happen when these parentheses are the parentheses from a if (...)

Are you sure? From reading the code, it seems that this happens exactly after "=" and "return". What's the behavior for function calls?

function(aaaaaaa //
+ bbbbb);

Or for expressions with just parens?

int a = (aaaaaa //
+ bbbbbb);

What if doing this would violate the ContinuationIndentWidth?

T t = aaaaa //
&& bbbbb;
lib/Format/ContinuationIndenter.cpp
759 ↗(On Diff #100095)

Fix style.

949 ↗(On Diff #100095)

I am not yet convinced you need a new flag in ParenState here. I guess you need it because the operators can have varying length and so you cannot just change LastSpace here?

Typz added a comment.May 26 2017, 1:56 AM

It indeed does not happens inside any parenthesis (it would actually make things completely unreadable if there are imbricated parenthesis), and may get indented less than the ContinuationIndentWidth.

Typz added inline comments.May 26 2017, 2:00 AM
lib/Format/ContinuationIndenter.cpp
949 ↗(On Diff #100095)

exactly. This is set when passing through the equal/return sign, but indent must be adjusted for each line individually based on the actual size of that line's leading operator.

Typz marked an inline comment as done.May 26 2017, 3:52 AM
Typz updated this revision to Diff 100384.May 26 2017, 3:52 AM

fix style

In all honesty, I think this style isn't thought out well enough. It really is a special case for only "=" and "return" and even there, it has many cases where it simply doesn't make sense. And then you have cases like this:

bool = aaaaaa //
    == bbbb //
    && ccccc;

Where the syntactic structure is lost entirely.

On top of that it has runtime downsides for all clang-format users because ParenState gets larger and more costly compare. As such, I am against moving forward with this. Can you remind me again, which coding style suggests this format?

unittests/Format/FormatTest.cpp
2619 ↗(On Diff #100384)

I think this is wrong and we should not indent like this.

Typz added a comment.EditedMay 26 2017, 4:51 AM

In all honesty, I think this style isn't thought out well enough. It really is a special case for only "=" and "return" and even there, it has many cases where it simply doesn't make sense. And then you have cases like this:

bool = aaaaaa //
    == bbbb //
    && ccccc;

Where the syntactic structure is lost entirely.

It is not lost, extra indent for 'virtual' parenthesis is still there:

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

On top of that it has runtime downsides for all clang-format users because ParenState gets larger and more costly compare. As such, I am against moving forward with this. Can you remind me again, which coding style suggests this format?

This is just a single extra bit (and there are still less than 16 such bits), so it does change the size of ParenState. As for the compare cost, I think it is within reach of the compiler's optimization, but it may indeed have a slight impact.

In D32478#765548, @Typz wrote:

In all honesty, I think this style isn't thought out well enough. It really is a special case for only "=" and "return" and even there, it has many cases where it simply doesn't make sense. And then you have cases like this:

bool = aaaaaa //
    == bbbb //
    && ccccc;

Where the syntactic structure is lost entirely.

It is not lost, extra indent for 'virtual' parenthesis is still there:

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

Ah, right, I was thinking about something else (commas where we don't add extra indentation. Anyhow, I don't think what you are writing is what clang-format produces. How could it indent relative to the "&&" when placing the "=="? It doesn't know how far to unindent at that point, I think.

On top of that it has runtime downsides for all clang-format users because ParenState gets larger and more costly compare. As such, I am against moving forward with this. Can you remind me again, which coding style suggests this format?

This is just a single extra bit (and there are still less than 16 such bits), so it does change the size of ParenState. As for the compare cost, I think it is within reach of the compiler's optimization, but it may indeed have a slight impact.

Typz added a comment.May 26 2017, 6:03 AM

I actually don't know how, but it still manages somehow : I rebuilt this exact patch to ensure I gave you the correct output.
And the same behavior can be seen in the test cases, where the operator with highest precedence is aligned with the equal sign.

In D32478#765583, @Typz wrote:

I actually don't know how, but it still manages somehow : I rebuilt this exact patch to ensure I gave you the correct output.
And the same behavior can be seen in the test cases, where the operator with highest precedence is aligned with the equal sign.

So, it's formatting like this?

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

auto a = aaaaaa    //
           == bbbb //
       + ccccc;

While that's interesting and I'd like to figure out how it actually works, I also think it's really weird to indent the inner "==" differently based on the kind of operator around it. The existing way to format these operators is nice and consistent here with always adding a multiple of four spaces.

Typz added a comment.EditedMay 26 2017, 6:53 AM

Nop, it's formatted like this:

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

bool a = aaaaaa //
      == bbbb   //
             + ccccc;

The current way to format operators is not affected: the indentation is done as "usual", then they are unindented by the operator width to keep the alignment...

djasper added a comment.EditedMay 26 2017, 7:00 AM
In D32478#765642, @Typz wrote:

Nop, it's formatted like this:

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

bool a = aaaaaa //
      == bbbb   //
             + ccccc;

The current way to format operators is not affected: the indentation is done as "usual", then they are unindented by the operator width to keep the alignment...

Ah damn. Didn't think about the precedences. What I wanted was for the highest level to have a one char operator, e.g.

bool a = aaaaaa //
           << bbbb   //
       | ccccc;
In D32478#765548, @Typz wrote:

In all honesty, I think this style isn't thought out well enough. It really is a special case for only "=" and "return" and even there, it has many cases where it simply doesn't make sense. And then you have cases like this:

bool = aaaaaa //
    == bbbb //
    && ccccc;

Where the syntactic structure is lost entirely.

It is not lost, extra indent for 'virtual' parenthesis is still there:

bool a = aaaaaa   //
          == bbbb //
      && ccccc;

On top of that it has runtime downsides for all clang-format users because ParenState gets larger and more costly compare. As such, I am against moving forward with this. Can you remind me again, which coding style suggests this format?

This is just a single extra bit (and there are still less than 16 such bits), so it does change the size of ParenState. As for the compare cost, I think it is within reach of the compiler's optimization, but it may indeed have a slight impact.

I would still be interested in a coding style that recommends this format.

Typz added a comment.May 26 2017, 7:49 AM

Ah damn. Didn't think about the precedences. What I wanted was for the highest level to have a one char operator, e.g.

bool a = aaaaaa //
           << bbbb   //
       | ccccc;

Almost, but not quite:

bool a = aaaaaa   //
          << bbbb //
       | ccccc;

i.e. the identifiers is indented by 4 characters, then the operator is 'unindented.

I would still be interested in a coding style that recommends this format.

We are using this at our compagny, but this is neither public nor open source.
I am sure I saw this style somewhere else, but I cannot remember where...
(This is also the style recommended for webkit's when returning booleans : https://webkit.org/code-style-guidelines/#indentation-wrap-bool-op, but probably because the continuation indent is 4...
and I am not sure this guide in Boost's Spirit library counts either: http://www.boost.org/doc/libs/1_56_0/libs/spirit/doc/html/spirit/notes/style_guide.html )

Typz marked 7 inline comments as done.Jun 20 2017, 6:10 AM

This style is used in the Skia project.

Typz updated this revision to Diff 103195.Jun 20 2017, 6:11 AM

Rebase & fix indentation when newline is inserted after return or equal.

I don't want to move forward with this patch. But adding Manuel as another reviewer to sanity-check.

include/clang/Format/Format.h
167 ↗(On Diff #103195)

The name is not intuitive. I don't think this is any more or less strict than the other version.

unittests/Format/FormatTest.cpp
2781 ↗(On Diff #103195)

Comment seems to belong to "+ b" so should be aligned to it.

Typz added inline comments.Jun 29 2017, 6:07 AM
include/clang/Format/Format.h
167 ↗(On Diff #103195)

It is a bit stricter in the sense that it really aligns operands, not operator with first operand... But this is indeed not very intuitive.

Any suggestion?

Typz updated this revision to Diff 106221.Jul 12 2017, 8:17 AM
Typz marked 3 inline comments as done.

Rename option to AlignAfterOperator

Typz added a comment.Jul 12 2017, 9:32 AM

I renamed the option to AlignAfterOperator, is it acceptable now?
(I also thought of UnindentOperator, but I think it is better to keep the notion of alignment)

Note that this style is used in multiple open-source projects: Skia, parts of QtCreator code base (e.g. in multiple plugins: clang, cpptools, android...), the OpenEXR library...

klimek edited edge metadata.Nov 9 2017, 1:00 AM
In D32478#806757, @Typz wrote:

I renamed the option to AlignAfterOperator, is it acceptable now?
(I also thought of UnindentOperator, but I think it is better to keep the notion of alignment)

Note that this style is used in multiple open-source projects: Skia, parts of QtCreator code base (e.g. in multiple plugins: clang, cpptools, android...), the OpenEXR library...

Sorry for the long response time. I don't see this style rule expressed explicitly in the Skia or QtCreator style guides - is this something that just happens to be done sometimes in the code bases? I have problems understanding the rules from the discussion (as well as the code / test tbh), is there a really concise way to give me the core of the idea?

Typz added a comment.Nov 9 2017, 2:39 AM

Sorry for the long response time. I don't see this style rule expressed explicitly in the Skia or QtCreator style guides - is this something that just happens to be done sometimes in the code bases?

This is present in code base. Those project's style guides are not precise enough, and do not document the behavior for this case.

I have problems understanding the rules from the discussion (as well as the code / test tbh), is there a really concise way to give me the core of the idea?

The case I am trying to address is to really keep the _operands_ aligned after an assignment or return, in case line is wrapped *before* the operator.

int a = b
      + c;
return a
     + b;

while the current behavior with Style.AlignOperands = true; BreakBeforeBinaryOperators = true is to align the wrapped operator with the previous line's operand:

int a = b
        + c;
return a
       + b;

In the discussion, it appeared that this behavior is not a error (with respect to the name), but an expected behavior for most coding rules: hence the introduction of a third AlignOperands mode ("AlignAfterOperator"), to handle this new case.

From there the code actually got a bit more complex to support various corner cases (e.g. operators with different number of characters, wrapperd first line, do not unindent more than the enclosing brace...)

klimek added a comment.Nov 9 2017, 2:55 AM
In D32478#920275, @Typz wrote:

Sorry for the long response time. I don't see this style rule expressed explicitly in the Skia or QtCreator style guides - is this something that just happens to be done sometimes in the code bases?

This is present in code base. Those project's style guides are not precise enough, and do not document the behavior for this case.

I have problems understanding the rules from the discussion (as well as the code / test tbh), is there a really concise way to give me the core of the idea?

The case I am trying to address is to really keep the _operands_ aligned after an assignment or return, in case line is wrapped *before* the operator.

int a = b
      + c;
return a
     + b;

while the current behavior with Style.AlignOperands = true; BreakBeforeBinaryOperators = true is to align the wrapped operator with the previous line's operand:

int a = b
        + c;
return a
       + b;

In the discussion, it appeared that this behavior is not a error (with respect to the name), but an expected behavior for most coding rules: hence the introduction of a third AlignOperands mode ("AlignAfterOperator"), to handle this new case.

From there the code actually got a bit more complex to support various corner cases (e.g. operators with different number of characters, wrapperd first line, do not unindent more than the enclosing brace...)

Checking out skia and looking at wrapped '+', I see various different formats, and random sampling I've found none so far that would fit your proposed rule. Unless I'm missing something, I'd agree with Daniel; this is not a rule that's widely used, and I'd say reformatting a code base to the clang-formatted variant will not regress readability.

Typz added a comment.Nov 9 2017, 4:36 AM

Unless I'm missing something, I'd agree with Daniel; this is not a rule that's widely used, and I'd say reformatting a code base to the clang-formatted variant will not regress readability.

Unfortunately coding rules are not just about readability, but also about consistency at entreprise scale, thus as a general rule we should not change the established rules in an organizatoin.
There is history in rules, we have code which already uses these rules, so for consistency we must stick to the old rules (even if we would not necessarily choose the same rules if we were to start from scratch).

In D32478#920345, @Typz wrote:

Unless I'm missing something, I'd agree with Daniel; this is not a rule that's widely used, and I'd say reformatting a code base to the clang-formatted variant will not regress readability.

Unfortunately coding rules are not just about readability, but also about consistency at entreprise scale, thus as a general rule we should not change the established rules in an organizatoin.
There is history in rules, we have code which already uses these rules, so for consistency we must stick to the old rules (even if we would not necessarily choose the same rules if we were to start from scratch).

My guess is (but I might be wrong): if your code base is large enough and you search your internal code base for how well these rules are obeyed, I'd expect there to be a large number of violations (given that there is no automation to detect or fix those violations). In the end, getting actual consistency will require effort - either changing all violations, or changing everything to a consistent state with automation.

Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2019, 1:52 AM
MyDeveloperDay added a project: Restricted Project.Oct 1 2019, 1:39 AM
Typz updated this revision to Diff 226393.Oct 25 2019, 3:00 AM

Rebase on top of D50078

MyDeveloperDay added inline comments.
clang/include/clang/Format/Format.h
187

I think you are missing an entry in the operator== in Format.h

Typz marked 2 inline comments as done.Oct 25 2019, 3:32 AM
Typz added inline comments.
clang/include/clang/Format/Format.h
187

It is there already, since this is not a new field, but just changing the type of an existing field.

MyDeveloperDay accepted this revision.Oct 25 2019, 4:23 AM

I've read back through the previous comments, and I'm slightly struggling to understand the reason for the objections. (other than the normal public style)

I'd tend to be wary of the "if its not in a public style guide, its not coming in" rule despite me understanding the rationale for that stance many people have non-public projects but still have millions of lines of code to support:

Many public projects have often already adopted a .clang-format file anyway and as such their style is defined by only what clang-format currently supports and hence it cannot by definition then define a style that clang-format keeps altering the code away from.

So I baulk at the idea that a rule that introduces a new style can't come in, My rule of thumb is don't mess up anyone's existing code, don't make others pay too much for your feature, help us look after this excellent tool by helping the project (bugs, reviews).

With that in mind, I've gone back to the original documentation for AlignOperands and BreakBeforeBinaryOperators, and I'm afraid I cannot see the logic in the current capabilities that align code nicely with BreakBeforeBinaryOperators turned off

but not nicely with BreakBeforeBinaryOperators turned on

If I understand correctly, this new setting (which will be off by default) will allow the following for those that choose it.

and it may be the OCD in me, but that re-alignment makes me feel happy, for this reason alone I'd give this a LGTM, as long as your willing to help us support this behaviour going forward.

That's my rationale for accepting, plus the authors 2.5 years of rebasing a feature, shows a dedication which I am assuming extends to continued support.

my 2c worth.

(images from https://zed0.co.uk/clang-format-configurator/) - 10.0.0+b452de0

clang/include/clang/Format/Format.h
187

sound good.

clang/unittests/Format/FormatTest.cpp
4147

Nit: I get nervous when we change tests, can we simply add a new duplicated line

This revision is now accepted and ready to land.Oct 25 2019, 4:23 AM
This revision was automatically updated to reflect the committed changes.
Typz marked 2 inline comments as done.