This is an archive of the discontinued LLVM Phabricator instance.

clang-format: do not add extra indent when wrapping last parameter
Needs ReviewPublic

Authored by Typz on Feb 1 2018, 2:35 AM.

Details

Summary

There should be no extra indent when wrapping only the expression used
as last argument. This is consistent with the behavior when the first
(and only) argument's expression is wrapped.

foo(a, bbbbbbbbbbbbbbbbbb +
       ccccccccccccccccc);
foo(bbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

This does not affect all other cases, where the argument itself is
wrapped:

foo(a,
    bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);
foo(bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);

Diff Detail

Event Timeline

Typz created this revision.Feb 1 2018, 2:35 AM

I am against this change. The current behavior here is intentional and IMO more consistent. If there is more than one precedence level in a set of parentheses, we add the additional indentation. If you don't like it, surround it with extra parentheses.

Generally, it'd be useful to have a "before" and "after" example in the patch description. That way, we can get more feedback from other people more easily.

Typz added a comment.EditedFeb 1 2018, 5:04 AM

I doubt this particular case was really intentional, esp. since this case never happens in the tests. I think it is more a side-effect of the (general) indent in "fake" parenthesis.
Here is an exemple:

Before this change:

foo(a, bbbbbbbbbbbbbbbbbb +
           ccccccccccccccccc);
foo(bbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);
foo(a,
    bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);
foo(bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);

After this change:

foo(a, bbbbbbbbbbbbbbbbbb +
       ccccccccccccccccc);
foo(bbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);
foo(a,
    bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);
foo(bbbbbbbbbbbbbbbbb +
        ccccccccccccccccc,
    d);

i.e. this patch only affect the 'first' scenario (e.g. wrapping expression in last argument) consistent with the second one (e.g. wrapping expression in first and only argument)

You might doubt it, but having written the code I can tell you that it's the case. Shame on me for not writing a test, though.

I see the argument why this indentation is not necessary in exactly the case where the last parameter is multi-line and not wrapped to a new line itself: You always have some indentation anyway because of the preceding parameter on the same line.
However, for me the consistency is more important here, i.e. achieving that we don't have a relative indentation change between:

foo(a, bbbbbbbbbbbbbbbbbb +
           ccccccccccccccccc);

and

foo(a,
    bbbbbbbbbbbbbbbbbb +
        ccccccccccccccccc);

This formatting can easily alter between these two when line length vary slightly and I think being able to pattern match that easily.
Yes, that means it is not consistent with:

foo(bbbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

But there is actually a substantial difference in structure and so, I think it is reasonable to not be consistent there.

Typz added a comment.Feb 1 2018, 6:22 AM

You might doubt it, but having written the code I can tell you that it's the case.

Ok, you win :-)

I see the argument why this indentation is not necessary in exactly the case where the last parameter is multi-line and not wrapped to a new line itself: You always have some indentation anyway because of the preceding parameter on the same line.
However, for me the consistency is more important here, i.e. achieving that we don't have a relative indentation change between:
[...]
This formatting can easily alter between these two when line length vary slightly and I think being able to pattern match that easily.

Not sure what you mean: a diff would not be trivial in either case...

Yes, that means it is not consistent with:

foo(bbbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

But there is actually a substantial difference in structure and so, I think it is reasonable to not be consistent there.

It's reasonable from the perspective of the tool [i.e. to make the code of clang-format consistent], but IMHO not so consistent from the perspective of a user:
I (and most people I believe) would not manually add this seemingly unnecessary indentation, so I would prefer the tool to do the same; and fortunately it seems pretty trivial to implement.

Is there a way this can go in anyway, for exemple with a setting?

I don't mean trivial with a diff. What I mean is, users will find it surprising if whether or not a parameter gets wrapped leads to a different indentation internal to that parameter. I have not heard of a single user that would be surprised by this extra indentation.

I don't think this is worth an extra setting. I'll be somewhat resistant to getting this in because a) I think it's wrong and b) never change a running system (users will have at least gotten used to it). But I also don't want to assume total power here, I do not care that much about this issue. I'll add some other clang-format authors as reviewers here and you can also feel free to add more people. If the common opinion is that your change is good, I am happy to move forward.

Ah, Manuel and Krasimir are already on this thread, maybe they can comment? I also added Chandler and Sam who I know care about formatting somewhat.

We could adapt the single-argument version instead, turning:

foo(bbbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

into:

foo(bbbbbbbbbbbbbbbbbb +
        ccccccccccccccccc);
Typz added a comment.EditedFeb 9 2018, 2:10 AM

We could adapt the single-argument version instead, turning:

foo(bbbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

into:

foo(bbbbbbbbbbbbbbbbbb +
        ccccccccccccccccc);

We could indeed, but that would probably make neither djasper or me happy :-)
And it would make the meaning of 'AlignOperands' even more convoluted...

Typz added a comment.Feb 28 2018, 6:55 AM

What I mean is, users will find it surprising if whether or not a parameter gets wrapped leads to a different indentation internal to that parameter. I have not heard of a single user that would be surprised by this extra indentation.

well you have now...
I configured the tool to align AlignOperands, to I expected the operands of my expressions to be aligned, even if they are in a function call.

(Btw, I don't think the comma used to separate function parameters are actually considered "operators" by the C++ standard, so technically this is not a case of multiple precedences, and users may not expect it to be handled this way...)

But you *do* want extra indentation in the case of:

function(aaaaa, 
         bbbbb +
         cccccc);

I understand you argument, but I don't agree at the moment. As is (without getting more feedback from others that clang-format is behaving unexpected here), I do not want to move forward with this change.

Typz added a comment.Feb 28 2018, 9:33 AM

But you *do* want extra indentation in the case of:

function(aaaaa, 
         bbbbb +
         cccccc);

I understand you argument, but I don't agree at the moment. As is (without getting more feedback from others that clang-format is behaving unexpected here), I do not want to move forward with this change.

Indeed, because as much as we want to keep things aligned (and avoid extra indentation), we want above all to make code "easier to read": we think keeping alignment is indeed helping, but that keeping the alignment when it adds confusion is not...
Now let's hear what others are thinking about this behavior.

(BTW, I imagine there is no point introducing an option for this specific, if this does not get accepted as default? It may be done as part of https://reviews.llvm.org/D32478 to minimize options, in the new mode which tries to align more strictly... But last time you looked at it you were not very enthousiastic about it either...)

Are you sure that you are even addressing an important case? I have done some research on our codebase and this is something that happens incredibly rarely. The reason is that you have to have a very specific combination of line length, where the last parameter does not fit on one line if indented back to align with the open paren while it does fit on multiple lines if indented right of the comma. In all of LLVM/Clang, there seem to be only about 50 cases. How certain are you that people actually care?

Typz added a comment.Mar 2 2018, 4:12 AM

If people don't care about this case, we might as well merge this :-) Just kidding.

The tweak matches both our expectation, the auto-indent behaviour of IDE (not to be trusted, but still probably of 'default' behaviour for many people, esp. when you don't yet use a formatter), and its seems our code base is generally formatted like this. I think it is quite more frequent than 50 times in whole LLVM/Clang, but I have no actual numbers; do you have a magic tool to search for such specific "code pattern", or just run clang-format with one option then the next and count the differences ?

We could adapt the single-argument version instead, turning:

foo(bbbbbbbbbbbbbbbbbb +
    ccccccccccccccccc);

into:

foo(bbbbbbbbbbbbbbbbbb +
        ccccccccccccccccc);

I have a mild preference for this, and it seems more consistent. But I don't have a strong preference.

However, for each of the other cases shown in the thread so far, i strongly prefer clang-format's current behavior.

If people don't care about this case, we might as well merge this :-) Just kidding.

The tweak matches both our expectation, the auto-indent behaviour of IDE (not to be trusted, but still probably of 'default' behaviour for many people, esp. when you don't yet use a formatter), and its seems our code base is generally formatted like this. I think it is quite more frequent than 50 times in whole LLVM/Clang, but I have no actual numbers; do you have a magic tool to search for such specific "code pattern", or just run clang-format with one option then the next and count the differences ?

I just tweaked a search based on regular expressions. Fundamentally something like a line with an open paren and a comma that doesn't end in a comma gives a reasonable first approximation. But yes, first formatting with one option, then reformatting and looking at numbers of diffs would be interesting. And I would bet that even in your codebase diffs are few.

Also double-checked with Richard Smith and he agrees that the current behavior is preferable. For comma and plus this doesn't seem overly important, but making it:

aaaaaaaaaa(bbbbbbbbb + ccccccccccc *
                       ddddddddd);

seems really bad to him as this suggests that we are adding both ccccccccccc and ddddddddd.

aaaaaaaaaa(bbbbbbbbb + ccccccccccc *
                           ddddddddd);

seems clearer. And for consistency reasons, we should not treat those two cases differently.

Typz added a comment.EditedMar 2 2018, 7:01 AM

Also double-checked with Richard Smith and he agrees that the current behavior is preferable. For comma and plus this doesn't seem overly important, but making it:

aaaaaaaaaa(bbbbbbbbb + ccccccccccc *
                       ddddddddd);

seems really bad to him as this suggests that we are adding both ccccccccccc and ddddddddd.

aaaaaaaaaa(bbbbbbbbb + ccccccccccc *
                           ddddddddd);

seems clearer.

And I fully agree with this!
But I don't think that would be affected by my patch... I hope it does not at least, and I'll try to double check (and add a test).

And for consistency reasons, we should not treat those two cases differently.

The consistency is related only to the implementation of clang-format: for a language (and its users) they are not the same things, comma separating arguments are not operators and are used to separate different expressions.

We have talked about that and none of us agree.

chandlerc resigned from this revision.Mar 26 2018, 1:12 PM

Since this seems not going anywhere, removing it from my review dashboard.

Typz added a comment.Jun 8 2018, 7:18 AM

Would it be acceptable to introduce an option to allow enabling this behavior?
I mean would it have a chance of being integrated, or must I keep maintaining a fork of clang-format...

The normal rule for formatting options apply. If you can dig up a public style guide and a project of reasonable size where it is used, we can add an option.

Typz added a comment.Jun 12 2018, 7:29 AM

The normal rule for formatting options apply. If you can dig up a public style guide and a project of reasonable size where it is used, we can add an option.

I don't want to be rude, but it seems to me that in this context this response is just a polite way of saying "no" : as discussed already on this patch, this is indeed a corner case, and probably not documented anywere, and as far as I understand, the current behavior is not referenced in llvm or google coding rule either. This is simply the styling that the maintainers find the most appropriate.

Hence my question: I know the "rules", but I want to know if you would be open to introducing options for tweaking this, in case people do not agree this is the most appropriate. Typically, for such corner cases I could imagine a nested option, similar to custom brace wrapping, so that the "basic" namespace option is not poluted, but further customization can be defined in a nested "advanced" option.

You are right that this behavior is what the code authors, but also many other people, like to have and so it is what is engrained in clang-format. There are likely about a million things that fall into the same category. Now we might find that the current default is actually wrong and we have changed many defaults in the past. I don't believe that's the case here and there are more opinions on this thread.

If people still disagree, there is the question of whether a flag should be introduced, and for every single flag that was introduced, somebody did not agree with what clang-format was doing. We have established the bar that has to be met in order for the cost-benefit-ratio of options to be beneficial. I don't see any reason to make an exception here. The fact that it is a rare corner case makes it less likely to be important enough to carry the weight of an option. Creating a second option class (nested, hidden, ...) does not seem to improve the situation to me. That will just create a subsequent mass of additional flags that people have wanted in the past or will want in the future that don't actually carry their weight (in my opinion - and likely speaking for most other clang-format contributors). Discoverability is only one of the concerns about the costs of flags.

So in short, unless you can actually meet the usual requirements for flags, the answer is: no.