clang-format: introduce `CaseBlockIndent` to control indent in switch
Needs ReviewPublic

Authored by Typz on Feb 12 2018, 5:46 AM.

Details

Summary

When a block is started after a case label, clang-format does add extra
indent to the content of this block: the block content is indented one
level (with respect to the switch) while the closing brace is not
indented, and closes at switch level.

This gives the following code:

switch (x) {
case A: {
  stuff();
} break;
}

This makes it easy to confuse the closing brace of the 'case' with that
the whole 'switch', especially if more code is added after the brace:

switch (x) {
case A: {
  stuff();
}
  moreStuff();
  break;
}

This patch introduces a new CaseBlockIndent switch, which provides
alternative formatting for these cases:

  • CaseBlockIndent = None : default behavior, same behavior as before
  • CaseBlockIndent = ClosingBrace : indent the closing brace to the

same level as its content.

switch (x) {
case A: {
  stuff();
  } break;
}
  • CaseBlockIndent = Block : add an extra level of indent for the

content of the block.

switch (x) {
case A: {
    stuff();
  } break;
}

Diff Detail

Typz created this revision.Feb 12 2018, 5:46 AM

Do you have a reference to style guides recommending any of this?

Typz added a comment.Feb 13 2018, 5:11 AM

Do you have a reference to style guides recommending any of this?

Unfortunately not... I think this is a really advanced topic, and often not recommended way to format code at all (e.g. "if you need a block, you may as well use a function"). I can find the current implementation documented in google style guide, where there is an extra indent inside switch [e.g. IndentCaseLabels], which alleviates the ambiguity. But it is not documented (or shown) in the LLVM coding style, where it makes the code IMO really difficult to follow.

I believe the 2 extra modes I introduce should provide for most cases, leaving the decision to user:

  • None mode matches google-style, and is perfectly fine when IndentCaseLabels == true
  • ClosingBrace mode is "logical" with what people may want (e.g. have a scope for the whole case block), but is not syntaxically correct and may thus mislead
  • Block mode is syntaxically correct, but somewhat breaks the switch layout by moving break to different indentation levels depending on the use of braces

Which mode is use is then left to the user, depending on the "goals" of their coding style.

Also, maybe this is a small-enough detail that it would be better moved to the BraceWrapping field, to allow finely tuning the expected behavior (e.g. in Custom brace mode) while keeping the 'main' options simple (even though it is not technically a way to wrap the braces).

To me none of these options make sense. For the case you are describing, I agree that the current behavior is not ideal, but neither are any of the alternatives. However, I think that's fine. We'll just format those cases in a somewhat weird way and users can either accept that or change their code to not need it. I don't particularly care which of these options we go with, but I would be really hesitant to introduce an style flag for it. This is so rare, that almost no one needs this flag (or should need this flag). Thus, the cost of the flag (discoverability of flags for users, increased maintenance cost, etc.) strongly outweigh the benefit. IMO, for the same reason, this specific issue will not become part of the style guide of projects.

If you want to change something, I'd vote for making clang fall back to this behavior:

case A:
  {
    stuff();
  }
  moreStuff();
  break;
}

i.e. not let it put the "{" on the same line as the "case ..." if there is a trailing statement (other than "break;" maybe).

Typz added a comment.Feb 13 2018, 6:09 AM

We'll just format those cases in a somewhat weird way and users can either accept that or change their code to not need it.

I think we have a really diverging opinion on this. From my experience, people will mostly accept the output of the formatter without question : therefore I think we should strive to have the formatter format things as "correctly" as possible. And looking at the existing options and various complex formatting algorithms implemented I think this reasoning has been applied to some extent :-)
So I personnally would be willing to add some code/options even for such kind of corner cases; but then I am not the one maintaining the clang-format project.

I don't particularly care which of these options we go with, but I would be really hesitant to introduce an style flag for it. This is so rare, that almost no one needs this flag (or should need this flag). Thus, the cost of the flag (discoverability of flags for users, increased maintenance cost, etc.) strongly outweigh the benefit.

Any change will still require a new flag somewhere, since the currently implemented behavior is :
1- the documented way to format in Google style
2- the expected format in LLVM style, according to the clang-format unit test

It should thus probably not be changed.

IMO, for the same reason, this specific issue will not become part of the style guide of projects.

I definitely agree on this, but it is actually part of some styles (e.g. Google's)

If you want to change something, I'd vote for making clang fall back to this behavior:

case A:
  {
    stuff();
  }
  moreStuff();
  break;
}

i.e. not let it put the "{" on the same line as the "case ..." if there is a trailing statement (other than "break;" maybe).

I am fine with that formatting (though I did not implement it since it requires tweaking the code in more places, instead of essentially modifying a single function like I did).

We'll just format those cases in a somewhat weird way and users can either accept that or change their code to not need it.

I think we have a really diverging opinion on this. From my experience, people will mostly accept the output of the formatter without question : therefore I think we should strive to have the formatter format things as "correctly" as possible. And looking at the existing options and various complex formatting algorithms implemented I think this reasoning has been applied to some extent :-)
So I personnally would be willing to add some code/options even for such kind of corner cases; but then I am not the one maintaining the clang-format project.

I have no problem with making clang-format format things "correctly" and I am happy to add code. But I think we are doing the average clang-format user a disservice if we provide options for every such corner case. Let's settle on one good-enough behavior and go with that for everything/everyone.

I don't particularly care which of these options we go with, but I would be really hesitant to introduce an style flag for it. This is so rare, that almost no one needs this flag (or should need this flag). Thus, the cost of the flag (discoverability of flags for users, increased maintenance cost, etc.) strongly outweigh the benefit.

Any change will still require a new flag somewhere, since the currently implemented behavior is :
1- the documented way to format in Google style
2- the expected format in LLVM style, according to the clang-format unit test

It should thus probably not be changed.

I don't think that's true for the alternative I am suggesting.

IMO, for the same reason, this specific issue will not become part of the style guide of projects.

I definitely agree on this, but it is actually part of some styles (e.g. Google's)

If you want to change something, I'd vote for making clang fall back to this behavior:

case A:
  {
    stuff();
  }
  moreStuff();
  break;
}

i.e. not let it put the "{" on the same line as the "case ..." if there is a trailing statement (other than "break;" maybe).

I am fine with that formatting (though I did not implement it since it requires tweaking the code in more places, instead of essentially modifying a single function like I did).

As we can implement this without additional flags (it doesn't contradict any style guide I know of), I think this would be strictly preferable.

Typz added a comment.Feb 13 2018, 7:12 AM

It is explicitly documented in google style guide: https://google.github.io/styleguide/cppguide.html#Loops_and_Switch_Statements :

case blocks in switch statements can have curly braces or not, depending on your preference. If you do include curly braces they should be placed as shown below.

If not conditional on an enumerated value, switch statements should always have a default case (in the case of an enumerated value, the compiler will warn you if any values are not handled). If the default case should never execute, simply assert:

switch (var) {
  case 0: {  // 2 space indent
    ...      // 4 space indent
    break;
  }
  case 1: {
    ...
    break;
  }
  default: {
    assert(false);
  }
}

So IMHO we cannot just change the current (or default) behaviour.

It is explicitly documented in google style guide: https://google.github.io/styleguide/cppguide.html#Loops_and_Switch_Statements :

case blocks in switch statements can have curly braces or not, depending on your preference. If you do include curly braces they should be placed as shown below.

If not conditional on an enumerated value, switch statements should always have a default case (in the case of an enumerated value, the compiler will warn you if any values are not handled). If the default case should never execute, simply assert:

switch (var) {
  case 0: {  // 2 space indent
    ...      // 4 space indent
    break;
  }
  case 1: {
    ...
    break;
  }
  default: {
    assert(false);
  }
}

So IMHO we cannot just change the current (or default) behaviour.

My proposal does not contradict this style guide as the case in question is not included in the example. It gives absolutely no guidance on how to format this case.

If anything, you could argue that it tells the user never to have something outside of the braces that make up a case statement (keep in mind that the style guide does not only give formatting advise).

Typz added a comment.Feb 13 2018, 8:48 AM

Hum, not sure I fully got your proposal. So you actually mean that clang-format you format like this (with 4-space indent for clarity):

switch (x) {
case 0:
    break;
case 1: {
    foo();
    break;
}
case 2: {
    foo();
} break;
case 3:
    {
        foo();
    }
    bar();
    break;
}

Is this right?

If so it does not really help me: I don't care so much how it is formatted, but I think the current way is way too error prone (and I cannot change the style to indent the case blocks themself). So i'll have to keep a patch in my fork :-(
Or maybe the behavior should be dependant on IndentCaseLabels (though this would change LLVM style formatting) ?

Yes, that's what I mean. What do you mean, the style is too error prone?

Typz added a comment.Feb 15 2018, 6:53 AM

Yes, that's what I mean. What do you mean, the style is too error prone?

When IndentCaseLabels is false, the code gets formatted in a way that "hides" the structure of the code, by indenting the end of the case in the exact same way as the end of a switch, which is error prone IMHO.
Indentation should "highlight" the structure of the code, to make it easier to understand, and thus avoid this kind of confusion.

That doesn't explain to me how this is error prone. I can't think how you'd create an error by this that would not be caught by the compiler. Much less if you consistently use clang-format.

It's fundamentally what you get for IndentCaseLabels: false. Even without braces you have this issue to some extend. But LLVM has several thousand switches, about a thousand with braces formatted this way and I am not aware of a single time this has caused a bug or even confusion.

And to me (but this is obviously objective), your suggestions hide the structure of the code even more as they lead to a state where the closing brace is not aligned with the start of the line/statement that contains the opening braces. That looks like a bug to me more than anything else and I don't think there is another situation where clang-format would do that.

Typz added a comment.Feb 15 2018, 7:34 AM

A user can create an error by reasoning like this, after the code has been formatted this way (a long time ago) : "oh, I need to make an extra function call, but in all cases.... ah, here is the end of the switch, let's put my function call here".

I am not saying it happens often, I am just saying it breaks indentation : i.e. that two nested blocks should not close at the same depth. Breaking such things makes code harder to read/understand, and when you don't properly get the code you can more easily make a mistake. Obviously it should be caught in testing, but it is not always.

That said, I am not trying to say in any way that my approach is better. I think both CaseBlockIndent = Block or your variant (with the extra line break before opening brace; but applying it all the time) will indeed be consistent with the code and not break indentation; keeping the current behavior when IndentCaseLabels = true is also not a problem IMHO.

And to me (but this is obviously objective), your suggestions hide the structure of the code even more as they lead to a state where the closing brace is not aligned with the start of the line/statement that contains the opening braces. That looks like a bug to me more than anything else and I don't think there is another situation where clang-format would do that.

The exact same thing happens for functions blocks, class blocks and control statements, depending on BraceWrapping modes. Actually, IMO wrapping the opening brace should probably be done according to BraceWrapping.AfterControlStatement flag.

// BraceWrapping.AfterControlStatement = false
switch (x) {
case 0: {
    foo();
    break;
  }
}
// BraceWrapping.AfterControlStatement = true
switch (x)
{
case 0:
  {
    foo();
    break;
  }
}

But I agree the CaseBlockIndent = ClosingBrace mode is definitely not consistent with the code. I think it is clearer this way, but that is definitely my own subjective opinion: in this case, I accept to lose the extra indent for the sake of compactness and to somehow mimic the "regular" case blocks (e.g. which do not include an actual block), but that is indeed really my personnal way to read code.

A user can create an error by reasoning like this, after the code has been formatted this way (a long time ago) : "oh, I need to make an extra function call, but in all cases.... ah, here is the end of the switch, let's put my function call here".

And then trying to format it with clang-format they'll be immediately thrown off because clang-format gets the indentation wrong :).
But I don't want to argue this to death. I understand what you are saying. I just don't think any of your suggested formats make this situation better.

I am not saying it happens often, I am just saying it breaks indentation : i.e. that two nested blocks should not close at the same depth. Breaking such things makes code harder to read/understand, and when you don't properly get the code you can more easily make a mistake. Obviously it should be caught in testing, but it is not always.

That said, I am not trying to say in any way that my approach is better. I think both CaseBlockIndent = Block or your variant (with the extra line break before opening brace; but applying it all the time) will indeed be consistent with the code and not break indentation; keeping the current behavior when IndentCaseLabels = true is also not a problem IMHO.

An extra thing to consider is that my approach is also consistent with having something before this block:

case A:
  {
    f();
    g();
  }
  h();
  break;
case B:
  f();
  {
    g();
  }
  h();
  break;

And to me (but this is obviously objective), your suggestions hide the structure of the code even more as they lead to a state where the closing brace is not aligned with the start of the line/statement that contains the opening braces. That looks like a bug to me more than anything else and I don't think there is another situation where clang-format would do that.

The exact same thing happens for functions blocks, class blocks and control statements, depending on BraceWrapping modes. Actually, IMO wrapping the opening brace should probably be done according to BraceWrapping.AfterControlStatement flag.

// BraceWrapping.AfterControlStatement = false
switch (x) {
case 0: {
    foo();
    break;
  }
}
// BraceWrapping.AfterControlStatement = true
switch (x)
{
case 0:
  {
    foo();
    break;
  }
}

But I agree the CaseBlockIndent = ClosingBrace mode is definitely not consistent with the code. I think it is clearer this way, but that is definitely my own subjective opinion: in this case, I accept to lose the extra indent for the sake of compactness and to somehow mimic the "regular" case blocks (e.g. which do not include an actual block), but that is indeed really my personnal way to read code.

I don't agree that that's the same thing. The closing brace is still neatly aligned with the line of the opening brace (which happens to be just the opening brace).

Typz added a comment.Feb 28 2018, 6:49 AM

I don't agree that that's the same thing. The closing brace is still neatly aligned with the line of the opening brace (which happens to be just the opening brace).

This invariant is not really applicable to switch statements, where code of each "branch" is already indented with no opening brace. :-)

I can try to change the mode so that we get either "google" style:

switch (x) {
case 0: {
  foo():
} break;
}

or "regular indentation" mode (e.g. what would look like no special indentation rule):

switch (x) {
case 0:
  {
    foo():
  }
  break;
}

Generally, would a flag for CaseBlockIndent be acceptable for this purpose ?
Or should the behavior simply be selected depending on IndentCaseLabels ?

New options for this would not be acceptable IMO. Too much cost for too little benefit.

I'd suggest to first make the change to fall back to the style with a regular block when there are statements other than break after the closing brace. That is always bad, no matter the indentation of the case labels:

switch (x) {
  case 1: {
    doSomething();
  }
    doSomethingElse();
    break;
}

Fixing this is good no matter what.

And then the second question is whether this style should be used or not (with IndentCaseLabels: false):

switch (x) {
case 1: {
  doSomething();
}
}

Pro: Saves horizontal and vertical space.
Con: It's weird to have to braces in the same column.

I don't personally have an opinion here, but I'll check with a few LLVM developers who work with LLVM style.

Got two responses so far.

  1. Having to closing braces in the same column is weird, but not that weird. Consider
namespace n {
void f() {
  ...
}
}
  1. Richard Smith (code owner of Clang) says that he doesn't really like the two closing braces in the same column, but he always cheats by removing the braces for the last case label / default. You never need them. In any case he'd strongly prefer the current behavior over adding an extra break and more indentation.

In conclusion, I don't think we want to move forward with making clang-format do

switch (x) {
case 1:
  {
    doSomething();
  }
}

even with IndentCaseLabels: false.

Typz updated this revision to Diff 158313.Jul 31 2018, 9:59 AM

Regenerate documentation