This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/docs/
-
docs/
-
CodingStandards.rst

Differential D26943

[CodingStandards] Add style guide rule about "if" statements and loops.
AbandonedPublic

Authored by jlebar on Nov 21 2016, 3:17 PM.

Download Raw Diff

Details

Reviewers

MatzeB
rnk

Summary

It seems that most LLVM code omits braces on, and only on, single-line
"if" and loop bodies.

Diff Detail

Build Status

Buildable 1470
Build 1470: arc lint + arc unit

Event Timeline

jlebar updated this revision to Diff 78795.Nov 21 2016, 3:17 PM

jlebar retitled this revision from to [CodingStandards] Add style guide rule about "if" statements and loops..

jlebar updated this object.

jlebar added reviewers: rnk, MatzeB.

jlebar added subscribers: llvm-commits, arsenm.

I feel like this should live down near "Don’t use else after a return". Seems kind of weird to put this before an admonition to, you know, indent code at all. We should probably delete that too to make the document more concise.

LGTM (though I would leave this open for a while until after the thanksgiving break to hear different opinions).

I would add an example. Maybe:

if (SomethingIsOdd)
  return;

if (CoolInstruction *CI = dyn_cast<CoolInstruction>(I)) {
  if (CI.IsBaz())
    ++BazInstructions;
}

This revision is now accepted and ready to land.Nov 21 2016, 3:23 PM

And this example:

if (I.IsEven()) {
  HandleEven(I);
} else {
  HandleOdd(I):
}

to make it clear we want braces for if else.

In D26943#601946, @MatzeB wrote:
And this example:
if (I.IsEven()) {
  HandleEven(I);
} else {
  HandleOdd(I):
}
to make it clear we want braces for if else.

Or rather:

if (I.isEven()) {
  handleEven(I);
} else {
  handleOdd(I):
}

to be in line with function name conventions.

I feel like this should live down near "Don’t use else after a return"

I put it here because the overall section header is "Mechanical Source Issues". But now I see that e.g. "Spaces Before Parentheses" is down at the bottom of the doc, so...I now have no idea how this is structured.

to make it clear we want braces for if else.

We do? Elsewhere in the style guide they have unbraced if/else, e.g. following "or better yet (in this case)".

In D26943#601951, @jlebar wrote:

I feel like this should live down near "Don’t use else after a return"

I put it here because the overall section header is "Mechanical Source Issues". But now I see that e.g. "Spaces Before Parentheses" is down at the bottom of the doc, so...I now have no idea how this is structured.

to make it clear we want braces for if else.

We do? Elsewhere in the style guide they have unbraced if/else, e.g. following "or better yet (in this case)".

There's definitely prior art in LLVM for unbraced else if chains.

Personally, I think I would prefer that we add braces only if one of the bodies of the code path requires it:

OK:

if (first)
  if (second)
    third();

!OK:

if (first)
  if (second) {
    third();
    fourth();
  }

OK:

if (first) {
  if (second) {
    third();
    fourth();
  }
}

In D26943#601951, @jlebar wrote:

I feel like this should live down near "Don’t use else after a return"

I put it here because the overall section header is "Mechanical Source Issues". But now I see that e.g. "Spaces Before Parentheses" is down at the bottom of the doc, so...I now have no idea how this is structured.

to make it clear we want braces for if else.

We do? Elsewhere in the style guide they have unbraced if/else, e.g. following "or better yet (in this case)".

Ok, I grepped around llvm source and the majority of cases is indeed without braces. Then add this to the examples:

if (I.isEven())
  handleEven(I);
else
  handleOdd(I);

In D26943#601961, @compnerd wrote:

Personally, I think I would prefer that we add braces only if one of the bodies of the code path requires it:

Just for the sake of discussion (I personally would prefer braces). This would result in this in the extreme case:

if (foo)
  for (Baz b : bazes)
    if (b)
      ++BCount;
    else
      ++NotBCount;
else
  foobar();

In D26943#601970, @MatzeB wrote:
In D26943#601961, @compnerd wrote:

Personally, I think I would prefer that we add braces only if one of the bodies of the code path requires it:

Just for the sake of discussion (I personally would prefer braces). This would result in this in the extreme case:
if (foo)
  for (Baz b : bazes)
    if (b)
      ++BCount;
    else
      ++NotBCount;
else
  foobar();

+1 for braces

In D26943#601963, @MatzeB wrote:
We do? Elsewhere in the style guide they have unbraced if/else, e.g. following "or better yet (in this case)".

Ok, I grepped around llvm source and the majority of cases is indeed without braces. Then add this to the examples:
if (I.isEven())
  handleEven(I);
else
  handleOdd(I);

I'm OK with all this. But can you clarify for these:

if (I.isEven()) {
  handleEven(I);
  handleEven(I);
} else    // Should we open a brace here?
  handleOdd(I);

if (I.isEven())     // Should we open a brace here?
  handleEven(I); 
else {
  handleOdd(I);
  handleOdd(I);
}

I remember @majnemer asking for bracing both path when one is braced.

Although, for really simple cases it's probably fine to omit them:
if (auto Foo = bar())
  for (auto Baz : Foo.stuff())
    bazIt(Baz);
These types of constructs are one of my favorite parts of idiomatic LLVM code. I'd hate to lose them.

I'm sympathetic to this idiom, but I can't come up with an easy way to write a rule to allow it:

for (Foo *f : foos)
  if (f->isSomething())
    return true;
return false;

I also don't like it when it gets to three blocks.

for (Foo *f : foos)
  if (auto *b = dyn_cast<Bar>(f->getBar()))
    if (b->isBaz())
      return true;
return false;

It's too clever.

I'd rather have an easy rule of thumb that requires this:

for (Foo *f : foos) {
  if (f->isSomething())
    return true;
}
return false;

I've also seen this before, and I want to make sure I don't see it again:

for (Foo *f : foos)
  if (f->isSomething()) {
    stmt1();
    stmt2();
  }

In D26943#604404, @rnk wrote:

I'm sympathetic to this idiom, but I can't come up with an easy way to write a rule to allow it:

Does it need to be codified in full? I think that putting more braces should be allowed and the coding guidelines should dictate where they can be omitted.

(1) If a nested statement has braces, the enclosing statement should have braces.
(2) If the immediate nesting has only one indented line, braces can be omitted, unless (1) applies.

The intent of (2) is to require braces in cases like this:

for (auto a : Items)
  // Calculate foo for each item.
  a.foo();

Does it need to be codified in full?

I think we stand to benefit greatly from having an unambiguous statement of the rules, yes.

The cost of not having such a statement is that each reviewer has their own set of rules in their head, and so we waste human time during reviews with reviewers explaining their mental model and asking reviewees to change their code. And that's in the best case, where the reviewees don't argue back.

In D26943#604639, @jlebar wrote:

The cost of not having such a statement is that each reviewer has their own set of rules in their head,

That's a problem with reviewers.

The guidelines should be pretty clear: readability, and consistency. Regulating every little thing is unnecessary.

In D26943#604786, @kparzysz wrote:

In D26943#604639, @jlebar wrote:

The cost of not having such a statement is that each reviewer has their own set of rules in their head,

That's a problem with reviewers.

Not only, this is part of the "consistency" you mentioned in your second sentence.

The guidelines should be pretty clear: readability, and consistency. Regulating every little thing is unnecessary.

As a few others have indicated, I really don't like the proposed rule as written. I very much want to continue to write:

if (...)
  for (...)
    ...;

and

if (...)
  for (...) {
    ...;
    ...;
  }

I don't think there is any interesting ambiguity in these cases, and the braces seem to add clutter with little value.

I'd like a rule that strikes at the heart of the issue: use braces when they make the structure more clear. But I understand that this is subjective and we really should have a rule that is relatively mechanical for consistency's sake and gets close enough to the spirit of this to be a good fit.

Here is my attempt:

Omit braces when there is no ambiguity due to nesting and else statements.

Common cases of this are flagged with compiler warnings:

if (...)
  if (...)
    ...;
  else // warning: ambiguous else!
    ...;

Even if the ambiguity happens to dodge the compiler warnings or other checking mechanism, you should still use braces:

if (...)
  for (...)
    if (...)
      ...;
    else // Still bad!
      ...;

I like this (or something like this) formulation because:

It doesn't talk about number of lines. Do comments count as lines? what if the *condition* is multiple lines but the statement isn't? I don't like using line count here.

It precisely matches what compilers are trying to warn on already.

It gives grounds for a reviewer to argue that a highly confusing pattern, even if it doesn't literally result in ambiguity, is still sufficiently opaque as to require curlies. We could add some examples here if this actually comes up.

The primary alternative I see is tweaking the above to also require braces on all surrounding constructs once they are required for one. From discussions with Richard Smith, I think this is more common in Clang's code whereas the formulation I suggested is more common in LLVM code.

The reason I (quite strongly) prefer the formulation I proposed is in no small part because clang-format has completely removed the failure mode where I mis-indent things. As a consequence, I prefer the more minimal & brief approach to braces. I feel like it balances clarity and brevity well.

If we can't agree on a single consistent style between these two alternatives, then perhaps we end up saying both are acceptable and letting code reviewers choose, but I would generally prefer having clear guidance here.

My two cents.
-Chandler

In D26943#604951, @mehdi_amini wrote:

Not only, this is part of the "consistency" you mentioned in your second sentence.

I meant consistency with the surrounding code---that is much less subjective.

In D26943#605034, @chandlerc wrote:

If we can't agree on a single consistent style between these two alternatives, then perhaps we end up saying both are acceptable and letting code reviewers choose, but I would generally prefer having clear guidance here.

I agree with this, except that I'd say "letting the authors choose". There is more than one way to make code fit in with the rest and be readable.

In D26943#605046, @kparzysz wrote:

In D26943#604951, @mehdi_amini wrote:

Not only, this is part of the "consistency" you mentioned in your second sentence.

I meant consistency with the surrounding code---that is much less subjective.

Then I have a strong disagreement on this:

clang-format totally changed the way I deal with formatting, and having rules that can be taught to an automatic tool removes any subjectivity and discussion about the formatting.
What makes the code "clear" and "readable" beyond the various tastes of contributors, *is* consistency across the codebase. Less brain power used to adapt to different styles is a non-negligible productivity gain to me.

In D26943#605082, @mehdi_amini wrote:

Then I have a strong disagreement on this:

clang-format totally changed the way I deal with formatting, and having rules that can be taught to an automatic tool removes any subjectivity and discussion about the formatting.

What makes the code "clear" and "readable" beyond the various tastes of contributors, *is* consistency across the codebase. Less brain power used to adapt to different styles is a non-negligible productivity gain to me.

My concern about automatic formatting is that it may damage context-specific formatting. Consider this example:

unsigned FlagAB = FlagAlpha | FlagBravo;
unsigned FlagAC = FlagAlpha             | FlagCharlie;
unsigned FlagBC =             FlagBravo | FlagCharlie;

Here the spacing is intended to separate the individual flags in columns, and doing do may improve readability of this piece of code, which could otherwise appear cluttered. Automatic formatting could remove all of it, since it is not aware of the author's intent. I know this is a trivial example, but I hope that it illustrates the concept.

In my experience, readability is affected to different degrees by different formatting elements. Some factors are highly important (like spacing, e.g (((x-1)|(x+1)*(x-2))+x)+(x-1)*(x+2) vs (((x-1) | (x+1)*(x-2)) + x) + (x-1)*(x+2), which makes a big difference in visual parsing), others are less so. While I agree with the insistence on the important ones, I prefer to be lax about the others. Small local differences are a natural consequence of having a large number of contributors and the evolution of the formatting guidelines themselves. If at one point we decided that single-line if/for/while statements should also be bracketed, would we require that in a new code added to a file that consistently didn't have them? If so, it would be inconsistent with the existing code, if not, it would contravene the guidelines. Sweeping changes throughout the code base are a real pain for out-of-tree developers, so hopefully that wouldn't be considered. IMO, it's best to allow a degree of flexibility and just live with it.

In D26943#605378, @kparzysz wrote:

My concern about automatic formatting is that it may damage context-specific formatting. Consider this example:
[...]

For specific cases (formatting table, etc.), you can disable clang-format by marking the begin/end of the section with a comment, see for example https://github.com/llvm-mirror/lldb/blob/master/source/Commands/CommandObjectType.cpp#L101

If at one point we decided that single-line if/for/while statements should also be bracketed, would we require that in a new code added to a file that consistently didn't have them? If so, it would be inconsistent with the existing code, if not, it would contravene the guidelines.

I *always* clang-format the statement I change in my patch, irrespectively of the rest of the file. Hopefully with time we'll touch enough lines that non-conformant code will slowly disappear.
I've also seen people clang-formatting a file as a whole as a pre-step before performing large changes to it.

Sweeping changes throughout the code base are a real pain for out-of-tree developers, so hopefully that wouldn't be considered.

We are quite consistent about not limiting ourselves upstream to "please" out-of-tree developers, I wouldn't take this into consideration here either. (See also for example the recent massive clang-formatting change in LLDB, where they reformatted the full codebase in a single commit.)

ABataev mentioned this in D43627: [DEBUGINFO] Add flag for DWARF2 or less to use sections as references..Mar 1 2018, 6:57 AM

jlebar abandoned this revision.Feb 11 2020, 7:30 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2020, 7:31 AM

Revision Contents

Path

Size

llvm/

docs/

CodingStandards.rst

7 lines

Diff 78795

llvm/docs/CodingStandards.rst

	Show First 20 Lines • Show All 504 Lines • ▼ Show 20 Lines
	unreadable, and it is not worth dealing with.			unreadable, and it is not worth dealing with.

	As always, follow the `Golden Rule`_ above: follow the style of			As always, follow the `Golden Rule`_ above: follow the style of
	existing code if you are modifying and extending it. If you like four spaces of			existing code if you are modifying and extending it. If you like four spaces of
	indentation, DO NOT do that in the middle of a chunk of code with two spaces			indentation, DO NOT do that in the middle of a chunk of code with two spaces
	of indentation. Also, do not reindent a whole source file: it makes for			of indentation. Also, do not reindent a whole source file: it makes for
	incredible diffs that are absolutely worthless.			incredible diffs that are absolutely worthless.

				Omit braces for (and only for) one-line "if" and loop bodies
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				If the body of an if statement or a loop is a single line, omit the curly
				braces. If the if statement or loop is multiple lines long, do not omit the
				braces.

	Indent Code Consistently			Indent Code Consistently
	^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^

	Okay, in your first year of programming you were told that indentation is			Okay, in your first year of programming you were told that indentation is
	important. If you didn't believe and internalize this then, now is the time.			important. If you didn't believe and internalize this then, now is the time.
	Just do it. With the introduction of C++11, there are some new formatting			Just do it. With the introduction of C++11, there are some new formatting
	challenges that merit some suggestions to help have consistent, maintainable,			challenges that merit some suggestions to help have consistent, maintainable,
	and tool-friendly formatting and indentation.			and tool-friendly formatting and indentation.
	▲ Show 20 Lines • Show All 1,119 Lines • Show Last 20 Lines