It seems that most LLVM code omits braces on, and only on, single-line
"if" and loop bodies.
Details
Diff Detail
- Build Status
Buildable 1470 Build 1470: arc lint + arc unit
Event Timeline
I feel like this should live down near "Don’t use else after a return". Seems kind of weird to put this before an admonition to, you know, indent code at all. We should probably delete that too to make the document more concise.
LGTM (though I would leave this open for a while until after the thanksgiving break to hear different opinions).
I would add an example. Maybe:
if (SomethingIsOdd) return; if (CoolInstruction *CI = dyn_cast<CoolInstruction>(I)) { if (CI.IsBaz()) ++BazInstructions; }
And this example:
if (I.IsEven()) { HandleEven(I); } else { HandleOdd(I): }
to make it clear we want braces for if else.
Or rather:
if (I.isEven()) { handleEven(I); } else { handleOdd(I): }
to be in line with function name conventions.
I feel like this should live down near "Don’t use else after a return"
I put it here because the overall section header is "Mechanical Source Issues". But now I see that e.g. "Spaces Before Parentheses" is down at the bottom of the doc, so...I now have no idea how this is structured.
to make it clear we want braces for if else.
We do? Elsewhere in the style guide they have unbraced if/else, e.g. following "or better yet (in this case)".
Personally, I think I would prefer that we add braces only if one of the bodies of the code path requires it:
OK:
if (first) if (second) third();
!OK:
if (first) if (second) { third(); fourth(); }
OK:
if (first) { if (second) { third(); fourth(); } }
Ok, I grepped around llvm source and the majority of cases is indeed without braces. Then add this to the examples:
if (I.isEven()) handleEven(I); else handleOdd(I);
Just for the sake of discussion (I personally would prefer braces). This would result in this in the extreme case:
if (foo) for (Baz b : bazes) if (b) ++BCount; else ++NotBCount; else foobar();
I'm OK with all this. But can you clarify for these:
if (I.isEven()) { handleEven(I); handleEven(I); } else // Should we open a brace here? handleOdd(I);
if (I.isEven()) // Should we open a brace here? handleEven(I); else { handleOdd(I); handleOdd(I); }
I remember @majnemer asking for bracing both path when one is braced.
Although, for really simple cases it's probably fine to omit them:
if (auto Foo = bar()) for (auto Baz : Foo.stuff()) bazIt(Baz);These types of constructs are one of my favorite parts of idiomatic LLVM code. I'd hate to lose them.
I'm sympathetic to this idiom, but I can't come up with an easy way to write a rule to allow it:
for (Foo *f : foos) if (f->isSomething()) return true; return false;
I also don't like it when it gets to three blocks.
for (Foo *f : foos) if (auto *b = dyn_cast<Bar>(f->getBar())) if (b->isBaz()) return true; return false;
It's too clever.
I'd rather have an easy rule of thumb that requires this:
for (Foo *f : foos) { if (f->isSomething()) return true; } return false;
I've also seen this before, and I want to make sure I don't see it again:
for (Foo *f : foos) if (f->isSomething()) { stmt1(); stmt2(); }
Does it need to be codified in full? I think that putting more braces should be allowed and the coding guidelines should dictate where they can be omitted.
(1) If a nested statement has braces, the enclosing statement should have braces.
(2) If the immediate nesting has only one indented line, braces can be omitted, unless (1) applies.
The intent of (2) is to require braces in cases like this:
for (auto a : Items) // Calculate foo for each item. a.foo();
Does it need to be codified in full?
I think we stand to benefit greatly from having an unambiguous statement of the rules, yes.
The cost of not having such a statement is that each reviewer has their own set of rules in their head, and so we waste human time during reviews with reviewers explaining their mental model and asking reviewees to change their code. And that's in the best case, where the reviewees don't argue back.
That's a problem with reviewers.
The guidelines should be pretty clear: readability, and consistency. Regulating every little thing is unnecessary.
Not only, this is part of the "consistency" you mentioned in your second sentence.
The guidelines should be pretty clear: readability, and consistency. Regulating every little thing is unnecessary.
As a few others have indicated, I really don't like the proposed rule as written. I very much want to continue to write:
if (...) for (...) ...;
and
if (...) for (...) { ...; ...; }
I don't think there is any interesting ambiguity in these cases, and the braces seem to add clutter with little value.
I'd like a rule that strikes at the heart of the issue: use braces when they make the structure more clear. But I understand that this is subjective and we really should have a rule that is relatively mechanical for consistency's sake and gets close enough to the spirit of this to be a good fit.
Here is my attempt:
Omit braces when there is no ambiguity due to nesting and else statements.
Common cases of this are flagged with compiler warnings:
if (...) if (...) ...; else // warning: ambiguous else! ...;
Even if the ambiguity happens to dodge the compiler warnings or other checking mechanism, you should still use braces:
if (...) for (...) if (...) ...; else // Still bad! ...;
I like this (or something like this) formulation because:
- It doesn't talk about number of lines. Do comments count as lines? what if the *condition* is multiple lines but the statement isn't? I don't like using line count here.
- It precisely matches what compilers are trying to warn on already.
- It gives grounds for a reviewer to argue that a highly confusing pattern, even if it doesn't literally result in ambiguity, is still sufficiently opaque as to require curlies. We could add some examples here if this actually comes up.
The primary alternative I see is tweaking the above to also require braces on all surrounding constructs once they are required for one. From discussions with Richard Smith, I think this is more common in Clang's code whereas the formulation I suggested is more common in LLVM code.
The reason I (quite strongly) prefer the formulation I proposed is in no small part because clang-format has completely removed the failure mode where I mis-indent things. As a consequence, I prefer the more minimal & brief approach to braces. I feel like it balances clarity and brevity well.
If we can't agree on a single consistent style between these two alternatives, then perhaps we end up saying both are acceptable and letting code reviewers choose, but I would generally prefer having clear guidance here.
My two cents.
-Chandler
I agree with this, except that I'd say "letting the authors choose". There is more than one way to make code fit in with the rest and be readable.
Then I have a strong disagreement on this:
- clang-format totally changed the way I deal with formatting, and having rules that can be taught to an automatic tool removes any subjectivity and discussion about the formatting.
- What makes the code "clear" and "readable" beyond the various tastes of contributors, *is* consistency across the codebase. Less brain power used to adapt to different styles is a non-negligible productivity gain to me.
My concern about automatic formatting is that it may damage context-specific formatting. Consider this example:
unsigned FlagAB = FlagAlpha | FlagBravo; unsigned FlagAC = FlagAlpha | FlagCharlie; unsigned FlagBC = FlagBravo | FlagCharlie;
Here the spacing is intended to separate the individual flags in columns, and doing do may improve readability of this piece of code, which could otherwise appear cluttered. Automatic formatting could remove all of it, since it is not aware of the author's intent. I know this is a trivial example, but I hope that it illustrates the concept.
In my experience, readability is affected to different degrees by different formatting elements. Some factors are highly important (like spacing, e.g (((x-1)|(x+1)*(x-2))+x)+(x-1)*(x+2) vs (((x-1) | (x+1)*(x-2)) + x) + (x-1)*(x+2), which makes a big difference in visual parsing), others are less so. While I agree with the insistence on the important ones, I prefer to be lax about the others. Small local differences are a natural consequence of having a large number of contributors and the evolution of the formatting guidelines themselves. If at one point we decided that single-line if/for/while statements should also be bracketed, would we require that in a new code added to a file that consistently didn't have them? If so, it would be inconsistent with the existing code, if not, it would contravene the guidelines. Sweeping changes throughout the code base are a real pain for out-of-tree developers, so hopefully that wouldn't be considered. IMO, it's best to allow a degree of flexibility and just live with it.
For specific cases (formatting table, etc.), you can disable clang-format by marking the begin/end of the section with a comment, see for example https://github.com/llvm-mirror/lldb/blob/master/source/Commands/CommandObjectType.cpp#L101
If at one point we decided that single-line if/for/while statements should also be bracketed, would we require that in a new code added to a file that consistently didn't have them? If so, it would be inconsistent with the existing code, if not, it would contravene the guidelines.
I *always* clang-format the statement I change in my patch, irrespectively of the rest of the file. Hopefully with time we'll touch enough lines that non-conformant code will slowly disappear.
I've also seen people clang-formatting a file as a whole as a pre-step before performing large changes to it.
Sweeping changes throughout the code base are a real pain for out-of-tree developers, so hopefully that wouldn't be considered.
We are quite consistent about not limiting ourselves upstream to "please" out-of-tree developers, I wouldn't take this into consideration here either. (See also for example the recent massive clang-formatting change in LLDB, where they reformatted the full codebase in a single commit.)