Page MenuHomePhabricator

[clang-format] Add Left/Right Const (East/West , Before/After) fixer capability
Needs ReviewPublic

Authored by MyDeveloperDay on Nov 3 2019, 7:52 AM.

Details

Summary

Developers these days seem to argue over east vs west const like they used to argue over tabs vs whitespace or the various bracing style. These previous arguments were mainly eliminated with tools like clang-format that allowed those rules to become part of your style guide. Anyone who has been using clang-format in a large team over the last couple of years knows that we don't have those religious arguments any more, and code reviews are more productive.

https://www.youtube.com/watch?v=fv--IKZFVO8
https://mariusbancila.ro/blog/2018/11/23/join-the-east-const-revolution/
https://www.youtube.com/watch?v=z6s6bacI424

The purpose of this revision is to try to do the same for the East/West const discussion. Move the debate into the style guide and leave it there!

In addition to the new ConstStyle: Right or ConstStyle: Left there is an additional command-line argument --const-style=left/right/before/after/east/west which would allow an individual developer to switch the source back and forth to their own style for editing, and back to the committed style before commit. (you could imagine an IDE might offer such a switch)

The revision works by implementing a separate pass of the Annotated lines much like the SortIncludes and then create replacements for constant type declarations.

Diff Detail

Event Timeline

MyDeveloperDay created this revision.Nov 3 2019, 7:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2019, 7:52 AM
Herald added a subscriber: mgorny. · View Herald Transcript

Support for template object const type declarations

A demo of clang-format -i --const-style=east *.cpp *.h being run on clang/lib/Format can be see here D69780: [clang-format] DO NOT COMMIT - Demo of East/West Const Fixer on clang-format

lebedev.ri added inline comments.
clang/lib/Format/Format.cpp
780

Based on code reviews, this should be CS_West.

MyDeveloperDay marked 2 inline comments as done.Nov 3 2019, 11:57 PM
MyDeveloperDay added inline comments.
clang/lib/Format/Format.cpp
780

I would tend to agree for the LLVM project itself, but I really don't feel I can change this due to the huge number of projects which have BasedOnStyle: LLVM (not to mention most other style inherit from LLVM like Google, Chromium etc...)

I think this needs to be an opt-in change in the .clang-format (at least for now)

lebedev.ri marked an inline comment as done.Nov 4 2019, 12:10 AM
lebedev.ri added inline comments.
clang/lib/Format/Format.cpp
780

There is already a number of stylistic LLVM-specific decisions in LLVM style

  • 80-col wrap
  • pointer alignment to the right
  • spaces instead of tabs
  • ???

I think it first and foremost is an LLVM style, and i don't
really recall any guarantees that it must not/will not change.

MyDeveloperDay marked 2 inline comments as done.Nov 4 2019, 1:07 AM
MyDeveloperDay added inline comments.
clang/lib/Format/Format.cpp
780

Again I do agree, but I think those rules were established before clang-format really gained such popular usage, those coming to clang-format changed those probably straight away if they didn't agree with their established their style.

Changing something under everyone now feels like I'd have a different agenda (for west const to win), and that that is not my goal. I'm simply trying to facilitate compliance.

I think we can agree the east/west const rebellion is a contentious issue, (as where the previous formatting wars!), my aim here is to help try and remove the discussion from the debate and let people move on! but I think what it doesn't need is someone like clang-format being considered the bad guy by dictating a default style globally.

I did a quick search on github, the hits to "const int" and "int const" alone are too eye-watering. I really don't want to break the internet with this.

An example of even how setting the style to be West will generate new changes on a previously formatted lib/Format

$ git diff .
diff --git a/clang/lib/Format/WhitespaceManager.cpp b/clang/lib/Format/WhitespaceManager.cpp
old mode 100644
new mode 100755
index 5a44500d355..0274d35bad7
--- a/clang/lib/Format/WhitespaceManager.cpp
+++ b/clang/lib/Format/WhitespaceManager.cpp
@@ -585,7 +585,7 @@ void WhitespaceManager::alignConsecutiveDeclarations() {
   //   SomeVeryLongType const& v3;
   AlignTokens(
       Style,
-      [](Change const &C) {
+      [](const Change &C) {
         // tok::kw_operator is necessary for aligning operator overload
         // definitions.
         if (C.Tok->isOneOf(TT_FunctionDeclarationName, tok::kw_operator))

@pollydev too (a previously clang-format clean directory) will generate changes running clang-format --const-style=west -i -n *.cpp

ScopBuilder.cpp:74:9: warning: code should be clang-formatted [-Wclang-format-violations]
static int const MaxDimensionsInAccessRange = 9;
        ^
ScopInfo.cpp:113:2: warning: code should be clang-formatted [-Wclang-format-violations]
int const polly::MaxDisjunctsInDomain = 20;
....
 ^
ScopInfo.cpp:118:9: warning: code should be clang-formatted [-Wclang-format-violations]
static int const MaxDisjunctsInContext = 4;

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

East/West I think is a term coined I think by Simon Brand (@TartanLlama https://twitter.com/tartanllama) or maybe Dan Saks before, I did a quick check on github and both "west const","east const" ,"left const" and "right const" are all used in commit messages, I'd really like to keep the East/West just because that's the name of the rebellion (not affiliated!), but we could alias to Left and Right (like we sometimes do for true/false when we overload the options), I've also seen const Before and const After used. again if people don't think its too much overload I'd be happy to add those if it means people get the gist a little clearer.

I'd probably agree with the CVQualifierStyle although not such a nice command-line name, what also concerns me is sometimes you see const volatile type and sometimes volatile const type sometimes type const volatile almost feels like volatile might needs its own VolatileStyle allowing

const volatile int   (ConstStyle: West,  VolatileStyle: East)
volatile const int   (ConstStyle: West,  VolatileStyle: West)
int const volatile   (ConstStyle: East,  VolatileStyle: East)
int volatile const   (ConstStyle: West,  VolatileStyle: West)

Whilst I agree one option would be better, I think we'll end up being asked for more flexible support, such is the way for people to want their own style.

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

East/West I think is a term coined I think by Simon Brand (@TartanLlama https://twitter.com/tartanllama) or maybe Dan Saks before, I did a quick check on github and both "west const","east const" ,"left const" and "right const" are all used in commit messages, I'd really like to keep the East/West just because that's the name of the rebellion (not affiliated!), but we could alias to Left and Right (like we sometimes do for true/false when we overload the options), I've also seen const Before and const After used. again if people don't think its too much overload I'd be happy to add those if it means people get the gist a little clearer.

I think east/west is more of a C++ community term than a C community one (and it's a newer phrase). I'd be fine with before/after as well as left/right, with an alias for east/west, but I'd like the canonical name to not be east/west.

I'd probably agree with the CVQualifierStyle although not such a nice command-line name, what also concerns me is sometimes you see const volatile type and sometimes volatile const type sometimes type const volatile almost feels like volatile might needs its own VolatileStyle allowing

const volatile int   (ConstStyle: West,  VolatileStyle: East)
volatile const int   (ConstStyle: West,  VolatileStyle: West)
int const volatile   (ConstStyle: East,  VolatileStyle: East)
int volatile const   (ConstStyle: West,  VolatileStyle: West)

Whilst I agree one option would be better, I think we'll end up being asked for more flexible support, such is the way for people to want their own style.

If we wanted to add more flexible support, we could do so with a placeholder syntax, like CVQualifierStyle: volatile const %ident% restrict or some such, so I don't think such a name fully closes the design space. However, if we do want to stick with separate styles for each of these, we should have explicit comments and tests showing this should only work for const qualifiers.

MyDeveloperDay retitled this revision from [clang-format] Add East Const / West Const fixer to [clang-format] Add Left/Right Const (East/West , Before/After) fixer capability.
MyDeveloperDay edited the summary of this revision. (Show Details)
  • Switch default to be Left/Right const
  • Add support for East/West and Before/After
  • Add volatile unit tests
  • Cover comment case (the only build failure in clang/lib caused after running clang-format -i --const-style=east)

@pollydev too (a previously clang-format clean directory) will generate changes running clang-format --const-style=west -i -n *.cpp

ScopBuilder.cpp:74:9: warning: code should be clang-formatted [-Wclang-format-violations]
static int const MaxDimensionsInAccessRange = 9;
        ^
ScopInfo.cpp:113:2: warning: code should be clang-formatted [-Wclang-format-violations]
int const polly::MaxDisjunctsInDomain = 20;
....
 ^
ScopInfo.cpp:118:9: warning: code should be clang-formatted [-Wclang-format-violations]
static int const MaxDisjunctsInContext = 4;

Changes in clang-format 'break' Polly regularly. When that happens, we run ninja polly-update-format and commit. Polly should not be a reason to not make changes in clang-format.

(Sorry for arriving at this late)

At a strategic level, I have some concerns here: the fact that clang-format generally doesn't touch the token sequence isn't an accident.
e.g. formatting int x;; will insert a newline rather than deleting the redundant semicolon. Like the one in this patch, that would be a useful feature, but it's a path the clang-format authors deliberately decided to close off.

The conservative approach means that pseudo-parser gets things wrong, we get broken formatting, and ~never broken code.
This means:

  • a simpler mental model. People are happier to blindly apply it, not to require review of formatting changes, etc
  • the ability to require clang-formatting as a pre-commit hook, knowing that at worst people will have to badly format their code
  • a lesser risk when rolling out new versions of clang-format from (our org does this from head weekly)

(There are exceptions, e.g. comment splitting, but they're pretty rare)

@klimek in case he has thoughts.

clang/lib/Format/Format.cpp
780

For non-style reasons, I do think this should be CS_Leave in every built-in style, so that adjusting const would have to at least be explicitly in .clang-format.

(Sorry for arriving at this late)

At a strategic level, I have some concerns here: the fact that clang-format generally doesn't touch the token sequence isn't an accident.
e.g. formatting int x;; will insert a newline rather than deleting the redundant semicolon. Like the one in this patch, that would be a useful feature, but it's a path the clang-format authors deliberately decided to close off.

I would agree that in the beginning that was true, but now with sorting of includes/using we are making replacements to move things around, The fact we are using the same Replacement ideas that clang-tidy uses for fix-its I feel this isn't a huge change of direction.

Since I also added the dry-run approach, I also think we can use this kind of approach to validate code for conformance (and not necessarily replace the actual code). But at least in the initial tests I've done on relatively large codebases this actually works better than I anticipated.

This is a requested feature both in our bug database but also on stack overflow, this felt like a natural fit (and an interesting challenge)

(Sorry for arriving at this late)

At a strategic level, I have some concerns here: the fact that clang-format generally doesn't touch the token sequence isn't an accident.
e.g. formatting int x;; will insert a newline rather than deleting the redundant semicolon. Like the one in this patch, that would be a useful feature, but it's a path the clang-format authors deliberately decided to close off.

I would agree that in the beginning that was true, but now with sorting of includes/using we are making replacements to move things around,

Yes. Making include-sorting a style option (and then turning that option on for google style) has been controversial and in hindsight a mistake IMO. I believe making it (only) a command-line flag is an option we should consider now (and should have considered then).

There are important differences here: include insertion (and I think using too) is less likely to hit pseudoparser problems, but more likely that bad fixes change program behavior.

The fact we are using the same Replacement ideas that clang-tidy uses for fix-its I feel this isn't a huge change of direction.

clang-tidy and clang-format are very different tools with very different philosophies. clang-tidy is fairly successful and widely used, but clang-format much more so.

I definitely agree with the default being "do nothing", its not just google style, nearly everyone is using a style which is derived from one of these base styles, that said turning on a const style by default would IMHO be a mistake, the level of churn could be immense.

I do get the concerns over the matching of "lexed" tokens rather than perhaps a semantic analysis could lead to false positives, that's harder to tell without running this over a larger amount of code and then running the unit tests of those projects, I feel the saving grace is the scenarios are limited to where the word const can appear in the language and so this should allow us to limit the possible failure scenarios a little.

I just didn't want to do this in clang-tidy, because I find the whole -fix part of clang-tidy unsatisfactory unless i'm dealing with just one checker at a time.

NOTE: Add foo() const override and foo() const override LLVM_READONLY test examples as these currently fail
MyDeveloperDay set the repository for this revision to rG LLVM Github Monorepo.

Fix an issue with foo() const override/final
Be more specific about this being turned off in all the base styles

Make the patch file correctly.

Build result: pass - 59989 tests passed, 0 failed and 763 were skipped.
Log files: console-log.txt, CMakeCache.txt

MyDeveloperDay marked an inline comment as done.Nov 17 2019, 4:31 AM
  • Rebase
  • Add Release note
  • Remove repeated lines cause by patch creation artefact
JonasToth added inline comments.
clang/include/clang/Format/Format.h
1110

Drive-By question: https://reviews.llvm.org/D54395 implements clang-tidy part of adding const to variables. It would be great to use the same underlying functionality, e.g. this enum for determining which const-style to use.
Is this header-file always accessible from clang-tidy-code? Or can e.g. clang-format be deactivated while building leading to exclusion of these bits?

MyDeveloperDay marked an inline comment as done.Nov 19 2019, 7:44 AM
MyDeveloperDay added inline comments.
clang/include/clang/Format/Format.h
1110

Doesn't clang-tidy include the clangFormat library for the purposes of reformatting after a replacement? in which case this should be linked in should it not?

from clang-tidy CMakeLists.txt

LINK_LIBS
  clangAnalysis
  clangAST
  clangASTMatchers
  clangBasic
  clangFormat
  clangFrontend
  clangLex
  clangRewrite
  clangSema
  clangSerialization
  clangTooling
  clangToolingCore
  )
MyDeveloperDay marked an inline comment as done.Nov 19 2019, 7:45 AM

-update with missing files
-clang-format test

klimek added a comment.Dec 2 2019, 2:23 AM

I'm not generally opposed to this, given that
a) clang-format already changes code; I think by now we're not fixing double semicolon mainly for workflow reasons, would be fine to add
b) the implementation is very self contained

clang/docs/ClangFormatStyleOptions.rst
1378

Personally, I'm somewhat against having 3 different aliases for the options. I'd chose one, even though it doesn't make everybody happy, and move on. I'm fine with East/West as long as the documentation makes it clear what it is.

clang/lib/Format/EastWestConstFixer.cpp
143

This function is super large - can we split it up?

clang/unittests/Format/FormatTest.cpp
30–32

I'm somewhat opposed to introducing these macros in this patch. If we want them, we should create an extra patch, and figure out ho to use them in all format tests.

MyDeveloperDay marked an inline comment as done.Dec 3 2019, 12:23 AM
MyDeveloperDay added inline comments.
clang/docs/ClangFormatStyleOptions.rst
1378

If I have to drop the other options, I think I'd want to go with East/West const as I feel it has more momentum, just letting people know before I change the code back (to my original patch ;-) )

https://www.youtube.com/watch?v=gRmI_gsNqcI

aaron.ballman added inline comments.Dec 3 2019, 5:13 AM
clang/docs/ClangFormatStyleOptions.rst
1378

@klimek I requested that we do not go with East/West the options and I'm still pretty insistent on it. East/West is a kitschy way to phrase it that I think is somewhat US-centric (where we make a pretty big distinction between the east and west coasts). I do not want to have to mentally map left/right to the less-clear east/west in the config file. Would you be fine if we only had Left/Right instead of East/West? I would be fine with that option, but figured enough people like the cute East/West designation that we might as well support it.

MyDeveloperDay marked an inline comment as done.Dec 3 2019, 5:27 AM
MyDeveloperDay added inline comments.
clang/docs/ClangFormatStyleOptions.rst
1378

Just for a reference, I'm not from the US and I think east/west still translates pretty well. I was happy to support the others.

As a consensus how about I drop Before/After and Keep Left/Right and West/East const support? (as no one is asking for Before/After?) can we handle having 2 options to aid language issues?

As a consensus how about I drop Before/After and Keep Left/Right and West/East const support? (as no one is asking for Before/After?) can we handle having 2 options to aid language issues?

FWIW as a non-US person (but english speaker) my preferences would be:

  1. east/west (this is what the community uses AFAICT)
  2. left/right
  3. both as synonyms

I don't think synonyms buy much, people who don't understand "west" will still have to work on projects where the config says "west".
And printing the config will use a different label, which is mildly confusing.

If there's strong sentiment to only go with East/West, I'll not block the feature to argue about the names further. That said, I still think East/West is a kitschy phrasing and I've seen multiple people get temporarily confused by it (while wearing the bracelets, no less) during in-person discussions. I think that either left/right or before/after are immediately clear to everyone -- I've never experienced someone getting confused by it, unlike east/west.

ilya added a subscriber: ilya.Dec 5 2019, 9:39 AM
ilya added a comment.Dec 9 2019, 12:33 PM

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

+1.
In addition to volatile and restrict, in my organization we'd also be interested in applying this to address space qualifiers (custom keywords added in our downstream fork). Can this be generalized to accept a map of qualifier keywords and its desired orientation?

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

+1.
In addition to volatile and restrict, in my organization we'd also be interested in applying this to address space qualifiers (custom keywords added in our downstream fork). Can this be generalized to accept a map of qualifier keywords and its desired orientation?

Could you give me an example of what you mean?

ilya added a comment.Dec 9 2019, 3:25 PM

I like the functionality, but am slightly opposed to using "east/west" terminology -- that's not a ubiquitous phrase and it takes a bit of thinking before it makes sense. I think "left/right" is likely to be more universally understood.

Also, should this apply to other qualifiers like volatile or restrict? If so, the name ConstStyle should probably be CVQualifierStyle or something else.

+1.
In addition to volatile and restrict, in my organization we'd also be interested in applying this to address space qualifiers (custom keywords added in our downstream fork). Can this be generalized to accept a map of qualifier keywords and its desired orientation?

Could you give me an example of what you mean?

As described in section 5 of the embedded C standard ISO/IEC J TC1 SC22 WG14 N1169, our downstream llvm fork supports several address spaces, let's denote them __as1 and __as2. These address space qualifiers behave similarly to const, so the following are semantically equal:

__as1 int foo;
int __as1 bar;

So for us it would be nice to be able to specify alignment options for a dynamic list of custom qualifiers. Something like the following:

QualifierStyle:
  const:     Right
  volatile:  Right
  __as1:     Right
  __as2:     Right

But I don't know if this can be (easily) supported in a .clang-format file, since the style options are defined as (static) enums. I realize my proposal might be out of the scope of this patch, but I wanted to get some opinions from the community.

But I don't know if this can be (easily) supported in a .clang-format file, since the style options are defined as (static) enums. I realize my proposal might be out of the scope of this patch, but I wanted to get some opinions from the community.

just thinking out loud, I think this could be achieved in the config by a list of dictionaries

QualifierStyle:
   -   const:     Right
   -   volatile:  Right
   -   __as1:     Right
   -   __as2:     Right

I think we'd need some ordering information (which is why I was thinking maybe having this in a list (so the order of the list defines the left -> right order)

const __as1 int foo; -> int const __as1 foo;

if the order is:

QualifierStyle:
   -   {  "const":   "Right" }
   -   {  "__as1":     "Right" }

but

const __as1 int foo; -> int __as1 const foo;

if the order is

QualifierStyle:
   -   {  "__as1":     "Right" }
   -   {  "const":   "Right" }

I think this would likely make everything much more complicated, but perhaps we should think about this for the configuration at least now so we future proof ourselves.

QualifierStyle:
   -   {  "__as1":     "Left" }
   -   {  "const":   "Right" }
   -   {  "__as2":     "Left" }
   -   {  "volatile":  "Left" }

would give

const volatile __as1 int foo; -> __as1 volatile int const foo;

ilya added a comment.Dec 10 2019, 10:00 AM

I think this would likely make everything much more complicated, but perhaps we should think about this for the configuration at least now so we future proof ourselves.

What you've outlined looks good to me.
Of course we will be happy to contribute, given there will be an interest from the community, but as you said, for now at least let's keep this in mind for the configuration and naming.

Rondom added a subscriber: Rondom.Wed, Jan 29, 2:11 AM