Page MenuHomePhabricator

[clang-tidy] check to find declarations declaring more than one name
Needs RevisionPublic

Authored by firolino on Dec 9 2016, 9:49 AM.

Details

Summary

This check can be used to find declarations, which declare more than one name.
It helps improving readability and prevents potential bugs caused by inattention
and C/C++ syntax specifics.

In addition, appropriate fix-it hints are provided and all user-intended
indentation will be preserved. For example:

{
  long ** lint1, lint2 = 0, * lint3, **linn;

  const int cx = 1, cy = 2;

  int const CS :: * pp = &CS::a, CS::* const qq = &CS::a;

  decltype(int()) declint1 = 5, declint2 = 3;
  
  typedef int ta, tb;
}

will be transformed to:

{
  long ** lint1;
  long lint2 = 0;
  long * lint3;
  long **linn;
  
  const int cx = 1;
  const int cy = 2;
  
  int const CS :: * pp = &CS::a;
  int const CS::* const qq = &CS::a;
  
  decltype(int()) declint1 = 5;
  decltype(int()) declint2 = 3;
  
  typedef int ta;
  typedef int tb;
}

Only declarations within a compound statement are matched. Meaning, global declarations
and function parameters are not matched. Moreover, it does not match on the following:

{
  for(int i = 0, j = 0;;);
}

This check will be used for:

in the future and provide appropriate options for it.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
firolino updated this revision to Diff 81964.Dec 19 2016, 9:07 AM

Added support for new test cases:

int S::*mdpa1[2] = {&S::a, &S::a}, var1 = 0;
// CHECK-MESSAGES: :[[@LINE-1]]:9: warning: declaration statement can be split
// CHECK-FIXES: {{^        }}int S::*mdpa1[2] = {&S::a, &S::a};
// CHECK-FIXES: {{^        }}int var1 = 0;

int S :: * * mdpa2[2] = {&p, &pp2}, var2 = 0;
// CHECK-MESSAGES: :[[@LINE-1]]:9: warning: declaration statement can be split
// CHECK-FIXES: {{^        }}int S :: * * mdpa2[2] = {&p, &pp2};
// CHECK-FIXES: {{^        }}int var2 = 0;

void (S::*mdfp1)() = &S::f, (S::*mdfp2)() = &S::f;
// CHECK-MESSAGES: :[[@LINE-1]]:9: warning: declaration statement can be split
// CHECK-FIXES: {{^        }}void (S::*mdfp1)() = &S::f;
// CHECK-FIXES: {{^        }}void (S::*mdfp2)() = &S::f;

void (S::*mdfpa1[2])() = {&S::f, &S::f}, (S::*mdfpa2)() = &S::f;
// CHECK-MESSAGES: :[[@LINE-1]]:9: warning: declaration statement can be split
// CHECK-FIXES: {{^        }}void (S::*mdfpa1[2])() = {&S::f, &S::f};
// CHECK-FIXES: {{^        }}void (S::*mdfpa2)() = &S::f;

void (S::**mdfpa3[2])() = {&mdfpa1[0], &mdfpa1[1]}, (S::*mdfpa4)() = &S::f;
// CHECK-MESSAGES: :[[@LINE-1]]:9: warning: declaration statement can be split
// CHECK-FIXES: {{^        }}void (S::**mdfpa3[2])() = {&mdfpa1[0], &mdfpa1[1]};
// CHECK-FIXES: {{^        }}void (S::*mdfpa4)() = &S::f;
aaron.ballman added inline comments.Dec 19 2016, 12:36 PM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
23

We may want to consider adding this to ASTMatchers.h at some point (can be done in a future patch).

26

This should accept by const std::string &.

33

is a object -> is an object

35

Why do we not want to match this?

45

No need to compare against nullptr explicitly (I think we have a clang-tidy check that will warn on this, in fact).

54

Please don't use auto as the type is not spelled out explicitly in the initialization. Same elsewhere, though it is fine with diag() and in for loop initializers.

62

This reads a bit awkwardly since it doesn't explain why the code is problematic. Perhaps: "multiple declarations should be split into individual declarations to improve readability" or something like that?

68

No need for the newline.

82

Missing a full stop at the end of the sentence. Also, no hyphen in preprocessor, and it should be "add an appropriate".

84

Will this (and the below \n) be converted into a CRLF automatically on platforms where that is the newline character sequence in the source?

88

This should probably use a Twine to avoid a bunch of copies.

102

You can drop the const from the dyn_cast, here and elsewhere.

108

Why references and not pointers? Perhaps this needs a comment.

132

You may want to include tests for pathological code, like:

int *(i), (*j), (((k)));
133

m should be capitalized, and you can elide the braces.

138

Why are you calling the Internal version of this function?

149

Capitalize "u".

161

Instead of adding yet another custom check for whether something is whitespace, can you use isWhitespace() from CharInfo.h, and special case \n? Also, why unsigned char instead of char?

201

Why the extra compound statement?

202

Should be I instead of i.

247

I think what @malcolm.parsons was getting at is that we have talked about automatically running clang-format over changes made from the FixIt engine instead of trying to add custom logic to all of the places we use FixIts.

clang-tidy/readability/ReadabilityTidyModule.cpp
64

Should this also be registered as an alias for cert-dcl04-c and CppCoreGuideline ES.10, or is there work left to be done for those checks?

clang-tidy/utils/LexerUtils.h
15

Please remove this include.

docs/ReleaseNotes.rst
142

that -> than

docs/clang-tidy/checks/readability-one-name-per-declaration.rst
7

Remove the comma. which -> that.

8

improving -> improve

test/clang-tidy/readability-one-name-per-declaration-complex.cpp
245

Can you add a test with:

template <typename A, typename B> // Should not be touched
void f(A, B);
firolino marked 18 inline comments as done.Dec 23 2016, 5:19 AM

Addressed suggestions from @aaron.ballman Thanks!

clang-tidy/readability/OneNamePerDeclarationCheck.cpp
23

Is there a // FIXME: Once I am in AstMatcher.h, remove me. OK?

26

Since I need a copy, it is more efficient to pass-by-value instead pass-by-reference + copy. So, I get a benefit for rvalues and move will be applied. Search for "want speed? pass by value!" articles.

35

If we decide, whether we transform

class A { 
} Object1, Object2;

to

class A { 
} Object1, 
Object2;

or

class A { 
} 
Object1, 
Object2;

I might consider adding support for it. Moreover, this kind of definition is usually seen globally and I don't know how to handle globals yet. See http://lists.llvm.org/pipermail/cfe-dev/2015-November/046262.html

54

Alright. What about:

const auto FirstVarIt = DeclStmt->getDeclGroup().begin();
const auto DL = SM.getDecomposedLoc(Loc);
62

Yes. Sounds better.

84

Actually, I was thinking about this but I don't have a Windows etc. with a running llvm environment. Is it OK, if I leave a FIXME for now?

102

Alright, but can you explain why?

108

// Typedefs on function pointers are covered into LValueReferenceType's ?

138

Because I need its QualType.

161

CharInfo, nice! unsigned char because there is no negative ASCII. Even isWhitespace has unsigned char.

clang-tidy/readability/ReadabilityTidyModule.cpp
64

I will include Options for these checks in the future, since they have special rules when to split a declaration. See https://reviews.llvm.org/D25024#inline-221483

aaron.ballman added inline comments.Dec 24 2016, 7:56 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
23

You could put in a FIXME, or just create the ASTMatchers patch and mark this patch as dependent on that one.

35

I think this should be handled. It can be handled in either of the forms you show, or by saying:

A Object1;
A Object2;

If all of these turn out to be a problem, we can still diagnose without providing a fixit.

As for globals in general, they can be handled in a separate patch once we figure out the declaration grouping.

54

Iterators are fine (because they usually involve a lot of template boilerplate for containers and their interfaces are regular and common enough), but the call to getDecomposedLoc() should not use auto -- the reader requires too much internal knowledge to understand what the return type is.

84

I think it's fine with or without the fixme (it's certainly not the first instance of \n in the compiler).

102

The const on the declaration is sufficient to ensure const correctness. It's morally equivalent to:

void f(int i) {
  const int ci = i; // No need to const_cast this either.
}
108

So should this be specific to function pointer types? If so, I don't see that predicated anywhere.

138

You should not call functions that have Internal in the name (generally speaking). Those are generally considered to be warts we want to remove once we get rid of the last external caller.

All of your uses of Type are as a clang::Type *, so are you sure you need the QualType at all? (Btw, you should consider a better name than Type.)

161

Hmm, I'm used to seeing characters using char and arithmetic types using unsigned char, but I suppose I've also seen UTF-8 code points represented as unsigned char, so good enough for me. :-)

firolino marked 27 inline comments as done.Dec 28 2016, 7:09 AM
firolino added inline comments.
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

OK. I will try to split the object definition from the class definition, as you have suggested. Thus, I can kick out the tagDecl-matcher as well. If there is no easy way to do this, it will be reported anyway but without a fixit.

Note for me: Update documentation!

102

Ah, got it. Thought first, you meant to remove any const in that dyn_cast line.

138

operator-> is overloaded in QualType... I was so confused all the time. Changed completely to Type.

firolino marked 3 inline comments as done.Dec 28 2016, 7:34 AM
firolino added inline comments.
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

What about

struct S {
} S1;

I would like to report this too, since two names are being declared here. S and S1. What do you think?

firolino added inline comments.Dec 29 2016, 6:58 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35
struct {
} nn1, nn2;

Shall we ignore anonymous definitions?

firolino added inline comments.Dec 29 2016, 7:16 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

To be more precise: Warn and provide a fixit for struct S {} S1. Only warn for struct {} nn1, nn2.

aaron.ballman added inline comments.Dec 29 2016, 7:33 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

What about

struct S {
} S1;

I would like to report this too, since two names are being declared here. S and S1. What do you think?

I don't think that this should be diagnosed. For one, this is a relatively common pattern (arguably more common than struct S { } S1, S2;), but also, it names two very distinct entities (a type and a variable).

struct {
} nn1, nn2;

Shall we ignore anonymous definitions?

Yes, we basically have to.

firolino marked 5 inline comments as done.Jan 3 2017, 2:15 AM
firolino added inline comments.
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

Transforming

typedef struct X { int t; } X, Y;

to

typedef struct X { int t; };
typedef X X;
typedef X Y;

will be valid, but looks odd.

firolino updated this revision to Diff 82855.Jan 3 2017, 2:48 AM
firolino updated this object.
firolino marked an inline comment as done.
  • applied suggestions from Aaron
  • added support and test cases for TagDecl e.g.
struct S {} S1, S2;
aaron.ballman added inline comments.Jan 3 2017, 6:41 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

I am on the fence about this transformation -- it does not declare new objects, but instead declares new types. A very common pattern in some libraries (such as the Win32 SDK) is to declare:

typedef struct Blah {

} Blah, *PBlah, **PPBlah;

Because of that pattern, diagnosing the above typedef is likely to be very chatty in some projects, and I don't know that the fixit provides much real benefit for types.

At least for the CERT DCL04-C version of this rule, that code should not be flagged. Do you think any of the variations of this check should flag it?

firolino added inline comments.Jan 3 2017, 10:54 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

Maybe we should reconsider where to split and where not. The original idea was to minimize confusion on code like

int* a,b;

whereas

struct S {} *s1, s2;

seems not so confusing as above. However,

struct S {}* s1, s2;

looks dangerous again.

Even following code looks unfamiliar to me.

typedef struct Blah {} Blah, *PBlah;

It really depends on the beholder. So, how about letting this check split everything and provide options for maximum flexibility? So specific rulesets or the user may configure it as they wish. We could add for example split-after-tag-def, split-tag-typedef in addition to cppcore and cert and use a default setting (split-after-tag-def=true, split-tag-typedef=false).

aaron.ballman added inline comments.Jan 4 2017, 7:58 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

I've typically found that users go with whatever the default options are, rather than specify a slew of options, and if the default options produce an onerous check, they simply disable the entire check. I'd rather not push the design off to the users unless there's really no better alternative.

I think that the only "dangerous" operations this sort of check is meant to catch is when the declaration group uses different specifiers for the declarator or when there's an initializer on one declarator but none of the rest. I think the other operations are more likely to be a consistency thing than a bug-prone thing. e.g.,

// Differing specifiers
int i, j, k; // Not at all likely to be mistaken code.
int i, *j, &k; // Not really likely to be mistaken code.
int *i, j; // More likely to be mistaken code.
int i, *j; // Not really likely to be mistaken code.
typedef int *one, two, three; // More likely to be mistaken code.
typedef int one, *two, **three; // Not really likely to be mistaken code.

// Initialization
int i = 10, j = 10; // Not at all likely to be mistaken code.
int i = 10, j; // Not really likely to be mistaken code.
int i, j = 10; // More likely to be mistaken code.

I think declarations that declare both types and variables should ignore the type name and focus only on the variable name following the same heuristics. e.g.,

struct S {} s; // Not at all likely to be mistaken code.
struct S {} s, *t; // Not really likely to be mistaken code.
struct S {} *s, t; // More likely to be mistaken code.

I could imagine an option like FlagPossiblyDangerous or something else; if the flag is set, the check diagnoses based on some heuristic of likely dangerous operations and if the flag is not set, the check flags all declarators in the same declaration.

What do you think?

firolino added inline comments.Jan 5 2017, 8:24 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

We are judging from our perspective what is "likely dangerous" and this might not agree with the users view. This is why I don't wanna force any limitations based on our decisions. This check might not be used only to reduce "readability error" but even for a simple style guide enforcement that wants one variable declaration per line.

I know that int i, j, k; is not bad at all or likely to be mistaken, but style guides from openssl or the linux kernel are enforcing one declaration per line.

I've typically found that users go with whatever the default options are

So, why not simple use options and set the default whatever we think is right. We are not loosing anything with this way, just getting more flexible. Anyone else can reset the options to their needs.

I would think of following options:
Style = Default, CppCore, Cert
SplitTagDef = True, False
SplitTagTypeDef = True, False

and with default:

Style = Default     # splits anything, regardless of tokens and initializers
SplitTagDef = True
SplitTagTypeDef = False

The check will be registered under cert with Style = Cert and CppCore for cppcore respectively.

And since it is topic related, there are still two things on my todo list:

  • attributes
  • members
struct alignas(8) S {} s1, s2;
alignas(8) char a, b;
[[maybe_unused]] bool x, y;

struct S {
  int a, b;
};

Should we include this as well, or leave it for future work?

aaron.ballman added inline comments.Jan 5 2017, 9:40 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

So, why not simple use options and set the default whatever we think is right. We are not loosing anything with this way, just getting more flexible. Anyone else can reset the options to their needs.

I think we are describing the same solution, but perhaps from different angles. I am proposing that we have this checker flag everything, but that we have a sensible set of default options that reduce the number of diagnostics issued for existing code bases. I fear that if the default behavior of this check is to point out *every* declaration group with more than one name being declared, the output will be way too chatty and so the entire check is disabled. I think that if we had a default behavior that only pointed out the suspicious declarations, that would reduce the "noise" by default. One way to test this would be to run the check over some large code bases of various style guides to see how many results there are. I would recommend running it over the LLVM and Clang code bases (which do not have a restriction on declarations) and OpenSSL (which does have this restriction) and report back the diagnostics you get from each (as well as totals). Feel free to pick other/additional code bases to test again.

Should we include this as well, or leave it for future work?

It's up to you, but I think it's reasonable to leave for future work. Attributes are likely to be somewhat harder than members in many regards due to where attributes can be written. I suspect that members can be handled like variable declarations in general.

firolino added inline comments.Jan 5 2017, 2:57 PM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
35

OK, I get what you mean. Btw, thanks for taking your time! This review gets longer and longer :)

alexfh requested changes to this revision.Jan 17 2017, 5:39 AM

A few more comments.

clang-tidy/readability/OneNamePerDeclarationCheck.cpp
153

I wonder whether re-lexing the whole range and working on tokens instead would be a better and more robust approach? You wouldn't have to deal with whitespace and comments then. What do you think?

163

Why do we want long int and not long?

clang-tidy/readability/OneNamePerDeclarationCheck.h
25

nit: move the private section to the end.

clang-tidy/utils/LexerUtils.cpp
41–56

Is it significantly more efficient than

Token Tok;
do {
  Tok = getPreviousNonCommentToken(Context, Location);
} while (!Tok.isOneOf(tok::unknown, TokenToFind));
return Tok.getLocation();

?

If it's not, then this function might be not so much useful to be here. At least, its interface is rather specific to the use case you have. Why, for example, it returns a location instead of the token itself?

This revision now requires changes to proceed.Jan 17 2017, 5:39 AM
firolino added inline comments.Jan 21 2017, 6:15 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
153

Sounds interesting! I am gonna give this a try.

163

Sorry, misleading comment.

clang-tidy/readability/OneNamePerDeclarationCheck.h
25

LLVM Coding Standards is putting the private section at the beginning as well. Even in the code base, it seems to be the general rule. While you are reading code from top to bottom, you need to get the necessary context (variables and their types) first, before proceeding to the functional code. Otherwise, you would have to scroll all the time down to the private variables - when passing its usage in a function - to get its type. This would impair the reading fluency and thus the straight-forwarding understanding of the code.

clang-tidy/utils/LexerUtils.cpp
41–56

More efficient? In worst-case (token not found) yes, in my use-case probably not that significant. You are right, changing the interface to Token findTokenBackward(...) seems to be more generic than returning the location directly.

firolino added inline comments.Jan 21 2017, 6:17 AM
clang-tidy/readability/OneNamePerDeclarationCheck.h
25

Or is it, because there is just a function and not a private variable?

@aaron.ballman I am going to implement CppCore, Cert and Everything and test it on some projects and provide the results next week, to find out which one to set Default.

firolino marked 7 inline comments as done.Jan 21 2017, 6:35 AM
  • nothing special, just went through some open comments and marked them as Done.
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
247

Sounds good! Will clang-format be able to detect the code format in the future?

aaron.ballman added inline comments.Jan 23 2017, 7:38 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
37–38

Is there a case where this could happen? I would have imagined this as an assert(), if anything.

alexfh added inline comments.Jan 24 2017, 10:39 AM
clang-tidy/readability/OneNamePerDeclarationCheck.cpp
130

Should we suggest using x = ...; in C++11 code?

clang-tidy/readability/OneNamePerDeclarationCheck.h
25

I don't think there's a rule to this effect in LLVM Coding Standards or a widely accepted convention, it's rather my personal preference (maybe because it's consistent with the Google C++ Style Guide, https://google.github.io/styleguide/cppguide.html#Declaration_Order). Also, the argument you're using doesn't seem to apply here, since the private implementation details aren't used in the public part below it (because, there are no non-empty definitions in it).

This comment was removed by JonasToth.

@firolino do you plan on finishing this?

firolino marked an inline comment as done.Apr 9 2018, 6:33 AM

@firolino do you plan on finishing this?

Sometimes the universe is really funny! I have just re-started working on it today.

Hi @firolino,
I implemented a less general version of you check https://reviews.llvm.org/D51949 as I wanted to achieve a slightly different goal (and this revision seems to be abonded). My aim is to have general transformation capabilities to separate declarations (variables for now, as these refactorings are most important) that can isolate single variables, or split whole declarations.

I used parts of your test-suite as it is very complete. If you plan to continue working on this check, I'd propose to include it into the other (right now almost) finished version.

My Best, Jonas

Hi @firolino,
I implemented a less general version of you check https://reviews.llvm.org/D51949 as I wanted to achieve a slightly different goal (and this revision seems to be abonded). My aim is to have general transformation capabilities to separate declarations (variables for now, as these refactorings are most important) that can isolate single variables, or split whole declarations.

I used parts of your test-suite as it is very complete. If you plan to continue working on this check, I'd propose to include it into the other (right now almost) finished version.

My Best, Jonas

Thanks for the information. Looking forward to your final patch!