This is an archive of the discontinued LLVM Phabricator instance.

Add new IdentifierNaming check
ClosedPublic

Authored by berenm on Jul 3 2015, 8:39 AM.

Details

Reviewers
alexfh
Summary

This check will try to enforce coding guidelines on the identifiers naming.
It supports lower_case, UPPER_CASE, camelBack and CamelCase casing and
tries to convert from one to another if a mismatch is detected.

It also supports a fixed prefix and suffix that will be prepended or appended
to the identifiers, regardless of the casing.

Many configuration options are available, in order to be able to create
different rules for different kind of identifier. In general, the
rules are falling back to a more generic rule if the specific case is not
configured.

Diff Detail

Event Timeline

berenm updated this revision to Diff 29025.Jul 3 2015, 8:39 AM
berenm retitled this revision from to Add new IdentifierNaming check.
berenm updated this object.
berenm added a reviewer: alexfh.
berenm added a subscriber: cfe-commits.
alexfh edited edge metadata.Aug 5 2015, 6:50 AM

Some initial comments, mostly style-related. It takes some time to get used to the coding style used in LLVM/Clang. One notable thing: though we use auto sometimes, we don't use the almost-always-auto style. Please see the inline comments for specific cases.

clang-tidy/readability/IdentifierNamingCheck.cpp
64

I'd name this StyleKind as most other enums in the LLVM code.

70

The only use of this array seems to be a simplification of a single loop (the other one can iterate over StyleNames as per the comment below). I'd remove it and instead added a "StyleKindCount" element in the enum and use a normal for loop.

88

This could be written nicer using llvm::StringSwitch.

88

Not sure why we would need alternative spellings of these options. That seems to just add complexity with no benefit.

106

Why not iterate over StyleNames here?

152

No need to cast ValueDecl returned by DeclRefExpr::getDecl() to its base class.

174

Please move the detection of the KindName and Style to a separate function. This method is too large.

177

Please resolve the node only once (as it's done on line 168) and then just check it using isa<...>(...).

179

Would it be better to have these in a single array next to StyleNames? Also, this code could return a StyleKind (now StyleConst), and you would need to get the corresponding NamingStyle and KindName once.

This code would suddenly become much leaner.

191

Please remove empty lines before }.

476

Please make this a function. This method is too large to define every utility function inside it.

478

No need to explicitly convert to StringRef here as well.

506

Please make this a function. This method is too large to define every utility function inside it.

506

Please make Name a StringRef.

507

This doesn't look like a valid use case for auto. static llvm::Regex Splitter(...); is shorter and more readable.

Same applies to Words, Substrs and Groups below.

Also, you don't need to explicitly create a StringRef, just leave the string literal.

510

Why not a SmallVector<StringRef,...>?

513

I think, this may be a StringRef.

520

The use of std::move here doesn't make sense to me: Groups[...] is a StringRef, which isn't moved faster than it is copied.

582

We use auto when the type is spelled in the initializer and in a few other cases, e.g. when iterating over a map, and the element type is somewhat tricky to spell correctly. In other cases please use concrete types.

608

clang::DeclarationNameInfo DeclRange(...) is better, IMO.

611

nit: Diagn is not a common abbreviation in clang-tidy code. Diag would be better.

clang-tidy/readability/IdentifierNamingCheck.h
43

I don't see what you gain by using std::move here. How about changing the type of Prefix and Suffix to const std::string & and getting rid of std::move? That's a more common pattern.

test/clang-tidy/readability-identifier-naming.cpp
2

I recently learned that RUN: lines can be split:

// RUN: some-long-command ... \
// RUN:   -command-continues ... \
// RUN:   -command-continues ... \
// RUN:   -command-continues ... \
// RUN:   -command-continues ... \
// RUN:   finally-done
9

No need to repeat the check name in each message ( [readability-identifier-naming]). It's enough to have it in the first CHECK-MESSAGES line. All other CHECK-MESSAGES lines can omit it.

berenm added a comment.Aug 5 2015, 7:34 AM

Thanks for the review, I will fix the various style issues and come back with an updated version.

clang-tidy/readability/IdentifierNamingCheck.cpp
88

The idea was to provide a way easier to type than the full names. This may not be very useful, and maybe a single simple name for each case scheme would be enough.

179

The problem is that sometimes, the current code falls back to a more generic naming style, but the "kind name" is still trying to describe the original declaration.

For example, if no style is defined for methods, then it will try to use a more generic "function" style, but the warning will still be "invalid case style for method xxx".

Maybe this is superfluous and I can drop it. It don't see many cases anyway (my original code was covering more cases - too many - and it seemed sensible at that time).

513

I think I had some issues when trying to use StringRef around here, I will investigate a bit more.

alexfh added inline comments.Aug 5 2015, 8:42 AM
clang-tidy/readability/IdentifierNamingCheck.cpp
88

Clang-tidy options are usually read from a file, so minimizing typing is not really important.

179

The problem is that sometimes, the current code falls back to a more
generic naming style, but the "kind name" is still trying to describe the
original declaration.

I see. It might be possible to split the mapping of types to style kinds and handling missing styles. E.g. a function can return "Method" (which should be SK_Method according to [1], btw) and then a caller would check whether the corresponding NamingStyle is configured, and if needed fall back to a more generic category. WDYT?

[1] http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

berenm added inline comments.Aug 5 2015, 9:46 AM
clang-tidy/readability/IdentifierNamingCheck.cpp
179

I think it would be difficult to map the declaration types to style kinds without checking the styles that have been provided and falling back to another style at the same time.

For example, a constexpr method is currently selecting the styles in this preferred order :
constexpr method > [public/protected/private] method > method > function.

The order is debatable, but we cannot have the constexpr method to public/protected/private method fallback, if there is no style for constexpr methods, without reading again the method declaration.

It might be OK that the warning message do not use the exact identifier kind name, and one can even think it is better to have a warning message that tells the user which style it used instead of which kind of identifier was checked.

alexfh added inline comments.Aug 6 2015, 6:47 AM
clang-tidy/readability/IdentifierNamingCheck.cpp
179

we cannot have the constexpr method to public/protected/private method fallback, if there is no style for
constexpr methods, without reading again the method declaration.

So the issue is caused by the fact that the categories are partially overlapping and don't form a hierarchy. Not sure whether this is convenient for the end-user and if it's a good model, but this can fit into the approach: the function can return an ordered list of StyleKinds that could then be looked up in the configured NamingStyles. I don't insist on this specific implementation, but it might end up being more elegant solution.

berenm updated this revision to Diff 31447.EditedAug 6 2015, 7:38 AM
berenm edited edge metadata.

Here is an updated version with some style fixes, and function splits.

The styles are still selected the same way as before (no split between finding the best type style and falling back to available style), but there is only the StyleKind returned and the warning message is deduced from the StyleNames (lower case conversion and underscores replaced with spaces).

I can still implement the mechanism suggested in comment http://reviews.llvm.org/D10933#218645 if you find it is cleaner.

berenm marked 24 inline comments as done.Aug 6 2015, 7:45 AM
berenm added inline comments.
clang-tidy/readability/IdentifierNamingCheck.cpp
514

I replaced with StringRef everywhere it looked sensible and the functions return a std::string, as they are constructing a new string when computing the fixup.

576

DeclRange is actually the SourceRange and not the DeclarationNameInfo

Sorry for the delay. A few more comments, otherwise looks really nice.

clang-tidy/readability/IdentifierNamingCheck.cpp
167

Why do you need the outermost parentheses?

182

nit: Please remove the empty lines before closing braces.

534

That doesn't seem like a useful warning message. If the tool didn't manage to understand the naming style, it may be interesting to the tool developer, but not to a user. Please make this a DEBUG(llvm::dbgs() << ...).

540

clang:: is superfluous here, the code is already inside the clang namespace.

555

What does P stand for? I'd prefer a more meaningful name for the loop variable.

berenm marked 2 inline comments as done.Aug 17 2015, 2:29 AM
berenm added inline comments.
clang-tidy/readability/IdentifierNamingCheck.cpp
167

For some unknown reason, I always thought parentheses were required around regex alternation... Well, I now know they are not.

berenm updated this revision to Diff 32277.Aug 17 2015, 2:30 AM

Here is an updated version with the latest comments fixed.

berenm marked 6 inline comments as done.Aug 17 2015, 2:32 AM
alexfh accepted this revision.Aug 17 2015, 4:56 AM
alexfh edited edge metadata.

Looks good with a couple of comments. Tell me if you need me to submit the patch for you, once you address the comments.

Thank you for the awesome new check!

clang-tidy/readability/IdentifierNamingCheck.cpp
154

nit: Maybe remove the bool variable and just return false; instead of Matches = false;?

if (Name.startswith(Style.Prefix))
  Name = Name.drop_front(Style.Prefix.size());
else
  return false;

if (Name.endswith(Style.Suffix))
  Name = Name.drop_back(Style.Suffix.size());
else
  return false;

return Matchers[static_cast<size_t>(Style.Case)].match(Name);
test/clang-tidy/readability-identifier-naming.cpp
71

Looking at the tests once again, I find that they should be more strict. E.g. the presence of foo_ns doesn't say anything about whether the replacement was done in the right place (e.g. if the code replaces a wrong token and the result is foo_ns FOO_NS {, the test would pass). Please add more context to the CHECK-FIXES lines, maybe even whole lines including start- and end-of-line anchors ({{^}} and {{$}} respectively).

Thanks!

This revision is now accepted and ready to land.Aug 17 2015, 4:56 AM

Looks good with a couple of comments. Tell me if you need me to submit the patch for you, once you address the comments.

If you tell me how I should proceed, I can maybe do it myself (do I just have to send the patch to the cfe-commits list ?).

Looks good with a couple of comments. Tell me if you need me to submit the patch for you, once you address the comments.

If you tell me how I should proceed, I can maybe do it myself (do I just have to send the patch to the cfe-commits list ?).

The question was whether you have commit rights to the LLVM Subversion repository. Apparently, no, so someone needs to commit the patch for you. I'll happily do so once you update it.

berenm updated this revision to Diff 32294.Aug 17 2015, 6:41 AM
berenm edited edge metadata.

Indeed, I probably don't have rights on Clang/LLVM repositories.

I have updated the patch with stricter tests - and found a missing break statement (clang-tidy/readability/IdentifierNamingCheck.cpp:201).

I also realized that the DeclRefExpr matcher used to find references to declared identifiers is only matching references to values and not to classes.

For example, when referencing the class in the out-of-line class static member declaration (test/clang-tidy/readability-identifier-naming.cpp:133 and test/clang-tidy/readability-identifier-naming.cpp:136), or when using a class identifier as the type of a typedef, (test/clang-tidy/readability-identifier-naming.cpp:243). Or even when declaring a variable of a type whose name must be fixed.

Is there any matcher that could be used for finding references to a TypeDecl? Should I go through calls to getDeclContext, getType, or getUnderlyingType depending on the context? Any easier way I don't know about?

Indeed, I probably don't have rights on Clang/LLVM repositories.

I have updated the patch with stricter tests - and found a missing break statement (clang-tidy/readability/IdentifierNamingCheck.cpp:201).

Behold the magic power of testing ;)

I also realized that the DeclRefExpr matcher used to find references to declared identifiers is only matching references to values and not to classes.

For example, when referencing the class in the out-of-line class static member declaration (test/clang-tidy/readability-identifier-naming.cpp:133 and test/clang-tidy/readability-identifier-naming.cpp:136), or when using a class identifier as the type of a typedef, (test/clang-tidy/readability-identifier-naming.cpp:243). Or even when declaring a variable of a type whose name must be fixed.

Is there any matcher that could be used for finding references to a TypeDecl? Should I go through calls to getDeclContext, getType, or getUnderlyingType depending on the context? Any easier way I don't know about?

Yep, renaming classes is not as trivial as it might seem. You'll need to look for TypeLocs (the loc() matcher) of QualTypes and NestedNameSpecifiers referring to the old class name. You'd also need to match constructor declarations and UsingDecls separately.

Here's a rough list of matchers that search for the name of a class:

loc(qualType(hasDeclaration(namedDecl(hasName(old_class)))))
loc(nestedNameSpecifier(specifiesType(
              hasDeclaration(recordDecl(hasName(old_class))))))
usingDecl(hasAnyUsingShadowDecl(hasTargetDecl(hasName(old_class))))
recordDecl(hasName(old_class))
constructorDecl(ofClass(hasName(old_class)))

I suggest that you add a bunch of FIXMEs in the test and we can iterate on the check in follow-up patches. Similar issues apply to namespace and function renaming.

What needs to be tested:

  • for classes: forward declarations, definitions, constructors, destructors, using declarations, type aliases, use of the class name in new, casting, qualified names (Class::member) (also resulting from injected class name usage - Class::Class::Class), use of the class name in templates and macros (to ensure there are no duplicate or incorrect fixes), and probably something else that I forgot.
  • for functions/methods: declarations, calls, using declarations, taking address of, with qualified names and without, probably something else.
  • for namespaces: use in qualified names and using declarations and directives.
berenm updated this revision to Diff 32304.Aug 17 2015, 8:18 AM

Alright!

Here is the latest revision then, with FIXMEs comments mentioning this missing feature. I will work on implementing it and come back to you when i've got something working.

Thanks for your time and your advices!

Another couple of comments (fine for a follow-up).

test/clang-tidy/readability-identifier-naming.cpp
72

Please add a test (with a CHECK-FIXES or a FIXME) for the usages of a namespace in qualified names (e.g. ::foo_ns::inline_namespace::my_enumeration).

209

Add a FIXME or a CHECK-FIXES for the class name in destructor.

alexfh closed this revision.Aug 19 2015, 4:17 AM

Committed revision 245429.