This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add a check for enforcing minimum length for variable names
ClosedPublic

Authored by 0x8000-0000 on Mar 1 2021, 7:50 PM.

Details

Summary

Add a check for enforcing minimum length for variable names. A default minimum length of three characters is applied to regular variables (including function parameters). Loop counters and exception variables have a minimum of two characters. Additionally, the 'i', 'j' and 'k' are accepted as legacy values.

All three sizes, as well as the list of accepted legacy loop counter names are configurable.

Diff Detail

Event Timeline

0x8000-0000 created this revision.Mar 1 2021, 7:50 PM
0x8000-0000 requested review of this revision.Mar 1 2021, 7:50 PM

Rebased on main branch. Added missing documentation and release notes.

Clean-up messy rebase for the checks/list.rst .

Eugene.Zelenko added inline comments.Mar 1 2021, 8:10 PM
clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
34 ↗(On Diff #327344)

Please don't use auto unless type is explicitly spelled in same statement or iterator.

93 ↗(On Diff #327344)

Please don't use auto unless type is explicitly spelled in same statement or iterator.

clang-tools-extra/docs/clang-tidy/checks/readability-variable-length.rst
6 ↗(On Diff #327344)

Please synchronize with statement in Release Notes.

9 ↗(On Diff #327344)

See other checks documentation as example of options descriptions.

15 ↗(On Diff #327344)

Please highlight default value with single back-quote. Same below.

Eugene.Zelenko set the repository for this revision to rG LLVM Github Monorepo.Mar 1 2021, 8:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2021, 8:11 PM
Eugene.Zelenko retitled this revision from Add a check for enforcing minimum length for variable names to [clang-tidy] Add a check for enforcing minimum length for variable names.Mar 1 2021, 8:11 PM

Please can you upload diffs with full context or using arcanist.

It may be worth adding a case where if the Minimum<Type>NameLength is set to 0, it won't check that type, or even bother registering its matchers.

It may be worth adding unless(isImplicit()) to each varDecl() matcher, save any nasties from compiler generated decls.

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
23 ↗(On Diff #327344)

If you make this regex this could be changed to "^[ijk]$".

30 ↗(On Diff #327344)

For better or worse, we typically use parentheses instead of braced initialization for the member init list.

43–50 ↗(On Diff #327344)

You shouldn't be storing the default value, you should be storing the value read from config.

54–55 ↗(On Diff #327344)

This code will match i, j and k in this example

for (int i = 4, j = 5;;)
  int k = 5;

It may be better to use this:

// To match both i and j
forStmt(hasLoopInit(declStmt(forEach(varDecl().bind("loopVar")))))
// To only match i
forStmt(hasLoopInit(declStmt(containsDeclaration(0, varDecl().bind("loopVar")))))
58–62 ↗(On Diff #327344)

This will miss the match for k in the same example:

for (int i = 4, j = 5;;)
  int k = 5;

Gotta scratch my head for this one though. Though like the case before its rather unlikely to show up in code so not the end of the world.

Also varDecl will match parameters, is this intended can you add tests and explain in documentation.
If its unintended putting unless(parmVarDecl()) in the matcher disable that.

72–74 ↗(On Diff #327344)

This message can(should) be reused for all warnings with select.
Obviously you'd put that in a char arrray and the reference it for the next 2 diags.

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.h
20 ↗(On Diff #327344)

Full-stop/period at the end of comments.

36 ↗(On Diff #327344)

I feel like regex is better suited for this purpose and just ignore any loop variables that match the regex.
Doing this will require storing a string as well as the regex for the purpose of storeOptions.

0x8000-0000 marked 10 inline comments as done.Mar 2 2021, 6:43 PM

@Eugene.Zelenko and @njames93

Thank you both for the kind and thoughtful feedback. I am incorporating all recommended changes.

First step are all the smaller scope comments, with regex support in a follow-up commit.

I will also generate the diffs in the future using the -U99999 option as suggested.

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
58–62 ↗(On Diff #327344)

Matching of function parameter names is intentional. I will make sure it is emphasized in the documentation.

Addressed most of the review comments with the exception of switching to regex for ignoring loop counters; this change is in the follow-up review.

Eugene.Zelenko added inline comments.Mar 2 2021, 6:58 PM
clang-tools-extra/docs/clang-tidy/checks/readability-variable-length.rst
9 ↗(On Diff #327344)

You still need to add Options section and mark-up for options. See other checks documentation as example.

0x8000-0000 marked an inline comment as done.Mar 2 2021, 7:13 PM
0x8000-0000 added inline comments.
clang-tools-extra/docs/clang-tidy/checks/readability-variable-length.rst
9 ↗(On Diff #327344)

Oh, sorry, now I see what you mean. I have sampled randomly a handful of documentation snippets, and didn't find anything. But after your comment, I looked at readability-identifier-naming.rst which had "options" in all their glory.

0x8000-0000 marked an inline comment as done.

Use regular expression for filtering out loop counter variables to ignore.

Fully document the options.

Eugene.Zelenko added inline comments.Mar 2 2021, 7:17 PM
clang-tools-extra/docs/clang-tidy/checks/readability-variable-length.rst
21 ↗(On Diff #327653)

It'll be good idea to use 2 symbols indention, like in code. Same in other places.

0x8000-0000 marked an inline comment as done.Mar 2 2021, 7:27 PM
0x8000-0000 marked 3 inline comments as done.

Add an option "IgnoredVariableNames" that allows filtering out acceptable variable names (similar to the loop counters); also implemented via regex.

Added tests for the IgnoredVariableNames option.

aaron.ballman added inline comments.Aug 4 2021, 8:20 AM
clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

Is there a reason this should be restricted to variables? For example, wouldn't the same functionality be useful for type names, or dare I say it, even macro names? I'm wondering if this should be readability-identifier-length to be more generic.

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
20 ↗(On Diff #327655)

Should it be possible to enforce parameters differently than local variables? (It seems a bit odd that we'd allow you to specify exception variable length separate from loops but not function parameters separate from locals.)

24 ↗(On Diff #327655)

Ignored exception names? There's a fair amount of catch (const exception &e) code in the world: https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=catch%28const+exception+%26e%29&search=Search)

26 ↗(On Diff #327655)

It looks like there will be whitespace issues with this. If the variable is a loop or exception, it'll have an extra space (loop variable name) and if it's not, it will start with a leading space ( variable name).

0x8000-0000 marked an inline comment as done.Aug 7 2021, 3:29 PM

Added regex for exception names and rebased onto current 'main' branch.

clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

I consider those to be in separate namespaces. I suppose we could have a single checker with multiple rules, but on the other hand I don't want to combine too many things into one, just because they share the "compare length" dimension.

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
20 ↗(On Diff #327655)

Exceptions and Loops historically are "throw-away" / "don't care" / "one-off" variables. Parameter names are names, and they are the interface between the outside of the routine and the processing inside; other than historical, I don't see good arguments (sic) to allow single-letter parameter names.

Note that this check will be quite noisy on a legacy code base, and people will find little reason to invest the work to remove the warnings. But if somebody starts something new and want to enforce some quality attributes, it is the right tool at the right time. There will be new projects starting in 2021 and 2022 that could benefit from this, I hope.

26 ↗(On Diff #327655)

This was recommended by a previous reviewer. Any alternative suggestion here?

Added separate check for parameter names.

Updated documentation for all options.

aaron.ballman added inline comments.Aug 9 2021, 5:40 AM
clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

I see where you're coming from, but I come down on the other side. Running a check is expensive (especially on Windows where process creation can be really slow), so having multiple checks that traverse the AST gives worse performance than having a single check that only traverses the AST once. I'd rather see related functionality grouped together and use options to control behavior when it's a natural fit to do so.

I should note that I don't mean *you* have to implement this other functionality (as part of this patch or otherwise). Just that I think we should leave the check name ambiguous enough that we could grow it to do that work in the future.

WDYT?

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
98 ↗(On Diff #365016)
114 ↗(On Diff #365016)
128 ↗(On Diff #365016)
143 ↗(On Diff #365016)
20 ↗(On Diff #327655)

Exceptions and Loops historically are "throw-away" / "don't care" / "one-off" variables. Parameter names are names, and they are the interface between the outside of the routine and the processing inside; other than historical, I don't see good arguments (sic) to allow single-letter parameter names.

I largely agree with you, but there are definitely cases where single-letter parameter names are extremely common. For example, coordinates are often x, y, and z, colors are often r, g, and b (among many others), and physics applications often do things like f = m * a with their variables. Also, there are cases where the parameter names are often not super meaningful, like with functions min and max so the parameter names may be quite short.

26 ↗(On Diff #327655)

I *think* the diagnostic will work if you write it like this:
"%select{variable |exception variable |loop variable |parameter }0name %1 is too short, expected at least %2 characters"

clang-tools-extra/test/clang-tidy/checkers/readability-variable-length.cpp
44 ↗(On Diff #365016)

What in the configuration causes n to be ignored?

0x8000-0000 marked an inline comment as done.Aug 9 2021, 5:12 PM
0x8000-0000 added inline comments.
clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

Right - that's a good point. But I'm wondering the other way; maybe the bigger check will subsume this one, and this one will become just an alias for the bigger check?

So I'm -0.1 on using the "bigger name" for the limited functionality, but if one more vote comes in saying to go readability-identifier-length I'll rename this (and add explicitly the scope limits in the documentation.)

clang-tools-extra/clang-tidy/readability/VariableLengthCheck.cpp
20 ↗(On Diff #327655)

Indeed - I have changed the code to meet this capability; now parameters are controlled independently of the regular variables.

clang-tools-extra/test/clang-tidy/checkers/readability-variable-length.cpp
44 ↗(On Diff #365016)

It is ignored by the default configuration. Search for "DefaultIgnoredParameterNames" above.

aaron.ballman added inline comments.Aug 10 2021, 4:04 AM
clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

Right - that's a good point. But I'm wondering the other way; maybe the bigger check will subsume this one, and this one will become just an alias for the bigger check?

The downside to that approach is that the alias is a bit confusing until its deprecation period ends and we remove it. However, that's not a huge downside., so I don't insist on the name change if you're resistant to it.

clang-tools-extra/test/clang-tidy/checkers/readability-variable-length.cpp
44 ↗(On Diff #365016)

Ah, the comment tripped me up -- can you say ignored via default configuration like below to make it more clear?

0x8000-0000 added inline comments.Aug 10 2021, 3:01 PM
clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
136

I can do the rename, but my intuition (informed somewhat by a few years in the industry) is that outside of T as template argument and X for X-Macros, there aren't many single or even two letter class names or structures out there outside of intentionally obfuscated code - thus I don't think there will be a need to have readability-identifier-length produced.

To avoid potential churn and the hassle with deprecation I'll make the change though.

clang-tools-extra/test/clang-tidy/checkers/readability-variable-length.cpp
44 ↗(On Diff #365016)

Sorry about that; fixed in next patch.

Rebased on main and renamed from readability-variable-length to readability-identifier-length.

Clean diff against origin/main

0x8000-0000 marked 3 inline comments as done.Aug 10 2021, 8:32 PM
aaron.ballman accepted this revision.Aug 11 2021, 7:39 AM

LGTM aside from some style nits (removing some top-level const we don't typically use, unnecessary paren expressions, and a typo fix), thank you for the check!

clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
100
102
116–117
130
145
clang-tools-extra/docs/clang-tidy/checks/readability-identifier-length.rst
82
This revision is now accepted and ready to land.Aug 11 2021, 7:39 AM
0x8000-0000 marked 6 inline comments as done.

Removed extra 'const' and parentheses that are not congruent with the accepted code style.

@aaron.ballman - thank you for the review; please submit on my behalf.

Reformatted the test file.

One more format nit.

@aaron.ballman - thank you for the review; please submit on my behalf.

Happy to do so, which name and email address would you like me to use for patch attribution?

The diff was generated with format-patch, so can be applied with "git am".

The envelope:

commit d98d336b452876cfbc32ad1b76319912c45b646b (HEAD -> readability-variable-name-length-main)
Author: Florin Iucha <florin@signbit.net>
Date: Mon Mar 1 00:57:01 2021 -0500

Add a checker for variable name length

It requires regular variables and parameters names to be at least
three characters long, and loop variables and exception names to be at
least two characters long. Exceptions to these minimums can be added
via regular expressions.
aaron.ballman closed this revision.Aug 12 2021, 8:34 AM

The diff was generated with format-patch, so can be applied with "git am".

The envelope:

commit d98d336b452876cfbc32ad1b76319912c45b646b (HEAD -> readability-variable-name-length-main)
Author: Florin Iucha <florin@signbit.net>
Date: Mon Mar 1 00:57:01 2021 -0500

Thanks! When I go to download the diff from Phab, it drops the envelope so all I have is the diff. :-)

I've commit on your behalf in d2c5cbc3a856c2f8bbda05540fa761bb94ba32f6