This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add a checker for long functions.
ClosedPublic

Authored by bkramer on Aug 20 2014, 8:52 AM.

Details

Summary

As this is very dependent on the code base it has some ways of configuration.
It's possible to pick between 3 modes of operation:

  • Line counting: number of lines including whitespace and comments
  • Statement counting: number of statements within compoundStmts.
  • Cyclomatic complexity: calculated from the CFG.

In addition a threshold can be picked, warnings are only emitted when it is met.
Sadly this is not configurable at runtime yet as there is some plumbing missing.

Diff Detail

Repository
rL LLVM

Event Timeline

bkramer updated this revision to Diff 12703.Aug 20 2014, 8:52 AM
bkramer retitled this revision from to [clang-tidy] Add a checker for long functions..
bkramer added reviewers: alexfh, djasper.
bkramer updated this object.
bkramer added a subscriber: Unknown Object (MLST).
alexfh edited edge metadata.Aug 25 2014, 3:06 AM

I wonder if any style guides define function size threshold in terms of cyclomatic complexity? Is it actually useful as a warning? It may be interesting to developers looking for a place in their code to start refactoring from, but this task requires a bit different representation of the results - tables or charts; diagnostic messages are sub-optimal here. I also think, that at the point you start needing a CFG, the check becomes closer to the static analyzer realm, as it provides more useful tools for path-based analysis.

In D4986#4, @alexfh wrote:

I wonder if any style guides define function size threshold in terms of cyclomatic complexity? Is it actually useful as a warning?

It doesn't seem to be very common in C++ coding styles (probably due to lack of tools). I've seen warnings for this in other languages, e.g. Microsoft's C# analyzer puts an upper bound on cyclomatic complexity. It's yet another metric that seems slightly more useful than counting lines or statements.

I couldn't fully make up my mind about the static analyzer as its goals are slightly different. It tries hard to prune dead paths to avoid false positives while we still want to warn about complex functions that are mostly dead. Not sure if that matters in practice though.

This checker is a bit of a grab bag of metrics I could think of. It needs some configuration from the user to be useful.

bkramer updated this revision to Diff 12903.Aug 25 2014, 9:17 AM
bkramer edited edge metadata.
  • Replace CFG construction with a simple branch counter as suggested by Manuel.

Counting control statements is much more straightforward, thanks. A few comments inline.

clang-tidy/misc/FunctionLength.cpp
19 ↗(On Diff #12903)

Move it to ASTMatchers.h? It's used in enough places already. IIUC, Manuel considers this a good enough way to filter out matches in template instantiations.

44 ↗(On Diff #12903)

I wonder if you see specific reasons to use presumed locations here?

65 ↗(On Diff #12903)

If someone decides to configure several thresholds, they will get duplicate "function is very large" warnings. I'd make a single "function is very large" (or maybe "function exceeds recommended size/complexity thresholds") warning, and add specific metrics as notes. Also please insert the function name to the message to make it obvious what it complains about.

unittests/clang-tidy/FunctionLengthCheckTest.cpp
1 ↗(On Diff #12903)

It would be nice to test: multiple functions, lambdas, local classes/functions.

10 ↗(On Diff #12903)

I'd make this function a bit more generic (e.g. return llvm::ErrorOr<std::vector<std::string>>) and move it to ClangTidyTest.h. Or maybe even extend runCheckOnCode.

bkramer updated this revision to Diff 13204.Sep 3 2014, 6:26 AM

Use upstreamed astmatcher.
emit messages as notes.
moar tests.
minor fixes.

alexfh added inline comments.Sep 3 2014, 8:48 AM
unittests/clang-tidy/FunctionLengthCheckTest.cpp
1 ↗(On Diff #12903)

It would be nice to test: multiple functions, lambdas, local classes/functions.

What's with this?

10 ↗(On Diff #12903)

I'd make this function a bit more generic (e.g. return
llvm::ErrorOr<std::vector<std::string>>) and move it to ClangTidyTest.h. Or
maybe even extend runCheckOnCode.

So what's with this? And even more: wouldn't it be shorter/cleaner with lit-based tests?

bkramer updated this revision to Diff 13637.Sep 12 2014, 8:58 AM
  • Added configurability via ClangTidyOptions.
  • Turned the unit test into a lit test.
alexfh added inline comments.Sep 12 2014, 9:46 AM
clang-tidy/misc/FunctionSize.cpp
21 ↗(On Diff #13637)

Please make it 800U to instantiate get<unsigned>().

clang-tidy/tool/ClangTidyMain.cpp
108 ↗(On Diff #13637)
  1. Please move configuration-related changes to a separate patch.
  2. Please break the strings after \n's (and maybe file a bug against clang-format to respect \n's located close to the column limit).
  3. s/Specifing/Specifying/
  4. It seems reasonable to allow inline configurations like in clang-format, if the -config= value starts with a '{':
clang-tidy -config='{CheckOptions: [{key: misc-function-size.LineThreshold, value: 0}, {key: misc-function-size.StatementThreshold, value: 0}]}' some-file.cpp
178 ↗(On Diff #13637)

EffectiveOptions is only used for -list-checks and -dump-config. ClangTidy itself uses OptionsProvider, so overriding configuration file needs to be done in FileOptionsProvider. Alternatively, you can do:

OptionsProvider.reset(new DefaultOptionsProvider(GlobalOptions, EffectiveOptions));

after this. I'd probably prefer the first way.

182 ↗(On Diff #13637)

This line doesn't do what you expect (mergeWith should probably have LLVM_ATTRIBUTE_UNUSED_RESULT).

bkramer updated this revision to Diff 13696.Sep 15 2014, 2:37 AM
  • Remove configuration changes, they aren't ready for submission and unrelated to this checker.
  • Request get<unsigned> to read StatementThreshold from the config.
alexfh accepted this revision.Sep 15 2014, 5:46 AM
alexfh edited edge metadata.

LG.

clang-tidy/misc/FunctionSize.h
20 ↗(On Diff #13696)

Please move private members to the end for consistency with other ClangTidy classes.

test/clang-tidy/misc-function-size.cpp
5 ↗(On Diff #13696)

It seems reasonable to tighten the test a bit by adding the 'note:' prefix to all relevant CHECK lines and to the -implicit-check-not:

-implicit-check-not='{{warning:|error:|note:}}'
...
// CHECK: note: 1 statements (threshold 0)
53 ↗(On Diff #13696)

For the lambda above the check seems to only warn on the function itself, but not on the lambda. I wonder what happens if you add statements inside the x() method. Will the warning only fire for bar2() or for both? I don't know which behavior is better, but it would be good to know what it does currently.

This revision is now accepted and ready to land.Sep 15 2014, 5:46 AM
bkramer added inline comments.Sep 15 2014, 5:56 AM
test/clang-tidy/misc-function-size.cpp
53 ↗(On Diff #13696)

We do not warn on lambdas currently but add the statements to the number of the surrounding function. For local classes we emit a warning for the function AND add the statements to the surrounding function.

Diffusion closed this revision.Sep 15 2014, 5:58 AM
Diffusion updated this revision to Diff 13705.

Closed by commit rL217768 (authored by d0k).