This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] readability-function-size: add VariableThreshold param.
ClosedPublic

Authored by lebedev.ri on Mar 17 2018, 2:04 PM.

Details

Summary

Pretty straight-forward, just count all the variable declarations in the function's body, and if more than the configured threshold - do complain.

Note that this continues perverse practice of disabling the new option by default.
I'm not certain where is the balance point between not being too noisy, and actually enforcing the good practice.
If we really want to not disable this by default, but also to not cause too many new warnings, we could default to 50 or so.
But that is a lot of variables too...

I was able to find one coding style referencing variable count:

Diff Detail

Event Timeline

lebedev.ri created this revision.Mar 17 2018, 2:04 PM
Eugene.Zelenko added inline comments.
docs/ReleaseNotes.rst
149

Will be good idea to add option to clarify improvement.

lebedev.ri marked an inline comment as done.

Adjusted ReleaseNotes change.

JonasToth added inline comments.Mar 18 2018, 3:08 AM
docs/ReleaseNotes.rst
152

I would rephrase this a little to:

Flags function bodies exceeding this number of declared variables.
test/clang-tidy/readability-function-size.cpp
118

Why is this check here and not on line 100 like the other conditions?

144

same here.

153

Could you please add a test for function having more parameters then allowed variables, but no variables?

180

This testfile is not C++17 right now, but do you know how structured bindings are handled? Maybe its worth adding another file for C++17 testing it.

Could you add a little test for range based for loops?

lebedev.ri marked 3 inline comments as done.

Address jonastoth's review notes.

  • Properly support C++17 structured bindings.
  • A few more tests
JonasToth added a comment.EditedMar 18 2018, 4:49 AM

Looks fine to me, but Aaron or someone else must accept :)

Just out of curiosity: The decomposition visit is for structured binding and the binding visit for range based for? Are there other places where binding occurs, for example in some template stuff?
Just asking if this might hit me in a other check :D

Looks fine to me, but Aaron or someone else must accept :)

Thanks for review!

Just out of curiosity: The decomposition visit is for structured binding and the binding visit for range based for? Are there other places where binding occurs, for example in some template stuff?
Just asking if this might hit me in a other check :D

The comment i added is actually wrong.
The range-based for just works anyway.
The VisitBindingDecl() is needed to get all the 'variables' declared in structured binding.
But as you can see in https://godbolt.org/g/be6Juf, the BindingDecl's are wrapped into DecompositionDecl.
And DecompositionDecl is actually inherited from VarDecl, https://github.com/llvm-mirror/clang/blob/b031fdc9b7dbd9c7f942d8060a4f00d63f3c9af2/include/clang/AST/DeclCXX.h#L3824-L3825,
so we count the DecompositionDecl in VisitVarDecl(), thus we need a VisitDecompositionDecl() that subtracts it.

This may be the wrong/incomplete solution.

docs/ReleaseNotes.rst
152

I agree that it looks gibberish. Reworded a little. Probably still not ideal.

test/clang-tidy/readability-function-size.cpp
118

It can not be there, the note: nesting level <> starts here (threshold 2) are outputted before note: <> variables (threshold 1), and filecheck respects the order of // CHECK-MESSAGES:

153

It's already checked in foo7(), but sure.

180

This testfile is not C++17 right now, but do you know how structured bindings are handled? Maybe its worth adding another file for C++17 testing it.

Nice catch!

Could you add a little test for range based for loops?

Added. I expected them to not work (after a quick look at AST), but apparently they just work..

The comment i added is actually wrong.
The range-based for just works anyway.
The VisitBindingDecl() is needed to get all the 'variables' declared in structured binding.
But as you can see in https://godbolt.org/g/be6Juf, the BindingDecl's are wrapped into DecompositionDecl.
And DecompositionDecl is actually inherited from VarDecl, https://github.com/llvm-mirror/clang/blob/b031fdc9b7dbd9c7f942d8060a4f00d63f3c9af2/include/clang/AST/DeclCXX.h#L3824-L3825,
so we count the DecompositionDecl in VisitVarDecl(), thus we need a VisitDecompositionDecl() that subtracts it.

Thanks for clarification. It feels very NOT obvious how to handle this case :)

JonasToth added inline comments.Mar 18 2018, 5:05 AM
docs/ReleaseNotes.rst
152

Its better. But the native speakers should give a final thought.

test/clang-tidy/readability-function-size.cpp
118

I see, no problems then :)

180

Added. I expected them to not work (after a quick look at AST), but apparently they just work..

Life would be so much easier if this is always the case :D

aaron.ballman added inline comments.Mar 20 2018, 9:10 AM
clang-tidy/readability/FunctionSizeCheck.cpp
30–31

This comment could be clarified. I think what it's trying to express is that the decomposition declaration was already counted as a variable declaration but we do not want to count it as such.

I think a better way to express this is to change VisitVarDecl() to check isa<DecompositionDecl>() and not increment in that case, rather than increment in one place and decrement in another. You could also check isa<ParmVarDecl>() at the same time and skip subtracting off formal parameters.

docs/ReleaseNotes.rst
152

I'd tweak slightly: Flags functions that have more than a specified number of variables declared in the body.

docs/clang-tidy/checks/readability-function-size.rst
42

This should clarify that parameters do not count as variables declared in the body.

lebedev.ri marked 5 inline comments as done.

Address @aaron.ballman's review notes.

I'm not sure the change re ParmVarDecl handling is actually correct, see testcase.

clang-tidy/readability/FunctionSizeCheck.cpp
30–31

This comment could be clarified. I think what it's trying to express is that the decomposition declaration was already counted as a variable declaration but we do not want to count it as such.

Yep.

I think a better way to express this is to change VisitVarDecl() to check isa<DecompositionDecl>() and not increment in that case, rather than increment in one place and decrement in another.

I agree.

You could also check isa<ParmVarDecl>() at the same time and skip subtracting off formal parameters.

That is a behavior-changing change, actually.
But i don't have any strong preferences here, so changed, test added.

docs/clang-tidy/checks/readability-function-size.rst
42

This gets oh so more interesting with the testcase i added.
I'm not sure whether we actually want to ignore all the ParmVarDecl, or keep

unsigned ActualNumberParameters = Func->getNumParams();
unsigned BodyVariables = FI.Variables - ActualNumberParameters;
aaron.ballman added inline comments.Mar 20 2018, 11:07 AM
test/clang-tidy/readability-function-size.cpp
207–212

I think the current behavior here is correct and the previous behavior was incorrect. However, it brings up an interesting question about what to do here:

void f() {
  struct S {
    void bar() {
      int a, b;
    }
  };
}

Does f() contain zero variables or two? I would contend that it has no variables because S::bar() is a different scope than f(). But I can see a case being made about the complexity of f() being increased by the presence of the local class definition. Perhaps this is a different facet of the test about number of types?

lebedev.ri added inline comments.Mar 20 2018, 11:12 AM
test/clang-tidy/readability-function-size.cpp
207–212

As previously briefly discussed in IRC, i strongly believe that the current behavior is correct, and readability-function-size
should analyze/diagnose the function as a whole, including all sub-classes/sub-functions.

aaron.ballman added inline comments.Mar 20 2018, 11:19 AM
test/clang-tidy/readability-function-size.cpp
207–212

Do you know of any coding standards related to this check that weigh in on this?

What do you think about this:

#define SWAP(x, y) ({__typeof__(x) temp = x; x = y; y = x;})

void f() {
  int a = 10, b = 12;
  SWAP(a, b);
}

Does f() have two variables or three? Should presence of the SWAP macro cause this code to be more complex due to having too many variables?

lebedev.ri added inline comments.Mar 20 2018, 11:23 AM
test/clang-tidy/readability-function-size.cpp
207–212

Datapoint: the doc (docs/clang-tidy/checks/readability-function-size.rst) actually already states that macros *are* counted.

.. option:: StatementThreshold

   Flag functions exceeding this number of statements. This may differ
   significantly from the number of lines for macro-heavy code. The default is
   `800`.
.. option:: NestingThreshold

    Flag compound statements which create next nesting level after
    `NestingThreshold`. This may differ significantly from the expected value
    for macro-heavy code. The default is `-1` (ignore the nesting level).
aaron.ballman added inline comments.Mar 20 2018, 11:34 AM
test/clang-tidy/readability-function-size.cpp
207–212

My concerns relate to what's considered a "variable declared in the body" (per the documentation) in relation to function complexity. To me, if the variable is not accessible lexically within the body of the function, it's not adding to the function's complexity *for local variables*. It may certainly be adding other complexity, of course.

I would have a very hard time explaining to a user that variables they cannot see or change (assuming the macro is in a header file out of their control) contribute to their function's complexity. Similarly, I would have difficulty explaining that variables in an locally declared class member function contribute to the number of variables in the outer function body, but the class data members somehow do not.

lebedev.ri added inline comments.Mar 20 2018, 11:49 AM
test/clang-tidy/readability-function-size.cpp
207–212

(per the documentation)

Please note that the word complexity is not used in the documentation, only size is.

There also is the other side of the coin:

#define simple_macro_please_ignore \
  the; \
  actual; \
  content; \
  of; \
  the; \
  foo();

// Very simple function, nothing to see.
void foo() {
  simple_macro_please_ignore();
}

#undef simple_macro_please_ignore

In other words, if we ignore macros, it would be possible to abuse them to artificially reduce complexity, by hiding it in the macros.
I agree that it's total abuse of macros, but macros are in general not nice, and it would not be good to give such things a pass.

My concerns relate to what's considered a "variable declared in the body" (per the documentation) in relation to function complexity.

Could you please clarify, at this point, your concerns are only about this new part of the check (variables), or for the entire check?

aaron.ballman added inline comments.Mar 20 2018, 12:23 PM
test/clang-tidy/readability-function-size.cpp
207–212

In other words, if we ignore macros, it would be possible to abuse them to artificially reduce complexity, by hiding it in the macros.

I don't disagree, that's why I'm trying to explore the boundaries. Your example does artificially reduce complexity. My example using swap does not -- it's an idiomatic swap macro where the inner variable declaration adds no complexity to the calling function as it's not exposed to the calling function.

Could you please clarify, at this point, your concerns are only about this new part of the check (variables), or for the entire check?

Only the new part of the check involving variables.

lebedev.ri added inline comments.Mar 21 2018, 11:55 AM
test/clang-tidy/readability-function-size.cpp
207–212

Could you please clarify, at this point, your concerns are only about this new part of the check (variables), or for the entire check?

Only the new part of the check involving variables.

OK.

This should be split into two boundaries:

  • macros
  • the nested functions/classes/methods in classes.

I *think* it may make sense to give the latter a pass, no strong opinion here.
But not macros.
(Also, i think it would be good to treat macros consistently within the check.)

Does anyone else has an opinion on how that should be handled?

JonasToth added inline comments.Mar 22 2018, 4:41 AM
test/clang-tidy/readability-function-size.cpp
207–212

what is the current behaviour for aarons nested function?
i checked cppcoreguidelines and hicpp and they did not mention such a case and i do not recall any rule that might relate to it.

I think aaron has a good point with:

I would have a very hard time explaining to a user that variables they cannot see or change (assuming the macro is in a header file out of their control) contribute to their function's complexity. Similarly, I would have difficulty explaining that variables in an locally declared class member function contribute to the number of variables in the outer function body, but the class data members somehow do not.

But I see no way to distinguish between "good" and "bad" macros, so macro expansions should add to the variable count, even though your swap macro is a valid counter example.

aaron.ballman added inline comments.Mar 22 2018, 4:51 AM
test/clang-tidy/readability-function-size.cpp
207–212

But I see no way to distinguish between "good" and "bad" macros, so macro expansions should add to the variable count, even though your swap macro is a valid counter example.

I would constrain it this way: variables declared in local class member function definitions and expression statements within a macro expansion do not contribute to the variable count, all other local variables do. e.g.,

#define SWAP(x, y) ({__typeof__(x) temp = x; x = y; y = x;})

void two_variables() {
  int a = 10, b = 12;
  SWAP(a, b);
}

void three_variables() {
  int a = 10, b = 12;
  ({__typeof__(x) temp = x; x = y; y = x;})
}

void one_variable() {
  int i = 12;
  class C {
    void four_variables() {
      int a, b, c, d;
    }
  };
}

#define FOO(x) (x + ({int i = 12; i;}))

void five_variables() {
  int a, b, c, d = FOO(100);
  float f;
}
lebedev.ri added inline comments.Mar 22 2018, 5:01 AM
test/clang-tidy/readability-function-size.cpp
207–212

I would constrain it this way: variables declared in local class member function definitions and expression statements within a macro expansion do not contribute to the variable count, all other local variables do.

But we do already count statements, branches and compound statements in all those cases in this check.
Why should variables be an exception?

aaron.ballman added inline comments.Mar 22 2018, 5:15 AM
test/clang-tidy/readability-function-size.cpp
207–212

But we do already count statements, branches and compound statements in all those cases in this check.

Why should variables be an exception?

Why should variables that are entirely inaccessible to the function count towards the function's variable complexity?

Things like macros count towards a function's line count because the macros are expanded into the function. I don't agree with this choice, but I can at least explain it to someone I'm teaching. In the case of variable declarations, I have no justification for those variables adding complexity because they cannot be named within the function even though the macro is expanded in the function. Yet the check doesn't count global variables which do add to function complexity when used within the function.

For those design reasons, I'd also be opposed to diagnosing this (assume it requires 2 variables to trigger the diagnostic):

void one_variable() {
  auto lambda = []() { int a = 12, b = 100; return a + b; };
}

which is functionally equivalent to:

void one_variable() {
  struct S {
    int operator()() { int a = 12, b = 100; return a + b; }
  } lambda;
}
Eugene.Zelenko added inline comments.Mar 22 2018, 6:55 AM
docs/ReleaseNotes.rst
149

Please rebase from trunk and use :doc: for link.

alexfh removed a reviewer: alexfh.Mar 23 2018, 7:46 AM

I have to say I disagree that either the nested struct/function or macros
(in any form) should count toward a function's total variable count.

Both are valid forms of abstraction, and they both remove complexity from
the containing function since they factor details *out of the function's
immediate lexical contents* (I avoid 'scope' as macros do pollute the
scope) in a way that improves readability.

There are cases where macros can make things more complex but it seems
unfair to flag variables declared by macros as making expanding functions
more complex.

In short, I think I agree with Aaron's last example classifications.

  • Kim

Den tors 22 mars 2018 14:56Eugene Zelenko via Phabricator via cfe-commits <
cfe-commits@lists.llvm.org> skrev:

Eugene.Zelenko added inline comments.

Comment at: docs/ReleaseNotes.rst:127

+- Added VariableThreshold option to `readability-function-size
+ <
http://clang.llvm.org/extra/clang-tidy/checks/readability-function-size.html>`_

check

Please rebase from trunk and use :doc: for link.

Repository:

rCTE Clang Tools Extra

https://reviews.llvm.org/D44602


cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

lebedev.ri planned changes to this revision.Mar 23 2018, 1:05 PM

Ok then :)
Let's only count the variables in the function itself, not in macros/nested functions/nested lambdas/nested classes/...

lebedev.ri marked 15 inline comments as done.
lebedev.ri removed a subscriber: lebedev.ri.
  • Rebased ontop of master
  • Fixed links in documentation
  • Improved test coverage
  • No longer count variables declared in nested classes.
  • No longer count variables declared in nested lambdas. FIXME: this check()` is not being called for lambdas at all. But out of the scope of this patch.
  • No longer count variables declared in macro expansion. Please see FIXME, i think this is too broad.
test/clang-tidy/readability-function-size.cpp
207–212

Ok, done. But this raises another question:

#define vardecl(type, name) type name;
void variables_15() {
  // FIXME: surely we should still warn here?
  vardecl(int, a);
  vardecl(int, b);
}

I'm guessing we want to still warn in cases like this?

JonasToth added inline comments.Mar 24 2018, 8:34 AM
test/clang-tidy/readability-function-size.cpp
207–212

how would you differentiate? I am against trying to get all macro cases right, either warn for everything in macros or nothing.

  • No longer count variables declared in macro expansion. Please see FIXME, i think this is too broad.

Ping, thoughts?

Ping.

@aaron.ballman ping, do you have any further thoughts on that macro false-negative?

#define vardecl(type, name) type name;
void variables_15() {
  // FIXME: surely we should still warn here?
  vardecl(int, a);
  vardecl(int, b);
}
// CHECK-MESSAGES: :[[@LINE-5]]:6: warning: function 'variables_15' exceeds recommended size/complexity thresholds [readability-function-size]
// CHECK-MESSAGES: :[[@LINE-6]]:6: note: 4 lines including whitespace and comments (threshold 0)
aaron.ballman added inline comments.Apr 10 2018, 9:00 AM
clang-tidy/readability/FunctionSizeCheck.cpp
31

This isn't the restriction I was envisioning. I was thinking more along the lines of:

bool VisitStmtExpr(StmtExpr *SE) override {
  ++StructNesting;
  Base::VisitStmtExpr(SE);
  --StructNesting;
  return true;
}

Basically -- treat a statement expression the same as an inner struct or lambda because the variables declared within it are scoped *only* to the statement expression.

You could argue that we should also add: if (!SE->getLocation.isMacroID()) { Base::VisitStmtExpr(SE); } to the top of the function so that you only treat statement expressions that are themselves in a macro expansion get this special treatment. e.g.,

void one_var() {
  (void)({int a = 12; a;});
}

#define M ({int a = 12; a;})
void zero_var() {
  (void)M;
}

I don't have strong opinions on this, but weakly lean towards treating all statement expressions the same way.

docs/clang-tidy/checks/readability-function-size.rst
44–46

I'd combine these into a serial list, like: Please note that function parameters and variables declared in macro expansions, lambdas, and nested class inline functions are not counted.

test/clang-tidy/readability-function-size.cpp
207–212

I'm guessing we want to still warn in cases like this?

That would be nice, yes. That's why the cut-point I was recommending were situations where the declared variables are not accessible within the function.

lebedev.ri marked 2 inline comments as done.
  • Update
  • Ignore GNU Statement Expressions too.
  • Slightly reword docs.
clang-tidy/readability/FunctionSizeCheck.cpp
31

Ah, i did not really knew about GNU Statement Expression, so i did not ignore them.
Test added, and ignored.

test/clang-tidy/readability-function-size.cpp
207–212

Ok. Any hint on how to actually do that?
I guess i could look at DeclRefExpr, but that would only count those macro variables,
that are actually used, while the check is trying not to differentiate.

lebedev.ri added a reviewer: alexfh.

After a quick IRC chat with @aaron.ballman, it was discovered that there has been a big misunderstanding:

we don't need to ignore/exempt *all* the variables declared in macros, we just need to ignore *all* GNU Statement Expressions.

Yay!

I think this is ready now, please review.

lebedev.ri marked 4 inline comments as done.Apr 11 2018, 6:46 AM
aaron.ballman accepted this revision.Apr 11 2018, 7:14 AM

Aside from some minor nits, LGTM!

clang-tidy/readability/FunctionSizeCheck.cpp
28

Comment is now stale.

31

Do you mind switching these to preincrement (and predecrement, below) since the value is not subsequently used?

This revision is now accepted and ready to land.Apr 11 2018, 7:14 AM
lebedev.ri marked 2 inline comments as done.
  • Drop stale comment
  • Switch to pre{in,de}crement

@alexfh i'm waiting for some comment from you. there was no comment added when you removed yourself as a reviewer,
so i *assume* the usual message of that time was meant - "removing from my review queue, re-add when others have accepted.".

@alexfh i'm waiting for some comment from you. there was no comment added when you removed yourself as a reviewer,
so i *assume* the usual message of that time was meant - "removing from my review queue, re-add when others have accepted.".

No, I just think that other reviewers can handle this. Feel free to commit as long as there are no open concerns.

@alexfh i'm waiting for some comment from you. there was no comment added when you removed yourself as a reviewer,
so i *assume* the usual message of that time was meant - "removing from my review queue, re-add when others have accepted.".

No, I just think that other reviewers can handle this. Feel free to commit as long as there are no open concerns.

Ah, ok, i'll commit it then, thanks!

This revision was automatically updated to reflect the committed changes.