This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] readability-function-size: add NestingThreshold param.
ClosedPublic

Authored by lebedev.ri on May 6 2017, 9:25 AM.

Details

Summary

Finds compound statements which create next nesting level after NestingThreshold and emits a warning.
Do note that it warns about each compound statement that breaches the threshold, but not any of it's sub-statements, to have readable warnings.

I was able to find only one coding style referencing nesting:

This seems too basic, i'm not sure what else to test. Are more tests needed?

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.May 6 2017, 9:25 AM
lebedev.ri updated this revision to Diff 98072.May 6 2017, 10:25 AM

Actually consider whether there are any such compound statements when outputting initial diagnostic.

aaron.ballman edited edge metadata.May 9 2017, 8:11 AM

Have you tried running this over a large code base to see how frequently the diagnostic triggers? I'm wondering why 10 was chosen as the default threshold.

clang-tidy/readability/FunctionSizeCheck.cpp
41

Comment should start with a capital letter and have punctuation.

compount -> compound

43

This can be replaced with TrackedParent.size() == Info.NestingThreshold.

75

Remove the newline here.

test/clang-tidy/readability-function-size.cpp
83–86

I think these messages would be easier to understand if they were placed near the line the diagnostic triggers.

lebedev.ri updated this revision to Diff 98345.May 9 2017, 1:30 PM
lebedev.ri marked 4 inline comments as done.
lebedev.ri edited the summary of this revision. (Show Details)

Updated according to review notes.
Added one more test case with macro expansion.
Added parameter to one more doc file.

Now parameter defaults to -1 by default, because testing shows that otherwise it should be as high by default as 75+

lebedev.ri added inline comments.May 9 2017, 1:31 PM
clang-tidy/readability/FunctionSizeCheck.cpp
43

(IRC)

[18:40:55] <LebedevRI> AaronBallman: are you really sure about TrackedParent.size() == Info.NestingThreshold ? TrackedParent can contain false too..
[18:43:11] <AaronBallman> LebedevRI: nope, I'm definitely wrong there. Thank you for pointing that out. :-D
[18:43:59] <AaronBallman> LebedevRI: instead, consider using llvm::count(TrackedParent, true); It's far more clear as to what you are trying to accomplish.

aaron.ballman added inline comments.May 10 2017, 8:39 AM
clang-tidy/readability/FunctionSizeCheck.h
30–35

Good catch! You should add the parenthetical that explains what the default does (as in the other cases).

33

You should add the parenthetical that explains what the default does (as in the other cases).

Revise the docs.

lebedev.ri marked 3 inline comments as done.May 10 2017, 11:13 AM
This revision is now accepted and ready to land.May 10 2017, 12:05 PM

LGTM!

Thanks.
I guess we want to wait a bit more for @alexfh to leave any comment?

LGTM!

Thanks.
I guess we want to wait a bit more for @alexfh to leave any comment?

You can deal with any of Alex's comments post-commit. If you still don't have commit privileges, now might be a good time to ask Chris Lattner for them: http://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access

alexfh requested changes to this revision.May 11 2017, 8:06 AM
alexfh added inline comments.
clang-tidy/readability/FunctionSizeCheck.cpp
44

This code used to be the single hungriest cycle eater in all of clang-tidy, so changing this code should be done very carefully.

In particular, can we avoid calling count on each compound statement? Is it possible to use an additional counter and update it whenever a tracked statement is pushed/popped?

This revision now requires changes to proceed.May 11 2017, 8:06 AM
alexfh edited reviewers, added: sbenza; removed: bkramer.May 11 2017, 8:06 AM
lebedev.ri added inline comments.May 11 2017, 9:05 AM
clang-tidy/readability/FunctionSizeCheck.cpp
44

This code used to be the single hungriest cycle eater in all of clang-tidy, so changing this code should be done very carefully.

Good to know. Are there any automated benchmarks for clang-tidy in-tree?

In particular, can we avoid calling count on each compound statement?
Is it possible to use an additional counter and update it whenever a tracked statement is pushed/popped?

Yep, sounds like a pretty safe, straight-forward optimization, will do.

lebedev.ri edited edge metadata.

As suggested by alexfh, use dedicated variable to track the current nesting level.
This results in less operations, namely, no need to foreach through the TrackedParent vector, and count how many of the entries are true each time the new CompoundStmt is encountered.
I did think about doing so in the first place, but decided against it because it seemed to be unnecessary at that time.

lebedev.ri marked an inline comment as done.May 11 2017, 10:02 AM
This revision is now accepted and ready to land.May 16 2017, 4:08 PM

Thank you.
I'll wait until after D33102 is accepted, then ask for commit rights, and if successful, will either commit myself, or will ask for it to be committed by someone else.

lebedev.ri updated this revision to Diff 102029.Jun 9 2017, 7:16 AM

No changes at all, rebased before commit.

This revision was automatically updated to reflect the committed changes.

Does this check consider else if chains to be nesting?

void nesting() { // 1
  if (true) { // 2
     int j;
  } else if (true) { // 2 or 3?
     int j;
  } else if (true) { // 2 or 4?
     int j;
  } else if (true) { // 2 or 5?
     int j;
  }
}

Does this check consider else if chains to be nesting?

void nesting() { // 1
  if (true) { // 2
     int j;
  } else if (true) { // 2 or 3?
     int j;
  } else if (true) { // 2 or 4?
     int j;
  } else if (true) { // 2 or 5?
     int j;
  }
}

Yes, this check, like does consider else if chains to be nesting, as you can see from the following output:

$ ./bin/clang-tidy -checks=readability-function-size -config='{CheckOptions: [{key: readability-function-size.NestingThreshold, value: 2}]}' /tmp/if-nest.cpp
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "/tmp/if-nest.cpp"
No compilation database found in /tmp or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
1 warning generated.
/tmp/if-nest.cpp:1:6: warning: function 'nesting' exceeds recommended size/complexity thresholds [readability-function-size]
void nesting() { // 1
     ^
/tmp/if-nest.cpp:2:13: note: nesting level 3 starts here (threshold 2)
  if (true) { // 2
            ^
/tmp/if-nest.cpp:4:10: note: nesting level 3 starts here (threshold 2)
  } else if (true) { // 2 or 3?
         ^

Which makes sense, since in AST, they are nested:

$ clang -Xclang -ast-dump -fsyntax-only  /tmp/if-nest.cpp
TranslationUnitDecl 0x560a93687b60 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x560a936880f0 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x560a93687dd0 '__int128'
|-TypedefDecl 0x560a93688160 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x560a93687df0 'unsigned __int128'
|-TypedefDecl 0x560a936884a8 <<invalid sloc>> <invalid sloc> implicit __NSConstantString 'struct __NSConstantString_tag'
| `-RecordType 0x560a93688250 'struct __NSConstantString_tag'
|   `-CXXRecord 0x560a936881b8 '__NSConstantString_tag'
|-TypedefDecl 0x560a93688540 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
| `-PointerType 0x560a93688500 'char *'
|   `-BuiltinType 0x560a93687bf0 'char'
|-TypedefDecl 0x560a936bd730 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'struct __va_list_tag [1]'
| `-ConstantArrayType 0x560a93688820 'struct __va_list_tag [1]' 1 
|   `-RecordType 0x560a93688630 'struct __va_list_tag'
|     `-CXXRecord 0x560a93688598 '__va_list_tag'
`-FunctionDecl 0x560a936bd7e0 </tmp/if-nest.cpp:1:1, line:11:1> line:1:6 nesting 'void (void)'
  `-CompoundStmt 0x560a936bdcb8 <col:16, line:11:1>
    `-IfStmt 0x560a936bdc80 <line:2:3, line:10:3>
      |-<<<NULL>>>
      |-<<<NULL>>>
      |-CXXBoolLiteralExpr 0x560a936bd8b8 <line:2:7> '_Bool' true
      |-CompoundStmt 0x560a936bd960 <col:13, line:4:3>
      | `-DeclStmt 0x560a936bd948 <line:3:6, col:11>
      |   `-VarDecl 0x560a936bd8e8 <col:6, col:10> col:10 j 'int'
      `-IfStmt 0x560a936bdc48 <line:4:10, line:10:3>
        |-<<<NULL>>>
        |-<<<NULL>>>
        |-CXXBoolLiteralExpr 0x560a936bd980 <line:4:14> '_Bool' true
        |-CompoundStmt 0x560a936bda28 <col:20, line:6:3>
        | `-DeclStmt 0x560a936bda10 <line:5:6, col:11>
        |   `-VarDecl 0x560a936bd9b0 <col:6, col:10> col:10 j 'int'
        `-IfStmt 0x560a936bdc10 <line:6:10, line:10:3>
          |-<<<NULL>>>
          |-<<<NULL>>>
          |-CXXBoolLiteralExpr 0x560a936bda48 <line:6:14> '_Bool' true
          |-CompoundStmt 0x560a936bdaf0 <col:20, line:8:3>
          | `-DeclStmt 0x560a936bdad8 <line:7:6, col:11>
          |   `-VarDecl 0x560a936bda78 <col:6, col:10> col:10 j 'int'
          `-IfStmt 0x560a936bdbd8 <line:8:10, line:10:3>
            |-<<<NULL>>>
            |-<<<NULL>>>
            |-CXXBoolLiteralExpr 0x560a936bdb10 <line:8:14> '_Bool' true
            |-CompoundStmt 0x560a936bdbb8 <col:20, line:10:3>
            | `-DeclStmt 0x560a936bdba0 <line:9:6, col:11>
            |   `-VarDecl 0x560a936bdb40 <col:6, col:10> col:10 j 'int'
            `-<<<NULL>>>

However what i'm not totally understanding right now is

/tmp/if-nest.cpp:2:13: note: nesting level 3 starts here (threshold 2)
  if (true) { // 2
            ^

If we look at clang-query output, that *should* be second nesting level...

$ clang-query-5.0 /tmp/if-nest.cpp
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "/tmp/if-nest.cpp"
No compilation database found in /tmp or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
clang-query> match compoundStmt()

Match #1:

/tmp/if-nest.cpp:1:16: note: "root" binds here
void nesting() { // 1
               ^~~~~~

Match #2:

/tmp/if-nest.cpp:2:13: note: "root" binds here
  if (true) { // 2
            ^~~~~~

Match #3:

/tmp/if-nest.cpp:4:20: note: "root" binds here
  } else if (true) { // 2 or 3?
                   ^~~~~~~~~~~~

Match #4:

/tmp/if-nest.cpp:6:20: note: "root" binds here
  } else if (true) { // 2 or 4?
                   ^~~~~~~~~~~~

Match #5:

/tmp/if-nest.cpp:8:20: note: "root" binds here
  } else if (true) { // 2 or 5?
                   ^~~~~~~~~~~~
5 matches.

Which makes sense, since in AST, they are nested:

They're not nested in the formatting, so I don't think it makes sense.

Which makes sense, since in AST, they are nested:

They're not nested in the formatting, so I don't think it makes sense.

As usual, all the meaningful review happens post-factum :)
So, it should warn on:

void yes_nesting() { // 1
  if (true) {    // 2
    int j;
  } else {
    if (true) { // 2 or 3?
      int j;
    } else {
      if (true) { // 2 or 4?
        int j;
      } else {
        if (true) { // 2 or 5?
          int j;
        }
      }
    }
  }
}

which is

`-FunctionDecl <line:12:1, line:28:1> line:12:6 yes_nesting 'void (void)'
  `-CompoundStmt <col:20, line:28:1>
    `-IfStmt <line:13:3, line:27:3>
      |-<<<NULL>>>
      |-<<<NULL>>>
      |-CXXBoolLiteralExpr <line:13:7> '_Bool' true
      |-CompoundStmt <col:13, line:15:3>
      | `-DeclStmt <line:14:5, col:10>
      |   `-VarDecl <col:5, col:9> col:9 j 'int'
      `-CompoundStmt <line:15:10, line:27:3>
        `-IfStmt <line:16:5, line:26:5>
          |-<<<NULL>>>
          |-<<<NULL>>>
          |-CXXBoolLiteralExpr <line:16:9> '_Bool' true
          |-CompoundStmt <col:15, line:18:5>
          | `-DeclStmt <line:17:7, col:12>
          |   `-VarDecl <col:7, col:11> col:11 j 'int'
          `-CompoundStmt <line:18:12, line:26:5>
            `-IfStmt <line:19:7, line:25:7>
              |-<<<NULL>>>
              |-<<<NULL>>>
              |-CXXBoolLiteralExpr <line:19:11> '_Bool' true
              |-CompoundStmt <col:17, line:21:7>
              | `-DeclStmt <line:20:9, col:14>
              |   `-VarDecl <col:9, col:13> col:13 j 'int'
              `-CompoundStmt <line:21:14, line:25:7>
                `-IfStmt <line:22:9, line:24:9>
                  |-<<<NULL>>>
                  |-<<<NULL>>>
                  |-CXXBoolLiteralExpr <line:22:13> '_Bool' true
                  |-CompoundStmt <col:19, line:24:9>
                  | `-DeclStmt <line:23:11, col:16>
                  |   `-VarDecl <col:11, col:15> col:15 j 'int'
                  `-<<<NULL>>>

but should not on what you/i posted in the previous comment.
The difference seems to be some kind of implicit CompoundStmt added by IfStmt?
Assuming that is the case, maybe this is as simple as checking whether this CompoundStmt is implicit or not?
Best to move open a new bug about this. I'll see what can be done.

Which makes sense, since in AST, they are nested:

They're not nested in the formatting, so I don't think it makes sense.

As usual, all the meaningful review happens post-factum :)

I didn't get an email about this change until it was pushed.

So, it should warn on:
...
but should not on what you/I posted in the previous comment.

Yes.

The difference seems to be some kind of implicit CompoundStmt added by IfStmt?

CompoundStmt represents the {}.

Assuming that is the case, maybe this is as simple as checking whether this CompoundStmt is implicit or not?

My prototype of this feature used ifStmt(stmt().bind("if"), unless(hasParent(ifStmt(hasElse(equalsBoundNode("if")))))).

Best to move open a new bug about this. I'll see what can be done.

Do you need me to report a bug?

Which makes sense, since in AST, they are nested:

They're not nested in the formatting, so I don't think it makes sense.

As usual, all the meaningful review happens post-factum :)

I didn't get an email about this change until it was pushed.

So, it should warn on:
...
but should not on what you/I posted in the previous comment.

Yes.

The difference seems to be some kind of implicit CompoundStmt added by IfStmt?

CompoundStmt represents the {}.

Assuming that is the case, maybe this is as simple as checking whether this CompoundStmt is implicit or not?

My prototype of this feature used ifStmt(stmt().bind("if"), unless(hasParent(ifStmt(hasElse(equalsBoundNode("if")))))).

The fix should ideally be fully contained in the TraverseStmt(), and should not be noticeably slower.

Best to move open a new bug about this. I'll see what can be done.

(that should have been "Best to open a new bug about this. I'll see what can be done.")

Do you need me to report a bug?

Yes, and assign it to me, if possible.

Hm, oh, maybe this is as simple as a partially-unintentional switch fallthrough

My prototype of this feature used ifStmt(stmt().bind("if"), unless(hasParent(ifStmt(hasElse(equalsBoundNode("if")))))).

The fix should ideally be fully contained in the TraverseStmt(), and should not be noticeably slower.

Yes, my prototype wasn't part of readability-function-size.